Set up comprehensive Varnish monitoring using prometheus-varnish-exporter, custom Grafana dashboards, and performance alerting rules for production cache optimization.
Prerequisites
- Varnish 7 installed and running
- Prometheus server configured
- Grafana dashboard access
- Basic systemd knowledge
What this solves
Varnish cache performance monitoring helps you track hit rates, response times, backend health, and memory usage in real-time. This tutorial sets up prometheus-varnish-exporter to collect metrics, configures Prometheus scraping, creates comprehensive Grafana dashboards, and establishes alerting rules for cache misses, high latency, and backend failures.
Step-by-step configuration
Install Varnish statistics daemon
First verify Varnish is running and enable statistics collection with varnishstat daemon.
sudo systemctl status varnish
varnishstat -1
Install prometheus-varnish-exporter
Download and install the official Prometheus exporter for Varnish metrics collection.
sudo apt update
wget https://github.com/jonnenauha/prometheus_varnish_exporter/releases/download/1.6.1/prometheus_varnish_exporter-1.6.1.linux-amd64.tar.gz
tar xzf prometheus_varnish_exporter-1.6.1.linux-amd64.tar.gz
sudo cp prometheus_varnish_exporter-1.6.1.linux-amd64/prometheus_varnish_exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/prometheus_varnish_exporter
Create varnish-exporter user
Create a dedicated system user for running the exporter securely without shell access.
sudo useradd --no-create-home --shell /bin/false varnish-exporter
sudo usermod -a -G varnish varnish-exporter
Create systemd service
Configure the exporter to run as a systemd service with proper security settings and automatic restart.
[Unit]
Description=Prometheus Varnish Exporter
After=network.target varnish.service
Requires=varnish.service
[Service]
Type=simple
User=varnish-exporter
Group=varnish-exporter
ExecStart=/usr/local/bin/prometheus_varnish_exporter \
-varnish-listen-address :9131 \
-web.listen-address :9131 \
-web.telemetry-path /metrics
Restart=always
RestartSec=10
KillMode=process
Security settings
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadOnlyPaths=/
[Install]
WantedBy=multi-user.target
Start and enable varnish-exporter
Enable the service to start on boot and verify it's collecting metrics properly.
sudo systemctl daemon-reload
sudo systemctl enable --now varnish-exporter
sudo systemctl status varnish-exporter
Test metrics endpoint
Verify the exporter is serving Varnish metrics in Prometheus format on port 9131.
curl http://localhost:9131/metrics | grep varnish_main_cache_hit
Configure firewall
Allow Prometheus server access to the exporter port while restricting external access.
sudo ufw allow from 203.0.113.10 to any port 9131
sudo ufw reload
Add Prometheus scrape configuration
Configure Prometheus to scrape Varnish metrics every 15 seconds for real-time monitoring.
- job_name: 'varnish'
static_configs:
- targets: ['localhost:9131']
scrape_interval: 15s
metrics_path: /metrics
scrape_timeout: 10s
honor_labels: true
Restart Prometheus
Apply the new configuration and verify Varnish targets are being scraped successfully.
sudo systemctl restart prometheus
sudo systemctl status prometheus
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.job=="varnish")'
Create Grafana dashboard
Import a comprehensive Varnish dashboard with cache performance, hit rates, and backend health panels.
{
"dashboard": {
"id": null,
"title": "Varnish Cache Performance",
"tags": ["varnish", "cache", "performance"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Cache Hit Rate",
"type": "stat",
"targets": [
{
"expr": "100 * (rate(varnish_main_cache_hit[5m]) / (rate(varnish_main_cache_hit[5m]) + rate(varnish_main_cache_miss[5m])))",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "percent",
"min": 0,
"max": 100,
"thresholds": {
"steps": [
{"color": "red", "value": 0},
{"color": "yellow", "value": 80},
{"color": "green", "value": 95}
]
}
}
},
"gridPos": {"h": 4, "w": 6, "x": 0, "y": 0}
},
{
"id": 2,
"title": "Requests per Second",
"type": "stat",
"targets": [
{
"expr": "rate(varnish_main_client_req[5m])",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "reqps"
}
},
"gridPos": {"h": 4, "w": 6, "x": 6, "y": 0}
},
{
"id": 3,
"title": "Backend Response Time",
"type": "timeseries",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(varnish_backend_req_duration_seconds_bucket[5m]))",
"refId": "A",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.50, rate(varnish_backend_req_duration_seconds_bucket[5m]))",
"refId": "B",
"legendFormat": "50th percentile"
}
],
"fieldConfig": {
"defaults": {
"unit": "s"
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 4}
},
{
"id": 4,
"title": "Memory Usage",
"type": "timeseries",
"targets": [
{
"expr": "varnish_sma_g_bytes{type=\"s0\"}",
"refId": "A",
"legendFormat": "Used"
},
{
"expr": "varnish_sma_g_space{type=\"s0\"}",
"refId": "B",
"legendFormat": "Available"
}
],
"fieldConfig": {
"defaults": {
"unit": "bytes"
}
},
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 4}
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "5s"
}
}
Import dashboard via API
Use Grafana API to import the dashboard and make it available for monitoring.
curl -X POST \
http://admin:admin@localhost:3000/api/dashboards/db \
-H 'Content-Type: application/json' \
-d @/tmp/varnish-dashboard.json
Create alerting rules
Set up Prometheus alerting rules for low cache hit rates, high response times, and backend failures.
groups:
- name: varnish_alerts
rules:
- alert: VarnishLowCacheHitRate
expr: 100 * (rate(varnish_main_cache_hit[5m]) / (rate(varnish_main_cache_hit[5m]) + rate(varnish_main_cache_miss[5m]))) < 80
for: 2m
labels:
severity: warning
service: varnish
annotations:
summary: "Varnish cache hit rate is below 80%"
description: "Cache hit rate is {{ $value }}% for the last 5 minutes"
- alert: VarnishHighBackendResponseTime
expr: histogram_quantile(0.95, rate(varnish_backend_req_duration_seconds_bucket[5m])) > 1
for: 2m
labels:
severity: critical
service: varnish
annotations:
summary: "Varnish backend response time is high"
description: "95th percentile backend response time is {{ $value }}s"
- alert: VarnishBackendDown
expr: varnish_backend_up == 0
for: 30s
labels:
severity: critical
service: varnish
annotations:
summary: "Varnish backend is down"
description: "Backend {{ $labels.backend }} is not responding"
- alert: VarnishHighMemoryUsage
expr: (varnish_sma_g_bytes{type="s0"} / varnish_sma_g_space{type="s0"}) * 100 > 90
for: 5m
labels:
severity: warning
service: varnish
annotations:
summary: "Varnish memory usage is high"
description: "Memory usage is {{ $value }}% of allocated space"
- alert: VarnishClientErrors
expr: rate(varnish_main_client_resp_4xx[5m]) > 10
for: 2m
labels:
severity: warning
service: varnish
annotations:
summary: "High rate of 4xx errors from Varnish"
description: "4xx error rate is {{ $value }} per second"
Update Prometheus configuration
Add the new alerting rules file to Prometheus configuration and restart the service.
rule_files:
- "/etc/prometheus/rules/varnish.yml"
sudo systemctl restart prometheus
sudo systemctl status prometheus
Configure advanced dashboard panels
Create additional monitoring panels for thread usage, object lifetime, and purge operations.
{
"panels": [
{
"id": 5,
"title": "Thread Pool Usage",
"type": "timeseries",
"targets": [
{
"expr": "varnish_main_threads",
"refId": "A",
"legendFormat": "Active Threads"
},
{
"expr": "varnish_main_thread_queue_len",
"refId": "B",
"legendFormat": "Queued Requests"
}
],
"gridPos": {"h": 6, "w": 8, "x": 0, "y": 12}
},
{
"id": 6,
"title": "Object Lifecycle",
"type": "timeseries",
"targets": [
{
"expr": "rate(varnish_main_n_object[5m])",
"refId": "A",
"legendFormat": "Objects Created"
},
{
"expr": "rate(varnish_main_n_objecthead[5m])",
"refId": "B",
"legendFormat": "Object Headers"
}
],
"gridPos": {"h": 6, "w": 8, "x": 8, "y": 12}
},
{
"id": 7,
"title": "Purge Operations",
"type": "timeseries",
"targets": [
{
"expr": "rate(varnish_main_n_purge[5m])",
"refId": "A",
"legendFormat": "Purges per second"
},
{
"expr": "rate(varnish_main_n_purge_obj[5m])",
"refId": "B",
"legendFormat": "Objects purged"
}
],
"gridPos": {"h": 6, "w": 8, "x": 16, "y": 12}
}
]
}
Set up Grafana notifications
Configure notification channels for critical Varnish alerts to integrate with your existing monitoring workflow.
curl -X POST \
http://admin:admin@localhost:3000/api/alert-notifications \
-H 'Content-Type: application/json' \
-d '{
"name": "varnish-alerts",
"type": "webhook",
"settings": {
"url": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK",
"httpMethod": "POST"
}
}'
Verify your setup
Test that all components are working together and collecting metrics properly.
# Check exporter is running and serving metrics
sudo systemctl status varnish-exporter
curl -s http://localhost:9131/metrics | grep -E "varnish_main_(cache_hit|cache_miss|client_req)"
Verify Prometheus is scraping Varnish metrics
curl -s http://localhost:9090/api/v1/query?query=up{job="varnish"}
Test that dashboards show data
curl -s http://admin:admin@localhost:3000/api/search?query=varnish
Generate some cache activity to see metrics
varnishtest -v /usr/share/doc/varnish/examples/test01.vtc
Check alert rules are loaded
curl -s http://localhost:9090/api/v1/rules | jq '.data.groups[] | select(.name=="varnish_alerts")'
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Exporter returns "permission denied" | User not in varnish group | sudo usermod -a -G varnish varnish-exporter |
| No metrics in Prometheus | Firewall blocking scrape | Check firewall rules and Prometheus config |
| Dashboard shows "No data" | Wrong metric names or time range | Verify metric names with curl localhost:9131/metrics |
| Alerts not firing | Rules syntax error or thresholds too high | promtool check rules /etc/prometheus/rules/varnish.yml |
| High memory usage alerts | Varnish cache size misconfigured | Adjust malloc size in varnish systemd config |
Next steps
- Set up Prometheus and Grafana monitoring stack with Docker Compose for a complete monitoring solution
- Install and configure Varnish Cache 7 with NGINX backend to optimize your web acceleration setup
- Configure Varnish cache warming strategies to improve cache hit rates
- Set up Varnish cluster with load balancing for high availability deployments
- Configure advanced alerting rules for comprehensive Varnish performance monitoring
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration
EXPORTER_VERSION="1.6.1"
EXPORTER_URL="https://github.com/jonnenauha/prometheus_varnish_exporter/releases/download/${EXPORTER_VERSION}/prometheus_varnish_exporter-${EXPORTER_VERSION}.linux-amd64.tar.gz"
PROMETHEUS_SERVER_IP="${1:-127.0.0.1}"
# Usage function
usage() {
echo "Usage: $0 [prometheus_server_ip]"
echo "Example: $0 192.168.1.100"
exit 1
}
# Error handling
cleanup_on_error() {
echo -e "${RED}[ERROR] Installation failed. Cleaning up...${NC}"
systemctl stop varnish-exporter 2>/dev/null || true
systemctl disable varnish-exporter 2>/dev/null || true
rm -f /etc/systemd/system/varnish-exporter.service
userdel -f varnish-exporter 2>/dev/null || true
rm -f /usr/local/bin/prometheus_varnish_exporter
rm -rf /tmp/varnish_exporter_install
exit 1
}
trap cleanup_on_error ERR
# Check if running as root
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}This script must be run as root${NC}"
exit 1
fi
# Detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
PKG_UPDATE="apt update"
FIREWALL_CMD="ufw"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf update -y"
FIREWALL_CMD="firewalld"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum update -y"
FIREWALL_CMD="firewalld"
;;
*)
echo -e "${RED}Unsupported distribution: $ID${NC}"
exit 1
;;
esac
else
echo -e "${RED}/etc/os-release not found. Cannot detect distribution.${NC}"
exit 1
fi
echo -e "${GREEN}Installing Varnish Prometheus Exporter on $PRETTY_NAME${NC}"
# Step 1: Verify prerequisites
echo -e "${YELLOW}[1/9] Checking prerequisites...${NC}"
if ! systemctl is-active varnish >/dev/null 2>&1; then
echo -e "${RED}Varnish service is not running. Please install and start Varnish first.${NC}"
exit 1
fi
if ! command -v varnishstat >/dev/null 2>&1; then
echo -e "${RED}varnishstat command not found. Please install Varnish tools.${NC}"
exit 1
fi
# Step 2: Update packages
echo -e "${YELLOW}[2/9] Updating package repositories...${NC}"
$PKG_UPDATE >/dev/null
# Step 3: Install required packages
echo -e "${YELLOW}[3/9] Installing required packages...${NC}"
$PKG_INSTALL wget tar curl >/dev/null
# Step 4: Download and install exporter
echo -e "${YELLOW}[4/9] Downloading Varnish Prometheus Exporter...${NC}"
cd /tmp
mkdir -p varnish_exporter_install
cd varnish_exporter_install
wget -q "$EXPORTER_URL"
tar xzf "prometheus_varnish_exporter-${EXPORTER_VERSION}.linux-amd64.tar.gz"
cp "prometheus_varnish_exporter-${EXPORTER_VERSION}.linux-amd64/prometheus_varnish_exporter" /usr/local/bin/
chmod 755 /usr/local/bin/prometheus_varnish_exporter
chown root:root /usr/local/bin/prometheus_varnish_exporter
# Step 5: Create system user
echo -e "${YELLOW}[5/9] Creating varnish-exporter user...${NC}"
if ! id varnish-exporter >/dev/null 2>&1; then
useradd --no-create-home --shell /bin/false --system varnish-exporter
fi
usermod -a -G varnish varnish-exporter
# Step 6: Create systemd service
echo -e "${YELLOW}[6/9] Creating systemd service...${NC}"
cat > /etc/systemd/system/varnish-exporter.service << 'EOF'
[Unit]
Description=Prometheus Varnish Exporter
After=network.target varnish.service
Requires=varnish.service
[Service]
Type=simple
User=varnish-exporter
Group=varnish-exporter
ExecStart=/usr/local/bin/prometheus_varnish_exporter \
-varnish-listen-address :9131 \
-web.listen-address :9131 \
-web.telemetry-path /metrics
Restart=always
RestartSec=10
KillMode=process
# Security settings
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadOnlyPaths=/
[Install]
WantedBy=multi-user.target
EOF
chmod 644 /etc/systemd/system/varnish-exporter.service
# Step 7: Start and enable service
echo -e "${YELLOW}[7/9] Starting varnish-exporter service...${NC}"
systemctl daemon-reload
systemctl enable varnish-exporter
systemctl start varnish-exporter
# Wait for service to start
sleep 3
# Step 8: Configure firewall
echo -e "${YELLOW}[8/9] Configuring firewall...${NC}"
if [[ "$FIREWALL_CMD" == "ufw" ]]; then
if systemctl is-active ufw >/dev/null 2>&1; then
ufw allow from "$PROMETHEUS_SERVER_IP" to any port 9131 >/dev/null
ufw --force reload >/dev/null
fi
elif [[ "$FIREWALL_CMD" == "firewalld" ]]; then
if systemctl is-active firewalld >/dev/null 2>&1; then
firewall-cmd --permanent --add-rich-rule="rule family=\"ipv4\" source address=\"$PROMETHEUS_SERVER_IP\" port protocol=\"tcp\" port=\"9131\" accept" >/dev/null
firewall-cmd --reload >/dev/null
fi
fi
# Step 9: Verification
echo -e "${YELLOW}[9/9] Verifying installation...${NC}"
# Check service status
if ! systemctl is-active varnish-exporter >/dev/null; then
echo -e "${RED}varnish-exporter service is not running${NC}"
systemctl status varnish-exporter
exit 1
fi
# Check metrics endpoint
if ! curl -s http://localhost:9131/metrics | grep -q varnish_main_cache_hit; then
echo -e "${RED}Metrics endpoint is not responding correctly${NC}"
exit 1
fi
# Cleanup temp files
rm -rf /tmp/varnish_exporter_install
echo -e "${GREEN}✓ Varnish Prometheus Exporter installed successfully!${NC}"
echo ""
echo "Service status:"
systemctl status varnish-exporter --no-pager -l
echo ""
echo "Metrics available at: http://localhost:9131/metrics"
echo "Prometheus scrape target: $PROMETHEUS_SERVER_IP:9131"
echo ""
echo "Add this to your Prometheus configuration:"
echo " - job_name: 'varnish'"
echo " static_configs:"
echo " - targets: ['$(hostname -I | awk '{print $1}'):9131']"
echo " scrape_interval: 15s"
Review the script before running. Execute with: bash install.sh