Varnish Cache Warming Guide - Automated Content Preloading

Set up automated Varnish cache warming with priority URL preloading, systemd timers for scheduled content refreshing, and comprehensive monitoring to optimize cache hit rates and reduce backend server load for high-traffic websites.

Prerequisites

Existing Varnish installation
Root or sudo access
Python 3.6 or higher
Basic understanding of HTTP caching

What this solves

Varnish cache warming prevents cold cache performance issues by automatically preloading critical content before users request it. This eliminates first-request latency spikes and maintains consistently fast response times. Cache warming is essential for high-traffic websites that need predictable performance after deployments, server restarts, or cache invalidations.

Step-by-step configuration

Update system packages

Start by updating your package manager to ensure you get the latest versions of required tools.

sudo apt update && sudo apt upgrade -y

sudo dnf update -y

Install cache warming dependencies

Install curl for making HTTP requests, jq for JSON parsing, and Python for advanced warming scripts.

sudo apt install -y curl jq python3 python3-pip python3-venv bc

sudo dnf install -y curl jq python3 python3-pip bc

Create cache warming directory structure

Set up organized directories for warming scripts, URL lists, logs, and configuration files.

sudo mkdir -p /opt/varnish-warming/{scripts,config,logs,urls}
sudo mkdir -p /var/log/varnish-warming

Create the main cache warming script

This script handles URL loading with priority levels, parallel processing, and detailed logging.

#!/bin/bash

Varnish Cache Warming Script
Usage: ./warm-cache.sh [priority_level] [max_parallel]

set -euo pipefail

Configuration
CONFIG_DIR="/opt/varnish-warming/config"
LOG_DIR="/var/log/varnish-warming"
URL_DIR="/opt/varnish-warming/urls"
VARNISH_HOST="127.0.0.1"
VARNISH_PORT="80"
DEFAULT_TIMEOUT="30"
DEFAULT_PARALLEL="10"
USER_AGENT="VarnishWarming/1.0"

Load configuration if exists
if [[ -f "$CONFIG_DIR/warming.conf" ]]; then
    source "$CONFIG_DIR/warming.conf"
fi

Parameters
PRIORITY_LEVEL=${1:-"high"}
MAX_PARALLEL=${2:-$DEFAULT_PARALLEL}
LOG_FILE="$LOG_DIR/warming-$(date +%Y%m%d-%H%M%S).log"
STATS_FILE="$LOG_DIR/warming-stats.json"

Logging function
log() {
    local level=$1
    shift
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] [$level] $*" | tee -a "$LOG_FILE"
}

Warm single URL function
warm_url() {
    local url=$1
    local priority=${2:-"medium"}
    local start_time=$(date +%s.%N)
    
    # Add cache-busting headers for first request
    local response=$(curl -s -w "%{http_code},%{time_total},%{size_download}" \
        -H "Cache-Control: no-cache" \
        -H "Pragma: no-cache" \
        -H "User-Agent: $USER_AGENT" \
        -H "X-Cache-Warming: true" \
        --max-time "$DEFAULT_TIMEOUT" \
        -o /dev/null \
        "$url" 2>/dev/null || echo "000,0,0")
    
    local http_code=$(echo "$response" | cut -d',' -f1)
    local response_time=$(echo "$response" | cut -d',' -f2)
    local content_size=$(echo "$response" | cut -d',' -f3)
    
    # Make second request to populate cache
    if [[ "$http_code" == "200" ]]; then
        curl -s \
            -H "User-Agent: $USER_AGENT" \
            --max-time "$DEFAULT_TIMEOUT" \
            -o /dev/null \
            "$url" 2>/dev/null || true
    fi
    
    local end_time=$(date +%s.%N)
    local total_time=$(echo "$end_time - $start_time" | bc)
    
    # Log result
    if [[ "$http_code" == "200" ]]; then
        log "INFO" "SUCCESS: $url ($priority) - ${response_time}s, ${content_size} bytes"
        echo "success,$url,$priority,$http_code,$response_time,$content_size,$total_time"
    else
        log "WARN" "FAILED: $url ($priority) - HTTP $http_code"
        echo "failed,$url,$priority,$http_code,$response_time,$content_size,$total_time"
    fi
}

export -f warm_url
export -f log
export VARNISH_HOST VARNISH_PORT DEFAULT_TIMEOUT USER_AGENT LOG_FILE

Main warming function
warm_cache() {
    local priority=$1
    local url_file="$URL_DIR/urls-$priority.txt"
    
    if [[ ! -f "$url_file" ]]; then
        log "WARN" "URL file not found: $url_file"
        return 1
    fi
    
    local url_count=$(wc -l < "$url_file")
    log "INFO" "Starting cache warming for $priority priority ($url_count URLs)"
    
    # Process URLs in parallel
    cat "$url_file" | grep -v '^#' | grep -v '^$' | \
    parallel -j "$MAX_PARALLEL" --will-cite \
        "warm_url {} $priority" > "/tmp/warming-results-$priority.csv"
    
    # Calculate statistics
    local success_count=$(grep -c '^success,' "/tmp/warming-results-$priority.csv" || echo 0)
    local failed_count=$(grep -c '^failed,' "/tmp/warming-results-$priority.csv" || echo 0)
    local success_rate=0
    
    if [[ $url_count -gt 0 ]]; then
        success_rate=$(echo "scale=2; $success_count * 100 / $url_count" | bc)
    fi
    
    log "INFO" "Completed $priority priority: $success_count/$url_count successful (${success_rate}%)"
    
    # Update stats file
    local stats=$(cat <> "$STATS_FILE"
    
    # Cleanup
    rm -f "/tmp/warming-results-$priority.csv"
}

Install GNU parallel if not available
if ! command -v parallel &> /dev/null; then
    log "INFO" "Installing GNU parallel for concurrent processing"
    if command -v apt &> /dev/null; then
        sudo apt install -y parallel
    elif command -v dnf &> /dev/null; then
        sudo dnf install -y parallel
    fi
fi

Main execution
log "INFO" "Starting Varnish cache warming (priority: $PRIORITY_LEVEL, parallel: $MAX_PARALLEL)"

case "$PRIORITY_LEVEL" in
    "high")
        warm_cache "high"
        ;;
    "medium")
        warm_cache "medium"
        ;;
    "low")
        warm_cache "low"
        ;;
    "all")
        warm_cache "high"
        warm_cache "medium"
        warm_cache "low"
        ;;
    *)
        log "ERROR" "Invalid priority level: $PRIORITY_LEVEL (use: high, medium, low, all)"
        exit 1
        ;;
esac

log "INFO" "Cache warming completed"

Create configuration file

Set up the main configuration file with customizable parameters for your environment.

# Varnish Cache Warming Configuration

Varnish connection settings
VARNISH_HOST="127.0.0.1"
VARNISH_PORT="80"

Request settings
DEFAULT_TIMEOUT="30"
DEFAULT_PARALLEL="15"
USER_AGENT="VarnishWarming/1.0"

Retry settings
MAX_RETRIES="2"
RETRY_DELAY="1"

Logging
LOG_RETENTION_DAYS="7"
DEBUG_MODE="false"

Performance settings
RATE_LIMIT_DELAY="0.1"  # Seconds between requests to same host
MAX_CONCURRENT_HOSTS="5"  # Max parallel hosts to warm simultaneously

Create priority URL lists

Set up URL lists organized by priority levels for efficient cache warming scheduling.

# High priority URLs - Critical pages that need immediate caching
Homepage and main landing pages
http://example.com/
http://example.com/index.html
http://example.com/about
http://example.com/contact

Critical API endpoints
http://example.com/api/health
http://example.com/api/config

Essential assets
http://example.com/css/main.css
http://example.com/js/app.js
http://example.com/images/logo.png

Create medium and low priority URL lists

Add additional URL lists for medium and low priority content to optimize cache warming coverage.

# Medium priority URLs - Important but not critical
Product categories and popular content
http://example.com/products
http://example.com/services
http://example.com/blog
http://example.com/news

User-facing pages
http://example.com/login
http://example.com/register
http://example.com/dashboard

Common resources
http://example.com/css/theme.css
http://example.com/js/utils.js

# Low priority URLs - Nice to have cached
Secondary pages and content
http://example.com/privacy
http://example.com/terms
http://example.com/help
http://example.com/faq

Archive and historical content
http://example.com/archive
http://example.com/old-blog-posts

Non-critical assets
http://example.com/images/background.jpg
http://example.com/fonts/custom.woff2

Create URL discovery script

This script automatically discovers URLs from sitemaps and access logs to maintain current URL lists.

#!/usr/bin/env python3

import requests
import xml.etree.ElementTree as ET
import json
import re
import sys
import argparse
from urllib.parse import urljoin, urlparse
from collections import defaultdict
import logging

Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class URLDiscovery:
    def __init__(self, base_url, output_dir='/opt/varnish-warming/urls'):
        self.base_url = base_url.rstrip('/')
        self.output_dir = output_dir
        self.discovered_urls = defaultdict(list)
        
    def discover_from_sitemap(self, sitemap_url=None):
        """Discover URLs from XML sitemap"""
        if not sitemap_url:
            sitemap_url = urljoin(self.base_url, '/sitemap.xml')
            
        try:
            response = requests.get(sitemap_url, timeout=30)
            response.raise_for_status()
            
            root = ET.fromstring(response.content)
            
            # Handle sitemap index
            if 'sitemapindex' in root.tag:
                for sitemap in root.findall('.//{http://www.sitemaps.org/schemas/sitemap/0.9}sitemap'):
                    loc = sitemap.find('{http://www.sitemaps.org/schemas/sitemap/0.9}loc')
                    if loc is not None:
                        self.discover_from_sitemap(loc.text)
            else:
                # Handle URL set
                for url in root.findall('.//{http://www.sitemaps.org/schemas/sitemap/0.9}url'):
                    loc = url.find('{http://www.sitemaps.org/schemas/sitemap/0.9}loc')
                    priority = url.find('{http://www.sitemaps.org/schemas/sitemap/0.9}priority')
                    
                    if loc is not None:
                        url_text = loc.text
                        priority_value = float(priority.text) if priority is not None else 0.5
                        
                        # Categorize by priority
                        if priority_value >= 0.8:
                            self.discovered_urls['high'].append(url_text)
                        elif priority_value >= 0.5:
                            self.discovered_urls['medium'].append(url_text)
                        else:
                            self.discovered_urls['low'].append(url_text)
                            
            logger.info(f"Discovered {sum(len(urls) for urls in self.discovered_urls.values())} URLs from sitemap")
            
        except Exception as e:
            logger.error(f"Error parsing sitemap {sitemap_url}: {e}")
    
    def discover_from_access_log(self, log_file, min_requests=5):
        """Discover popular URLs from access logs"""
        url_counts = defaultdict(int)
        
        try:
            with open(log_file, 'r') as f:
                for line in f:
                    # Parse common log format
                    match = re.search(r'"[A-Z]+ (\S+)', line)
                    if match:
                        path = match.group(1)
                        if not path.startswith('/'):
                            continue
                            
                        # Skip common non-cacheable paths
                        if any(skip in path for skip in ['.php', '.cgi', '/admin/', '/api/auth']):
                            continue
                            
                        url = urljoin(self.base_url, path)
                        url_counts[url] += 1
            
            # Categorize by popularity
            sorted_urls = sorted(url_counts.items(), key=lambda x: x[1], reverse=True)
            
            for url, count in sorted_urls:
                if count >= min_requests * 5:
                    self.discovered_urls['high'].append(url)
                elif count >= min_requests * 2:
                    self.discovered_urls['medium'].append(url)
                elif count >= min_requests:
                    self.discovered_urls['low'].append(url)
                    
            logger.info(f"Discovered {len(sorted_urls)} URLs from access logs")
            
        except Exception as e:
            logger.error(f"Error parsing access log {log_file}: {e}")
    
    def save_url_lists(self):
        """Save discovered URLs to priority files"""
        for priority, urls in self.discovered_urls.items():
            # Remove duplicates and sort
            unique_urls = list(set(urls))
            unique_urls.sort()
            
            filename = f"{self.output_dir}/urls-{priority}.txt"
            
            try:
                with open(filename, 'w') as f:
                    f.write(f"# {priority.title()} priority URLs - Auto-generated\n")
                    f.write(f"# Total URLs: {len(unique_urls)}\n\n")
                    
                    for url in unique_urls:
                        f.write(f"{url}\n")
                        
                logger.info(f"Saved {len(unique_urls)} {priority} priority URLs to {filename}")
                
            except Exception as e:
                logger.error(f"Error saving URL list {filename}: {e}")

def main():
    parser = argparse.ArgumentParser(description='Discover URLs for Varnish cache warming')
    parser.add_argument('base_url', help='Base URL of the website')
    parser.add_argument('--sitemap', help='Sitemap URL (default: /sitemap.xml)')
    parser.add_argument('--access-log', help='Path to access log file')
    parser.add_argument('--output-dir', default='/opt/varnish-warming/urls', help='Output directory')
    parser.add_argument('--min-requests', type=int, default=5, help='Minimum requests for log-based discovery')
    
    args = parser.parse_args()
    
    discovery = URLDiscovery(args.base_url, args.output_dir)
    
    # Discover from sitemap
    discovery.discover_from_sitemap(args.sitemap)
    
    # Discover from access logs if provided
    if args.access_log:
        discovery.discover_from_access_log(args.access_log, args.min_requests)
    
    # Save results
    discovery.save_url_lists()
    
    total_urls = sum(len(urls) for urls in discovery.discovered_urls.values())
    logger.info(f"URL discovery completed. Total URLs: {total_urls}")

if __name__ == '__main__':
    main()

Set proper permissions

Configure ownership and permissions for the warming scripts and directories. The scripts need execute permissions while config files should be readable by the service user.

sudo chmod +x /opt/varnish-warming/scripts/warm-cache.sh
sudo chmod +x /opt/varnish-warming/scripts/discover-urls.py
sudo chown -R root:root /opt/varnish-warming
sudo chmod 755 /opt/varnish-warming/scripts/
sudo chmod 644 /opt/varnish-warming/config/*
sudo chmod 644 /opt/varnish-warming/urls/*
sudo chown -R syslog:adm /var/log/varnish-warming
sudo chmod 755 /var/log/varnish-warming

Note: We use 755 for directories (read/execute for everyone, write for owner) and 644 for files (read for everyone, write for owner). The syslog user owns log directories for proper logging integration.

Create systemd service for cache warming

Set up a systemd service to manage cache warming operations with proper resource limits and logging.

[Unit]
Description=Varnish Cache Warming Service
After=network.target varnish.service
Requires=varnish.service
Wants=network-online.target

[Service]
Type=oneshot
User=www-data
Group=www-data
ExecStart=/opt/varnish-warming/scripts/warm-cache.sh all 15
WorkingDirectory=/opt/varnish-warming
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
StandardOutput=journal
StandardError=journal

Resource limits
MemoryMax=512M
CPUQuota=50%
TasksMax=50

Security settings
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/log/varnish-warming /tmp

Timeout settings
TimeoutStartSec=300
TimeoutStopSec=30

[Install]
WantedBy=multi-user.target

Create systemd timer for scheduled warming

Configure automatic cache warming with systemd timers for consistent performance optimization.

[Unit]
Description=Varnish Cache Warming Timer
Requires=varnish-warming.service

[Timer]
Run high priority warming every 30 minutes
OnCalendar=*:0/30
RandomizedDelaySec=60
Persistent=true

[Install]
WantedBy=timers.target

Create additional timer for full warming

Set up a separate timer for complete cache warming including all priority levels during off-peak hours.

[Unit]
Description=Varnish Full Cache Warming Service
After=network.target varnish.service
Requires=varnish.service

[Service]
Type=oneshot
User=www-data
Group=www-data
ExecStartPre=/opt/varnish-warming/scripts/discover-urls.py http://example.com
ExecStart=/opt/varnish-warming/scripts/warm-cache.sh all 20
WorkingDirectory=/opt/varnish-warming
StandardOutput=journal
StandardError=journal

Resource limits for full warming
MemoryMax=1G
CPUQuota=75%
TasksMax=100

Security settings
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/log/varnish-warming /tmp /opt/varnish-warming/urls

TimeoutStartSec=1800
TimeoutStopSec=60

[Unit]
Description=Varnish Full Cache Warming Timer
Requires=varnish-warming-full.service

[Timer]
Run full warming twice daily during off-peak hours
OnCalendar=02:00,14:00
RandomizedDelaySec=300
Persistent=true

[Install]
WantedBy=timers.target

Enable and start the services

Reload systemd configuration and enable the cache warming services and timers.

sudo systemctl daemon-reload
sudo systemctl enable varnish-warming.timer
sudo systemctl enable varnish-warming-full.timer
sudo systemctl start varnish-warming.timer
sudo systemctl start varnish-warming-full.timer

Create monitoring and metrics script

Set up comprehensive monitoring for cache warming performance and hit rate analysis.

#!/bin/bash

Varnish Cache Warming Monitor Script

set -euo pipefail

LOG_DIR="/var/log/varnish-warming"
STATS_FILE="$LOG_DIR/warming-stats.json"
METRICS_FILE="$LOG_DIR/warming-metrics.json"
VARNISH_STATS_CMD="varnishstat"

Function to get Varnish statistics
get_varnish_stats() {
    local stats
    if command -v varnishstat &> /dev/null; then
        stats=$(varnishstat -1 -j 2>/dev/null || echo '{}')
    else
        stats='{}'
    fi
    echo "$stats"
}

Function to calculate cache hit rate
calculate_hit_rate() {
    local varnish_stats=$1
    local cache_hits=$(echo "$varnish_stats" | jq -r '.MAIN.cache_hit.value // 0')
    local cache_misses=$(echo "$varnish_stats" | jq -r '.MAIN.cache_miss.value // 0')
    local total_requests=$((cache_hits + cache_misses))
    
    if [[ $total_requests -gt 0 ]]; then
        local hit_rate=$(echo "scale=4; $cache_hits * 100 / $total_requests" | bc)
        echo "$hit_rate"
    else
        echo "0"
    fi
}

Function to get warming statistics
get_warming_stats() {
    if [[ -f "$STATS_FILE" ]]; then
        # Get last 24 hours of warming stats
        local cutoff_time=$(date -d '24 hours ago' -Iseconds)
        jq -s --arg cutoff "$cutoff_time" '
            map(select(.timestamp >= $cutoff)) |
            {
                total_warming_sessions: length,
                total_urls_warmed: map(.successful) | add,
                total_failed_urls: map(.failed) | add,
                average_success_rate: (map(.success_rate) | add / length),
                by_priority: group_by(.priority) | map({
                    priority: .[0].priority,
                    sessions: length,
                    urls_warmed: map(.successful) | add,
                    avg_success_rate: (map(.success_rate) | add / length)
                })
            }' "$STATS_FILE" 2>/dev/null || echo '{}'
    else
        echo '{}'
    fi
}

Function to analyze log files
analyze_logs() {
    local log_pattern="$LOG_DIR/warming-*.log"
    local recent_logs=$(find "$LOG_DIR" -name "warming-*.log" -mtime -1 2>/dev/null || true)
    
    if [[ -n "$recent_logs" ]]; then
        local total_successes=$(grep -c "SUCCESS:" $recent_logs 2>/dev/null || echo 0)
        local total_failures=$(grep -c "FAILED:" $recent_logs 2>/dev/null || echo 0)
        local error_patterns=$(grep "FAILED:" $recent_logs 2>/dev/null | \
            awk '{print $NF}' | sort | uniq -c | sort -nr | head -5 || true)
        
        echo "{
            \"log_analysis\": {
                \"period\": \"24h\",
                \"total_successes\": $total_successes,
                \"total_failures\": $total_failures,
                \"top_errors\": \"$error_patterns\"
            }
        }"
    else
        echo '{"log_analysis": {"period": "24h", "status": "no_recent_logs"}}'
    fi
}

Main monitoring function
main() {
    local timestamp=$(date -Iseconds)
    local varnish_stats=$(get_varnish_stats)
    local hit_rate=$(calculate_hit_rate "$varnish_stats")
    local warming_stats=$(get_warming_stats)
    local log_analysis=$(analyze_logs)
    
    # Combine all metrics
    local combined_metrics=$(jq -n \
        --arg timestamp "$timestamp" \
        --argjson varnish_stats "$varnish_stats" \
        --arg hit_rate "$hit_rate" \
        --argjson warming_stats "$warming_stats" \
        --argjson log_analysis "$log_analysis" '
        {
            timestamp: $timestamp,
            cache_hit_rate: ($hit_rate | tonumber),
            varnish_stats: {
                cache_hits: ($varnish_stats.MAIN.cache_hit.value // 0),
                cache_misses: ($varnish_stats.MAIN.cache_miss.value // 0),
                backend_requests: ($varnish_stats.MAIN.backend_req.value // 0),
                objects_cached: ($varnish_stats.MAIN.n_object.value // 0)
            },
            warming_performance: $warming_stats,
            log_analysis: $log_analysis.log_analysis
        }')
    
    echo "$combined_metrics" >> "$METRICS_FILE"
    
    # Keep only last 7 days of metrics
    local cutoff_time=$(date -d '7 days ago' -Iseconds)
    local temp_file="/tmp/warming-metrics-temp.json"
    
    if [[ -f "$METRICS_FILE" ]]; then
        jq --arg cutoff "$cutoff_time" 'select(.timestamp >= $cutoff)' "$METRICS_FILE" > "$temp_file" 2>/dev/null || true
        if [[ -s "$temp_file" ]]; then
            mv "$temp_file" "$METRICS_FILE"
        fi
    fi
    
    # Output current status
    echo "Cache Warming Status Report - $(date)"
    echo "=========================================="
    echo "Cache Hit Rate: ${hit_rate}%"
    echo "Warming Stats: $(echo "$warming_stats" | jq -c .)"
    echo "Full metrics saved to: $METRICS_FILE"
}

Run monitoring
main "$@"

Set up log rotation

Configure logrotate to manage cache warming log files and prevent disk space issues.

/var/log/varnish-warming/*.log {
    daily
    missingok
    rotate 7
    compress
    delaycompress
    notifempty
    create 644 syslog adm
    postrotate
        systemctl reload rsyslog > /dev/null 2>&1 || true
    endrotate
}

/var/log/varnish-warming/*.json {
    weekly
    missingok
    rotate 4
    compress
    delaycompress
    notifempty
    create 644 syslog adm
}

Make monitoring script executable

Set proper permissions for the monitoring script and test the setup.

sudo chmod +x /opt/varnish-warming/scripts/monitor-warming.sh
sudo /opt/varnish-warming/scripts/monitor-warming.sh

Verify your setup

Test the cache warming system and verify all components are working correctly.

# Check systemd services and timers
sudo systemctl status varnish-warming.timer
sudo systemctl status varnish-warming-full.timer

List next scheduled runs
sudo systemctl list-timers | grep warming

Test manual cache warming
sudo -u www-data /opt/varnish-warming/scripts/warm-cache.sh high 5

Check warming logs
sudo tail -f /var/log/varnish-warming/warming-*.log

Verify URL discovery
/opt/varnish-warming/scripts/discover-urls.py http://example.com --output-dir /tmp

Check cache statistics
varnishstat -1 | grep -E '(cache_hit|cache_miss)'

Monitor warming performance
/opt/varnish-warming/scripts/monitor-warming.sh

For enhanced monitoring integration, you can connect cache warming metrics to existing monitoring systems like Prometheus and cAdvisor or set up performance analysis similar to NGINX cache optimization.

Common issues

Symptom	Cause	Fix
Warming script fails with permission denied	Incorrect file permissions or ownership	`sudo chown www-data:www-data /opt/varnish-warming -R` and verify execute permissions
URLs return 404 during warming	Outdated URL lists or incorrect base URL	Run URL discovery script and update configuration with correct domain
High memory usage during warming	Too many parallel workers or large responses	Reduce MAX_PARALLEL in config and add response size limits to curl
Systemd timer not running	Service file errors or timer not enabled	`sudo systemctl daemon-reload && sudo systemctl enable --now varnish-warming.timer`
Cache hit rate not improving	Cache invalidation or TTL too low	Check Varnish VCL configuration and ensure proper cache headers
Warming takes too long	Sequential processing or network timeouts	Install GNU parallel and adjust timeout values in config
Log files growing too large	Missing logrotate configuration	Apply logrotate config and run `sudo logrotate -f /etc/logrotate.d/varnish-warming`

Next steps

Install and configure Varnish Cache 7 with NGINX backend for the foundation caching setup
Configure Varnish cache invalidation with automated purging strategies to complement cache warming
Implement Varnish cache warming with Kubernetes CronJobs for containerized environments
Monitor Varnish performance with Prometheus and Grafana dashboards for comprehensive observability
Optimize Varnish cache storage with memory tuning and persistence for enhanced performance

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Varnish Cache Warming Installation Script
# Installs and configures Varnish cache warming with automated content preloading

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

# Default configuration
VARNISH_HOST="${1:-127.0.0.1}"
VARNISH_PORT="${2:-80}"
INSTALL_DIR="/opt/varnish-warming"

usage() {
    echo "Usage: $0 [varnish_host] [varnish_port]"
    echo "Example: $0 127.0.0.1 80"
    exit 1
}

log_info() {
    echo -e "${GREEN}[INFO]${NC} $1"
}

log_warn() {
    echo -e "${YELLOW}[WARN]${NC} $1"
}

log_error() {
    echo -e "${RED}[ERROR]${NC} $1"
}

cleanup() {
    log_error "Installation failed. Cleaning up..."
    rm -rf "$INSTALL_DIR" 2>/dev/null || true
    systemctl disable varnish-warming.timer 2>/dev/null || true
    systemctl stop varnish-warming.timer 2>/dev/null || true
    rm -f /etc/systemd/system/varnish-warming.* 2>/dev/null || true
    exit 1
}

trap cleanup ERR

# Check prerequisites
if [[ $EUID -ne 0 ]]; then
    log_error "This script must be run as root"
    exit 1
fi

# Auto-detect distribution
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian) 
            PKG_MGR="apt"
            PKG_UPDATE="apt update"
            PKG_INSTALL="apt install -y"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora) 
            PKG_MGR="dnf"
            PKG_UPDATE="dnf makecache"
            PKG_INSTALL="dnf install -y"
            ;;
        amzn) 
            PKG_MGR="yum"
            PKG_UPDATE="yum makecache"
            PKG_INSTALL="yum install -y"
            ;;
        *) 
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    log_error "Cannot detect distribution"
    exit 1
fi

log_info "[1/8] Detected distribution: $PRETTY_NAME"

# Update system packages
log_info "[2/8] Updating system packages..."
$PKG_UPDATE > /dev/null

# Install dependencies
log_info "[3/8] Installing dependencies..."
$PKG_INSTALL curl jq python3 python3-pip bc > /dev/null

# Create directory structure
log_info "[4/8] Creating directory structure..."
mkdir -p "$INSTALL_DIR"/{scripts,config,logs,urls}
mkdir -p /var/log/varnish-warming
chown -R root:root "$INSTALL_DIR"
chmod -R 755 "$INSTALL_DIR"
chown -R root:root /var/log/varnish-warming
chmod 755 /var/log/varnish-warming

# Create main warming script
log_info "[5/8] Creating cache warming script..."
cat > "$INSTALL_DIR/scripts/warm-cache.sh" << 'EOF'
#!/bin/bash

set -euo pipefail

CONFIG_DIR="/opt/varnish-warming/config"
LOG_DIR="/var/log/varnish-warming"
URL_DIR="/opt/varnish-warming/urls"
VARNISH_HOST="127.0.0.1"
VARNISH_PORT="80"
DEFAULT_TIMEOUT="30"
DEFAULT_PARALLEL="10"
USER_AGENT="VarnishWarming/1.0"

if [[ -f "$CONFIG_DIR/warming.conf" ]]; then
    source "$CONFIG_DIR/warming.conf"
fi

PRIORITY_LEVEL=${1:-"high"}
MAX_PARALLEL=${2:-$DEFAULT_PARALLEL}
LOG_FILE="$LOG_DIR/warming-$(date +%Y%m%d-%H%M%S).log"

log() {
    local level=$1
    shift
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] [$level] $*" | tee -a "$LOG_FILE"
}

warm_url() {
    local url=$1
    local priority=${2:-"medium"}
    local start_time=$(date +%s.%N)
    
    local response=$(curl -s -w "%{http_code},%{time_total},%{size_download}" \
        -H "Cache-Control: no-cache" \
        -H "User-Agent: $USER_AGENT" \
        -H "X-Cache-Warming: true" \
        --max-time "$DEFAULT_TIMEOUT" \
        -o /dev/null \
        "$url" 2>/dev/null || echo "000,0,0")
    
    local http_code=$(echo "$response" | cut -d',' -f1)
    local response_time=$(echo "$response" | cut -d',' -f2)
    local content_size=$(echo "$response" | cut -d',' -f3)
    
    if [[ "$http_code" == "200" ]]; then
        curl -s -H "User-Agent: $USER_AGENT" \
            --max-time "$DEFAULT_TIMEOUT" \
            -o /dev/null "$url" 2>/dev/null || true
        log "INFO" "SUCCESS: $url ($priority) - ${response_time}s"
    else
        log "WARN" "FAILED: $url ($priority) - HTTP $http_code"
    fi
}

export -f warm_url log
export VARNISH_HOST VARNISH_PORT DEFAULT_TIMEOUT USER_AGENT LOG_FILE

warm_cache() {
    local priority=$1
    local url_file="$URL_DIR/urls-$priority.txt"
    
    if [[ ! -f "$url_file" ]]; then
        log "WARN" "URL file not found: $url_file"
        return 0
    fi
    
    local url_count=$(wc -l < "$url_file")
    log "INFO" "Starting cache warming for $priority priority ($url_count URLs)"
    
    cat "$url_file" | xargs -P "$MAX_PARALLEL" -I {} bash -c "warm_url '{}' '$priority'"
    
    log "INFO" "Completed warming $priority priority URLs"
}

main() {
    log "INFO" "Cache warming started - Priority: $PRIORITY_LEVEL, Parallel: $MAX_PARALLEL"
    
    case "$PRIORITY_LEVEL" in
        "critical") warm_cache "critical" ;;
        "high") warm_cache "critical"; warm_cache "high" ;;
        "medium") warm_cache "critical"; warm_cache "high"; warm_cache "medium" ;;
        "all") warm_cache "critical"; warm_cache "high"; warm_cache "medium"; warm_cache "low" ;;
        *) log "ERROR" "Invalid priority level: $PRIORITY_LEVEL"; exit 1 ;;
    esac
    
    log "INFO" "Cache warming completed"
}

main "$@"
EOF

chmod 755 "$INSTALL_DIR/scripts/warm-cache.sh"

# Create configuration file
log_info "[6/8] Creating configuration files..."
cat > "$INSTALL_DIR/config/warming.conf" << EOF
VARNISH_HOST="$VARNISH_HOST"
VARNISH_PORT="$VARNISH_PORT"
DEFAULT_TIMEOUT="30"
DEFAULT_PARALLEL="10"
USER_AGENT="VarnishWarming/1.0"
EOF

# Create sample URL files
cat > "$INSTALL_DIR/urls/urls-critical.txt" << EOF
http://$VARNISH_HOST:$VARNISH_PORT/
EOF

cat > "$INSTALL_DIR/urls/urls-high.txt" << EOF
http://$VARNISH_HOST:$VARNISH_PORT/about
http://$VARNISH_HOST:$VARNISH_PORT/products
EOF

cat > "$INSTALL_DIR/urls/urls-medium.txt" << EOF
http://$VARNISH_HOST:$VARNISH_PORT/blog
http://$VARNISH_HOST:$VARNISH_PORT/contact
EOF

cat > "$INSTALL_DIR/urls/urls-low.txt" << EOF
http://$VARNISH_HOST:$VARNISH_PORT/sitemap
EOF

chmod 644 "$INSTALL_DIR/config/warming.conf"
chmod 644 "$INSTALL_DIR/urls/"*.txt

# Create systemd service and timer
log_info "[7/8] Setting up systemd service and timer..."
cat > /etc/systemd/system/varnish-warming.service << EOF
[Unit]
Description=Varnish Cache Warming
After=network.target

[Service]
Type=oneshot
User=root
ExecStart=$INSTALL_DIR/scripts/warm-cache.sh high
WorkingDirectory=$INSTALL_DIR
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target
EOF

cat > /etc/systemd/system/varnish-warming.timer << EOF
[Unit]
Description=Run Varnish Cache Warming every 30 minutes
Requires=varnish-warming.service

[Timer]
OnCalendar=*:0/30
Persistent=true

[Install]
WantedBy=timers.target
EOF

systemctl daemon-reload
systemctl enable varnish-warming.timer
systemctl start varnish-warming.timer

# Create log rotation
cat > /etc/logrotate.d/varnish-warming << EOF
/var/log/varnish-warming/*.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 644 root root
}
EOF

# Verification
log_info "[8/8] Verifying installation..."
if [[ -x "$INSTALL_DIR/scripts/warm-cache.sh" ]]; then
    log_info "✓ Warming script installed successfully"
else
    log_error "✗ Warming script installation failed"
    exit 1
fi

if systemctl is-enabled varnish-warming.timer >/dev/null 2>&1; then
    log_info "✓ Systemd timer enabled successfully"
else
    log_error "✗ Systemd timer setup failed"
    exit 1
fi

log_info "Installation completed successfully!"
echo
echo "Configuration:"
echo "- Installation directory: $INSTALL_DIR"
echo "- Varnish host: $VARNISH_HOST"
echo "- Varnish port: $VARNISH_PORT"
echo "- Log directory: /var/log/varnish-warming"
echo
echo "Usage:"
echo "- Manual run: $INSTALL_DIR/scripts/warm-cache.sh [priority]"
echo "- Edit URLs: $INSTALL_DIR/urls/urls-*.txt"
echo "- Edit config: $INSTALL_DIR/config/warming.conf"
echo "- View timer status: systemctl status varnish-warming.timer"
echo "- View logs: tail -f /var/log/varnish-warming/warming-*.log"

Review the script before running. Execute with: bash install.sh

#varnish #cache-warming #performance-optimization #automation #systemd

Implement Varnish cache warming with automated content preloading for high-performance websites

Prerequisites

What this solves

Step-by-step configuration

Update system packages

Install cache warming dependencies

Create cache warming directory structure

Create the main cache warming script

Varnish Cache Warming Script

Usage: ./warm-cache.sh [priority_level] [max_parallel]

Configuration

Load configuration if exists

Parameters

Logging function

Warm single URL function

Main warming function

Install GNU parallel if not available

Main execution

Create configuration file

Varnish connection settings

Request settings

Retry settings

Logging

Performance settings

Create priority URL lists

Homepage and main landing pages

Critical API endpoints

Essential assets

Create medium and low priority URL lists

Product categories and popular content

User-facing pages

Common resources

Secondary pages and content

Archive and historical content

Non-critical assets

Create URL discovery script

Configure logging

Set proper permissions

Create systemd service for cache warming

Resource limits

Security settings

Timeout settings

Create systemd timer for scheduled warming

Run high priority warming every 30 minutes

Create additional timer for full warming

Resource limits for full warming

Security settings

Run full warming twice daily during off-peak hours

Enable and start the services

Create monitoring and metrics script

Varnish Cache Warming Monitor Script

Function to get Varnish statistics

Function to calculate cache hit rate

Function to get warming statistics

Function to analyze log files

Main monitoring function

Run monitoring

Set up log rotation

Make monitoring script executable

Verify your setup

List next scheduled runs

Test manual cache warming

Check warming logs

Verify URL discovery

Check cache statistics

Monitor warming performance

Common issues

Next steps

Related tutorials

Configure Cherokee caching and compression for improved performance

Implement Spark SQL performance optimization with Catalyst optimizer and advanced tuning

Configure Nginx Redis cluster caching for high availability and performance optimization

Don't want to manage this yourself?