Configure Gunicorn performance monitoring with Prometheus metrics and Grafana dashboards

Intermediate 45 min Apr 08, 2026 33 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up comprehensive performance monitoring for Gunicorn WSGI servers using Prometheus metrics collection and Grafana visualization. Monitor request rates, response times, worker processes, memory usage, and implement automated alerting for production Python applications.

Prerequisites

  • Python 3.8+
  • Root or sudo access
  • At least 2GB RAM
  • Basic knowledge of Python WSGI applications

What this solves

Gunicorn applications in production need comprehensive monitoring to track performance metrics, identify bottlenecks, and maintain service reliability. This tutorial sets up Prometheus metrics collection for Gunicorn workers, request handling, and resource usage, then creates Grafana dashboards for visualization and alerting.

Step-by-step configuration

Update system packages

Start by updating your package manager to ensure you get the latest versions of monitoring tools.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install Python dependencies and Prometheus client

Install the prometheus_client library and other required Python packages for metrics collection.

sudo apt install -y python3-pip python3-venv python3-dev
pip3 install prometheus_client gunicorn psutil
sudo dnf install -y python3-pip python3-devel gcc
pip3 install prometheus_client gunicorn psutil

Create Gunicorn metrics middleware

Create a WSGI middleware that collects and exposes Prometheus metrics for your Python application.

import time
import threading
from prometheus_client import Counter, Histogram, Gauge, generate_latest, CONTENT_TYPE_LATEST
from prometheus_client.core import CollectorRegistry
import psutil
import os

Create a custom registry for this application

registry = CollectorRegistry()

Request metrics

REQUEST_COUNT = Counter( 'gunicorn_requests_total', 'Total number of HTTP requests', ['method', 'endpoint', 'status'], registry=registry ) REQUEST_DURATION = Histogram( 'gunicorn_request_duration_seconds', 'Time spent processing HTTP requests', ['method', 'endpoint'], registry=registry, buckets=[0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0] ) ACTIVE_REQUESTS = Gauge( 'gunicorn_active_requests', 'Number of active requests being processed', registry=registry )

Worker and system metrics

WORKER_CONNECTIONS = Gauge( 'gunicorn_worker_connections', 'Number of active connections per worker', ['worker_pid'], registry=registry ) MEMORY_USAGE = Gauge( 'gunicorn_memory_usage_bytes', 'Memory usage in bytes', ['worker_pid', 'type'], registry=registry ) CPU_USAGE = Gauge( 'gunicorn_cpu_usage_percent', 'CPU usage percentage', ['worker_pid'], registry=registry ) class PrometheusMiddleware: def __init__(self, app): self.app = app self.active_requests = 0 self.lock = threading.Lock() # Start background thread for system metrics self.start_system_metrics_collection() def start_system_metrics_collection(self): def collect_system_metrics(): while True: try: process = psutil.Process() worker_pid = str(os.getpid()) # Memory metrics memory_info = process.memory_info() MEMORY_USAGE.labels(worker_pid=worker_pid, type='rss').set(memory_info.rss) MEMORY_USAGE.labels(worker_pid=worker_pid, type='vms').set(memory_info.vms) # CPU metrics cpu_percent = process.cpu_percent() CPU_USAGE.labels(worker_pid=worker_pid).set(cpu_percent) # Connection metrics (approximate) connections = len(process.connections()) WORKER_CONNECTIONS.labels(worker_pid=worker_pid).set(connections) except Exception as e: print(f"Error collecting system metrics: {e}") time.sleep(15) # Collect every 15 seconds thread = threading.Thread(target=collect_system_metrics, daemon=True) thread.start() def __call__(self, environ, start_response): method = environ.get('REQUEST_METHOD', 'GET') path = environ.get('PATH_INFO', '/') # Handle metrics endpoint if path == '/metrics': status = '200 OK' headers = [('Content-Type', CONTENT_TYPE_LATEST)] start_response(status, headers) return [generate_latest(registry)] # Track active requests with self.lock: self.active_requests += 1 ACTIVE_REQUESTS.set(self.active_requests) start_time = time.time() def custom_start_response(status, headers, exc_info=None): # Extract status code status_code = status.split(' ', 1)[0] # Record metrics REQUEST_COUNT.labels(method=method, endpoint=path, status=status_code).inc() duration = time.time() - start_time REQUEST_DURATION.labels(method=method, endpoint=path).observe(duration) # Decrease active requests with self.lock: self.active_requests -= 1 ACTIVE_REQUESTS.set(self.active_requests) return start_response(status, headers, exc_info) return self.app(environ, custom_start_response)

Create sample Flask application with monitoring

Create a sample Flask application that uses the metrics middleware for demonstration purposes.

from flask import Flask, jsonify, request
from metrics_middleware import PrometheusMiddleware
import time
import random

app = Flask(__name__)
app.wsgi_app = PrometheusMiddleware(app.wsgi_app)

@app.route('/')
def index():
    return jsonify({
        'message': 'Hello from monitored Gunicorn application',
        'status': 'healthy'
    })

@app.route('/api/data')
def get_data():
    # Simulate variable response time
    time.sleep(random.uniform(0.1, 0.5))
    return jsonify({
        'data': [1, 2, 3, 4, 5],
        'timestamp': time.time()
    })

@app.route('/api/slow')
def slow_endpoint():
    # Simulate slow endpoint
    time.sleep(random.uniform(1.0, 3.0))
    return jsonify({'message': 'This was slow'})

@app.route('/health')
def health_check():
    return jsonify({'status': 'ok'}), 200

if __name__ == '__main__':
    app.run(debug=True)

Configure Gunicorn with monitoring settings

Create a Gunicorn configuration file optimized for monitoring and performance tracking.

# Gunicorn configuration for monitoring
import multiprocessing

Server socket

bind = "0.0.0.0:8000" backlog = 2048

Worker processes

workers = multiprocessing.cpu_count() * 2 + 1 worker_class = "sync" worker_connections = 1000 max_requests = 1000 max_requests_jitter = 50 preload_app = True timeout = 30 keepalive = 2

Logging

accesslog = "/var/log/gunicorn/access.log" errorlog = "/var/log/gunicorn/error.log" loglevel = "info" access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" %(D)s'

Process naming

proc_name = "gunicorn-monitored-app"

Server mechanics

daemon = False pidfile = "/var/run/gunicorn/gunicorn.pid" user = "www-data" group = "www-data" tmp_upload_dir = None

SSL (if needed)

keyfile = "/path/to/keyfile"

certfile = "/path/to/certfile"

Worker process monitoring

worker_tmp_dir = "/dev/shm"

Restart workers after this many requests, with up to jitter additional requests

max_requests = 1200 max_requests_jitter = 200

Timeout for graceful workers restart

graceful_timeout = 30 def when_ready(server): server.log.info("Gunicorn server is ready. Prometheus metrics available at /metrics") def worker_int(worker): worker.log.info("Worker received INT or QUIT signal") def pre_fork(server, worker): server.log.info("Worker spawned (pid: %s)", worker.pid) def post_fork(server, worker): server.log.info("Worker spawned (pid: %s)", worker.pid)

Create necessary directories and set permissions

Create log directories and set appropriate permissions for the Gunicorn application.

sudo mkdir -p /var/log/gunicorn /var/run/gunicorn
sudo chown -R www-data:www-data /var/log/gunicorn /var/run/gunicorn
sudo chmod 755 /var/log/gunicorn /var/run/gunicorn
sudo chown -R www-data:www-data /opt/myapp
sudo chmod 755 /opt/myapp
Never use chmod 777. It gives every user on the system full access to your files. Instead, fix ownership with chown and use minimal permissions like 755 for directories and 644 for files.

Create systemd service for Gunicorn

Create a systemd service file to manage the Gunicorn application with proper monitoring configuration.

[Unit]
Description=Gunicorn instance to serve monitored Python app
After=network.target

[Service]
User=www-data
Group=www-data
WorkingDirectory=/opt/myapp
Environment="PATH=/opt/myapp/venv/bin"
ExecStart=/usr/local/bin/gunicorn --config /opt/myapp/gunicorn.conf.py app:app
ExecReload=/bin/kill -s HUP $MAINPID
Restart=always
RestartSec=3
KillMode=mixed
TimeoutStopSec=5
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/log/gunicorn /var/run/gunicorn

[Install]
WantedBy=multi-user.target

Install and configure Prometheus

Install Prometheus server to collect metrics from your Gunicorn application.

sudo apt install -y prometheus
sudo dnf install -y prometheus2

Configure Prometheus to scrape Gunicorn metrics

Configure Prometheus to collect metrics from your Gunicorn application endpoints.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "/etc/prometheus/rules/*.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - localhost:9093

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'gunicorn-app'
    static_configs:
      - targets: ['localhost:8000']
    metrics_path: '/metrics'
    scrape_interval: 10s
    scrape_timeout: 5s
    scheme: http
    static_configs:
      - targets: ['localhost:8000']
        labels:
          service: 'gunicorn-app'
          environment: 'production'

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
    scrape_interval: 15s

Create Prometheus alerting rules

Define alerting rules for Gunicorn application health monitoring and performance thresholds.

sudo mkdir -p /etc/prometheus/rules
groups:
  - name: gunicorn_alerts
    rules:
      - alert: GunicornHighErrorRate
        expr: |
          (
            rate(gunicorn_requests_total{status=~"5.."}[5m]) /
            rate(gunicorn_requests_total[5m])
          ) > 0.1
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "High error rate in Gunicorn application"
          description: "Gunicorn application {{ $labels.service }} has error rate above 10% for more than 2 minutes"

      - alert: GunicornHighLatency
        expr: |
          histogram_quantile(0.95, rate(gunicorn_request_duration_seconds_bucket[5m])) > 2
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High latency in Gunicorn application"
          description: "95th percentile latency is {{ $value }}s for {{ $labels.service }}"

      - alert: GunicornHighMemoryUsage
        expr: gunicorn_memory_usage_bytes{type="rss"} > 500  1024  1024
        for: 3m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage in Gunicorn worker"
          description: "Worker {{ $labels.worker_pid }} is using {{ $value | humanize }}B of memory"

      - alert: GunicornHighCPUUsage
        expr: gunicorn_cpu_usage_percent > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage in Gunicorn worker"
          description: "Worker {{ $labels.worker_pid }} has {{ $value }}% CPU usage"

      - alert: GunicornDown
        expr: up{job="gunicorn-app"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Gunicorn application is down"
          description: "Gunicorn application {{ $labels.service }} has been down for more than 1 minute"

Install and configure Grafana

Install Grafana for creating dashboards and visualizing Gunicorn performance metrics.

sudo apt install -y software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update && sudo apt install -y grafana
sudo dnf install -y grafana

Create Grafana dashboard configuration

Create a comprehensive dashboard configuration for monitoring Gunicorn performance metrics.

{
  "dashboard": {
    "id": null,
    "title": "Gunicorn Performance Monitoring",
    "tags": ["gunicorn", "python", "wsgi"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(gunicorn_requests_total[5m])",
            "legendFormat": "{{method}} {{status}}"
          }
        ],
        "yAxes": [
          {
            "label": "Requests/sec"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
      },
      {
        "id": 2,
        "title": "Response Time (95th percentile)",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(gunicorn_request_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          }
        ],
        "yAxes": [
          {
            "label": "Seconds"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
      },
      {
        "id": 3,
        "title": "Active Requests",
        "type": "singlestat",
        "targets": [
          {
            "expr": "gunicorn_active_requests",
            "legendFormat": "Active"
          }
        ],
        "gridPos": {"h": 4, "w": 6, "x": 0, "y": 8}
      },
      {
        "id": 4,
        "title": "Worker Memory Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "gunicorn_memory_usage_bytes{type=\"rss\"}",
            "legendFormat": "Worker {{worker_pid}}"
          }
        ],
        "yAxes": [
          {
            "label": "Bytes"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 12}
      },
      {
        "id": 5,
        "title": "Worker CPU Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "gunicorn_cpu_usage_percent",
            "legendFormat": "Worker {{worker_pid}}"
          }
        ],
        "yAxes": [
          {
            "label": "Percent",
            "max": 100
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 12}
      }
    ],
    "time": {
      "from": "now-1h",
      "to": "now"
    },
    "refresh": "10s"
  }
}

Configure Grafana data source

Configure Grafana to use Prometheus as a data source for the dashboards.

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://localhost:9090
    isDefault: true
    editable: true

Start all monitoring services

Enable and start all the monitoring services in the correct order.

# Start Gunicorn application
sudo systemctl daemon-reload
sudo systemctl enable --now gunicorn-monitored

Start Prometheus

sudo systemctl enable --now prometheus

Start Grafana

sudo systemctl enable --now grafana-server

Verify services are running

sudo systemctl status gunicorn-monitored prometheus grafana-server

Configure firewall rules

Open the necessary ports for accessing your monitoring services while maintaining security.

sudo ufw allow 8000/tcp comment "Gunicorn application"
sudo ufw allow 9090/tcp comment "Prometheus"
sudo ufw allow 3000/tcp comment "Grafana"
sudo ufw reload
sudo firewall-cmd --permanent --add-port=8000/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --reload

Verify your setup

Test that all components are working correctly and collecting metrics.

# Check Gunicorn application and metrics endpoint
curl -s http://localhost:8000/ | jq
curl -s http://localhost:8000/metrics | grep gunicorn

Generate some test traffic

for i in {1..20}; do curl -s http://localhost:8000/api/data > /dev/null; done for i in {1..5}; do curl -s http://localhost:8000/api/slow > /dev/null & done

Verify Prometheus is scraping metrics

curl -s "http://localhost:9090/api/v1/query?query=gunicorn_requests_total" | jq

Check Grafana is accessible

curl -s http://localhost:3000/api/health

View service logs

sudo journalctl -u gunicorn-monitored -f --lines=20
Note: Default Grafana login is admin/admin. You'll be prompted to change the password on first login. Access Grafana at http://your-server-ip:3000 and import the dashboard configuration.

Common issues

SymptomCauseFix
Metrics endpoint returns 404Middleware not properly configuredCheck middleware import in app.py and verify WSGI wrapping
Prometheus can't scrape metricsFirewall blocking or wrong targetVerify port 8000 is open: sudo ss -tlnp | grep :8000
Gunicorn workers crashMemory limits or Python errorsCheck logs: sudo journalctl -u gunicorn-monitored
High memory usage alertsMemory leaks or worker recycling neededLower max_requests in gunicorn.conf.py or check for memory leaks
Grafana shows no dataData source not configured correctlyVerify Prometheus data source URL: curl http://localhost:9090/targets
Permission denied errorsIncorrect file ownershipFix ownership: sudo chown -R www-data:www-data /opt/myapp

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.