Set up audit log analysis dashboard with Grafana and Prometheus for security monitoring

Intermediate 45 min May 20, 2026 22 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Build a comprehensive security monitoring dashboard that collects Linux audit logs through auditd, exports metrics to Prometheus, and visualizes security events in Grafana with automated alerting for suspicious activities.

Prerequisites

  • Root or sudo access
  • Minimum 2GB RAM
  • 10GB free disk space
  • Network access for package downloads

What this solves

System administrators need visibility into security events across their infrastructure to detect unauthorized access, privilege escalations, and compliance violations. This tutorial sets up a complete audit log analysis pipeline that captures system events with auditd, transforms them into metrics with custom exporters, and creates actionable dashboards in Grafana with automated alerting for security incidents.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest security patches and package versions.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install audit daemon and dependencies

Install auditd for system auditing, along with tools for log processing and metric collection. This includes the audit daemon, log utilities, and Python tools for custom exporters.

sudo apt install -y auditd audispd-plugins python3-pip python3-venv curl wget jq
sudo pip3 install prometheus_client psutil
sudo dnf install -y audit audit-libs python3-pip curl wget jq
sudo pip3 install prometheus_client psutil

Configure audit rules for security monitoring

Create comprehensive audit rules that monitor file access, user authentication, privilege changes, and network connections. These rules capture security-relevant events without overwhelming the system with noise.

# Delete existing rules
-D

Set buffer size

-b 8192

Set failure mode (2 = panic on failure)

-f 1

Monitor authentication events

-w /etc/passwd -p wa -k identity -w /etc/group -p wa -k identity -w /etc/shadow -p wa -k identity -w /etc/sudoers -p wa -k privilege_escalation -w /etc/sudoers.d/ -p wa -k privilege_escalation

Monitor login/logout events

-w /var/log/faillog -p wa -k logins -w /var/log/lastlog -p wa -k logins -w /var/log/tallylog -p wa -k logins

Monitor network configuration

-w /etc/hosts -p wa -k network_config -w /etc/network/ -p wa -k network_config

Monitor cron jobs

-w /etc/cron.allow -p wa -k cron -w /etc/cron.deny -p wa -k cron -w /etc/cron.d/ -p wa -k cron -w /etc/cron.daily/ -p wa -k cron -w /etc/cron.hourly/ -p wa -k cron -w /etc/cron.monthly/ -p wa -k cron -w /etc/cron.weekly/ -p wa -k cron -w /etc/crontab -p wa -k cron

Monitor SSH configuration

-w /etc/ssh/sshd_config -p wa -k ssh_config

Monitor system calls

-a always,exit -F arch=b64 -S adjtimex -S settimeofday -k time_change -a always,exit -F arch=b32 -S adjtimex -S settimeofday -S stime -k time_change -a always,exit -F arch=b64 -S mount -k mounts -a always,exit -F arch=b32 -S mount -k mounts

Monitor file deletions

-a always,exit -F arch=b64 -S unlink -S unlinkat -S rename -S renameat -k delete -a always,exit -F arch=b32 -S unlink -S unlinkat -S rename -S renameat -k delete

Make rules immutable

-e 2

Configure auditd daemon

Configure auditd with appropriate log rotation, formatting, and performance settings. This ensures logs are captured efficiently without filling up disk space.

log_file = /var/log/audit/audit.log
log_format = RAW
log_group = adm
priority_boost = 4
flush = INCREMENTAL_ASYNC
freq = 50
num_logs = 10
max_log_file = 100
max_log_file_action = ROTATE
space_left = 500
space_left_action = SYSLOG
admin_space_left = 100
admin_space_left_action = SUSPEND
disk_full_action = SUSPEND
disk_error_action = SUSPEND

Install and configure Prometheus Node Exporter

Download and install Prometheus Node Exporter to collect system metrics. This will work alongside our custom audit exporter to provide comprehensive monitoring data.

cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xfz node_exporter-1.7.0.linux-amd64.tar.gz
sudo mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/
sudo useradd --no-create-home --shell /bin/false node_exporter
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

Create Node Exporter systemd service

Create a systemd service file for Node Exporter to run automatically on system startup with proper security isolation.

[Unit]
Description=Node Exporter
After=network.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter --web.listen-address=:9100
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Create custom audit log exporter

Build a custom Python exporter that parses audit logs and exposes security metrics to Prometheus. This exporter will track authentication failures, privilege escalations, and file access patterns.

#!/usr/bin/env python3
import re
import time
import subprocess
from collections import defaultdict, deque
from prometheus_client import start_http_server, Counter, Gauge, Histogram
import threading

Prometheus metrics

auth_failures = Counter('audit_authentication_failures_total', 'Authentication failures', ['user', 'source']) auth_successes = Counter('audit_authentication_successes_total', 'Authentication successes', ['user', 'source']) privilege_escalations = Counter('audit_privilege_escalations_total', 'Privilege escalations', ['user', 'command']) file_accesses = Counter('audit_file_accesses_total', 'File accesses', ['user', 'file', 'action']) sudo_commands = Counter('audit_sudo_commands_total', 'Sudo commands executed', ['user', 'command']) login_events = Counter('audit_login_events_total', 'Login events', ['user', 'type', 'result']) cron_changes = Counter('audit_cron_changes_total', 'Cron configuration changes', ['user', 'file']) ssh_events = Counter('audit_ssh_events_total', 'SSH events', ['user', 'event_type'])

Gauges for current state

active_sessions = Gauge('audit_active_sessions', 'Currently active user sessions') failed_login_rate = Gauge('audit_failed_login_rate_per_minute', 'Failed login attempts per minute')

Rate tracking

failed_logins_window = deque(maxlen=100) class AuditLogParser: def __init__(self): self.patterns = { 'auth_failure': re.compile(r'type=USER_AUTH.res=failed.uid=(\d+).*auid=(\d+)'), 'auth_success': re.compile(r'type=USER_AUTH.res=success.uid=(\d+).*auid=(\d+)'), 'sudo_command': re.compile(r'type=USER_CMD.uid=(\d+).cmd="([^"]+)"'), 'file_access': re.compile(r'type=PATH.name="([^"]+)".uid=(\d+)'), 'user_start': re.compile(r'type=USER_START.uid=(\d+).auid=(\d+)'), 'user_end': re.compile(r'type=USER_END.uid=(\d+).auid=(\d+)'), 'cron_change': re.compile(r'key="cron".name="([^"]+)".uid=(\d+)'), 'ssh_login': re.compile(r'type=USER_LOGIN.uid=(\d+).addr=([\d\.]+)') } def get_username(self, uid): try: result = subprocess.run(['getent', 'passwd', str(uid)], capture_output=True, text=True, timeout=5) if result.returncode == 0: return result.stdout.split(':')[0] except: pass return f'uid_{uid}' def parse_line(self, line): current_time = time.time() # Authentication failures match = self.patterns['auth_failure'].search(line) if match: uid = match.group(1) user = self.get_username(uid) failed_logins_window.append(current_time) auth_failures.labels(user=user, source='unknown').inc() # Authentication successes match = self.patterns['auth_success'].search(line) if match: uid = match.group(1) user = self.get_username(uid) auth_successes.labels(user=user, source='unknown').inc() # Sudo commands match = self.patterns['sudo_command'].search(line) if match: uid = match.group(1) cmd = match.group(2)[:50] # Truncate long commands user = self.get_username(uid) sudo_commands.labels(user=user, command=cmd).inc() privilege_escalations.labels(user=user, command=cmd).inc() # File access match = self.patterns['file_access'].search(line) if match: filepath = match.group(1) uid = match.group(2) user = self.get_username(uid) # Only track sensitive file access if any(sensitive in filepath for sensitive in ['/etc/passwd', '/etc/shadow', '/var/log']): file_accesses.labels(user=user, file=filepath, action='access').inc() # Cron changes match = self.patterns['cron_change'].search(line) if match: filepath = match.group(1) uid = match.group(2) user = self.get_username(uid) cron_changes.labels(user=user, file=filepath).inc() # SSH events match = self.patterns['ssh_login'].search(line) if match: uid = match.group(1) addr = match.group(2) user = self.get_username(uid) ssh_events.labels(user=user, event_type='login').inc() def update_gauges(self): # Calculate failed login rate current_time = time.time() recent_failures = sum(1 for t in failed_logins_window if current_time - t < 60) failed_login_rate.set(recent_failures) # Update active sessions (simplified) try: result = subprocess.run(['who', '-q'], capture_output=True, text=True, timeout=5) if result.returncode == 0: lines = result.stdout.strip().split('\n') if len(lines) > 1: count_line = lines[-1] count = int(count_line.split('=')[1].strip()) if '=' in count_line else 0 active_sessions.set(count) except: pass def tail_audit_log(parser): try: process = subprocess.Popen(['tail', '-F', '/var/log/audit/audit.log'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True) for line in iter(process.stdout.readline, ''): if line: parser.parse_line(line.strip()) except Exception as e: print(f"Error reading audit log: {e}") time.sleep(5) def update_metrics_periodically(parser): while True: parser.update_gauges() time.sleep(30) if __name__ == '__main__': # Start metrics server start_http_server(9101) print("Audit exporter started on port 9101") parser = AuditLogParser() # Start gauge update thread gauge_thread = threading.Thread(target=update_metrics_periodically, args=(parser,)) gauge_thread.daemon = True gauge_thread.start() # Start log tailing (blocking) while True: try: tail_audit_log(parser) except KeyboardInterrupt: break except Exception as e: print(f"Error in main loop: {e}") time.sleep(10)

Make audit exporter executable and create service

Set proper permissions for the audit exporter script and create a systemd service to run it automatically.

sudo chmod +x /usr/local/bin/audit_exporter.py
sudo useradd --no-create-home --shell /bin/false audit_exporter
sudo usermod -a -G adm audit_exporter
[Unit]
Description=Audit Log Prometheus Exporter
After=network.target auditd.service
Requires=auditd.service

[Service]
User=audit_exporter
Group=audit_exporter
Type=simple
ExecStart=/usr/bin/python3 /usr/local/bin/audit_exporter.py
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Install Prometheus

Download and install Prometheus to collect metrics from both Node Exporter and our custom audit exporter.

cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.48.0/prometheus-2.48.0.linux-amd64.tar.gz
tar xfz prometheus-2.48.0.linux-amd64.tar.gz
sudo mv prometheus-2.48.0.linux-amd64/prometheus /usr/local/bin/
sudo mv prometheus-2.48.0.linux-amd64/promtool /usr/local/bin/
sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus

Configure Prometheus

Create a Prometheus configuration that scrapes metrics from both exporters and includes basic alerting rules for security events.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "/etc/prometheus/alert_rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - localhost:9093

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
        
  - job_name: 'node_exporter'
    static_configs:
      - targets: ['localhost:9100']
        
  - job_name: 'audit_exporter'
    static_configs:
      - targets: ['localhost:9101']
    scrape_interval: 30s

Create Prometheus alerting rules

Define alerting rules that trigger on suspicious security events like multiple authentication failures, privilege escalations, or unusual file access patterns.

groups:
  - name: security_alerts
    rules:
      - alert: HighAuthenticationFailures
        expr: increase(audit_authentication_failures_total[5m]) > 10
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "High number of authentication failures"
          description: "User {{ $labels.user }} has {{ $value }} authentication failures in the last 5 minutes"
          
      - alert: PrivilegeEscalation
        expr: increase(audit_privilege_escalations_total[10m]) > 5
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Multiple privilege escalations detected"
          description: "User {{ $labels.user }} executed {{ $value }} sudo commands in 10 minutes"
          
      - alert: SensitiveFileAccess
        expr: increase(audit_file_accesses_total{file=~".(/etc/passwd|/etc/shadow|/var/log/.).*"}[5m]) > 3
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Unusual access to sensitive files"
          description: "User {{ $labels.user }} accessed sensitive file {{ $labels.file }} {{ $value }} times in 5 minutes"
          
      - alert: CronConfigurationChanges
        expr: increase(audit_cron_changes_total[30m]) > 0
        for: 0m
        labels:
          severity: warning
        annotations:
          summary: "Cron configuration modified"
          description: "User {{ $labels.user }} modified cron file {{ $labels.file }}"
          
      - alert: HighFailedLoginRate
        expr: audit_failed_login_rate_per_minute > 20
        for: 3m
        labels:
          severity: critical
        annotations:
          summary: "High rate of failed login attempts"
          description: "{{ $value }} failed login attempts per minute detected - possible brute force attack"
          
  - name: system_health
    rules:
      - alert: AuditExporterDown
        expr: up{job="audit_exporter"} == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Audit exporter is down"
          description: "The audit log exporter has been down for more than 5 minutes"
          
      - alert: NodeExporterDown
        expr: up{job="node_exporter"} == 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Node exporter is down"
          description: "The node exporter has been down for more than 5 minutes"

Create Prometheus systemd service

Set up Prometheus to run as a systemd service with proper permissions and security settings.

sudo chown -R prometheus:prometheus /etc/prometheus
[Unit]
Description=Prometheus
After=network.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus/ \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address=0.0.0.0:9090 \
  --web.enable-lifecycle \
  --storage.tsdb.retention.time=30d
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Install Grafana

Install Grafana to create dashboards for visualizing audit logs and security metrics from Prometheus.

curl -fsSL https://packages.grafana.com/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/grafana.gpg
echo "deb [signed-by=/usr/share/keyrings/grafana.gpg] https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana
sudo tee /etc/yum.repos.d/grafana.repo <

Configure firewall rules

Open necessary ports for Prometheus, Grafana, and the exporters while maintaining security. Only expose Grafana externally.

sudo ufw allow 3000/tcp comment 'Grafana'
sudo ufw allow from 127.0.0.1 to any port 9090 comment 'Prometheus localhost only'
sudo ufw allow from 127.0.0.1 to any port 9100 comment 'Node Exporter localhost only'
sudo ufw allow from 127.0.0.1 to any port 9101 comment 'Audit Exporter localhost only'
sudo ufw --force enable
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='127.0.0.1' port protocol='tcp' port='9090' accept"
sudo firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='127.0.0.1' port protocol='tcp' port='9100' accept"
sudo firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='127.0.0.1' port protocol='tcp' port='9101' accept"
sudo firewall-cmd --reload

Start and enable all services

Enable and start all services in the correct order to ensure dependencies are met and the monitoring stack comes online properly.

sudo systemctl daemon-reload
sudo systemctl enable --now auditd
sudo systemctl enable --now node_exporter
sudo systemctl enable --now audit_exporter
sudo systemctl enable --now prometheus
sudo systemctl enable --now grafana-server

Configure Grafana data source

Add Prometheus as a data source in Grafana and create the security monitoring dashboard. First, access Grafana and change the default password.

Note: Default Grafana credentials are admin/admin. You'll be prompted to change the password on first login.
echo "Grafana is available at http://your-server-ip:3000"
echo "Default credentials: admin / admin"
echo "Prometheus URL for data source: http://localhost:9090"

Create audit dashboard JSON

Create a comprehensive Grafana dashboard configuration that visualizes audit metrics, authentication events, and security alerts.

{
  "dashboard": {
    "id": null,
    "title": "Security Audit Dashboard",
    "tags": ["security", "audit", "monitoring"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "Authentication Failures by User",
        "type": "stat",
        "targets": [
          {
            "expr": "sum by (user) (increase(audit_authentication_failures_total[1h]))",
            "format": "time_series",
            "legendFormat": "{{user}}"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
        "fieldConfig": {
          "defaults": {
            "color": {"mode": "thresholds"},
            "thresholds": {
              "steps": [
                {"color": "green", "value": null},
                {"color": "yellow", "value": 5},
                {"color": "red", "value": 15}
              ]
            }
          }
        }
      },
      {
        "id": 2,
        "title": "Privilege Escalations Over Time",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(audit_privilege_escalations_total[5m]) * 60",
            "format": "time_series",
            "legendFormat": "{{user}} - {{command}}"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0},
        "yAxes": [
          {"label": "Escalations per minute", "min": 0}
        ]
      },
      {
        "id": 3,
        "title": "Failed Login Rate",
        "type": "singlestat",
        "targets": [
          {
            "expr": "audit_failed_login_rate_per_minute",
            "format": "time_series"
          }
        ],
        "gridPos": {"h": 4, "w": 6, "x": 0, "y": 8},
        "valueName": "current",
        "thresholds": "10,20",
        "colors": ["green", "yellow", "red"]
      },
      {
        "id": 4,
        "title": "Active Sessions",
        "type": "singlestat",
        "targets": [
          {
            "expr": "audit_active_sessions",
            "format": "time_series"
          }
        ],
        "gridPos": {"h": 4, "w": 6, "x": 6, "y": 8},
        "valueName": "current"
      },
      {
        "id": 5,
        "title": "Sensitive File Access",
        "type": "table",
        "targets": [
          {
            "expr": "sum by (user, file) (increase(audit_file_accesses_total[24h]))",
            "format": "table",
            "instant": true
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 8},
        "transformations": [
          {"id": "organize", "options": {"excludeByName": {"Time": true}}}
        ]
      },
      {
        "id": 6,
        "title": "Cron Changes",
        "type": "logs",
        "targets": [
          {
            "expr": "increase(audit_cron_changes_total[24h])",
            "format": "time_series"
          }
        ],
        "gridPos": {"h": 6, "w": 24, "x": 0, "y": 16}
      }
    ],
    "time": {"from": "now-6h", "to": "now"},
    "refresh": "30s"
  },
  "folderId": null,
  "overwrite": false
}

Import the dashboard into Grafana

Use the Grafana API or web interface to import the security audit dashboard. You can also create additional dashboards for specific security use cases.

# Import dashboard via API (replace with your Grafana admin password)
curl -X POST \
  http://admin:your-password@localhost:3000/api/dashboards/db \
  -H 'Content-Type: application/json' \
  -d @/tmp/audit_dashboard.json

Or manually import through Grafana UI:

1. Go to http://your-server-ip:3000

2. Click '+' > Import

3. Copy and paste the JSON content from /tmp/audit_dashboard.json

Verify your setup

Check that all services are running correctly and metrics are being collected and displayed properly.

# Check service status
sudo systemctl status auditd node_exporter audit_exporter prometheus grafana-server

Verify audit rules are active

sudo auditctl -l

Check audit log is being generated

sudo tail -f /var/log/audit/audit.log

Test Prometheus metrics endpoints

curl http://localhost:9100/metrics | head -20 curl http://localhost:9101/metrics | head -20

Check Prometheus targets

curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | {job, health}'

Test a security event (this should appear in metrics and dashboard)

sudo su - nonexistentuser 2>/dev/null || echo "Expected failure - check dashboard for auth failure metric"

Common issues

SymptomCauseFix
Audit exporter not collecting metricsInsufficient permissions to read audit logsudo usermod -a -G adm audit_exporter && sudo systemctl restart audit_exporter
Prometheus can't scrape exportersFirewall blocking localhost connectionsCheck firewall rules allow loopback connections or use 127.0.0.1 explicitly
Grafana shows "No data" for dashboardsPrometheus data source not configuredAdd Prometheus data source at http://localhost:9090 in Grafana settings
Audit rules not generating eventsRules not loaded or auditd not runningsudo systemctl restart auditd && sudo auditctl -R /etc/audit/rules.d/audit.rules
High CPU usage from audit exporterToo many audit events being processedReduce audit rule scope or increase processing intervals in Python script
Grafana dashboard panels show errorsMetric names changed or not availableCheck available metrics: curl http://localhost:9101/metrics | grep audit_

Next steps

Running this in production?

Want this handled for you? Setting up audit monitoring once is straightforward. Keeping it patched, tuned, backed up and responding to security alerts across environments is the harder part. See how we run infrastructure like this for European SaaS and e-commerce teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle infrastructure security hardening for businesses that depend on uptime. From initial setup to ongoing operations.