Configure Netdata alerts with Slack and Microsoft Teams for real-time monitoring notifications

Intermediate 25 min Apr 10, 2026 49 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up comprehensive Netdata alerting with Slack and Microsoft Teams integration. Configure custom alert thresholds, webhook notifications, and automated monitoring responses for real-time system health alerts.

Prerequisites

  • Root or sudo access
  • Active Slack workspace
  • Microsoft Teams channel access
  • Internet connectivity for webhooks

What this solves

Netdata provides real-time system monitoring with powerful alerting capabilities, but the default email notifications aren't always sufficient for modern DevOps teams. This tutorial shows you how to integrate Netdata with Slack and Microsoft Teams to receive instant notifications about system performance issues, resource usage spikes, and service failures. You'll configure custom alert rules with specific thresholds and set up webhook-based notifications that reach your team wherever they work.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of all dependencies.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install required dependencies

Install curl and other tools needed for webhook notifications and JSON processing.

sudo apt install -y curl jq wget gnupg
sudo dnf install -y curl jq wget gnupg

Install Netdata

Download and install Netdata using the official kickstart script with security features enabled.

bash <(curl -Ss https://my-netdata.io/kickstart.sh) --stable-channel --disable-telemetry

Verify Netdata installation

Check that Netdata is running and accessible on the default port 19999.

sudo systemctl status netdata
ss -tlnp | grep 19999

Create Slack webhook integration

Go to your Slack workspace and create an incoming webhook for your channel. Navigate to Apps > Incoming Webhooks > Add to Slack, select your channel, and copy the webhook URL.

Note: Keep your webhook URL secure. Anyone with access to it can send messages to your Slack channel.

Create Microsoft Teams webhook

In Microsoft Teams, go to your target channel, click the three dots menu, select Connectors, find "Incoming Webhook", and configure it. Copy the generated webhook URL for later use.

Configure Netdata health notifications

Create the health alarm configuration directory and set up the notification methods.

sudo mkdir -p /etc/netdata/health_alarm_notify.conf.d
sudo cp /usr/lib/netdata/conf.d/health_alarm_notify.conf /etc/netdata/health_alarm_notify.conf

Configure Slack notifications

Edit the health notification configuration to enable Slack webhooks with your specific channel settings.

# Enable Slack notifications
SEND_SLACK="YES"

Slack webhook URL (replace with your actual webhook)

SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"

Default Slack channel

DEFAULT_RECIPIENT_SLACK="#monitoring"

Slack message format

SLACK_CHANNEL="monitoring" SLACK_USERNAME="netdata" SLACK_ICON_EMOJI=":warning:"

Role-based recipients for different alert types

role_recipients_slack[sysadmin]="#alerts" role_recipients_slack[domainadmin]="#infrastructure" role_recipients_slack[dba]="#database"

Configure Microsoft Teams notifications

Add Microsoft Teams webhook configuration to the same notification file.

# Enable Microsoft Teams notifications
SEND_MSTEAMS="YES"

Microsoft Teams webhook URL (replace with your actual webhook)

MSTEAMS_WEBHOOK_URL="https://outlook.office.com/webhook/YOUR/TEAMS/WEBHOOK"

Default Teams channel

DEFAULT_RECIPIENT_MSTEAMS="General"

Teams notification settings

MSTEAMS_ICON_DEFAULT="https://registry.my-netdata.io/api/v1/badge.svg?chart=system.cpu" MSTEAMS_COLOR_DEFAULT="0076D7"

Role-based Teams recipients

role_recipients_msteams[sysadmin]="Infrastructure Team" role_recipients_msteams[domainadmin]="Operations Team"

Create custom alert rules

Define specific monitoring thresholds for CPU, memory, disk usage, and network performance.

# High CPU usage alert
template: cpu_usage_high
      on: system.cpu
    calc: $user + $system + $softirq + $irq + $guest
   units: %
   every: 10s
    warn: $this > 80
    crit: $this > 95
   delay: down 5m multiplier 1.5 max 1h
    info: CPU utilization is high
      to: sysadmin

Memory usage alert

template: ram_usage_high on: system.ram calc: ($used - $buffers - $cached) * 100 / $used units: % every: 10s warn: $this > 80 crit: $this > 90 delay: down 5m multiplier 1.5 max 1h info: RAM usage is high to: sysadmin

Disk space alert

template: disk_space_usage on: disk.space calc: $used * 100 / ($avail + $used) units: % every: 1m warn: $this > 80 crit: $this > 90 delay: up 1m down 15m multiplier 1.5 max 1h info: Disk space usage is high to: sysadmin

Network interface errors

template: network_interface_errors on: net.errors calc: $inbound + $outbound units: errors/s every: 10s warn: $this > 5 crit: $this > 20 delay: up 1m down 5m multiplier 1.5 max 2h info: Network interface errors detected to: sysadmin

Configure service monitoring alerts

Set up alerts for critical services like web servers, databases, and system services.

# Apache/Nginx status monitoring
template: web_service_requests
      on: apache.requests
    calc: $requests
   units: requests/s
   every: 10s
    warn: $this > 1000
    crit: $this > 2000
   delay: up 2m down 5m multiplier 1.5 max 1h
    info: High web server request rate
      to: sysadmin

Database connection monitoring

template: mysql_connections on: mysql.connections calc: $connections units: connections every: 10s warn: $this > 80 crit: $this > 95 delay: up 1m down 5m multiplier 1.5 max 1h info: High MySQL connection usage to: dba

System service status

template: systemd_service_failed on: systemd.service_units calc: $failed units: units every: 10s crit: $this > 0 delay: up 30s down 5m multiplier 1.5 max 1h info: Systemd service failure detected to: sysadmin

Set correct file permissions

Ensure Netdata can read the configuration files with proper ownership and permissions.

sudo chown netdata:netdata /etc/netdata/health_alarm_notify.conf
sudo chown -R netdata:netdata /etc/netdata/health.d/
sudo chmod 640 /etc/netdata/health_alarm_notify.conf
sudo chmod 644 /etc/netdata/health.d/*.conf
Never use chmod 777. It gives every user on the system full access to your files. Instead, fix ownership with chown and use minimal permissions.

Restart Netdata service

Apply the new configuration by restarting the Netdata service.

sudo systemctl restart netdata
sudo systemctl status netdata

Test notification delivery

Trigger a test alert to verify both Slack and Teams notifications are working properly.

# Generate high CPU load for testing
yes > /dev/null &
PID=$!
sleep 30
kill $PID

Check Netdata logs for notification attempts

sudo tail -f /var/log/netdata/error.log | grep -E "(slack|msteams|alarm)"

Configure advanced alert customization

Create role-based alert routing

Configure different notification channels based on alert severity and system component.

# Critical system alerts go to all channels
template: system_critical_all
      on: system.load
    calc: $load1
   units: load
   every: 10s
    crit: $this > (($system.cpu.cores) * 2)
   delay: up 1m down 15m multiplier 1.5 max 1h
    info: System load is critically high
      to: sysadmin, domainadmin

Database alerts only to DBA team

template: database_performance on: mysql.queries calc: $select + $insert + $update + $delete units: queries/s every: 10s warn: $this > 500 crit: $this > 1000 delay: up 2m down 10m multiplier 1.5 max 1h info: High database query rate to: dba

Configure alert suppression and maintenance windows

Set up alert suppression during maintenance windows and prevent alert storms.

# Suppress alerts during maintenance (example: daily backup window)
template: maintenance_window_disk
      on: disk.space
    calc: $used * 100 / ($avail + $used)
   units: %
   every: 1m
    warn: $this > 85
    crit: $this > 95
   delay: up 5m down 15m multiplier 1.5 max 1h
    info: Disk space usage is high (maintenance window considered)
      to: sysadmin
 # Skip alerts between 2 AM and 4 AM (backup window)
 options: no-clear-notification

Rate limiting for noisy alerts

template: network_drops_limited on: net.drops calc: $inbound + $outbound units: drops/s every: 30s warn: $this > 10 crit: $this > 50 delay: up 5m down 30m multiplier 2.0 max 4h info: Network packet drops detected (rate limited) to: sysadmin

Verify your setup

# Check Netdata service status
sudo systemctl status netdata

Verify configuration syntax

sudo /usr/sbin/netdata -W set health enabled yes -W set2 health "enabled=yes" -P /var/lib/netdata/netdata.pid -D

Check active alerts

curl -s "http://localhost:19999/api/v1/alarms?active" | jq .

Verify webhook connectivity

curl -X POST -H 'Content-type: application/json' \ --data '{"text":"Netdata webhook test"}' \ "YOUR_SLACK_WEBHOOK_URL"

Check Netdata logs for notification attempts

sudo journalctl -u netdata -f | grep -E "(ALARM|notification|webhook)"

Test alert configuration

sudo /usr/libexec/netdata/plugins.d/alarm-notify.sh test

Verify health check endpoints

curl -s "http://localhost:19999/api/v1/info" | jq .version

Common issues

SymptomCauseFix
Notifications not sendingIncorrect webhook URLs or permissionsVerify URLs with curl test and check file permissions
Alerts triggering too frequentlyThresholds too low or delay settings wrongIncrease warn/crit values and adjust delay parameters
Missing alerts in Teams/SlackRole recipients not configured correctlyCheck role_recipients_slack and role_recipients_msteams settings
Configuration syntax errorsInvalid YAML or missing quotesUse netdata -W unittest to validate configuration
Webhook SSL certificate errorsOutdated CA certificatesUpdate ca-certificates package and restart Netdata
High false positive rateDefault thresholds not tuned for environmentMonitor baseline metrics for 24h before setting final thresholds

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.