Telegraf Custom Plugins with Prometheus & InfluxDB

Learn to build custom Telegraf input plugins for application metrics collection, configure dual output to Prometheus and InfluxDB backends, and create comprehensive monitoring dashboards with Grafana for production observability.

Prerequisites

Root or sudo access
Python 3.6+ installed
At least 2GB RAM
Network access for package downloads

What this solves

Telegraf's built-in plugins don't always capture the specific metrics your applications generate. This tutorial shows you how to create custom input plugins for application-specific monitoring, configure dual output to both Prometheus and InfluxDB, and build Grafana dashboards for comprehensive observability.

Step-by-step installation

Install Telegraf agent

Start by installing Telegraf from the official InfluxData repository to ensure you get the latest features and security updates.

wget -qO- https://repos.influxdata.com/influxdata-archive_compat.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdata.list
sudo apt update
sudo apt install -y telegraf

sudo tee /etc/yum.repos.d/influxdata.repo <




Create custom input plugin directory
Set up a dedicated directory for your custom plugins and ensure proper permissions for the telegraf user to execute them.
sudo mkdir -p /etc/telegraf/scripts
sudo chown telegraf:telegraf /etc/telegraf/scripts
sudo chmod 755 /etc/telegraf/scripts



Build custom application metrics script
Create a custom script that gathers application-specific metrics. This example monitors a web application's response times and connection counts.
#!/usr/bin/env python3
import json
import time
import requests
import psutil
from datetime import datetime

def collect_app_metrics():
    metrics = {}
    timestamp = int(time.time() * 1000000000)  # nanoseconds for InfluxDB
    
    # Application response time check
    try:
        start_time = time.time()
        response = requests.get('http://localhost:8080/health', timeout=5)
        response_time = (time.time() - start_time) * 1000
        metrics['app_response_time'] = response_time
        metrics['app_status_code'] = response.status_code
        metrics['app_available'] = 1 if response.status_code == 200 else 0
    except Exception as e:
        metrics['app_response_time'] = 0
        metrics['app_status_code'] = 0
        metrics['app_available'] = 0
    
    # Database connection pool metrics
    try:
        db_response = requests.get('http://localhost:8080/metrics/db', timeout=2)
        if db_response.status_code == 200:
            db_data = db_response.json()
            metrics['db_active_connections'] = db_data.get('active_connections', 0)
            metrics['db_idle_connections'] = db_data.get('idle_connections', 0)
            metrics['db_max_connections'] = db_data.get('max_connections', 0)
    except:
        pass
    
    # Custom business metrics
    try:
        business_response = requests.get('http://localhost:8080/metrics/business', timeout=2)
        if business_response.status_code == 200:
            business_data = business_response.json()
            metrics['active_users'] = business_data.get('active_users', 0)
            metrics['orders_per_minute'] = business_data.get('orders_per_minute', 0)
            metrics['revenue_last_hour'] = business_data.get('revenue_last_hour', 0)
    except:
        pass
    
    # System resource usage for the application
    for proc in psutil.process_iter(['pid', 'name', 'cpu_percent', 'memory_info']):
        try:
            if 'myapp' in proc.info['name']:
                metrics['app_cpu_percent'] = proc.info['cpu_percent']
                metrics['app_memory_mb'] = proc.info['memory_info'].rss / 1024 / 1024
                break
        except (psutil.NoSuchProcess, psutil.AccessDenied):
            continue
    
    # Output in InfluxDB line protocol format
    tags = f"host={psutil.os.uname().nodename},app=myapp"
    fields = []
    for key, value in metrics.items():
        if isinstance(value, (int, float)):
            fields.append(f"{key}={value}")
        else:
            fields.append(f'{key}="{value}"')
    
    line = f"custom_app_metrics,{tags} {','.join(fields)} {timestamp}"
    print(line)

if __name__ == "__main__":
    collect_app_metrics()



Make the script executable
Set proper permissions and ownership for the custom metrics script.
sudo chmod 755 /etc/telegraf/scripts/app_metrics.py
sudo chown telegraf:telegraf /etc/telegraf/scripts/app_metrics.py



Install Python dependencies
Install required Python packages for the custom metrics script.






sudo apt install -y python3-pip python3-requests python3-psutil
sudo pip3 install requests psutil


sudo dnf install -y python3-pip python3-requests python3-psutil
sudo pip3 install requests psutil





Configure Telegraf with custom plugin and dual outputs
Create a comprehensive Telegraf configuration that includes your custom plugin and outputs to both Prometheus and InfluxDB.
# Global agent configuration
[agent]
  interval = "30s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

Custom application metrics input plugin
[[inputs.exec]]
  commands = ["/etc/telegraf/scripts/app_metrics.py"]
  timeout = "10s"
  data_format = "influx"
  interval = "30s"

System metrics for context
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.swap]]

[[inputs.system]]

Network interface monitoring
[[inputs.net]]

HTTP response time monitoring
[[inputs.http_response]]
  urls = ["http://localhost:8080/health"]
  response_timeout = "5s"
  method = "GET"
  follow_redirects = false

InfluxDB output
[[outputs.influxdb]]
  urls = ["http://localhost:8086"]
  database = "telegraf"
  retention_policy = ""
  write_consistency = "any"
  timeout = "5s"
  username = "telegraf_user"
  password = "your_influxdb_password"

Prometheus output
[[outputs.prometheus_client]]
  listen = ":9273"
  metric_version = 2
  collectors_exclude = ["gocollector", "process"]
  string_as_label = false
  export_timestamp = false



Install and configure InfluxDB
Set up InfluxDB as one of your time-series backends for storing metrics data.






sudo apt install -y influxdb
sudo systemctl enable --now influxdb


sudo dnf install -y influxdb
sudo systemctl enable --now influxdb





Create InfluxDB database and user
Set up the database and user credentials for Telegraf to write metrics data.
influx
CREATE DATABASE telegraf
CREATE USER "telegraf_user" WITH PASSWORD 'your_influxdb_password'
GRANT ALL ON "telegraf" TO "telegraf_user"
EXIT



Install Prometheus
Set up Prometheus to scrape metrics from Telegraf's Prometheus output endpoint.






sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
tar xzf prometheus-2.45.0.linux-amd64.tar.gz
sudo cp prometheus-2.45.0.linux-amd64/prometheus /usr/local/bin/
sudo cp prometheus-2.45.0.linux-amd64/promtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/prometheus /usr/local/bin/promtool


sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
tar xzf prometheus-2.45.0.linux-amd64.tar.gz
sudo cp prometheus-2.45.0.linux-amd64/prometheus /usr/local/bin/
sudo cp prometheus-2.45.0.linux-amd64/promtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/prometheus /usr/local/bin/promtool





Configure Prometheus to scrape Telegraf
Set up Prometheus configuration to collect metrics from Telegraf's Prometheus endpoint.
global:
  scrape_interval: 30s
  evaluation_interval: 30s

scrape_configs:
  - job_name: 'telegraf'
    static_configs:
      - targets: ['localhost:9273']
    scrape_interval: 30s
    metrics_path: /metrics
    
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093



Create Prometheus systemd service
Set up Prometheus as a systemd service for automatic startup and management.
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file /etc/prometheus/prometheus.yml \
    --storage.tsdb.path /var/lib/prometheus/ \
    --web.console.templates=/etc/prometheus/consoles \
    --web.console.libraries=/etc/prometheus/console_libraries \
    --web.listen-address=0.0.0.0:9090 \
    --web.enable-lifecycle

[Install]
WantedBy=multi-user.target



Install and configure Grafana
Set up Grafana for creating dashboards that visualize data from both Prometheus and InfluxDB.






sudo apt install -y software-properties-common
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana


sudo tee /etc/yum.repos.d/grafana.repo <






Start all services
Enable and start all the monitoring stack components.
sudo systemctl enable --now telegraf
sudo systemctl enable --now prometheus
sudo systemctl enable --now grafana-server
sudo systemctl enable --now influxdb



Create Grafana dashboard configuration
Set up a comprehensive dashboard that displays metrics from both data sources.
{
  "dashboard": {
    "id": null,
    "title": "Custom Application Monitoring",
    "tags": ["telegraf", "custom", "application"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "Application Response Time",
        "type": "stat",
        "targets": [
          {
            "expr": "custom_app_metrics_app_response_time",
            "legendFormat": "Response Time (ms)",
            "refId": "A",
            "datasource": "Prometheus"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
        "fieldConfig": {
          "defaults": {
            "unit": "ms",
            "thresholds": {
              "steps": [
                {"color": "green", "value": null},
                {"color": "yellow", "value": 100},
                {"color": "red", "value": 500}
              ]
            }
          }
        }
      },
      {
        "id": 2,
        "title": "Database Connections",
        "type": "graph",
        "targets": [
          {
            "query": "SELECT mean(\"db_active_connections\") FROM \"custom_app_metrics\" WHERE time >= now() - 1h GROUP BY time(1m) fill(null)",
            "refId": "A",
            "datasource": "InfluxDB"
          }
        ],
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
      },
      {
        "id": 3,
        "title": "Business Metrics",
        "type": "table",
        "targets": [
          {
            "expr": "custom_app_metrics_active_users",
            "legendFormat": "Active Users",
            "refId": "A",
            "datasource": "Prometheus"
          },
          {
            "expr": "custom_app_metrics_orders_per_minute",
            "legendFormat": "Orders/min",
            "refId": "B",
            "datasource": "Prometheus"
          }
        ],
        "gridPos": {"h": 8, "w": 24, "x": 0, "y": 8}
      }
    ],
    "time": {"from": "now-1h", "to": "now"},
    "refresh": "30s"
  }
}


Configure advanced alerting rules


Create Prometheus alerting rules
Set up alert rules for your custom metrics to get notified when issues occur.
groups:
name: custom_app_alerts  rules:
  - alert: ApplicationDown
    expr: custom_app_metrics_app_available == 0
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "Application is down"
      description: "Application has been down for more than 2 minutes"
      
  - alert: HighResponseTime
    expr: custom_app_metrics_app_response_time > 1000
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High application response time"
      description: "Application response time is {{ $value }}ms"
      
  - alert: DatabaseConnectionsHigh
    expr: custom_app_metrics_db_active_connections > 80
    for: 3m
    labels:
      severity: warning
    annotations:
      summary: "High database connection usage"
      description: "Database has {{ $value }} active connections"



Update Prometheus configuration for alerts
Add the alert rules file to your Prometheus configuration.
sudo chown prometheus:prometheus /etc/prometheus/app_alerts.yml
sudo systemctl restart prometheus


Verify your setup
# Check all services are running
sudo systemctl status telegraf prometheus grafana-server influxdb

Test custom script execution
sudo -u telegraf /etc/telegraf/scripts/app_metrics.py

Check Telegraf is collecting metrics
sudo journalctl -u telegraf -f

Verify Prometheus is scraping Telegraf
curl http://localhost:9090/api/v1/targets

Check InfluxDB has data
influx -execute 'SHOW MEASUREMENTS ON telegraf'

Test Grafana access
curl http://localhost:3000/api/health


Note: Access Grafana at http://your-server:3000 with default credentials admin/admin. Configure data sources for both Prometheus (http://localhost:9090) and InfluxDB (http://localhost:8086).

Common issues



Symptom
Cause
Fix




Custom script not executing
Permission or path issues
sudo -u telegraf /etc/telegraf/scripts/app_metrics.py to test manually


No metrics in Prometheus
Telegraf Prometheus output not running
Check curl localhost:9273/metrics and verify port 9273 is open


InfluxDB connection failed
Database or user doesn't exist
Recreate database and user with proper permissions


Python import errors
Missing dependencies
Install missing packages with pip3 install requests psutil


High memory usage
Too frequent collection interval
Increase interval in telegraf.conf from 30s to 60s or higher


Grafana dashboards empty
Data source not configured
Add both Prometheus and InfluxDB as data sources in Grafana settings




Next steps

Configure advanced Grafana dashboards and alerting with Prometheus integration
Set up Alertmanager with email and Slack notifications for monitoring alerts
Implement Telegraf clustering for high availability monitoring
Configure Telegraf custom processors and aggregators for data transformation


Running this in production?
Need this managed? Setting up custom monitoring once is straightforward. Keeping it patched, monitored, backed up and performant across environments is the harder part. See how we run infrastructure like this for European SaaS and e-commerce teams.

Symptom	Cause	Fix
Custom script not executing	Permission or path issues	`sudo -u telegraf /etc/telegraf/scripts/app_metrics.py` to test manually
No metrics in Prometheus	Telegraf Prometheus output not running	Check `curl localhost:9273/metrics` and verify port 9273 is open
InfluxDB connection failed	Database or user doesn't exist	Recreate database and user with proper permissions
Python import errors	Missing dependencies	Install missing packages with `pip3 install requests psutil`
High memory usage	Too frequent collection interval	Increase interval in telegraf.conf from 30s to 60s or higher
Grafana dashboards empty	Data source not configured	Add both Prometheus and InfluxDB as data sources in Grafana settings



    
            
            
                
                    
                        
                            
                        
                        
                            Automated install script
                            Run this to automate the entire setup
                        
                    
                    
                
                
                    
                        
                            install.sh
                            
                        
                        #!/usr/bin/env bash

set -euo pipefail

# Colors for output
readonly RED='\033[0;31m'
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly BLUE='\033[0;34m'
readonly NC='\033[0m' # No Color

# Global variables
SCRIPT_NAME=$(basename "$0")
TEMP_FILES=()
SERVICES_STARTED=()

# Cleanup function
cleanup() {
    local exit_code=$?
    echo -e "\n${YELLOW}[CLEANUP] Cleaning up temporary files...${NC}"
    
    for file in "${TEMP_FILES[@]}"; do
        [ -f "$file" ] && rm -f "$file"
    done
    
    if [ $exit_code -ne 0 ]; then
        echo -e "${RED}[ERROR] Installation failed. Stopping started services...${NC}"
        for service in "${SERVICES_STARTED[@]}"; do
            systemctl stop "$service" 2>/dev/null || true
            systemctl disable "$service" 2>/dev/null || true
        done
    fi
    
    exit $exit_code
}

# Set up error handling
trap cleanup ERR EXIT

usage() {
    cat << EOF
Usage: $SCRIPT_NAME [OPTIONS]

Install and configure Telegraf with custom plugin support for application monitoring

OPTIONS:
    -h, --help          Show this help message
    -p, --prometheus    Enable Prometheus output (default: disabled)
    -i, --influxdb      Enable InfluxDB output (default: disabled)
    --influx-url URL    InfluxDB URL (default: http://localhost:8086)
    --influx-db NAME    InfluxDB database name (default: telegraf)

Examples:
    $SCRIPT_NAME --prometheus --influxdb
    $SCRIPT_NAME -p -i --influx-url http://influxdb:8086 --influx-db metrics

EOF
}

log_step() {
    echo -e "${BLUE}[$1] $2${NC}"
}

log_success() {
    echo -e "${GREEN}[SUCCESS] $1${NC}"
}

log_error() {
    echo -e "${RED}[ERROR] $1${NC}"
}

log_warning() {
    echo -e "${YELLOW}[WARNING] $1${NC}"
}

# Parse command line arguments
ENABLE_PROMETHEUS=false
ENABLE_INFLUXDB=false
INFLUX_URL="http://localhost:8086"
INFLUX_DB="telegraf"

while [[ $# -gt 0 ]]; do
    case $1 in
        -h|--help)
            usage
            exit 0
            ;;
        -p|--prometheus)
            ENABLE_PROMETHEUS=true
            shift
            ;;
        -i|--influxdb)
            ENABLE_INFLUXDB=true
            shift
            ;;
        --influx-url)
            INFLUX_URL="$2"
            shift 2
            ;;
        --influx-db)
            INFLUX_DB="$2"
            shift 2
            ;;
        *)
            log_error "Unknown option: $1"
            usage
            exit 1
            ;;
    esac
done

# Check if at least one output is enabled
if [ "$ENABLE_PROMETHEUS" = false ] && [ "$ENABLE_INFLUXDB" = false ]; then
    log_warning "No outputs enabled. Enabling InfluxDB output by default."
    ENABLE_INFLUXDB=true
fi

# Check prerequisites
log_step "1/8" "Checking prerequisites..."

if [[ $EUID -ne 0 ]]; then
    log_error "This script must be run as root or with sudo"
    exit 1
fi

# Detect distribution
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian)
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            PKG_UPDATE="apt update"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            PKG_UPDATE="dnf check-update || true"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            PKG_UPDATE="yum check-update || true"
            ;;
        *)
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    log_error "Cannot detect distribution. /etc/os-release not found."
    exit 1
fi

log_success "Running on $PRETTY_NAME with $PKG_MGR package manager"

# Check for required tools
for tool in curl wget systemctl; do
    if ! command -v "$tool" &> /dev/null; then
        log_error "$tool is required but not installed"
        exit 1
    fi
done

# Install prerequisites
log_step "2/8" "Installing prerequisites..."

case "$PKG_MGR" in
    apt)
        $PKG_UPDATE
        $PKG_INSTALL curl wget gnupg2 lsb-release
        ;;
    dnf|yum)
        $PKG_UPDATE
        $PKG_INSTALL curl wget gnupg2
        ;;
esac

log_success "Prerequisites installed"

# Add InfluxData repository
log_step "3/8" "Adding InfluxData repository..."

case "$PKG_MGR" in
    apt)
        # Add GPG key
        curl -sL https://repos.influxdata.com/influxdata-archive_compat.key | apt-key add -
        
        # Add repository
        echo "deb https://repos.influxdata.com/ubuntu $(lsb_release -cs) stable" > /etc/apt/sources.list.d/influxdata.list
        
        # Update package list
        $PKG_UPDATE
        ;;
    dnf|yum)
        # Create repository file
        cat > /etc/yum.repos.d/influxdata.repo << 'EOF'
[influxdata]
name = InfluxData Repository - Stable
baseurl = https://repos.influxdata.com/stable/\$basearch/main
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdata-archive_compat.key
EOF
        ;;
esac

log_success "InfluxData repository added"

# Install Telegraf
log_step "4/8" "Installing Telegraf..."

$PKG_INSTALL telegraf

log_success "Telegraf installed"

# Create custom plugins directory
log_step "5/8" "Setting up custom plugins directory..."

PLUGINS_DIR="/etc/telegraf/plugins"
mkdir -p "$PLUGINS_DIR"
chown telegraf:telegraf "$PLUGINS_DIR"
chmod 755 "$PLUGINS_DIR"

# Create example custom plugin script
cat > "$PLUGINS_DIR/example_app_metrics.sh" << 'EOF'
#!/bin/bash
# Example custom plugin for application metrics
# This script should output metrics in InfluxDB line protocol format

# Example: Application response time metric
RESPONSE_TIME=$(curl -o /dev/null -s -w '%{time_total}' http://localhost:8080/health 2>/dev/null || echo "0")
echo "app_metrics,service=example_app response_time=${RESPONSE_TIME}"

# Example: Application memory usage
if command -v pgrep &> /dev/null; then
    APP_PID=$(pgrep -f "example_app" | head -1)
    if [ -n "$APP_PID" ] && [ -f "/proc/$APP_PID/status" ]; then
        MEMORY_KB=$(grep VmRSS /proc/$APP_PID/status | awk '{print $2}')
        MEMORY_BYTES=$((MEMORY_KB * 1024))
        echo "app_metrics,service=example_app memory_usage=${MEMORY_BYTES}"
    fi
fi
EOF

chmod 755 "$PLUGINS_DIR/example_app_metrics.sh"
chown telegraf:telegraf "$PLUGINS_DIR/example_app_metrics.sh"

log_success "Custom plugins directory created"

# Backup original configuration
log_step "6/8" "Configuring Telegraf..."

cp /etc/telegraf/telegraf.conf /etc/telegraf/telegraf.conf.backup

# Create new Telegraf configuration
cat > /etc/telegraf/telegraf.conf << EOF
# Telegraf Configuration with Custom Plugins

[global_tags]
  environment = "production"

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false

###############################################################################
#                            OUTPUT PLUGINS                                   #
###############################################################################

EOF

# Add InfluxDB output if enabled
if [ "$ENABLE_INFLUXDB" = true ]; then
    cat >> /etc/telegraf/telegraf.conf << EOF
# InfluxDB Output Plugin
[[outputs.influxdb]]
  urls = ["${INFLUX_URL}"]
  database = "${INFLUX_DB}"
  timeout = "5s"

EOF
fi

# Add Prometheus output if enabled
if [ "$ENABLE_PROMETHEUS" = true ]; then
    cat >> /etc/telegraf/telegraf.conf << EOF
# Prometheus Output Plugin
[[outputs.prometheus_client]]
  listen = ":9273"
  metric_version = 2

EOF
fi

# Add input plugins configuration
cat >> /etc/telegraf/telegraf.conf << EOF
###############################################################################
#                            INPUT PLUGINS                                    #
###############################################################################

# System metrics
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.swap]]

[[inputs.system]]

[[inputs.net]]

# Custom application metrics via exec plugin
[[inputs.exec]]
  commands = [
    "${PLUGINS_DIR}/example_app_metrics.sh"
  ]
  timeout = "5s"
  data_format = "influx"
  interval = "30s"

# HTTP response monitoring
[[inputs.http_response]]
  urls = ["http://localhost:8080/health"]
  method = "GET"
  response_timeout = "5s"
  follow_redirects = false
  [inputs.http_response.tags]
    service = "example_app"
EOF

# Set proper permissions for configuration
chown telegraf:telegraf /etc/telegraf/telegraf.conf
chmod 644 /etc/telegraf/telegraf.conf

log_success "Telegraf configuration created"

# Configure firewall if Prometheus output is enabled
log_step "7/8" "Configuring firewall..."

if [ "$ENABLE_PROMETHEUS" = true ]; then
    if command -v firewall-cmd &> /dev/null; then
        # RHEL-based systems with firewalld
        if systemctl is-active --quiet firewalld; then
            firewall-cmd --permanent --add-port=9273/tcp
            firewall-cmd --reload
            log_success "Firewall configured for Prometheus (port 9273)"
        fi
    elif command -v ufw &> /dev/null; then
        # Ubuntu/Debian with ufw
        if ufw status | grep -q "Status: active"; then
            ufw allow 9273/tcp
            log_success "Firewall configured for Prometheus (port 9273)"
        fi
    fi
else
    log_success "Firewall configuration skipped (Prometheus not enabled)"
fi

# Start and enable Telegraf service
log_step "8/8" "Starting Telegraf service..."

systemctl enable telegraf
systemctl start telegraf
SERVICES_STARTED+=("telegraf")

# Wait a moment for service to initialize
sleep 3

log_success "Telegraf service started and enabled"

# Verification checks
echo -e "\n${BLUE}[VERIFICATION] Running verification checks...${NC}"

# Check service status
if systemctl is-active --quiet telegraf; then
    log_success "Telegraf service is running"
else
    log_error "Telegraf service is not running"
    systemctl status telegraf
    exit 1
fi

# Check configuration syntax
if telegraf --test --config /etc/telegraf/telegraf.conf &>/dev/null; then
    log_success "Telegraf configuration is valid"
else
    log_error "Telegraf configuration has errors"
    telegraf --test --config /etc/telegraf/telegraf.conf
    exit 1
fi

# Check if Prometheus endpoint is accessible (if enabled)
if [ "$ENABLE_PROMETHEUS" = true ]; then
    if curl -sf http://localhost:9273/metrics > /dev/null; then
        log_success "Prometheus metrics endpoint is accessible"
    else
        log_warning "Prometheus metrics endpoint is not accessible yet (may need more time to initialize)"
    fi
fi

# Display final information
echo -e "\n${GREEN}[INSTALLATION COMPLETE]${NC}"
echo "=============================="
echo "Telegraf has been successfully installed and configured!"
echo
echo "Configuration file: /etc/telegraf/telegraf.conf"
echo "Custom plugins directory: $PLUGINS_DIR"
echo "Service status: $(systemctl is-active telegraf)"
echo
if [ "$ENABLE_PROMETHEUS" = true ]; then
    echo "Prometheus metrics: http://$(hostname -I | awk '{print $1}'):9273/metrics"
fi
if [ "$ENABLE_INFLUXDB" = true ]; then
    echo "InfluxDB output: $INFLUX_URL (database: $INFLUX_DB)"
fi
echo
echo "To add custom metrics:"
echo "1. Create executable scripts in $PLUGINS_DIR"
echo "2. Add [[inputs.exec]] sections to telegraf.conf"
echo "3. Restart telegraf: systemctl restart telegraf"
echo
echo "Logs: journalctl -u telegraf -f"
                    
                    Review the script before running. Execute with: bash install.sh
                
            
        
    
    
            
                            #telegraf
                            #prometheus
                            #influxdb
                            #grafana
                            #monitoring

Set up Telegraf custom plugins for application monitoring with Prometheus and InfluxDB integration

Prerequisites

What this solves

Step-by-step installation

Install Telegraf agent

Create custom input plugin directory

Build custom application metrics script

Make the script executable

Install Python dependencies

Configure Telegraf with custom plugin and dual outputs

Custom application metrics input plugin

System metrics for context

Network interface monitoring

HTTP response time monitoring

InfluxDB output

Prometheus output

Install and configure InfluxDB

Create InfluxDB database and user

Install Prometheus

Configure Prometheus to scrape Telegraf

Create Prometheus systemd service

Install and configure Grafana

Start all services

Create Grafana dashboard configuration

Configure advanced alerting rules

Create Prometheus alerting rules

Update Prometheus configuration for alerts

Verify your setup

Test custom script execution

Check Telegraf is collecting metrics

Verify Prometheus is scraping Telegraf

Check InfluxDB has data

Test Grafana access

Common issues

Next steps

Running this in production?

Related tutorials

Configure Consul Connect service mesh monitoring with distributed tracing

Configure OpenTelemetry custom metrics for application monitoring with Prometheus and Grafana

Configure Jaeger with Elasticsearch backend security and encryption

Don't want to manage this yourself?