Configure Nagios Custom Plugins Development

Learn to develop custom Nagios plugins for specialized monitoring requirements including setting up the development environment, writing check scripts in multiple languages, and integrating them into your Nagios Core monitoring infrastructure.

Prerequisites

Nagios Core 4.5 installed
Python 3 and pip3
Basic knowledge of shell scripting
Understanding of monitoring concepts

What this solves

Default Nagios plugins cover basic system monitoring, but production environments often need custom checks for applications, APIs, databases, or business metrics. This guide shows you how to develop, test, and deploy custom Nagios plugins that extend your monitoring capabilities beyond standard system resources.

Step-by-step plugin development setup

Install development dependencies

Set up the essential tools for plugin development including compilers, interpreters, and Nagios plugin utilities.

sudo apt update
sudo apt install -y build-essential python3 python3-pip perl libmonitoring-plugin-perl curl wget git

sudo dnf install -y gcc gcc-c++ make python3 python3-pip perl-Monitoring-Plugin curl wget git

Create plugin development directory

Organize your custom plugins in a dedicated directory structure with proper permissions.

sudo mkdir -p /usr/local/nagios/plugins/custom
sudo mkdir -p /usr/local/nagios/plugins/development
sudo chown -R nagios:nagios /usr/local/nagios/plugins/
sudo chmod 755 /usr/local/nagios/plugins/custom
sudo chmod 775 /usr/local/nagios/plugins/development

Install Python monitoring libraries

Install essential Python libraries for building robust monitoring plugins with proper exit codes and performance data.

sudo pip3 install nagiosplugin requests psutil pymongo redis elasticsearch

Download Nagios plugin development utils

Get the official plugin development utilities that provide standard functions and exit codes.

cd /tmp
wget https://www.nagios-plugins.org/download/nagios-plugins-2.4.6.tar.gz
tar -xzf nagios-plugins-2.4.6.tar.gz
cd nagios-plugins-2.4.6
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
sudo cp plugins-root/utils.sh /usr/local/nagios/plugins/
sudo cp plugins/utils.c /usr/local/nagios/plugins/
sudo chmod 644 /usr/local/nagios/plugins/utils.*

Writing custom check plugins

Create a basic shell script plugin

Start with a simple disk usage plugin that demonstrates proper Nagios plugin structure and exit codes.

#!/bin/bash

Source the utils for standard functions
. /usr/local/nagios/plugins/utils.sh

Plugin info
PLUGIN_NAME="Custom Disk Check"
PLUGIN_VERSION="1.0"

Default values
WARNING_THRESHOLD=80
CRITICAL_THRESHOLD=90
PATH_TO_CHECK="/"

Function to display help
print_help() {
    echo "$PLUGIN_NAME $PLUGIN_VERSION"
    echo "Usage: $0 -w  -c  -p "
    echo "  -w: Warning threshold (default: 80%)"
    echo "  -c: Critical threshold (default: 90%)"
    echo "  -p: Path to check (default: /)"
    echo "  -h: Show this help"
}

Parse command line arguments
while getopts "w:c:p:h" opt; do
    case $opt in
        w) WARNING_THRESHOLD=$OPTARG ;;
        c) CRITICAL_THRESHOLD=$OPTARG ;;
        p) PATH_TO_CHECK=$OPTARG ;;
        h) print_help; exit $STATE_OK ;;
        *) print_help; exit $STATE_UNKNOWN ;;
    esac
done

Validate thresholds
if [ $WARNING_THRESHOLD -ge $CRITICAL_THRESHOLD ]; then
    echo "UNKNOWN - Warning threshold must be less than critical threshold"
    exit $STATE_UNKNOWN
fi

Check if path exists
if [ ! -d "$PATH_TO_CHECK" ]; then
    echo "UNKNOWN - Path $PATH_TO_CHECK does not exist"
    exit $STATE_UNKNOWN
fi

Get disk usage
USAGE=$(df "$PATH_TO_CHECK" | tail -1 | awk '{print $5}' | sed 's/%//')

Check if we got a valid number
if ! [[ "$USAGE" =~ ^[0-9]+$ ]]; then
    echo "UNKNOWN - Could not determine disk usage for $PATH_TO_CHECK"
    exit $STATE_UNKNOWN
fi

Performance data
PERF_DATA="usage=${USAGE}%;${WARNING_THRESHOLD};${CRITICAL_THRESHOLD};0;100"

Determine status and exit
if [ $USAGE -ge $CRITICAL_THRESHOLD ]; then
    echo "CRITICAL - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA"
    exit $STATE_CRITICAL
elif [ $USAGE -ge $WARNING_THRESHOLD ]; then
    echo "WARNING - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA"
    exit $STATE_WARNING
else
    echo "OK - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA"
    exit $STATE_OK
fi

Create a Python API monitoring plugin

Build a more advanced plugin that monitors API endpoints with JSON response validation and performance metrics.

#!/usr/bin/env python3

import sys
import argparse
import requests
import time
import json
from urllib.parse import urlparse

Nagios exit codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3

def parse_args():
    parser = argparse.ArgumentParser(description='Monitor API endpoint health')
    parser.add_argument('-u', '--url', required=True, help='API endpoint URL')
    parser.add_argument('-t', '--timeout', type=int, default=10, help='Request timeout in seconds')
    parser.add_argument('-w', '--warning', type=float, default=2.0, help='Warning threshold for response time')
    parser.add_argument('-c', '--critical', type=float, default=5.0, help='Critical threshold for response time')
    parser.add_argument('-k', '--key', help='JSON key to validate in response')
    parser.add_argument('-v', '--value', help='Expected value for the JSON key')
    parser.add_argument('-H', '--header', action='append', help='Custom headers (format: "Key: Value")')
    parser.add_argument('-s', '--status', type=int, default=200, help='Expected HTTP status code')
    return parser.parse_args()

def make_request(url, timeout, headers=None):
    """Make HTTP request and return response with timing"""
    custom_headers = {}
    if headers:
        for header in headers:
            key, value = header.split(':', 1)
            custom_headers[key.strip()] = value.strip()
    
    start_time = time.time()
    try:
        response = requests.get(url, timeout=timeout, headers=custom_headers)
        response_time = time.time() - start_time
        return response, response_time
    except requests.exceptions.Timeout:
        return None, timeout
    except requests.exceptions.RequestException as e:
        print(f"CRITICAL - Request failed: {str(e)}")
        sys.exit(CRITICAL)

def validate_json_content(response, key, expected_value):
    """Validate JSON response content"""
    try:
        data = response.json()
        if key in data:
            actual_value = data[key]
            if str(actual_value) == str(expected_value):
                return True, f"JSON validation passed: {key}={actual_value}"
            else:
                return False, f"JSON validation failed: {key}={actual_value}, expected={expected_value}"
        else:
            return False, f"JSON key '{key}' not found in response"
    except json.JSONDecodeError:
        return False, "Response is not valid JSON"

def main():
    args = parse_args()
    
    # Validate URL
    parsed_url = urlparse(args.url)
    if not parsed_url.scheme or not parsed_url.netloc:
        print("UNKNOWN - Invalid URL format")
        sys.exit(UNKNOWN)
    
    # Validate thresholds
    if args.warning >= args.critical:
        print("UNKNOWN - Warning threshold must be less than critical threshold")
        sys.exit(UNKNOWN)
    
    # Make request
    response, response_time = make_request(args.url, args.timeout, args.header)
    
    if response is None:
        print(f"CRITICAL - Request timed out after {args.timeout} seconds")
        sys.exit(CRITICAL)
    
    # Check HTTP status
    if response.status_code != args.status:
        print(f"CRITICAL - HTTP {response.status_code}, expected {args.status}")
        sys.exit(CRITICAL)
    
    # Validate JSON content if specified
    json_status = "OK"
    json_message = ""
    if args.key and args.value:
        is_valid, message = validate_json_content(response, args.key, args.value)
        if not is_valid:
            print(f"CRITICAL - {message}")
            sys.exit(CRITICAL)
        json_message = f", {message}"
    
    # Performance data
    perf_data = f"response_time={response_time:.3f}s;{args.warning};{args.critical};0"
    
    # Determine status based on response time
    if response_time >= args.critical:
        print(f"CRITICAL - Response time {response_time:.3f}s{json_message}|{perf_data}")
        sys.exit(CRITICAL)
    elif response_time >= args.warning:
        print(f"WARNING - Response time {response_time:.3f}s{json_message}|{perf_data}")
        sys.exit(WARNING)
    else:
        print(f"OK - Response time {response_time:.3f}s{json_message}|{perf_data}")
        sys.exit(OK)

if __name__ == "__main__":
    main()

Create a database connection plugin

Build a plugin that monitors database connectivity and query performance for PostgreSQL.

#!/usr/bin/env python3

import sys
import argparse
import time
try:
    import psycopg2
except ImportError:
    print("UNKNOWN - psycopg2 library not installed. Run: pip3 install psycopg2-binary")
    sys.exit(3)

Nagios exit codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3

def parse_args():
    parser = argparse.ArgumentParser(description='Monitor PostgreSQL query performance')
    parser.add_argument('-H', '--host', default='localhost', help='Database host')
    parser.add_argument('-P', '--port', type=int, default=5432, help='Database port')
    parser.add_argument('-d', '--database', required=True, help='Database name')
    parser.add_argument('-u', '--username', required=True, help='Database username')
    parser.add_argument('-p', '--password', required=True, help='Database password')
    parser.add_argument('-q', '--query', default='SELECT 1', help='SQL query to execute')
    parser.add_argument('-w', '--warning', type=float, default=1.0, help='Warning threshold in seconds')
    parser.add_argument('-c', '--critical', type=float, default=3.0, help='Critical threshold in seconds')
    parser.add_argument('-t', '--timeout', type=int, default=10, help='Connection timeout')
    return parser.parse_args()

def execute_query(host, port, database, username, password, query, timeout):
    """Execute database query and return timing"""
    try:
        start_time = time.time()
        conn = psycopg2.connect(
            host=host,
            port=port,
            database=database,
            user=username,
            password=password,
            connect_timeout=timeout
        )
        
        cursor = conn.cursor()
        cursor.execute(query)
        result = cursor.fetchall()
        
        query_time = time.time() - start_time
        
        cursor.close()
        conn.close()
        
        return True, query_time, len(result)
        
    except psycopg2.OperationalError as e:
        return False, 0, f"Connection error: {str(e)}"
    except psycopg2.Error as e:
        return False, 0, f"Database error: {str(e)}"
    except Exception as e:
        return False, 0, f"Unexpected error: {str(e)}"

def main():
    args = parse_args()
    
    # Validate thresholds
    if args.warning >= args.critical:
        print("UNKNOWN - Warning threshold must be less than critical threshold")
        sys.exit(UNKNOWN)
    
    # Execute query
    success, query_time, result_info = execute_query(
        args.host, args.port, args.database, 
        args.username, args.password, args.query, args.timeout
    )
    
    if not success:
        print(f"CRITICAL - {result_info}")
        sys.exit(CRITICAL)
    
    # Performance data
    perf_data = f"query_time={query_time:.3f}s;{args.warning};{args.critical};0"
    
    # Determine status
    if query_time >= args.critical:
        print(f"CRITICAL - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}")
        sys.exit(CRITICAL)
    elif query_time >= args.warning:
        print(f"WARNING - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}")
        sys.exit(WARNING)
    else:
        print(f"OK - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}")
        sys.exit(OK)

if __name__ == "__main__":
    main()

Set executable permissions on plugins

Make your custom plugins executable by the Nagios user with proper security permissions.

sudo chmod 755 /usr/local/nagios/plugins/development/check_custom_disk.sh
sudo chmod 755 /usr/local/nagios/plugins/development/check_api_endpoint.py
sudo chmod 755 /usr/local/nagios/plugins/development/check_postgres_query.py
sudo chown nagios:nagios /usr/local/nagios/plugins/development/*

Plugin testing and validation

Test plugins manually

Run each plugin from the command line to verify functionality and output format before integration.

# Test disk check plugin
sudo -u nagios /usr/local/nagios/plugins/development/check_custom_disk.sh -w 80 -c 90 -p /

Test API endpoint plugin
sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/json -w 2 -c 5

Test with JSON validation
sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/json -k slideshow -v "" -w 2 -c 5

Validate plugin output format

Ensure plugins follow Nagios standards for output format, exit codes, and performance data.

# Check exit codes
echo $?  # Should be 0, 1, 2, or 3

Validate performance data format
Should be: label=value[UOM];[warn];[crit];[min];[max]

Test all threshold conditions
sudo -u nagios /usr/local/nagios/plugins/development/check_custom_disk.sh -w 10 -c 20 -p /
sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/delay/3 -w 1 -c 2

Create plugin validation script

Build an automated test script to verify plugin behavior across different scenarios.

#!/bin/bash

PLUGIN_DIR="/usr/local/nagios/plugins/development"
TEST_RESULTS="/tmp/plugin_tests.log"

echo "Nagios Plugin Validation Results" > $TEST_RESULTS
echo "Generated: $(date)" >> $TEST_RESULTS
echo "========================================" >> $TEST_RESULTS

test_plugin() {
    local plugin="$1"
    local args="$2"
    local expected_exit="$3"
    local test_name="$4"
    
    echo "Testing: $test_name" >> $TEST_RESULTS
    echo "Command: $plugin $args" >> $TEST_RESULTS
    
    output=$(sudo -u nagios $plugin $args 2>&1)
    exit_code=$?
    
    echo "Output: $output" >> $TEST_RESULTS
    echo "Exit Code: $exit_code" >> $TEST_RESULTS
    
    if [ "$exit_code" = "$expected_exit" ]; then
        echo "Result: PASS" >> $TEST_RESULTS
        echo "✓ $test_name"
    else
        echo "Result: FAIL (expected $expected_exit, got $exit_code)" >> $TEST_RESULTS
        echo "✗ $test_name"
    fi
    echo "" >> $TEST_RESULTS
}

Test disk plugin
test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 80 -c 90 -p /" "0" "Disk Check - Normal"
test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 10 -c 20 -p /" "1" "Disk Check - Warning"
test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 90 -c 80 -p /" "3" "Disk Check - Invalid Thresholds"

Test API plugin
test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/status/200 -w 2 -c 5" "0" "API Check - Normal"
test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/status/404 -w 2 -c 5" "2" "API Check - 404 Error"
test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/delay/3 -w 1 -c 2" "2" "API Check - Slow Response"

echo "Validation complete. Results saved to $TEST_RESULTS"
echo "Review with: cat $TEST_RESULTS"

sudo chmod 755 /usr/local/nagios/plugins/development/validate_plugins.sh
sudo /usr/local/nagios/plugins/development/validate_plugins.sh

Integration with Nagios Core

Deploy tested plugins to production directory

Move validated plugins to the production plugin directory and set final permissions.

sudo cp /usr/local/nagios/plugins/development/check_custom_disk.sh /usr/local/nagios/plugins/custom/
sudo cp /usr/local/nagios/plugins/development/check_api_endpoint.py /usr/local/nagios/plugins/custom/
sudo cp /usr/local/nagios/plugins/development/check_postgres_query.py /usr/local/nagios/plugins/custom/
sudo chown nagios:nagios /usr/local/nagios/plugins/custom/*
sudo chmod 755 /usr/local/nagios/plugins/custom/*

Define command definitions

Create Nagios command definitions for your custom plugins in the commands configuration file.

# Custom Disk Check Command
define command {
    command_name    check_custom_disk
    command_line    /usr/local/nagios/plugins/custom/check_custom_disk.sh -w $ARG1$ -c $ARG2$ -p $ARG3$
}

API Endpoint Check Command
define command {
    command_name    check_api_endpoint
    command_line    /usr/local/nagios/plugins/custom/check_api_endpoint.py -u $ARG1$ -w $ARG2$ -c $ARG3$ -s $ARG4$
}

API with JSON validation
define command {
    command_name    check_api_json
    command_line    /usr/local/nagios/plugins/custom/check_api_endpoint.py -u $ARG1$ -w $ARG2$ -c $ARG3$ -k $ARG4$ -v $ARG5$
}

PostgreSQL Query Check Command
define command {
    command_name    check_postgres_query
    command_line    /usr/local/nagios/plugins/custom/check_postgres_query.py -H $ARG1$ -d $ARG2$ -u $ARG3$ -p $ARG4$ -q "$ARG5$" -w $ARG6$ -c $ARG7$
}

Configure service checks

Create service definitions that use your custom plugins for monitoring specific resources.

# Custom disk monitoring
define service {
    use                 generic-service
    host_name           localhost
    service_description Custom Disk Usage /var
    check_command       check_custom_disk!80!90!/var
    check_interval      5
    retry_interval      1
}

API endpoint monitoring
define service {
    use                 generic-service
    host_name           localhost
    service_description API Health Check
    check_command       check_api_endpoint!https://api.example.com/health!2!5!200
    check_interval      2
    retry_interval      1
}

Database query monitoring
define service {
    use                 generic-service
    host_name           localhost
    service_description Database Query Performance
    check_command       check_postgres_query!localhost!myapp!monitor_user!secure_password!SELECT COUNT(*) FROM users!1!3
    check_interval      5
    retry_interval      1
}

Validate Nagios configuration

Test the configuration to ensure your custom plugins integrate properly with Nagios Core.

sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

If validation passes, restart Nagios
sudo systemctl restart nagios
sudo systemctl status nagios

Monitor plugin execution in logs

Watch Nagios logs to verify your custom plugins execute correctly and produce expected results.

sudo tail -f /usr/local/nagios/var/nagios.log | grep -E "check_custom_disk|check_api_endpoint|check_postgres_query"

Advanced plugin development patterns

Create a plugin template

Build a reusable template that includes all best practices for consistent plugin development.

#!/usr/bin/env python3
"""
Nagios Plugin Template

This template provides a standardized structure for developing Nagios plugins
with proper argument parsing, logging, and error handling.
"""

import sys
import argparse
import logging
import time
from pathlib import Path

Nagios exit codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3

class NagiosPlugin:
    def __init__(self, plugin_name, version="1.0"):
        self.plugin_name = plugin_name
        self.version = version
        self.args = None
        self.logger = self._setup_logging()
        
    def _setup_logging(self):
        """Setup logging for debugging (optional)"""
        logger = logging.getLogger(self.plugin_name)
        # Only log to file in development
        if Path('/tmp/nagios_debug').exists():
            handler = logging.FileHandler(f'/tmp/{self.plugin_name}.log')
            formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
            handler.setFormatter(formatter)
            logger.addHandler(handler)
            logger.setLevel(logging.DEBUG)
        return logger
        
    def add_arguments(self, parser):
        """Override this method to add plugin-specific arguments"""
        parser.add_argument('-w', '--warning', type=float, default=80.0,
                          help='Warning threshold')
        parser.add_argument('-c', '--critical', type=float, default=90.0,
                          help='Critical threshold')
        
    def parse_args(self):
        """Parse command line arguments"""
        parser = argparse.ArgumentParser(
            description=f'{self.plugin_name} v{self.version}',
            formatter_class=argparse.RawDescriptionHelpFormatter
        )
        
        # Add standard arguments
        parser.add_argument('-V', '--version', action='version',
                          version=f'{self.plugin_name} {self.version}')
        parser.add_argument('-t', '--timeout', type=int, default=10,
                          help='Plugin timeout in seconds')
        parser.add_argument('-v', '--verbose', action='store_true',
                          help='Verbose output')
        
        # Add plugin-specific arguments
        self.add_arguments(parser)
        
        self.args = parser.parse_args()
        
        # Validate thresholds
        if hasattr(self.args, 'warning') and hasattr(self.args, 'critical'):
            if self.args.warning >= self.args.critical:
                self.nagios_exit(UNKNOWN, "Warning threshold must be less than critical")
                
        return self.args
        
    def check_health(self):
        """Override this method with your check logic"""
        # Example implementation
        import random
        value = random.uniform(0, 100)
        
        perf_data = f"value={value:.2f}%;{self.args.warning};{self.args.critical};0;100"
        
        if value >= self.args.critical:
            return CRITICAL, f"Value is {value:.2f}%", perf_data
        elif value >= self.args.warning:
            return WARNING, f"Value is {value:.2f}%", perf_data
        else:
            return OK, f"Value is {value:.2f}%", perf_data
            
    def nagios_exit(self, status, message, perf_data=None):
        """Exit with proper Nagios format"""
        status_text = {OK: 'OK', WARNING: 'WARNING', CRITICAL: 'CRITICAL', UNKNOWN: 'UNKNOWN'}
        
        output = f"{status_text[status]} - {message}"
        if perf_data:
            output += f"|{perf_data}"
            
        print(output)
        sys.exit(status)
        
    def run(self):
        """Main execution method"""
        try:
            self.parse_args()
            self.logger.debug(f"Starting {self.plugin_name} with args: {vars(self.args)}")
            
            status, message, perf_data = self.check_health()
            self.nagios_exit(status, message, perf_data)
            
        except KeyboardInterrupt:
            self.nagios_exit(UNKNOWN, "Plugin interrupted")
        except Exception as e:
            self.logger.error(f"Unexpected error: {str(e)}")
            self.nagios_exit(UNKNOWN, f"Plugin error: {str(e)}")

Example usage:
class CustomHealthCheck(NagiosPlugin):
    def __init__(self):
        super().__init__("Custom Health Check", "1.0")
        
    def add_arguments(self, parser):
        super().add_arguments(parser)
        parser.add_argument('-u', '--url', required=True,
                          help='URL to check')
                          
    def check_health(self):
        # Your custom check logic here
        return OK, "Custom check passed", "response_time=0.123s;2;5;0"

if __name__ == "__main__":
    plugin = CustomHealthCheck()
    plugin.run()

Create plugin configuration management

Build a system for managing plugin configurations and deployment across multiple Nagios instances.

#!/usr/bin/env python3
"""
Nagios Plugin Manager
Manages deployment and configuration of custom plugins
"""

import os
import sys
import shutil
import json
import subprocess
from pathlib import Path
import argparse

class PluginManager:
    def __init__(self):
        self.dev_dir = Path('/usr/local/nagios/plugins/development')
        self.prod_dir = Path('/usr/local/nagios/plugins/custom')
        self.config_file = self.dev_dir / 'plugin_config.json'
        
    def load_config(self):
        """Load plugin configuration"""
        if self.config_file.exists():
            with open(self.config_file) as f:
                return json.load(f)
        return {}
        
    def save_config(self, config):
        """Save plugin configuration"""
        with open(self.config_file, 'w') as f:
            json.dump(config, f, indent=2)
            
    def validate_plugin(self, plugin_path):
        """Validate plugin before deployment"""
        if not plugin_path.exists():
            return False, f"Plugin {plugin_path} does not exist"
            
        # Check if executable
        if not os.access(plugin_path, os.X_OK):
            return False, f"Plugin {plugin_path} is not executable"
            
        # Try to run with --help
        try:
            result = subprocess.run([str(plugin_path), '--help'], 
                                  capture_output=True, text=True, timeout=5)
            if result.returncode not in [0, 1, 2, 3]:
                return False, f"Plugin returned invalid exit code: {result.returncode}"
        except subprocess.TimeoutExpired:
            return False, "Plugin help command timed out"
        except Exception as e:
            return False, f"Error testing plugin: {str(e)}"
            
        return True, "Plugin validation passed"
        
    def deploy_plugin(self, plugin_name):
        """Deploy plugin from development to production"""
        dev_path = self.dev_dir / plugin_name
        prod_path = self.prod_dir / plugin_name
        
        # Validate plugin
        is_valid, message = self.validate_plugin(dev_path)
        if not is_valid:
            print(f"✗ Validation failed: {message}")
            return False
            
        # Copy to production
        try:
            shutil.copy2(dev_path, prod_path)
            os.chown(prod_path, os.getuid(), os.getgid())  # Will be changed by next command
            subprocess.run(['sudo', 'chown', 'nagios:nagios', str(prod_path)], check=True)
            os.chmod(prod_path, 0o755)
            print(f"✓ Deployed {plugin_name} to production")
            return True
        except Exception as e:
            print(f"✗ Deployment failed: {str(e)}")
            return False
            
    def list_plugins(self):
        """List available plugins"""
        print("Development Plugins:")
        for plugin in self.dev_dir.glob('check_*'):
            if plugin.is_file() and os.access(plugin, os.X_OK):
                print(f"  📝 {plugin.name}")
                
        print("\nProduction Plugins:")
        for plugin in self.prod_dir.glob('check_*'):
            if plugin.is_file():
                print(f"  ✅ {plugin.name}")
                
    def test_plugin(self, plugin_name, args='--help'):
        """Test plugin execution"""
        plugin_path = self.dev_dir / plugin_name
        if not plugin_path.exists():
            plugin_path = self.prod_dir / plugin_name
            
        if not plugin_path.exists():
            print(f"✗ Plugin {plugin_name} not found")
            return False
            
        try:
            cmd = ['sudo', '-u', 'nagios', str(plugin_path)] + args.split()
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
            
            print(f"Exit Code: {result.returncode}")
            print(f"Output: {result.stdout}")
            if result.stderr:
                print(f"Error: {result.stderr}")
                
            return True
        except Exception as e:
            print(f"✗ Test failed: {str(e)}")
            return False

def main():
    parser = argparse.ArgumentParser(description='Nagios Plugin Manager')
    parser.add_argument('action', choices=['list', 'deploy', 'test', 'validate'])
    parser.add_argument('plugin', nargs='?', help='Plugin name')
    parser.add_argument('--args', default='--help', help='Arguments for test')
    
    args = parser.parse_args()
    manager = PluginManager()
    
    if args.action == 'list':
        manager.list_plugins()
    elif args.action == 'deploy':
        if not args.plugin:
            print("Plugin name required for deploy")
            sys.exit(1)
        manager.deploy_plugin(args.plugin)
    elif args.action == 'test':
        if not args.plugin:
            print("Plugin name required for test")
            sys.exit(1)
        manager.test_plugin(args.plugin, args.args)
    elif args.action == 'validate':
        if not args.plugin:
            print("Plugin name required for validate")
            sys.exit(1)
        dev_path = manager.dev_dir / args.plugin
        is_valid, message = manager.validate_plugin(dev_path)
        print(f"Validation: {'✓' if is_valid else '✗'} {message}")

if __name__ == '__main__':
    main()

sudo chmod 755 /usr/local/nagios/plugins/development/plugin_manager.py
python3 /usr/local/nagios/plugins/development/plugin_manager.py list

Verify your setup

# Check plugin directory structure
ls -la /usr/local/nagios/plugins/custom/
ls -la /usr/local/nagios/plugins/development/

Test plugin execution as nagios user
sudo -u nagios /usr/local/nagios/plugins/custom/check_custom_disk.sh -w 80 -c 90 -p /

Verify Nagios configuration includes custom commands
sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Check plugin execution in Nagios web interface
sudo systemctl status nagios
curl -u nagiosadmin:password http://localhost/nagios/

Common issues

Symptom	Cause	Fix
Plugin returns "Permission denied"	Wrong file permissions or ownership	`sudo chmod 755 plugin_file && sudo chown nagios:nagios plugin_file`
"Command not found" in Nagios	Plugin path incorrect in command definition	Verify path in `/usr/local/nagios/etc/objects/commands.cfg`
Plugin works manually but fails in Nagios	Environment variables missing for nagios user	Add `export PATH=/usr/bin:/bin` at top of shell scripts
"No output returned from plugin"	Plugin doesn't print to stdout or exits without message	Ensure plugin always prints status message before exit
Performance data not appearing in graphs	Incorrect performance data format	Follow format: `label=value[UOM];[warn];[crit];[min];[max]`
Python modules not found	Modules installed for different user than nagios	`sudo -u nagios pip3 install module_name`

Next steps

Set up Nagios distributed monitoring with NRPE to use custom plugins on remote hosts
Configure Nagios SSL certificates and security hardening to secure your monitoring infrastructure
Configure Grafana custom plugins for specialized monitoring requirements for advanced visualization
Implement Nagios plugin performance optimization and caching for high-frequency checks
Configure Nagios plugin automated testing with CI/CD integration for production-ready workflows

Running this in production?

Need this managed for you? Setting up custom plugins once is straightforward. Keeping them maintained, tested, and integrated across environments is the harder part. See how we run infrastructure like this for European teams with comprehensive monitoring needs.

#nagios #monitoring #custom-plugins #python #bash #api-monitoring

Configure Nagios custom plugins development for specialized monitoring requirements

Prerequisites

What this solves

Step-by-step plugin development setup

Install development dependencies

Create plugin development directory

Install Python monitoring libraries

Download Nagios plugin development utils

Writing custom check plugins

Create a basic shell script plugin

Source the utils for standard functions

Plugin info

Default values

Function to display help

Parse command line arguments

Validate thresholds

Check if path exists

Get disk usage

Check if we got a valid number

Performance data

Determine status and exit

Create a Python API monitoring plugin

Nagios exit codes

Create a database connection plugin

Nagios exit codes

Set executable permissions on plugins

Plugin testing and validation

Test plugins manually

Test API endpoint plugin

Test with JSON validation

Validate plugin output format

Validate performance data format

Should be: label=value[UOM];[warn];[crit];[min];[max]

Test all threshold conditions

Create plugin validation script

Test disk plugin

Test API plugin

Integration with Nagios Core

Deploy tested plugins to production directory

Define command definitions

API Endpoint Check Command

API with JSON validation

PostgreSQL Query Check Command

Configure service checks

API endpoint monitoring

Database query monitoring

Validate Nagios configuration

If validation passes, restart Nagios

Monitor plugin execution in logs

Advanced plugin development patterns

Create a plugin template

Nagios exit codes

Example usage:

Create plugin configuration management

Verify your setup

Test plugin execution as nagios user

Verify Nagios configuration includes custom commands

Check plugin execution in Nagios web interface

Common issues

Next steps

Running this in production?

Related tutorials

Set up Nagios Core 4.5 distributed monitoring with NRPE for remote host checks

Configure MariaDB 11.6 monitoring with Prometheus and Grafana dashboards

Monitor Cherokee web server performance with Grafana and Prometheus metrics collection

Don't want to manage this yourself?