Configure Nagios custom plugins development for specialized monitoring requirements

Intermediate 45 min May 08, 2026 11 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Learn to develop custom Nagios plugins for specialized monitoring requirements including setting up the development environment, writing check scripts in multiple languages, and integrating them into your Nagios Core monitoring infrastructure.

Prerequisites

  • Nagios Core 4.5 installed
  • Python 3 and pip3
  • Basic knowledge of shell scripting
  • Understanding of monitoring concepts

What this solves

Default Nagios plugins cover basic system monitoring, but production environments often need custom checks for applications, APIs, databases, or business metrics. This guide shows you how to develop, test, and deploy custom Nagios plugins that extend your monitoring capabilities beyond standard system resources.

Step-by-step plugin development setup

Install development dependencies

Set up the essential tools for plugin development including compilers, interpreters, and Nagios plugin utilities.

sudo apt update
sudo apt install -y build-essential python3 python3-pip perl libmonitoring-plugin-perl curl wget git
sudo dnf install -y gcc gcc-c++ make python3 python3-pip perl-Monitoring-Plugin curl wget git

Create plugin development directory

Organize your custom plugins in a dedicated directory structure with proper permissions.

sudo mkdir -p /usr/local/nagios/plugins/custom
sudo mkdir -p /usr/local/nagios/plugins/development
sudo chown -R nagios:nagios /usr/local/nagios/plugins/
sudo chmod 755 /usr/local/nagios/plugins/custom
sudo chmod 775 /usr/local/nagios/plugins/development

Install Python monitoring libraries

Install essential Python libraries for building robust monitoring plugins with proper exit codes and performance data.

sudo pip3 install nagiosplugin requests psutil pymongo redis elasticsearch

Download Nagios plugin development utils

Get the official plugin development utilities that provide standard functions and exit codes.

cd /tmp
wget https://www.nagios-plugins.org/download/nagios-plugins-2.4.6.tar.gz
tar -xzf nagios-plugins-2.4.6.tar.gz
cd nagios-plugins-2.4.6
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
sudo cp plugins-root/utils.sh /usr/local/nagios/plugins/
sudo cp plugins/utils.c /usr/local/nagios/plugins/
sudo chmod 644 /usr/local/nagios/plugins/utils.*

Writing custom check plugins

Create a basic shell script plugin

Start with a simple disk usage plugin that demonstrates proper Nagios plugin structure and exit codes.

#!/bin/bash

Source the utils for standard functions

. /usr/local/nagios/plugins/utils.sh

Plugin info

PLUGIN_NAME="Custom Disk Check" PLUGIN_VERSION="1.0"

Default values

WARNING_THRESHOLD=80 CRITICAL_THRESHOLD=90 PATH_TO_CHECK="/"

Function to display help

print_help() { echo "$PLUGIN_NAME $PLUGIN_VERSION" echo "Usage: $0 -w -c -p " echo " -w: Warning threshold (default: 80%)" echo " -c: Critical threshold (default: 90%)" echo " -p: Path to check (default: /)" echo " -h: Show this help" }

Parse command line arguments

while getopts "w:c:p:h" opt; do case $opt in w) WARNING_THRESHOLD=$OPTARG ;; c) CRITICAL_THRESHOLD=$OPTARG ;; p) PATH_TO_CHECK=$OPTARG ;; h) print_help; exit $STATE_OK ;; *) print_help; exit $STATE_UNKNOWN ;; esac done

Validate thresholds

if [ $WARNING_THRESHOLD -ge $CRITICAL_THRESHOLD ]; then echo "UNKNOWN - Warning threshold must be less than critical threshold" exit $STATE_UNKNOWN fi

Check if path exists

if [ ! -d "$PATH_TO_CHECK" ]; then echo "UNKNOWN - Path $PATH_TO_CHECK does not exist" exit $STATE_UNKNOWN fi

Get disk usage

USAGE=$(df "$PATH_TO_CHECK" | tail -1 | awk '{print $5}' | sed 's/%//')

Check if we got a valid number

if ! [[ "$USAGE" =~ ^[0-9]+$ ]]; then echo "UNKNOWN - Could not determine disk usage for $PATH_TO_CHECK" exit $STATE_UNKNOWN fi

Performance data

PERF_DATA="usage=${USAGE}%;${WARNING_THRESHOLD};${CRITICAL_THRESHOLD};0;100"

Determine status and exit

if [ $USAGE -ge $CRITICAL_THRESHOLD ]; then echo "CRITICAL - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA" exit $STATE_CRITICAL elif [ $USAGE -ge $WARNING_THRESHOLD ]; then echo "WARNING - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA" exit $STATE_WARNING else echo "OK - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA" exit $STATE_OK fi

Create a Python API monitoring plugin

Build a more advanced plugin that monitors API endpoints with JSON response validation and performance metrics.

#!/usr/bin/env python3

import sys
import argparse
import requests
import time
import json
from urllib.parse import urlparse

Nagios exit codes

OK = 0 WARNING = 1 CRITICAL = 2 UNKNOWN = 3 def parse_args(): parser = argparse.ArgumentParser(description='Monitor API endpoint health') parser.add_argument('-u', '--url', required=True, help='API endpoint URL') parser.add_argument('-t', '--timeout', type=int, default=10, help='Request timeout in seconds') parser.add_argument('-w', '--warning', type=float, default=2.0, help='Warning threshold for response time') parser.add_argument('-c', '--critical', type=float, default=5.0, help='Critical threshold for response time') parser.add_argument('-k', '--key', help='JSON key to validate in response') parser.add_argument('-v', '--value', help='Expected value for the JSON key') parser.add_argument('-H', '--header', action='append', help='Custom headers (format: "Key: Value")') parser.add_argument('-s', '--status', type=int, default=200, help='Expected HTTP status code') return parser.parse_args() def make_request(url, timeout, headers=None): """Make HTTP request and return response with timing""" custom_headers = {} if headers: for header in headers: key, value = header.split(':', 1) custom_headers[key.strip()] = value.strip() start_time = time.time() try: response = requests.get(url, timeout=timeout, headers=custom_headers) response_time = time.time() - start_time return response, response_time except requests.exceptions.Timeout: return None, timeout except requests.exceptions.RequestException as e: print(f"CRITICAL - Request failed: {str(e)}") sys.exit(CRITICAL) def validate_json_content(response, key, expected_value): """Validate JSON response content""" try: data = response.json() if key in data: actual_value = data[key] if str(actual_value) == str(expected_value): return True, f"JSON validation passed: {key}={actual_value}" else: return False, f"JSON validation failed: {key}={actual_value}, expected={expected_value}" else: return False, f"JSON key '{key}' not found in response" except json.JSONDecodeError: return False, "Response is not valid JSON" def main(): args = parse_args() # Validate URL parsed_url = urlparse(args.url) if not parsed_url.scheme or not parsed_url.netloc: print("UNKNOWN - Invalid URL format") sys.exit(UNKNOWN) # Validate thresholds if args.warning >= args.critical: print("UNKNOWN - Warning threshold must be less than critical threshold") sys.exit(UNKNOWN) # Make request response, response_time = make_request(args.url, args.timeout, args.header) if response is None: print(f"CRITICAL - Request timed out after {args.timeout} seconds") sys.exit(CRITICAL) # Check HTTP status if response.status_code != args.status: print(f"CRITICAL - HTTP {response.status_code}, expected {args.status}") sys.exit(CRITICAL) # Validate JSON content if specified json_status = "OK" json_message = "" if args.key and args.value: is_valid, message = validate_json_content(response, args.key, args.value) if not is_valid: print(f"CRITICAL - {message}") sys.exit(CRITICAL) json_message = f", {message}" # Performance data perf_data = f"response_time={response_time:.3f}s;{args.warning};{args.critical};0" # Determine status based on response time if response_time >= args.critical: print(f"CRITICAL - Response time {response_time:.3f}s{json_message}|{perf_data}") sys.exit(CRITICAL) elif response_time >= args.warning: print(f"WARNING - Response time {response_time:.3f}s{json_message}|{perf_data}") sys.exit(WARNING) else: print(f"OK - Response time {response_time:.3f}s{json_message}|{perf_data}") sys.exit(OK) if __name__ == "__main__": main()

Create a database connection plugin

Build a plugin that monitors database connectivity and query performance for PostgreSQL.

#!/usr/bin/env python3

import sys
import argparse
import time
try:
    import psycopg2
except ImportError:
    print("UNKNOWN - psycopg2 library not installed. Run: pip3 install psycopg2-binary")
    sys.exit(3)

Nagios exit codes

OK = 0 WARNING = 1 CRITICAL = 2 UNKNOWN = 3 def parse_args(): parser = argparse.ArgumentParser(description='Monitor PostgreSQL query performance') parser.add_argument('-H', '--host', default='localhost', help='Database host') parser.add_argument('-P', '--port', type=int, default=5432, help='Database port') parser.add_argument('-d', '--database', required=True, help='Database name') parser.add_argument('-u', '--username', required=True, help='Database username') parser.add_argument('-p', '--password', required=True, help='Database password') parser.add_argument('-q', '--query', default='SELECT 1', help='SQL query to execute') parser.add_argument('-w', '--warning', type=float, default=1.0, help='Warning threshold in seconds') parser.add_argument('-c', '--critical', type=float, default=3.0, help='Critical threshold in seconds') parser.add_argument('-t', '--timeout', type=int, default=10, help='Connection timeout') return parser.parse_args() def execute_query(host, port, database, username, password, query, timeout): """Execute database query and return timing""" try: start_time = time.time() conn = psycopg2.connect( host=host, port=port, database=database, user=username, password=password, connect_timeout=timeout ) cursor = conn.cursor() cursor.execute(query) result = cursor.fetchall() query_time = time.time() - start_time cursor.close() conn.close() return True, query_time, len(result) except psycopg2.OperationalError as e: return False, 0, f"Connection error: {str(e)}" except psycopg2.Error as e: return False, 0, f"Database error: {str(e)}" except Exception as e: return False, 0, f"Unexpected error: {str(e)}" def main(): args = parse_args() # Validate thresholds if args.warning >= args.critical: print("UNKNOWN - Warning threshold must be less than critical threshold") sys.exit(UNKNOWN) # Execute query success, query_time, result_info = execute_query( args.host, args.port, args.database, args.username, args.password, args.query, args.timeout ) if not success: print(f"CRITICAL - {result_info}") sys.exit(CRITICAL) # Performance data perf_data = f"query_time={query_time:.3f}s;{args.warning};{args.critical};0" # Determine status if query_time >= args.critical: print(f"CRITICAL - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}") sys.exit(CRITICAL) elif query_time >= args.warning: print(f"WARNING - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}") sys.exit(WARNING) else: print(f"OK - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}") sys.exit(OK) if __name__ == "__main__": main()

Set executable permissions on plugins

Make your custom plugins executable by the Nagios user with proper security permissions.

sudo chmod 755 /usr/local/nagios/plugins/development/check_custom_disk.sh
sudo chmod 755 /usr/local/nagios/plugins/development/check_api_endpoint.py
sudo chmod 755 /usr/local/nagios/plugins/development/check_postgres_query.py
sudo chown nagios:nagios /usr/local/nagios/plugins/development/*

Plugin testing and validation

Test plugins manually

Run each plugin from the command line to verify functionality and output format before integration.

# Test disk check plugin
sudo -u nagios /usr/local/nagios/plugins/development/check_custom_disk.sh -w 80 -c 90 -p /

Test API endpoint plugin

sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/json -w 2 -c 5

Test with JSON validation

sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/json -k slideshow -v "" -w 2 -c 5

Validate plugin output format

Ensure plugins follow Nagios standards for output format, exit codes, and performance data.

# Check exit codes
echo $?  # Should be 0, 1, 2, or 3

Validate performance data format

Should be: label=value[UOM];[warn];[crit];[min];[max]

Test all threshold conditions

sudo -u nagios /usr/local/nagios/plugins/development/check_custom_disk.sh -w 10 -c 20 -p / sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/delay/3 -w 1 -c 2

Create plugin validation script

Build an automated test script to verify plugin behavior across different scenarios.

#!/bin/bash

PLUGIN_DIR="/usr/local/nagios/plugins/development"
TEST_RESULTS="/tmp/plugin_tests.log"

echo "Nagios Plugin Validation Results" > $TEST_RESULTS
echo "Generated: $(date)" >> $TEST_RESULTS
echo "========================================" >> $TEST_RESULTS

test_plugin() {
    local plugin="$1"
    local args="$2"
    local expected_exit="$3"
    local test_name="$4"
    
    echo "Testing: $test_name" >> $TEST_RESULTS
    echo "Command: $plugin $args" >> $TEST_RESULTS
    
    output=$(sudo -u nagios $plugin $args 2>&1)
    exit_code=$?
    
    echo "Output: $output" >> $TEST_RESULTS
    echo "Exit Code: $exit_code" >> $TEST_RESULTS
    
    if [ "$exit_code" = "$expected_exit" ]; then
        echo "Result: PASS" >> $TEST_RESULTS
        echo "✓ $test_name"
    else
        echo "Result: FAIL (expected $expected_exit, got $exit_code)" >> $TEST_RESULTS
        echo "✗ $test_name"
    fi
    echo "" >> $TEST_RESULTS
}

Test disk plugin

test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 80 -c 90 -p /" "0" "Disk Check - Normal" test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 10 -c 20 -p /" "1" "Disk Check - Warning" test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 90 -c 80 -p /" "3" "Disk Check - Invalid Thresholds"

Test API plugin

test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/status/200 -w 2 -c 5" "0" "API Check - Normal" test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/status/404 -w 2 -c 5" "2" "API Check - 404 Error" test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/delay/3 -w 1 -c 2" "2" "API Check - Slow Response" echo "Validation complete. Results saved to $TEST_RESULTS" echo "Review with: cat $TEST_RESULTS"
sudo chmod 755 /usr/local/nagios/plugins/development/validate_plugins.sh
sudo /usr/local/nagios/plugins/development/validate_plugins.sh

Integration with Nagios Core

Deploy tested plugins to production directory

Move validated plugins to the production plugin directory and set final permissions.

sudo cp /usr/local/nagios/plugins/development/check_custom_disk.sh /usr/local/nagios/plugins/custom/
sudo cp /usr/local/nagios/plugins/development/check_api_endpoint.py /usr/local/nagios/plugins/custom/
sudo cp /usr/local/nagios/plugins/development/check_postgres_query.py /usr/local/nagios/plugins/custom/
sudo chown nagios:nagios /usr/local/nagios/plugins/custom/*
sudo chmod 755 /usr/local/nagios/plugins/custom/*

Define command definitions

Create Nagios command definitions for your custom plugins in the commands configuration file.

# Custom Disk Check Command
define command {
    command_name    check_custom_disk
    command_line    /usr/local/nagios/plugins/custom/check_custom_disk.sh -w $ARG1$ -c $ARG2$ -p $ARG3$
}

API Endpoint Check Command

define command { command_name check_api_endpoint command_line /usr/local/nagios/plugins/custom/check_api_endpoint.py -u $ARG1$ -w $ARG2$ -c $ARG3$ -s $ARG4$ }

API with JSON validation

define command { command_name check_api_json command_line /usr/local/nagios/plugins/custom/check_api_endpoint.py -u $ARG1$ -w $ARG2$ -c $ARG3$ -k $ARG4$ -v $ARG5$ }

PostgreSQL Query Check Command

define command { command_name check_postgres_query command_line /usr/local/nagios/plugins/custom/check_postgres_query.py -H $ARG1$ -d $ARG2$ -u $ARG3$ -p $ARG4$ -q "$ARG5$" -w $ARG6$ -c $ARG7$ }

Configure service checks

Create service definitions that use your custom plugins for monitoring specific resources.

# Custom disk monitoring
define service {
    use                 generic-service
    host_name           localhost
    service_description Custom Disk Usage /var
    check_command       check_custom_disk!80!90!/var
    check_interval      5
    retry_interval      1
}

API endpoint monitoring

define service { use generic-service host_name localhost service_description API Health Check check_command check_api_endpoint!https://api.example.com/health!2!5!200 check_interval 2 retry_interval 1 }

Database query monitoring

define service { use generic-service host_name localhost service_description Database Query Performance check_command check_postgres_query!localhost!myapp!monitor_user!secure_password!SELECT COUNT(*) FROM users!1!3 check_interval 5 retry_interval 1 }

Validate Nagios configuration

Test the configuration to ensure your custom plugins integrate properly with Nagios Core.

sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

If validation passes, restart Nagios

sudo systemctl restart nagios sudo systemctl status nagios

Monitor plugin execution in logs

Watch Nagios logs to verify your custom plugins execute correctly and produce expected results.

sudo tail -f /usr/local/nagios/var/nagios.log | grep -E "check_custom_disk|check_api_endpoint|check_postgres_query"

Advanced plugin development patterns

Create a plugin template

Build a reusable template that includes all best practices for consistent plugin development.

#!/usr/bin/env python3
"""
Nagios Plugin Template

This template provides a standardized structure for developing Nagios plugins
with proper argument parsing, logging, and error handling.
"""

import sys
import argparse
import logging
import time
from pathlib import Path

Nagios exit codes

OK = 0 WARNING = 1 CRITICAL = 2 UNKNOWN = 3 class NagiosPlugin: def __init__(self, plugin_name, version="1.0"): self.plugin_name = plugin_name self.version = version self.args = None self.logger = self._setup_logging() def _setup_logging(self): """Setup logging for debugging (optional)""" logger = logging.getLogger(self.plugin_name) # Only log to file in development if Path('/tmp/nagios_debug').exists(): handler = logging.FileHandler(f'/tmp/{self.plugin_name}.log') formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s') handler.setFormatter(formatter) logger.addHandler(handler) logger.setLevel(logging.DEBUG) return logger def add_arguments(self, parser): """Override this method to add plugin-specific arguments""" parser.add_argument('-w', '--warning', type=float, default=80.0, help='Warning threshold') parser.add_argument('-c', '--critical', type=float, default=90.0, help='Critical threshold') def parse_args(self): """Parse command line arguments""" parser = argparse.ArgumentParser( description=f'{self.plugin_name} v{self.version}', formatter_class=argparse.RawDescriptionHelpFormatter ) # Add standard arguments parser.add_argument('-V', '--version', action='version', version=f'{self.plugin_name} {self.version}') parser.add_argument('-t', '--timeout', type=int, default=10, help='Plugin timeout in seconds') parser.add_argument('-v', '--verbose', action='store_true', help='Verbose output') # Add plugin-specific arguments self.add_arguments(parser) self.args = parser.parse_args() # Validate thresholds if hasattr(self.args, 'warning') and hasattr(self.args, 'critical'): if self.args.warning >= self.args.critical: self.nagios_exit(UNKNOWN, "Warning threshold must be less than critical") return self.args def check_health(self): """Override this method with your check logic""" # Example implementation import random value = random.uniform(0, 100) perf_data = f"value={value:.2f}%;{self.args.warning};{self.args.critical};0;100" if value >= self.args.critical: return CRITICAL, f"Value is {value:.2f}%", perf_data elif value >= self.args.warning: return WARNING, f"Value is {value:.2f}%", perf_data else: return OK, f"Value is {value:.2f}%", perf_data def nagios_exit(self, status, message, perf_data=None): """Exit with proper Nagios format""" status_text = {OK: 'OK', WARNING: 'WARNING', CRITICAL: 'CRITICAL', UNKNOWN: 'UNKNOWN'} output = f"{status_text[status]} - {message}" if perf_data: output += f"|{perf_data}" print(output) sys.exit(status) def run(self): """Main execution method""" try: self.parse_args() self.logger.debug(f"Starting {self.plugin_name} with args: {vars(self.args)}") status, message, perf_data = self.check_health() self.nagios_exit(status, message, perf_data) except KeyboardInterrupt: self.nagios_exit(UNKNOWN, "Plugin interrupted") except Exception as e: self.logger.error(f"Unexpected error: {str(e)}") self.nagios_exit(UNKNOWN, f"Plugin error: {str(e)}")

Example usage:

class CustomHealthCheck(NagiosPlugin): def __init__(self): super().__init__("Custom Health Check", "1.0") def add_arguments(self, parser): super().add_arguments(parser) parser.add_argument('-u', '--url', required=True, help='URL to check') def check_health(self): # Your custom check logic here return OK, "Custom check passed", "response_time=0.123s;2;5;0" if __name__ == "__main__": plugin = CustomHealthCheck() plugin.run()

Create plugin configuration management

Build a system for managing plugin configurations and deployment across multiple Nagios instances.

#!/usr/bin/env python3
"""
Nagios Plugin Manager
Manages deployment and configuration of custom plugins
"""

import os
import sys
import shutil
import json
import subprocess
from pathlib import Path
import argparse

class PluginManager:
    def __init__(self):
        self.dev_dir = Path('/usr/local/nagios/plugins/development')
        self.prod_dir = Path('/usr/local/nagios/plugins/custom')
        self.config_file = self.dev_dir / 'plugin_config.json'
        
    def load_config(self):
        """Load plugin configuration"""
        if self.config_file.exists():
            with open(self.config_file) as f:
                return json.load(f)
        return {}
        
    def save_config(self, config):
        """Save plugin configuration"""
        with open(self.config_file, 'w') as f:
            json.dump(config, f, indent=2)
            
    def validate_plugin(self, plugin_path):
        """Validate plugin before deployment"""
        if not plugin_path.exists():
            return False, f"Plugin {plugin_path} does not exist"
            
        # Check if executable
        if not os.access(plugin_path, os.X_OK):
            return False, f"Plugin {plugin_path} is not executable"
            
        # Try to run with --help
        try:
            result = subprocess.run([str(plugin_path), '--help'], 
                                  capture_output=True, text=True, timeout=5)
            if result.returncode not in [0, 1, 2, 3]:
                return False, f"Plugin returned invalid exit code: {result.returncode}"
        except subprocess.TimeoutExpired:
            return False, "Plugin help command timed out"
        except Exception as e:
            return False, f"Error testing plugin: {str(e)}"
            
        return True, "Plugin validation passed"
        
    def deploy_plugin(self, plugin_name):
        """Deploy plugin from development to production"""
        dev_path = self.dev_dir / plugin_name
        prod_path = self.prod_dir / plugin_name
        
        # Validate plugin
        is_valid, message = self.validate_plugin(dev_path)
        if not is_valid:
            print(f"✗ Validation failed: {message}")
            return False
            
        # Copy to production
        try:
            shutil.copy2(dev_path, prod_path)
            os.chown(prod_path, os.getuid(), os.getgid())  # Will be changed by next command
            subprocess.run(['sudo', 'chown', 'nagios:nagios', str(prod_path)], check=True)
            os.chmod(prod_path, 0o755)
            print(f"✓ Deployed {plugin_name} to production")
            return True
        except Exception as e:
            print(f"✗ Deployment failed: {str(e)}")
            return False
            
    def list_plugins(self):
        """List available plugins"""
        print("Development Plugins:")
        for plugin in self.dev_dir.glob('check_*'):
            if plugin.is_file() and os.access(plugin, os.X_OK):
                print(f"  📝 {plugin.name}")
                
        print("\nProduction Plugins:")
        for plugin in self.prod_dir.glob('check_*'):
            if plugin.is_file():
                print(f"  ✅ {plugin.name}")
                
    def test_plugin(self, plugin_name, args='--help'):
        """Test plugin execution"""
        plugin_path = self.dev_dir / plugin_name
        if not plugin_path.exists():
            plugin_path = self.prod_dir / plugin_name
            
        if not plugin_path.exists():
            print(f"✗ Plugin {plugin_name} not found")
            return False
            
        try:
            cmd = ['sudo', '-u', 'nagios', str(plugin_path)] + args.split()
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
            
            print(f"Exit Code: {result.returncode}")
            print(f"Output: {result.stdout}")
            if result.stderr:
                print(f"Error: {result.stderr}")
                
            return True
        except Exception as e:
            print(f"✗ Test failed: {str(e)}")
            return False

def main():
    parser = argparse.ArgumentParser(description='Nagios Plugin Manager')
    parser.add_argument('action', choices=['list', 'deploy', 'test', 'validate'])
    parser.add_argument('plugin', nargs='?', help='Plugin name')
    parser.add_argument('--args', default='--help', help='Arguments for test')
    
    args = parser.parse_args()
    manager = PluginManager()
    
    if args.action == 'list':
        manager.list_plugins()
    elif args.action == 'deploy':
        if not args.plugin:
            print("Plugin name required for deploy")
            sys.exit(1)
        manager.deploy_plugin(args.plugin)
    elif args.action == 'test':
        if not args.plugin:
            print("Plugin name required for test")
            sys.exit(1)
        manager.test_plugin(args.plugin, args.args)
    elif args.action == 'validate':
        if not args.plugin:
            print("Plugin name required for validate")
            sys.exit(1)
        dev_path = manager.dev_dir / args.plugin
        is_valid, message = manager.validate_plugin(dev_path)
        print(f"Validation: {'✓' if is_valid else '✗'} {message}")

if __name__ == '__main__':
    main()
sudo chmod 755 /usr/local/nagios/plugins/development/plugin_manager.py
python3 /usr/local/nagios/plugins/development/plugin_manager.py list

Verify your setup

# Check plugin directory structure
ls -la /usr/local/nagios/plugins/custom/
ls -la /usr/local/nagios/plugins/development/

Test plugin execution as nagios user

sudo -u nagios /usr/local/nagios/plugins/custom/check_custom_disk.sh -w 80 -c 90 -p /

Verify Nagios configuration includes custom commands

sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Check plugin execution in Nagios web interface

sudo systemctl status nagios curl -u nagiosadmin:password http://localhost/nagios/

Common issues

SymptomCauseFix
Plugin returns "Permission denied" Wrong file permissions or ownership sudo chmod 755 plugin_file && sudo chown nagios:nagios plugin_file
"Command not found" in Nagios Plugin path incorrect in command definition Verify path in /usr/local/nagios/etc/objects/commands.cfg
Plugin works manually but fails in Nagios Environment variables missing for nagios user Add export PATH=/usr/bin:/bin at top of shell scripts
"No output returned from plugin" Plugin doesn't print to stdout or exits without message Ensure plugin always prints status message before exit
Performance data not appearing in graphs Incorrect performance data format Follow format: label=value[UOM];[warn];[crit];[min];[max]
Python modules not found Modules installed for different user than nagios sudo -u nagios pip3 install module_name

Next steps

Running this in production?

Need this managed for you? Setting up custom plugins once is straightforward. Keeping them maintained, tested, and integrated across environments is the harder part. See how we run infrastructure like this for European teams with comprehensive monitoring needs.

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.