Learn to develop custom Nagios plugins for specialized monitoring requirements including setting up the development environment, writing check scripts in multiple languages, and integrating them into your Nagios Core monitoring infrastructure.
Prerequisites
- Nagios Core 4.5 installed
- Python 3 and pip3
- Basic knowledge of shell scripting
- Understanding of monitoring concepts
What this solves
Default Nagios plugins cover basic system monitoring, but production environments often need custom checks for applications, APIs, databases, or business metrics. This guide shows you how to develop, test, and deploy custom Nagios plugins that extend your monitoring capabilities beyond standard system resources.
Step-by-step plugin development setup
Install development dependencies
Set up the essential tools for plugin development including compilers, interpreters, and Nagios plugin utilities.
sudo apt update
sudo apt install -y build-essential python3 python3-pip perl libmonitoring-plugin-perl curl wget git
Create plugin development directory
Organize your custom plugins in a dedicated directory structure with proper permissions.
sudo mkdir -p /usr/local/nagios/plugins/custom
sudo mkdir -p /usr/local/nagios/plugins/development
sudo chown -R nagios:nagios /usr/local/nagios/plugins/
sudo chmod 755 /usr/local/nagios/plugins/custom
sudo chmod 775 /usr/local/nagios/plugins/development
Install Python monitoring libraries
Install essential Python libraries for building robust monitoring plugins with proper exit codes and performance data.
sudo pip3 install nagiosplugin requests psutil pymongo redis elasticsearch
Download Nagios plugin development utils
Get the official plugin development utilities that provide standard functions and exit codes.
cd /tmp
wget https://www.nagios-plugins.org/download/nagios-plugins-2.4.6.tar.gz
tar -xzf nagios-plugins-2.4.6.tar.gz
cd nagios-plugins-2.4.6
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make
sudo cp plugins-root/utils.sh /usr/local/nagios/plugins/
sudo cp plugins/utils.c /usr/local/nagios/plugins/
sudo chmod 644 /usr/local/nagios/plugins/utils.*
Writing custom check plugins
Create a basic shell script plugin
Start with a simple disk usage plugin that demonstrates proper Nagios plugin structure and exit codes.
#!/bin/bash
Source the utils for standard functions
. /usr/local/nagios/plugins/utils.sh
Plugin info
PLUGIN_NAME="Custom Disk Check"
PLUGIN_VERSION="1.0"
Default values
WARNING_THRESHOLD=80
CRITICAL_THRESHOLD=90
PATH_TO_CHECK="/"
Function to display help
print_help() {
echo "$PLUGIN_NAME $PLUGIN_VERSION"
echo "Usage: $0 -w -c -p "
echo " -w: Warning threshold (default: 80%)"
echo " -c: Critical threshold (default: 90%)"
echo " -p: Path to check (default: /)"
echo " -h: Show this help"
}
Parse command line arguments
while getopts "w:c:p:h" opt; do
case $opt in
w) WARNING_THRESHOLD=$OPTARG ;;
c) CRITICAL_THRESHOLD=$OPTARG ;;
p) PATH_TO_CHECK=$OPTARG ;;
h) print_help; exit $STATE_OK ;;
*) print_help; exit $STATE_UNKNOWN ;;
esac
done
Validate thresholds
if [ $WARNING_THRESHOLD -ge $CRITICAL_THRESHOLD ]; then
echo "UNKNOWN - Warning threshold must be less than critical threshold"
exit $STATE_UNKNOWN
fi
Check if path exists
if [ ! -d "$PATH_TO_CHECK" ]; then
echo "UNKNOWN - Path $PATH_TO_CHECK does not exist"
exit $STATE_UNKNOWN
fi
Get disk usage
USAGE=$(df "$PATH_TO_CHECK" | tail -1 | awk '{print $5}' | sed 's/%//')
Check if we got a valid number
if ! [[ "$USAGE" =~ ^[0-9]+$ ]]; then
echo "UNKNOWN - Could not determine disk usage for $PATH_TO_CHECK"
exit $STATE_UNKNOWN
fi
Performance data
PERF_DATA="usage=${USAGE}%;${WARNING_THRESHOLD};${CRITICAL_THRESHOLD};0;100"
Determine status and exit
if [ $USAGE -ge $CRITICAL_THRESHOLD ]; then
echo "CRITICAL - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA"
exit $STATE_CRITICAL
elif [ $USAGE -ge $WARNING_THRESHOLD ]; then
echo "WARNING - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA"
exit $STATE_WARNING
else
echo "OK - Disk usage ${USAGE}% on $PATH_TO_CHECK|$PERF_DATA"
exit $STATE_OK
fi
Create a Python API monitoring plugin
Build a more advanced plugin that monitors API endpoints with JSON response validation and performance metrics.
#!/usr/bin/env python3
import sys
import argparse
import requests
import time
import json
from urllib.parse import urlparse
Nagios exit codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3
def parse_args():
parser = argparse.ArgumentParser(description='Monitor API endpoint health')
parser.add_argument('-u', '--url', required=True, help='API endpoint URL')
parser.add_argument('-t', '--timeout', type=int, default=10, help='Request timeout in seconds')
parser.add_argument('-w', '--warning', type=float, default=2.0, help='Warning threshold for response time')
parser.add_argument('-c', '--critical', type=float, default=5.0, help='Critical threshold for response time')
parser.add_argument('-k', '--key', help='JSON key to validate in response')
parser.add_argument('-v', '--value', help='Expected value for the JSON key')
parser.add_argument('-H', '--header', action='append', help='Custom headers (format: "Key: Value")')
parser.add_argument('-s', '--status', type=int, default=200, help='Expected HTTP status code')
return parser.parse_args()
def make_request(url, timeout, headers=None):
"""Make HTTP request and return response with timing"""
custom_headers = {}
if headers:
for header in headers:
key, value = header.split(':', 1)
custom_headers[key.strip()] = value.strip()
start_time = time.time()
try:
response = requests.get(url, timeout=timeout, headers=custom_headers)
response_time = time.time() - start_time
return response, response_time
except requests.exceptions.Timeout:
return None, timeout
except requests.exceptions.RequestException as e:
print(f"CRITICAL - Request failed: {str(e)}")
sys.exit(CRITICAL)
def validate_json_content(response, key, expected_value):
"""Validate JSON response content"""
try:
data = response.json()
if key in data:
actual_value = data[key]
if str(actual_value) == str(expected_value):
return True, f"JSON validation passed: {key}={actual_value}"
else:
return False, f"JSON validation failed: {key}={actual_value}, expected={expected_value}"
else:
return False, f"JSON key '{key}' not found in response"
except json.JSONDecodeError:
return False, "Response is not valid JSON"
def main():
args = parse_args()
# Validate URL
parsed_url = urlparse(args.url)
if not parsed_url.scheme or not parsed_url.netloc:
print("UNKNOWN - Invalid URL format")
sys.exit(UNKNOWN)
# Validate thresholds
if args.warning >= args.critical:
print("UNKNOWN - Warning threshold must be less than critical threshold")
sys.exit(UNKNOWN)
# Make request
response, response_time = make_request(args.url, args.timeout, args.header)
if response is None:
print(f"CRITICAL - Request timed out after {args.timeout} seconds")
sys.exit(CRITICAL)
# Check HTTP status
if response.status_code != args.status:
print(f"CRITICAL - HTTP {response.status_code}, expected {args.status}")
sys.exit(CRITICAL)
# Validate JSON content if specified
json_status = "OK"
json_message = ""
if args.key and args.value:
is_valid, message = validate_json_content(response, args.key, args.value)
if not is_valid:
print(f"CRITICAL - {message}")
sys.exit(CRITICAL)
json_message = f", {message}"
# Performance data
perf_data = f"response_time={response_time:.3f}s;{args.warning};{args.critical};0"
# Determine status based on response time
if response_time >= args.critical:
print(f"CRITICAL - Response time {response_time:.3f}s{json_message}|{perf_data}")
sys.exit(CRITICAL)
elif response_time >= args.warning:
print(f"WARNING - Response time {response_time:.3f}s{json_message}|{perf_data}")
sys.exit(WARNING)
else:
print(f"OK - Response time {response_time:.3f}s{json_message}|{perf_data}")
sys.exit(OK)
if __name__ == "__main__":
main()
Create a database connection plugin
Build a plugin that monitors database connectivity and query performance for PostgreSQL.
#!/usr/bin/env python3
import sys
import argparse
import time
try:
import psycopg2
except ImportError:
print("UNKNOWN - psycopg2 library not installed. Run: pip3 install psycopg2-binary")
sys.exit(3)
Nagios exit codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3
def parse_args():
parser = argparse.ArgumentParser(description='Monitor PostgreSQL query performance')
parser.add_argument('-H', '--host', default='localhost', help='Database host')
parser.add_argument('-P', '--port', type=int, default=5432, help='Database port')
parser.add_argument('-d', '--database', required=True, help='Database name')
parser.add_argument('-u', '--username', required=True, help='Database username')
parser.add_argument('-p', '--password', required=True, help='Database password')
parser.add_argument('-q', '--query', default='SELECT 1', help='SQL query to execute')
parser.add_argument('-w', '--warning', type=float, default=1.0, help='Warning threshold in seconds')
parser.add_argument('-c', '--critical', type=float, default=3.0, help='Critical threshold in seconds')
parser.add_argument('-t', '--timeout', type=int, default=10, help='Connection timeout')
return parser.parse_args()
def execute_query(host, port, database, username, password, query, timeout):
"""Execute database query and return timing"""
try:
start_time = time.time()
conn = psycopg2.connect(
host=host,
port=port,
database=database,
user=username,
password=password,
connect_timeout=timeout
)
cursor = conn.cursor()
cursor.execute(query)
result = cursor.fetchall()
query_time = time.time() - start_time
cursor.close()
conn.close()
return True, query_time, len(result)
except psycopg2.OperationalError as e:
return False, 0, f"Connection error: {str(e)}"
except psycopg2.Error as e:
return False, 0, f"Database error: {str(e)}"
except Exception as e:
return False, 0, f"Unexpected error: {str(e)}"
def main():
args = parse_args()
# Validate thresholds
if args.warning >= args.critical:
print("UNKNOWN - Warning threshold must be less than critical threshold")
sys.exit(UNKNOWN)
# Execute query
success, query_time, result_info = execute_query(
args.host, args.port, args.database,
args.username, args.password, args.query, args.timeout
)
if not success:
print(f"CRITICAL - {result_info}")
sys.exit(CRITICAL)
# Performance data
perf_data = f"query_time={query_time:.3f}s;{args.warning};{args.critical};0"
# Determine status
if query_time >= args.critical:
print(f"CRITICAL - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}")
sys.exit(CRITICAL)
elif query_time >= args.warning:
print(f"WARNING - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}")
sys.exit(WARNING)
else:
print(f"OK - Query took {query_time:.3f}s, returned {result_info} rows|{perf_data}")
sys.exit(OK)
if __name__ == "__main__":
main()
Set executable permissions on plugins
Make your custom plugins executable by the Nagios user with proper security permissions.
sudo chmod 755 /usr/local/nagios/plugins/development/check_custom_disk.sh
sudo chmod 755 /usr/local/nagios/plugins/development/check_api_endpoint.py
sudo chmod 755 /usr/local/nagios/plugins/development/check_postgres_query.py
sudo chown nagios:nagios /usr/local/nagios/plugins/development/*
Plugin testing and validation
Test plugins manually
Run each plugin from the command line to verify functionality and output format before integration.
# Test disk check plugin
sudo -u nagios /usr/local/nagios/plugins/development/check_custom_disk.sh -w 80 -c 90 -p /
Test API endpoint plugin
sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/json -w 2 -c 5
Test with JSON validation
sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/json -k slideshow -v "" -w 2 -c 5
Validate plugin output format
Ensure plugins follow Nagios standards for output format, exit codes, and performance data.
# Check exit codes
echo $? # Should be 0, 1, 2, or 3
Validate performance data format
Should be: label=value[UOM];[warn];[crit];[min];[max]
Test all threshold conditions
sudo -u nagios /usr/local/nagios/plugins/development/check_custom_disk.sh -w 10 -c 20 -p /
sudo -u nagios /usr/local/nagios/plugins/development/check_api_endpoint.py -u https://httpbin.org/delay/3 -w 1 -c 2
Create plugin validation script
Build an automated test script to verify plugin behavior across different scenarios.
#!/bin/bash
PLUGIN_DIR="/usr/local/nagios/plugins/development"
TEST_RESULTS="/tmp/plugin_tests.log"
echo "Nagios Plugin Validation Results" > $TEST_RESULTS
echo "Generated: $(date)" >> $TEST_RESULTS
echo "========================================" >> $TEST_RESULTS
test_plugin() {
local plugin="$1"
local args="$2"
local expected_exit="$3"
local test_name="$4"
echo "Testing: $test_name" >> $TEST_RESULTS
echo "Command: $plugin $args" >> $TEST_RESULTS
output=$(sudo -u nagios $plugin $args 2>&1)
exit_code=$?
echo "Output: $output" >> $TEST_RESULTS
echo "Exit Code: $exit_code" >> $TEST_RESULTS
if [ "$exit_code" = "$expected_exit" ]; then
echo "Result: PASS" >> $TEST_RESULTS
echo "✓ $test_name"
else
echo "Result: FAIL (expected $expected_exit, got $exit_code)" >> $TEST_RESULTS
echo "✗ $test_name"
fi
echo "" >> $TEST_RESULTS
}
Test disk plugin
test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 80 -c 90 -p /" "0" "Disk Check - Normal"
test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 10 -c 20 -p /" "1" "Disk Check - Warning"
test_plugin "$PLUGIN_DIR/check_custom_disk.sh" "-w 90 -c 80 -p /" "3" "Disk Check - Invalid Thresholds"
Test API plugin
test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/status/200 -w 2 -c 5" "0" "API Check - Normal"
test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/status/404 -w 2 -c 5" "2" "API Check - 404 Error"
test_plugin "$PLUGIN_DIR/check_api_endpoint.py" "-u https://httpbin.org/delay/3 -w 1 -c 2" "2" "API Check - Slow Response"
echo "Validation complete. Results saved to $TEST_RESULTS"
echo "Review with: cat $TEST_RESULTS"
sudo chmod 755 /usr/local/nagios/plugins/development/validate_plugins.sh
sudo /usr/local/nagios/plugins/development/validate_plugins.sh
Integration with Nagios Core
Deploy tested plugins to production directory
Move validated plugins to the production plugin directory and set final permissions.
sudo cp /usr/local/nagios/plugins/development/check_custom_disk.sh /usr/local/nagios/plugins/custom/
sudo cp /usr/local/nagios/plugins/development/check_api_endpoint.py /usr/local/nagios/plugins/custom/
sudo cp /usr/local/nagios/plugins/development/check_postgres_query.py /usr/local/nagios/plugins/custom/
sudo chown nagios:nagios /usr/local/nagios/plugins/custom/*
sudo chmod 755 /usr/local/nagios/plugins/custom/*
Define command definitions
Create Nagios command definitions for your custom plugins in the commands configuration file.
# Custom Disk Check Command
define command {
command_name check_custom_disk
command_line /usr/local/nagios/plugins/custom/check_custom_disk.sh -w $ARG1$ -c $ARG2$ -p $ARG3$
}
API Endpoint Check Command
define command {
command_name check_api_endpoint
command_line /usr/local/nagios/plugins/custom/check_api_endpoint.py -u $ARG1$ -w $ARG2$ -c $ARG3$ -s $ARG4$
}
API with JSON validation
define command {
command_name check_api_json
command_line /usr/local/nagios/plugins/custom/check_api_endpoint.py -u $ARG1$ -w $ARG2$ -c $ARG3$ -k $ARG4$ -v $ARG5$
}
PostgreSQL Query Check Command
define command {
command_name check_postgres_query
command_line /usr/local/nagios/plugins/custom/check_postgres_query.py -H $ARG1$ -d $ARG2$ -u $ARG3$ -p $ARG4$ -q "$ARG5$" -w $ARG6$ -c $ARG7$
}
Configure service checks
Create service definitions that use your custom plugins for monitoring specific resources.
# Custom disk monitoring
define service {
use generic-service
host_name localhost
service_description Custom Disk Usage /var
check_command check_custom_disk!80!90!/var
check_interval 5
retry_interval 1
}
API endpoint monitoring
define service {
use generic-service
host_name localhost
service_description API Health Check
check_command check_api_endpoint!https://api.example.com/health!2!5!200
check_interval 2
retry_interval 1
}
Database query monitoring
define service {
use generic-service
host_name localhost
service_description Database Query Performance
check_command check_postgres_query!localhost!myapp!monitor_user!secure_password!SELECT COUNT(*) FROM users!1!3
check_interval 5
retry_interval 1
}
Validate Nagios configuration
Test the configuration to ensure your custom plugins integrate properly with Nagios Core.
sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
If validation passes, restart Nagios
sudo systemctl restart nagios
sudo systemctl status nagios
Monitor plugin execution in logs
Watch Nagios logs to verify your custom plugins execute correctly and produce expected results.
sudo tail -f /usr/local/nagios/var/nagios.log | grep -E "check_custom_disk|check_api_endpoint|check_postgres_query"
Advanced plugin development patterns
Create a plugin template
Build a reusable template that includes all best practices for consistent plugin development.
#!/usr/bin/env python3
"""
Nagios Plugin Template
This template provides a standardized structure for developing Nagios plugins
with proper argument parsing, logging, and error handling.
"""
import sys
import argparse
import logging
import time
from pathlib import Path
Nagios exit codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3
class NagiosPlugin:
def __init__(self, plugin_name, version="1.0"):
self.plugin_name = plugin_name
self.version = version
self.args = None
self.logger = self._setup_logging()
def _setup_logging(self):
"""Setup logging for debugging (optional)"""
logger = logging.getLogger(self.plugin_name)
# Only log to file in development
if Path('/tmp/nagios_debug').exists():
handler = logging.FileHandler(f'/tmp/{self.plugin_name}.log')
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
return logger
def add_arguments(self, parser):
"""Override this method to add plugin-specific arguments"""
parser.add_argument('-w', '--warning', type=float, default=80.0,
help='Warning threshold')
parser.add_argument('-c', '--critical', type=float, default=90.0,
help='Critical threshold')
def parse_args(self):
"""Parse command line arguments"""
parser = argparse.ArgumentParser(
description=f'{self.plugin_name} v{self.version}',
formatter_class=argparse.RawDescriptionHelpFormatter
)
# Add standard arguments
parser.add_argument('-V', '--version', action='version',
version=f'{self.plugin_name} {self.version}')
parser.add_argument('-t', '--timeout', type=int, default=10,
help='Plugin timeout in seconds')
parser.add_argument('-v', '--verbose', action='store_true',
help='Verbose output')
# Add plugin-specific arguments
self.add_arguments(parser)
self.args = parser.parse_args()
# Validate thresholds
if hasattr(self.args, 'warning') and hasattr(self.args, 'critical'):
if self.args.warning >= self.args.critical:
self.nagios_exit(UNKNOWN, "Warning threshold must be less than critical")
return self.args
def check_health(self):
"""Override this method with your check logic"""
# Example implementation
import random
value = random.uniform(0, 100)
perf_data = f"value={value:.2f}%;{self.args.warning};{self.args.critical};0;100"
if value >= self.args.critical:
return CRITICAL, f"Value is {value:.2f}%", perf_data
elif value >= self.args.warning:
return WARNING, f"Value is {value:.2f}%", perf_data
else:
return OK, f"Value is {value:.2f}%", perf_data
def nagios_exit(self, status, message, perf_data=None):
"""Exit with proper Nagios format"""
status_text = {OK: 'OK', WARNING: 'WARNING', CRITICAL: 'CRITICAL', UNKNOWN: 'UNKNOWN'}
output = f"{status_text[status]} - {message}"
if perf_data:
output += f"|{perf_data}"
print(output)
sys.exit(status)
def run(self):
"""Main execution method"""
try:
self.parse_args()
self.logger.debug(f"Starting {self.plugin_name} with args: {vars(self.args)}")
status, message, perf_data = self.check_health()
self.nagios_exit(status, message, perf_data)
except KeyboardInterrupt:
self.nagios_exit(UNKNOWN, "Plugin interrupted")
except Exception as e:
self.logger.error(f"Unexpected error: {str(e)}")
self.nagios_exit(UNKNOWN, f"Plugin error: {str(e)}")
Example usage:
class CustomHealthCheck(NagiosPlugin):
def __init__(self):
super().__init__("Custom Health Check", "1.0")
def add_arguments(self, parser):
super().add_arguments(parser)
parser.add_argument('-u', '--url', required=True,
help='URL to check')
def check_health(self):
# Your custom check logic here
return OK, "Custom check passed", "response_time=0.123s;2;5;0"
if __name__ == "__main__":
plugin = CustomHealthCheck()
plugin.run()
Create plugin configuration management
Build a system for managing plugin configurations and deployment across multiple Nagios instances.
#!/usr/bin/env python3
"""
Nagios Plugin Manager
Manages deployment and configuration of custom plugins
"""
import os
import sys
import shutil
import json
import subprocess
from pathlib import Path
import argparse
class PluginManager:
def __init__(self):
self.dev_dir = Path('/usr/local/nagios/plugins/development')
self.prod_dir = Path('/usr/local/nagios/plugins/custom')
self.config_file = self.dev_dir / 'plugin_config.json'
def load_config(self):
"""Load plugin configuration"""
if self.config_file.exists():
with open(self.config_file) as f:
return json.load(f)
return {}
def save_config(self, config):
"""Save plugin configuration"""
with open(self.config_file, 'w') as f:
json.dump(config, f, indent=2)
def validate_plugin(self, plugin_path):
"""Validate plugin before deployment"""
if not plugin_path.exists():
return False, f"Plugin {plugin_path} does not exist"
# Check if executable
if not os.access(plugin_path, os.X_OK):
return False, f"Plugin {plugin_path} is not executable"
# Try to run with --help
try:
result = subprocess.run([str(plugin_path), '--help'],
capture_output=True, text=True, timeout=5)
if result.returncode not in [0, 1, 2, 3]:
return False, f"Plugin returned invalid exit code: {result.returncode}"
except subprocess.TimeoutExpired:
return False, "Plugin help command timed out"
except Exception as e:
return False, f"Error testing plugin: {str(e)}"
return True, "Plugin validation passed"
def deploy_plugin(self, plugin_name):
"""Deploy plugin from development to production"""
dev_path = self.dev_dir / plugin_name
prod_path = self.prod_dir / plugin_name
# Validate plugin
is_valid, message = self.validate_plugin(dev_path)
if not is_valid:
print(f"✗ Validation failed: {message}")
return False
# Copy to production
try:
shutil.copy2(dev_path, prod_path)
os.chown(prod_path, os.getuid(), os.getgid()) # Will be changed by next command
subprocess.run(['sudo', 'chown', 'nagios:nagios', str(prod_path)], check=True)
os.chmod(prod_path, 0o755)
print(f"✓ Deployed {plugin_name} to production")
return True
except Exception as e:
print(f"✗ Deployment failed: {str(e)}")
return False
def list_plugins(self):
"""List available plugins"""
print("Development Plugins:")
for plugin in self.dev_dir.glob('check_*'):
if plugin.is_file() and os.access(plugin, os.X_OK):
print(f" 📝 {plugin.name}")
print("\nProduction Plugins:")
for plugin in self.prod_dir.glob('check_*'):
if plugin.is_file():
print(f" ✅ {plugin.name}")
def test_plugin(self, plugin_name, args='--help'):
"""Test plugin execution"""
plugin_path = self.dev_dir / plugin_name
if not plugin_path.exists():
plugin_path = self.prod_dir / plugin_name
if not plugin_path.exists():
print(f"✗ Plugin {plugin_name} not found")
return False
try:
cmd = ['sudo', '-u', 'nagios', str(plugin_path)] + args.split()
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
print(f"Exit Code: {result.returncode}")
print(f"Output: {result.stdout}")
if result.stderr:
print(f"Error: {result.stderr}")
return True
except Exception as e:
print(f"✗ Test failed: {str(e)}")
return False
def main():
parser = argparse.ArgumentParser(description='Nagios Plugin Manager')
parser.add_argument('action', choices=['list', 'deploy', 'test', 'validate'])
parser.add_argument('plugin', nargs='?', help='Plugin name')
parser.add_argument('--args', default='--help', help='Arguments for test')
args = parser.parse_args()
manager = PluginManager()
if args.action == 'list':
manager.list_plugins()
elif args.action == 'deploy':
if not args.plugin:
print("Plugin name required for deploy")
sys.exit(1)
manager.deploy_plugin(args.plugin)
elif args.action == 'test':
if not args.plugin:
print("Plugin name required for test")
sys.exit(1)
manager.test_plugin(args.plugin, args.args)
elif args.action == 'validate':
if not args.plugin:
print("Plugin name required for validate")
sys.exit(1)
dev_path = manager.dev_dir / args.plugin
is_valid, message = manager.validate_plugin(dev_path)
print(f"Validation: {'✓' if is_valid else '✗'} {message}")
if __name__ == '__main__':
main()
sudo chmod 755 /usr/local/nagios/plugins/development/plugin_manager.py
python3 /usr/local/nagios/plugins/development/plugin_manager.py list
Verify your setup
# Check plugin directory structure
ls -la /usr/local/nagios/plugins/custom/
ls -la /usr/local/nagios/plugins/development/
Test plugin execution as nagios user
sudo -u nagios /usr/local/nagios/plugins/custom/check_custom_disk.sh -w 80 -c 90 -p /
Verify Nagios configuration includes custom commands
sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Check plugin execution in Nagios web interface
sudo systemctl status nagios
curl -u nagiosadmin:password http://localhost/nagios/
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Plugin returns "Permission denied" | Wrong file permissions or ownership | sudo chmod 755 plugin_file && sudo chown nagios:nagios plugin_file |
| "Command not found" in Nagios | Plugin path incorrect in command definition | Verify path in /usr/local/nagios/etc/objects/commands.cfg |
| Plugin works manually but fails in Nagios | Environment variables missing for nagios user | Add export PATH=/usr/bin:/bin at top of shell scripts |
| "No output returned from plugin" | Plugin doesn't print to stdout or exits without message | Ensure plugin always prints status message before exit |
| Performance data not appearing in graphs | Incorrect performance data format | Follow format: label=value[UOM];[warn];[crit];[min];[max] |
| Python modules not found | Modules installed for different user than nagios | sudo -u nagios pip3 install module_name |
Next steps
- Set up Nagios distributed monitoring with NRPE to use custom plugins on remote hosts
- Configure Nagios SSL certificates and security hardening to secure your monitoring infrastructure
- Configure Grafana custom plugins for specialized monitoring requirements for advanced visualization
- Implement Nagios plugin performance optimization and caching for high-frequency checks
- Configure Nagios plugin automated testing with CI/CD integration for production-ready workflows