Configure OpenTelemetry Collector with custom metrics exporters and processors, set up application instrumentation with SDKs, and integrate with Prometheus and Grafana for comprehensive distributed system monitoring and observability.
Prerequisites
- Root or sudo access
- Python 3.8+ for sample applications
- Node.js 16+ for sample applications
- At least 2GB RAM
- Ports 4317, 4318, 8888, 8889, 9090, 3000 available
What this solves
OpenTelemetry provides a unified way to collect, process, and export telemetry data from your applications and infrastructure. This tutorial shows you how to set up custom instrumentation and metrics collection with Prometheus integration, enabling comprehensive monitoring of distributed systems with standardized telemetry data.
Step-by-step installation
Update system packages
Start by updating your package manager to ensure you get the latest versions of dependencies.
sudo apt update && sudo apt upgrade -y
sudo apt install -y wget curl unzip
Download and install OpenTelemetry Collector
Download the OpenTelemetry Collector binary from the official releases and install it in a standard location.
OTEL_VERSION="0.91.0"
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VERSION}/otelcol_${OTEL_VERSION}_linux_amd64.tar.gz
tar -xzf otelcol_${OTEL_VERSION}_linux_amd64.tar.gz
sudo mv otelcol /usr/local/bin/
sudo chmod +x /usr/local/bin/otelcol
Create OpenTelemetry user and directories
Create a dedicated user and directory structure for OpenTelemetry Collector with proper permissions.
sudo useradd --system --no-create-home --shell /bin/false otelcol
sudo mkdir -p /etc/otelcol /var/log/otelcol /var/lib/otelcol
sudo chown otelcol:otelcol /var/log/otelcol /var/lib/otelcol
sudo chmod 755 /etc/otelcol
sudo chmod 750 /var/log/otelcol /var/lib/otelcol
Configure OpenTelemetry Collector
Create the main configuration file with receivers, processors, exporters, and service pipeline definitions.
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['localhost:8888']
hostmetrics:
collection_interval: 30s
scrapers:
cpu: {}
disk: {}
filesystem: {}
memory: {}
network: {}
process: {}
processors:
batch:
timeout: 1s
send_batch_size: 1024
memory_limiter:
limit_mib: 512
resource:
attributes:
- key: environment
value: production
action: upsert
- key: service.instance.id
from_attribute: host.name
action: insert
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
namespace: "otel"
const_labels:
environment: "production"
otlp/jaeger:
endpoint: http://localhost:14250
tls:
insecure: true
logging:
loglevel: info
service:
extensions: [health_check, pprof]
pipelines:
metrics:
receivers: [otlp, prometheus, hostmetrics]
processors: [memory_limiter, resource, batch]
exporters: [prometheus, logging]
traces:
receivers: [otlp]
processors: [memory_limiter, resource, batch]
exporters: [otlp/jaeger, logging]
telemetry:
logs:
level: info
metrics:
address: 0.0.0.0:8888
extensions:
health_check:
endpoint: 0.0.0.0:13133
pprof:
endpoint: 0.0.0.0:1777
Create systemd service
Set up a systemd service to manage the OpenTelemetry Collector with proper restart policies and security settings.
[Unit]
Description=OpenTelemetry Collector
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=otelcol
Group=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=otelcol
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30
Security settings
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/lib/otelcol /var/log/otelcol
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectControlGroups=yes
[Install]
WantedBy=multi-user.target
Install Prometheus for metrics storage
Install Prometheus to scrape and store metrics from the OpenTelemetry Collector.
sudo apt install -y prometheus
sudo systemctl enable prometheus
Configure Prometheus to scrape OpenTelemetry metrics
Add the OpenTelemetry Collector as a scrape target in Prometheus configuration.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'otel-collector-metrics'
static_configs:
- targets: ['localhost:8889']
scrape_interval: 30s
metrics_path: /metrics
- job_name: 'otel-collector-internal'
static_configs:
- targets: ['localhost:8888']
scrape_interval: 30s
metrics_path: /metrics
Set up Python application instrumentation
Install OpenTelemetry Python SDK and create a sample application with custom metrics.
pip3 install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp opentelemetry-instrumentation-requests opentelemetry-instrumentation-flask
Create instrumented Python application
Create a sample Flask application with OpenTelemetry instrumentation and custom metrics.
from flask import Flask
import time
import random
from opentelemetry import trace, metrics
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor
Initialize tracing
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
Initialize metrics
metric_reader = PeriodicExportingMetricReader(
exporter=OTLPMetricExporter(endpoint="http://localhost:4317", insecure=True),
export_interval_millis=30000
)
metrics.set_meter_provider(MeterProvider(metric_readers=[metric_reader]))
meter = metrics.get_meter(__name__)
Create custom metrics
request_counter = meter.create_counter(
"http_requests_total",
description="Total number of HTTP requests",
unit="1"
)
response_time_histogram = meter.create_histogram(
"http_request_duration_seconds",
description="HTTP request duration in seconds",
unit="s"
)
active_connections = meter.create_up_down_counter(
"active_connections",
description="Number of active connections",
unit="1"
)
Configure OTLP exporter
otlp_exporter = OTLPSpanExporter(endpoint="http://localhost:4317", insecure=True)
span_processor = BatchSpanProcessor(otlp_exporter)
trace.get_tracer_provider().add_span_processor(span_processor)
app = Flask(__name__)
FlaskInstrumentor().instrument_app(app)
RequestsInstrumentor().instrument()
@app.route('/api/users')
def get_users():
start_time = time.time()
with tracer.start_as_current_span("get_users") as span:
span.set_attribute("operation", "fetch_users")
span.set_attribute("user.count", 100)
# Simulate work
processing_time = random.uniform(0.1, 0.5)
time.sleep(processing_time)
# Record metrics
request_counter.add(1, {"method": "GET", "endpoint": "/api/users"})
response_time_histogram.record(time.time() - start_time, {"method": "GET", "endpoint": "/api/users"})
active_connections.add(1)
return {"users": ["user1", "user2", "user3"]}
@app.route('/api/health')
def health_check():
request_counter.add(1, {"method": "GET", "endpoint": "/api/health"})
return {"status": "healthy"}
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=False)
Create Node.js application instrumentation
Install OpenTelemetry Node.js SDK and create a sample Express application with custom metrics.
mkdir -p /opt/nodejs-app
cd /opt/nodejs-app
npm init -y
npm install express @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node @opentelemetry/exporter-otlp-grpc
Create instrumented Node.js application
Create a sample Express application with OpenTelemetry instrumentation and custom business metrics.
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-otlp-grpc');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const { metrics, trace } = require('@opentelemetry/api');
// Initialize OpenTelemetry
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: 'http://localhost:4317',
}),
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: 'http://localhost:4317',
}),
exportIntervalMillis: 30000,
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
const express = require('express');
const app = express();
// Get meter and tracer
const meter = metrics.getMeter('nodejs-app', '1.0.0');
const tracer = trace.getTracer('nodejs-app', '1.0.0');
// Create custom metrics
const orderCounter = meter.createCounter('orders_total', {
description: 'Total number of orders processed',
});
const orderValueHistogram = meter.createHistogram('order_value_dollars', {
description: 'Order value distribution in dollars',
});
const inventoryGauge = meter.createUpDownCounter('inventory_items', {
description: 'Current inventory levels',
});
app.use(express.json());
app.get('/api/orders', (req, res) => {
const span = tracer.startSpan('get_orders');
span.setAttributes({
'operation': 'fetch_orders',
'user.id': req.query.user_id || 'anonymous'
});
try {
// Simulate fetching orders
const orders = [
{ id: 1, value: 29.99, status: 'completed' },
{ id: 2, value: 149.50, status: 'pending' }
];
// Record metrics
orderCounter.add(orders.length, {
status: 'success',
endpoint: '/api/orders'
});
orders.forEach(order => {
orderValueHistogram.record(order.value, {
status: order.status
});
});
span.setStatus({ code: trace.SpanStatusCode.OK });
res.json({ orders });
} catch (error) {
span.recordException(error);
span.setStatus({
code: trace.SpanStatusCode.ERROR,
message: error.message
});
res.status(500).json({ error: 'Internal server error' });
} finally {
span.end();
}
});
app.post('/api/orders', (req, res) => {
const span = tracer.startSpan('create_order');
span.setAttributes({
'operation': 'create_order',
'order.value': req.body.value
});
try {
const order = {
id: Math.floor(Math.random() * 10000),
value: req.body.value || 0,
status: 'created'
};
// Record metrics
orderCounter.add(1, {
status: 'created',
endpoint: '/api/orders'
});
orderValueHistogram.record(order.value, {
status: order.status
});
inventoryGauge.add(-1, {
item: req.body.item || 'unknown'
});
span.setStatus({ code: trace.SpanStatusCode.OK });
res.status(201).json({ order });
} catch (error) {
span.recordException(error);
span.setStatus({
code: trace.SpanStatusCode.ERROR,
message: error.message
});
res.status(500).json({ error: 'Internal server error' });
} finally {
span.end();
}
});
app.get('/health', (req, res) => {
res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});
const port = process.env.PORT || 3000;
app.listen(port, () => {
console.log(Server running on port ${port});
});
Start all services
Enable and start OpenTelemetry Collector, Prometheus, and verify they are running correctly.
sudo systemctl daemon-reload
sudo systemctl enable --now otelcol
sudo systemctl start prometheus
sudo systemctl status otelcol prometheus
Configure firewall rules
Open necessary ports for OpenTelemetry Collector, Prometheus, and application access.
sudo ufw allow 4317/tcp comment 'OpenTelemetry OTLP gRPC'
sudo ufw allow 4318/tcp comment 'OpenTelemetry OTLP HTTP'
sudo ufw allow 8889/tcp comment 'OpenTelemetry Prometheus metrics'
sudo ufw allow 9090/tcp comment 'Prometheus web UI'
sudo ufw reload
Install and configure Grafana
Install Grafana for visualizing metrics collected by Prometheus from OpenTelemetry.
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana
sudo systemctl enable --now grafana-server
Configure Prometheus data source in Grafana
Add Prometheus data source
Configure Grafana to use Prometheus as a data source for OpenTelemetry metrics visualization.
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://localhost:9090
isDefault: true
editable: true
jsonData:
httpMethod: POST
exemplarTraceIdDestinations:
- name: trace_id
datasourceUid: jaeger
urlDisplayLabel: "View in Jaeger"
Create OpenTelemetry dashboard
Create a custom Grafana dashboard for monitoring OpenTelemetry metrics and application performance.
{
"dashboard": {
"id": null,
"title": "OpenTelemetry Application Metrics",
"tags": ["opentelemetry", "monitoring"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "HTTP Requests Total",
"type": "stat",
"targets": [
{
"expr": "sum(rate(otel_http_requests_total[5m]))",
"legendFormat": "Requests/sec"
}
],
"fieldConfig": {
"defaults": {
"unit": "reqps"
}
},
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
},
{
"id": 2,
"title": "Request Duration",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(otel_http_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.50, rate(otel_http_request_duration_seconds_bucket[5m]))",
"legendFormat": "50th percentile"
}
],
"yAxes": [
{
"unit": "s"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
},
{
"id": 3,
"title": "Order Metrics",
"type": "graph",
"targets": [
{
"expr": "rate(orders_total[5m])",
"legendFormat": "Orders/sec"
}
],
"gridPos": {"h": 8, "w": 24, "x": 0, "y": 8}
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "5s"
}
}
Test custom instrumentation
Start sample applications
Run the Python and Node.js applications to generate telemetry data for testing.
# Start Python app in background
python3 /opt/sample-app.py &
Start Node.js app in background
cd /opt/nodejs-app
node app.js &
Generate test traffic
curl http://localhost:5000/api/users
curl http://localhost:3000/api/orders
curl -X POST http://localhost:3000/api/orders -H "Content-Type: application/json" -d '{"value": 99.99, "item": "laptop"}'
Verify your setup
# Check OpenTelemetry Collector status
sudo systemctl status otelcol
Verify collector is receiving metrics
curl http://localhost:8888/metrics
Check Prometheus metrics endpoint
curl http://localhost:8889/metrics
Verify Prometheus is scraping targets
curl http://localhost:9090/api/v1/targets
Check Grafana is running
sudo systemctl status grafana-server
View collector logs
sudo journalctl -u otelcol -f
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Collector fails to start | Invalid YAML configuration | otelcol --config=/etc/otelcol/config.yaml --dry-run |
| No metrics in Prometheus | Firewall blocking port 8889 | Check firewall rules and collector endpoint |
| Application spans not appearing | OTLP exporter connection failure | Verify port 4317/4318 accessibility |
| High memory usage | Memory limiter not configured | Adjust memory_limiter processor settings |
| Permission denied on log directory | Incorrect ownership | sudo chown otelcol:otelcol /var/log/otelcol |
| Grafana dashboard shows no data | Prometheus data source misconfigured | Check data source URL and connectivity |
Next steps
- Set up OpenTelemetry metrics collection with Prometheus integration for distributed system monitoring
- Configure advanced Grafana dashboards and alerting with Prometheus integration
- Set up distributed tracing for Node.js and Python microservices with OpenTelemetry and Jaeger
- Integrate OpenTelemetry with ELK stack for unified observability and distributed tracing
- Configure OpenTelemetry sampling strategies for high-traffic applications
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Default configuration
OTEL_VERSION="0.91.0"
PROMETHEUS_VERSION="2.47.2"
# Usage function
usage() {
echo "Usage: $0 [OPTIONS]"
echo "Options:"
echo " --otel-version VERSION OpenTelemetry Collector version (default: $OTEL_VERSION)"
echo " --prometheus-version VER Prometheus version (default: $PROMETHEUS_VERSION)"
echo " -h, --help Show this help message"
exit 1
}
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
--otel-version)
OTEL_VERSION="$2"
shift 2
;;
--prometheus-version)
PROMETHEUS_VERSION="$2"
shift 2
;;
-h|--help)
usage
;;
*)
echo -e "${RED}Unknown option: $1${NC}"
usage
;;
esac
done
# Cleanup function for rollback
cleanup() {
echo -e "${YELLOW}Cleaning up on failure...${NC}"
systemctl stop otelcol prometheus 2>/dev/null || true
systemctl disable otelcol prometheus 2>/dev/null || true
rm -f /etc/systemd/system/otelcol.service /etc/systemd/system/prometheus.service
rm -f /usr/local/bin/otelcol /usr/local/bin/prometheus /usr/local/bin/promtool
userdel otelcol prometheus 2>/dev/null || true
rm -rf /etc/otelcol /var/lib/otelcol /var/log/otelcol
rm -rf /etc/prometheus /var/lib/prometheus
systemctl daemon-reload
}
trap cleanup ERR
# Color echo functions
echo_info() { echo -e "${BLUE}$1${NC}"; }
echo_success() { echo -e "${GREEN}$1${NC}"; }
echo_warning() { echo -e "${YELLOW}$1${NC}"; }
echo_error() { echo -e "${RED}$1${NC}"; }
# Check prerequisites
echo_info "[1/10] Checking prerequisites..."
if [[ $EUID -ne 0 ]]; then
echo_error "This script must be run as root or with sudo"
exit 1
fi
# Detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_UPDATE="apt update && apt upgrade -y"
PKG_INSTALL="apt install -y"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_UPDATE="dnf update -y"
PKG_INSTALL="dnf install -y"
;;
amzn)
PKG_MGR="yum"
PKG_UPDATE="yum update -y"
PKG_INSTALL="yum install -y"
;;
*)
echo_error "Unsupported distribution: $ID"
exit 1
;;
esac
else
echo_error "Cannot detect distribution - /etc/os-release not found"
exit 1
fi
echo_success "Detected distribution: $ID using $PKG_MGR"
# Update system packages
echo_info "[2/10] Updating system packages..."
$PKG_UPDATE
$PKG_INSTALL wget curl tar gzip systemd
# Download and install OpenTelemetry Collector
echo_info "[3/10] Installing OpenTelemetry Collector..."
cd /tmp
wget "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VERSION}/otelcol_${OTEL_VERSION}_linux_amd64.tar.gz"
tar -xzf "otelcol_${OTEL_VERSION}_linux_amd64.tar.gz"
mv otelcol /usr/local/bin/
chmod 755 /usr/local/bin/otelcol
rm -f "otelcol_${OTEL_VERSION}_linux_amd64.tar.gz"
# Create OpenTelemetry user and directories
echo_info "[4/10] Creating OpenTelemetry user and directories..."
useradd --system --no-create-home --shell /bin/false otelcol || true
mkdir -p /etc/otelcol /var/log/otelcol /var/lib/otelcol
chown otelcol:otelcol /var/log/otelcol /var/lib/otelcol
chmod 755 /etc/otelcol
chmod 750 /var/log/otelcol /var/lib/otelcol
# Configure OpenTelemetry Collector
echo_info "[5/10] Configuring OpenTelemetry Collector..."
cat > /etc/otelcol/config.yaml << 'EOF'
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['localhost:8888']
hostmetrics:
collection_interval: 30s
scrapers:
cpu: {}
disk: {}
filesystem: {}
memory: {}
network: {}
process: {}
processors:
batch:
timeout: 1s
send_batch_size: 1024
memory_limiter:
limit_mib: 512
resource:
attributes:
- key: environment
value: production
action: upsert
- key: service.instance.id
from_attribute: host.name
action: insert
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
namespace: "otel"
const_labels:
environment: "production"
logging:
loglevel: info
service:
extensions: [health_check, pprof]
pipelines:
metrics:
receivers: [otlp, prometheus, hostmetrics]
processors: [memory_limiter, resource, batch]
exporters: [prometheus, logging]
telemetry:
logs:
level: info
metrics:
address: 0.0.0.0:8888
extensions:
health_check:
endpoint: 0.0.0.0:13133
pprof:
endpoint: 0.0.0.0:1777
EOF
chmod 644 /etc/otelcol/config.yaml
chown otelcol:otelcol /etc/otelcol/config.yaml
# Create OpenTelemetry systemd service
echo_info "[6/10] Creating OpenTelemetry systemd service..."
cat > /etc/systemd/system/otelcol.service << 'EOF'
[Unit]
Description=OpenTelemetry Collector
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=otelcol
Group=otelcol
ExecStart=/usr/local/bin/otelcol --config=/etc/otelcol/config.yaml
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal
SyslogIdentifier=otelcol
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30
# Security settings
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/var/lib/otelcol /var/log/otelcol
ProtectKernelTunables=yes
ProtectKernelModules=yes
ProtectControlGroups=yes
[Install]
WantedBy=multi-user.target
EOF
# Download and install Prometheus
echo_info "[7/10] Installing Prometheus..."
cd /tmp
wget "https://github.com/prometheus/prometheus/releases/download/v${PROMETHEUS_VERSION}/prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
tar -xzf "prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
mv "prometheus-${PROMETHEUS_VERSION}.linux-amd64/prometheus" /usr/local/bin/
mv "prometheus-${PROMETHEUS_VERSION}.linux-amd64/promtool" /usr/local/bin/
chmod 755 /usr/local/bin/prometheus /usr/local/bin/promtool
rm -rf "prometheus-${PROMETHEUS_VERSION}.linux-amd64" "prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz"
# Create Prometheus user and directories
echo_info "[8/10] Creating Prometheus user and directories..."
useradd --system --no-create-home --shell /bin/false prometheus || true
mkdir -p /etc/prometheus /var/lib/prometheus
chown prometheus:prometheus /var/lib/prometheus
chmod 755 /etc/prometheus
chmod 750 /var/lib/prometheus
# Configure Prometheus
cat > /etc/prometheus/prometheus.yml << 'EOF'
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'otel-collector'
static_configs:
- targets: ['localhost:8889']
scrape_interval: 10s
metrics_path: /metrics
EOF
chmod 644 /etc/prometheus/prometheus.yml
chown prometheus:prometheus /etc/prometheus/prometheus.yml
# Create Prometheus systemd service
cat > /etc/systemd/system/prometheus.service << 'EOF'
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090 \
--web.enable-lifecycle
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
# Configure firewall
echo_info "[9/10] Configuring firewall..."
if command -v firewall-cmd >/dev/null 2>&1; then
firewall-cmd --permanent --add-port=4317/tcp --add-port=4318/tcp --add-port=8888/tcp --add-port=8889/tcp --add-port=9090/tcp --add-port=13133/tcp || true
firewall-cmd --reload || true
elif command -v ufw >/dev/null 2>&1; then
ufw allow 4317/tcp || true
ufw allow 4318/tcp || true
ufw allow 8888/tcp || true
ufw allow 8889/tcp || true
ufw allow 9090/tcp || true
ufw allow 13133/tcp || true
fi
# Start and enable services
echo_info "[10/10] Starting and enabling services..."
systemctl daemon-reload
systemctl enable otelcol prometheus
systemctl start otelcol
systemctl start prometheus
# Verification checks
echo_info "Performing verification checks..."
sleep 5
# Check service status
if ! systemctl is-active --quiet otelcol; then
echo_error "OpenTelemetry Collector failed to start"
journalctl -u otelcol --no-pager -l
exit 1
fi
if ! systemctl is-active --quiet prometheus; then
echo_error "Prometheus failed to start"
journalctl -u prometheus --no-pager -l
exit 1
fi
# Check if ports are listening
if ! ss -tlnp | grep -q ":4317"; then
echo_error "OpenTelemetry GRPC port 4317 not listening"
exit 1
fi
if ! ss -tlnp | grep -q ":9090"; then
echo_error "Prometheus port 9090 not listening"
exit 1
fi
echo_success "✓ OpenTelemetry Collector is running"
echo_success "✓ Prometheus is running"
echo_success "✓ All services are healthy"
echo_info "Installation completed successfully!"
echo_info "OpenTelemetry Collector endpoints:"
echo_info " - GRPC: localhost:4317"
echo_info " - HTTP: localhost:4318"
echo_info " - Metrics: localhost:8889/metrics"
echo_info " - Health: localhost:13133"
echo_info "Prometheus:"
echo_info " - Web UI: http://localhost:9090"
trap - ERR
Review the script before running. Execute with: bash install.sh