Infrastructure tutorials

Production-grade guides for Linux, servers, security and performance. Copy-paste commands, multi-distro support, written by engineers who run this in production.

monitoring Advanced

Implement OpenTelemetry distributed context propagation across microservices with automatic instrumentation

Set up comprehensive distributed tracing across microservices using OpenTelemetry with automatic context propagation, trace correlation headers, and framework-specific auto-instrumentation for Python, Java, and Node.js applications.

45 min 4 distros 304 views
monitoring Advanced

Configure Thanos Ruler for distributed alerting across multiple Prometheus clusters

Set up Thanos Ruler to create a unified alerting layer across distributed Prometheus instances. This tutorial covers installation, global rule configuration, and cross-cluster alert federation for enterprise monitoring.

45 min 4 distros 247 views
security Advanced

Configure ClickHouse users and RBAC for production environments with authentication and access control

Secure your ClickHouse deployment with proper user authentication, role-based access control, and production-grade security policies. Learn to create users, manage roles, implement quota systems, and monitor access patterns for enterprise environments.

45 min 4 distros 221 views
monitoring Intermediate

Integrate Nagios Core 4.5 with Grafana dashboards for advanced monitoring visualization

Connect Nagios Core 4.5 with Grafana through NDOUtils and MySQL to create powerful monitoring dashboards. This integration provides advanced visualization capabilities, real-time alerting, and comprehensive monitoring insights for your infrastructure.

45 min 4 distros 255 views
devops Advanced

Implement Apache Spark 3.5 cluster with YARN and HDFS for distributed computing

Set up a production-grade Apache Spark 3.5 cluster with YARN resource management and HDFS distributed storage for scalable big data processing. This tutorial covers multi-node Hadoop cluster configuration, YARN integration, and monitoring setup.

45 min 4 distros 198 views
devops Advanced

Set up Istio multi-cluster service mesh with cross-cluster communication

Deploy and configure Istio across multiple Kubernetes clusters with secure cross-cluster communication, shared service discovery, and unified traffic management for distributed microservices architecture.

45 min 4 distros 232 views
monitoring Advanced

Set up Thanos Receiver for remote write scalability with Prometheus integration

Configure Thanos Receiver to handle high-volume remote write traffic from multiple Prometheus instances. This tutorial covers installation, multi-tenancy setup, and performance optimization for large-scale metrics ingestion.

45 min 4 distros 304 views
security Intermediate

Implement HAProxy rate limiting and DDoS protection with advanced security rules

Configure HAProxy with comprehensive rate limiting, connection throttling, and DDoS protection using stick tables, ACLs, and advanced security rules to protect your applications from malicious traffic and ensure service availability.

45 min 4 distros 261 views
monitoring Intermediate

Monitor Consul with Prometheus and Grafana for service discovery observability

Set up comprehensive monitoring for HashiCorp Consul using Prometheus metrics collection and Grafana dashboards. Configure telemetry export, alerting rules, and visualization for service discovery health and performance.

35 min 4 distros 273 views
security Advanced

Implement Jaeger security with TLS encryption and authentication for distributed tracing

Secure your Jaeger distributed tracing infrastructure with TLS encryption, JWT-based authentication, and RBAC policies. This tutorial covers certificate generation, collector/query service encryption, and UI authentication through reverse proxy integration.

45 min 4 distros 275 views
monitoring Intermediate

Configure OpenTelemetry sampling strategies for high-traffic applications

Learn how to implement probabilistic, deterministic, and adaptive sampling strategies in OpenTelemetry to optimize distributed tracing performance and reduce storage costs in high-traffic production environments.

25 min 4 distros 213 views
monitoring Advanced

Configure Prometheus long-term storage with Thanos for unlimited data retention

Deploy Thanos components with Prometheus to achieve unlimited data retention using object storage. This advanced setup enables querying years of historical metrics while maintaining high availability and reducing local storage costs.

45 min 4 distros 267 views

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer