Configure advanced Jaeger sampling strategies for high-traffic environments

Advanced 45 min Jun 06, 2026 20 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure probabilistic, adaptive, and remote sampling strategies for Jaeger distributed tracing to optimize performance and storage costs in high-throughput production environments while maintaining observability.

Prerequisites

  • Existing Jaeger deployment
  • Prometheus for metrics collection
  • Kubernetes cluster (for Kubernetes examples)
  • Administrative access to Jaeger components

What this solves

High-traffic applications generate massive amounts of trace data that can overwhelm Jaeger collectors and storage backends. Without proper sampling strategies, you'll face storage bloat, degraded query performance, and unnecessary network overhead while still missing critical traces for debugging. Advanced sampling configurations help you balance observability needs with system performance by intelligently selecting which traces to collect and store.

Prerequisites and system setup

This tutorial assumes you have a functional Jaeger deployment. If you need to set up Jaeger first, follow our Jaeger Kubernetes deployment guide or Jaeger alerting configuration for monitoring setup.

Verify Jaeger components

Check that your Jaeger collector and agent are running correctly before configuring sampling.

kubectl get pods -n jaeger-system

For non-Kubernetes deployments:

sudo systemctl status jaeger-collector sudo systemctl status jaeger-agent

Install sampling configuration tools

Install curl and jq for testing sampling configurations and querying Jaeger APIs.

sudo apt update
sudo apt install -y curl jq
sudo dnf install -y curl jq

Understanding Jaeger sampling fundamentals

Sampling strategy types overview

Jaeger supports four main sampling strategies, each suited for different traffic patterns and observability requirements.

StrategyUse CaseTraffic ImpactStorage Efficiency
ProbabilisticHigh uniform trafficLow overheadPredictable reduction
Rate LimitingBurst protectionVariable overheadConsistent volume
AdaptiveVariable traffic patternsMedium overheadDynamic optimization
RemoteCentralized controlMedium overheadService-specific tuning

Check current sampling configuration

Query your Jaeger collector to see the current sampling strategy being used.

curl -s "http://jaeger-collector:14268/api/sampling?service=your-service-name" | jq .

Configure probabilistic sampling strategies

Set up basic probabilistic sampling

Probabilistic sampling captures a fixed percentage of traces. This is ideal for high-volume services with consistent traffic patterns.

{
  "service_strategies": [
    {
      "service": "frontend-service",
      "type": "probabilistic",
      "param": 0.1
    },
    {
      "service": "payment-service",
      "type": "probabilistic",
      "param": 0.5
    }
  ],
  "default_strategy": {
    "type": "probabilistic",
    "param": 0.01
  }
}

Configure operation-level probabilistic sampling

Fine-tune sampling rates for specific operations within services to capture more data from critical endpoints.

{
  "service_strategies": [
    {
      "service": "api-gateway",
      "type": "probabilistic",
      "param": 0.05,
      "operation_strategies": [
        {
          "operation": "POST /api/v1/orders",
          "type": "probabilistic",
          "param": 1.0
        },
        {
          "operation": "GET /health",
          "type": "probabilistic",
          "param": 0.001
        }
      ]
    }
  ]
}

Implement rate limiting strategies

Configure rate limiting sampling

Rate limiting controls the absolute number of traces per second, protecting storage from traffic spikes while ensuring consistent data volume.

{
  "service_strategies": [
    {
      "service": "high-traffic-api",
      "type": "ratelimiting",
      "param": 100
    },
    {
      "service": "background-worker",
      "type": "ratelimiting",
      "param": 10
    }
  ],
  "default_strategy": {
    "type": "ratelimiting",
    "param": 50
  }
}

Combine rate limiting with probabilistic sampling

Use hybrid strategies to apply both probabilistic filtering and rate limits for maximum control over trace volume.

{
  "service_strategies": [
    {
      "service": "web-frontend",
      "type": "probabilistic",
      "param": 0.1,
      "max_traces_per_second": 200
    }
  ]
}

Configure adaptive sampling

Enable adaptive sampling on Jaeger collector

Adaptive sampling automatically adjusts sampling rates based on traffic volume and storage capacity.

sampling:
  adaptive:
    max_traces_per_second: 1000
    strategies_reload_frequency: 1m
    aggregation_buckets: 10
    delay: 2m
    initial_sampling_probability: 0.001
    min_sampling_probability: 0.00001
    max_sampling_probability: 1.0

Configure adaptive sampling parameters

Set target trace volumes and adjustment intervals for different services based on their importance and traffic patterns.

{
  "strategies": [
    {
      "service": "payment-processor",
      "max_traces_per_second": 500,
      "lookback_period": "5m"
    },
    {
      "service": "user-auth",
      "max_traces_per_second": 200,
      "lookback_period": "2m"
    },
    {
      "service": "logging-service",
      "max_traces_per_second": 10,
      "lookback_period": "10m"
    }
  ],
  "default": {
    "max_traces_per_second": 100,
    "lookback_period": "5m"
  }
}

Start Jaeger collector with adaptive sampling

Restart the collector with adaptive sampling configuration enabled.

# For Kubernetes deployments
kubectl patch deployment jaeger-collector -n jaeger-system -p '{
  "spec": {
    "template": {
      "spec": {
        "containers": [{
          "name": "jaeger-collector",
          "args": [
            "--sampling.strategies-file=/etc/jaeger/adaptive-sampling.json",
            "--sampling.strategies-reload-frequency=60s"
          ]
        }]
      }
    }
  }
}'

For systemd deployments

sudo systemctl restart jaeger-collector

Set up remote sampling configuration

Deploy sampling strategy server

Set up a centralized sampling strategy server that can be updated without restarting services.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jaeger-sampling-server
  namespace: jaeger-system
spec:
  replicas: 2
  selector:
    matchLabels:
      app: jaeger-sampling-server
  template:
    metadata:
      labels:
        app: jaeger-sampling-server
    spec:
      containers:
      - name: sampling-server
        image: jaegertracing/jaeger-collector:latest
        command:
        - /go/bin/collector-linux
        args:
        - --sampling.strategies-file=/etc/sampling/strategies.json
        - --sampling.strategies-reload-frequency=30s
        - --http.host-port=:14269
        ports:
        - containerPort: 14269
        volumeMounts:
        - name: sampling-config
          mountPath: /etc/sampling
      volumes:
      - name: sampling-config
        configMap:
          name: sampling-strategies

Create sampling strategy ConfigMap

Store sampling strategies in a ConfigMap that can be updated dynamically without container restarts.

kubectl create configmap sampling-strategies -n jaeger-system --from-file=/etc/jaeger/sampling-strategies.json

Apply the deployment

kubectl apply -f /etc/jaeger/sampling-server.yaml

Configure applications for remote sampling

Update your application configuration to use the remote sampling server instead of local configuration.

JAEGER_SAMPLER_TYPE=remote
JAEGER_SAMPLER_MANAGER_HOST_PORT=jaeger-sampling-server.jaeger-system.svc.cluster.local:14269
JAEGER_SAMPLER_REFRESH_INTERVAL=60

Monitor and optimize sampling performance

Set up sampling metrics monitoring

Configure Prometheus to collect sampling metrics from Jaeger components for performance monitoring.

- job_name: 'jaeger-sampling'
  static_configs:
  - targets: ['jaeger-collector:14269', 'jaeger-agent:14271']
  metrics_path: /metrics
  scrape_interval: 30s
  scrape_timeout: 10s

Create sampling performance alerts

Set up alerts for sampling rate anomalies and storage pressure indicators.

groups:
  • name: jaeger_sampling
rules: - alert: JaegerSamplingRateHigh expr: jaeger_agent_sampling_rate > 0.5 for: 5m labels: severity: warning annotations: summary: "Jaeger sampling rate is unusually high" description: "Service {{ $labels.service }} has sampling rate {{ $value }} which may indicate traffic anomaly" - alert: JaegerTraceVolumeHigh expr: rate(jaeger_collector_traces_received_total[5m]) > 10000 for: 2m labels: severity: critical annotations: summary: "Jaeger receiving excessive trace volume" description: "Collector is receiving {{ $value }} traces per second"

Update sampling strategies based on metrics

Use sampling metrics to dynamically adjust strategy parameters for optimal performance.

# Query current sampling rates
curl -s "http://jaeger-query:16686/api/services" | jq '.data[]'

Check trace volume by service

curl -s "http://prometheus:9090/api/v1/query?query=rate(jaeger_collector_traces_received_total[1h])" | jq '.data.result'

Update sampling configuration

kubectl patch configmap sampling-strategies -n jaeger-system --type merge -p '{ "data": { "strategies.json": "{ \"service_strategies\": [ { \"service\": \"high-volume-service\", \"type\": \"probabilistic\", \"param\": 0.01 } ] }" } }'

Verify your sampling configuration

# Check sampling strategy endpoints
curl -s "http://jaeger-collector:14268/api/sampling?service=your-service" | jq .

Verify sampling metrics

curl -s "http://jaeger-collector:14269/metrics" | grep jaeger_agent_sampling

Test trace collection with different sampling rates

for i in {1..100}; do curl -X POST "http://your-app/api/test-endpoint" sleep 0.1 done

Query traces to verify sampling effectiveness

curl -s "http://jaeger-query:16686/api/traces?service=your-service&limit=100" | jq '.data | length'

Common issues

SymptomCauseFix
Sampling config not appliedCollector not reloading strategiesCheck --sampling.strategies-reload-frequency setting
High trace volume despite samplingProbabilistic sampling with high paramReduce probabilistic parameter or add rate limiting
Missing critical tracesOverly aggressive samplingIncrease sampling for critical operations
Adaptive sampling not workingInsufficient traffic for algorithmUse probabilistic sampling for low-traffic services
Remote sampling server unreachableNetwork connectivity issuesVerify service discovery and DNS resolution
Storage still growing rapidlyDefault sampling too highLower default sampling rate and implement retention policies

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.