Configure Elasticsearch cross-cluster replication for disaster recovery

Advanced 45 min May 25, 2026 17 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up Elasticsearch cross-cluster replication (CCR) to replicate indices between clusters for disaster recovery. Configure remote clusters, create follower indices, and monitor replication status for high availability.

Prerequisites

  • Elasticsearch 6.7+ with X-Pack license on both clusters
  • Network connectivity between clusters on port 9300
  • SSL certificates configured for transport layer
  • Sufficient storage space for replicated data

What this solves

Elasticsearch cross-cluster replication (CCR) creates real-time copies of indices across different clusters for disaster recovery and data redundancy. This protects against data center outages and enables geographic data distribution.

Prerequisites and cluster setup

Verify cluster compatibility

CCR requires Elasticsearch 6.7+ with X-Pack license. Check your existing cluster versions and licensing status.

curl -X GET "localhost:9200/_xpack/license"

Configure network connectivity

Ensure clusters can communicate over the network. Configure firewall rules to allow Elasticsearch transport traffic between clusters.

sudo ufw allow from 203.0.113.0/24 to any port 9300
sudo ufw reload
sudo firewall-cmd --add-rich-rule='rule family="ipv4" source address="203.0.113.0/24" port protocol="tcp" port="9300" accept' --permanent
sudo firewall-cmd --reload

Configure SSL certificates

Enable TLS for secure cross-cluster communication. Generate certificates for both clusters if not already configured.

sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12

Update Elasticsearch configuration

Configure transport TLS settings in both leader and follower clusters.

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12

Restart Elasticsearch services

Apply the configuration changes by restarting Elasticsearch on all cluster nodes.

sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch

Configure cross-cluster replication settings

Set up remote cluster connection

Configure the follower cluster to connect to the leader cluster. Replace the IP addresses with your actual leader cluster nodes.

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster": {
      "remote": {
        "leader_cluster": {
          "seeds": [
            "203.0.113.10:9300",
            "203.0.113.11:9300",
            "203.0.113.12:9300"
          ]
        }
      }
    }
  }
}'

Configure CCR user roles

Create dedicated users with appropriate permissions for cross-cluster replication operations.

curl -X POST "localhost:9200/_security/role/ccr_user" -H 'Content-Type: application/json' -d'
{
  "cluster": ["monitor", "read_ccr"],
  "indices": [
    {
      "names": ["*"],
      "privileges": ["monitor", "read", "view_index_metadata"]
    }
  ]
}'

Create CCR authentication user

Set up a user account for cross-cluster replication with the appropriate role.

curl -X POST "localhost:9200/_security/user/ccr_user" -H 'Content-Type: application/json' -d'
{
  "password": "StrongCCRPassword123!",
  "roles": ["ccr_user"],
  "full_name": "Cross Cluster Replication User"
}'

Test remote cluster connectivity

Verify the connection between clusters is established and working correctly.

curl -X GET "localhost:9200/_remote/info"

Set up follower indices and replication policies

Create a follower index

Set up a follower index that replicates data from a specific leader index. Adjust the index names to match your setup.

curl -X PUT "localhost:9200/logs-follower/_ccr/follow" -H 'Content-Type: application/json' -d'
{
  "remote_cluster": "leader_cluster",
  "leader_index": "logs-2024",
  "settings": {
    "index.number_of_replicas": 1,
    "index.unfollow.leader_index.close.enable": true
  }
}'

Configure auto-follow patterns

Set up automatic replication for indices matching specific patterns to reduce manual management.

curl -X PUT "localhost:9200/_ccr/auto_follow/logs_pattern" -H 'Content-Type: application/json' -d'
{
  "remote_cluster": "leader_cluster",
  "leader_index_patterns": ["logs-", "metrics-"],
  "follow_index_pattern": "{{leader_index}}-replica",
  "settings": {
    "index.number_of_replicas": 0
  },
  "max_read_request_operation_count": 5120,
  "max_outstanding_read_requests": 12,
  "max_read_request_size": "32mb",
  "max_write_request_operation_count": 5120,
  "max_write_request_size": "9mb",
  "max_outstanding_write_requests": 9,
  "max_write_buffer_count": 2147483647,
  "max_write_buffer_size": "512mb",
  "max_retry_delay": "500ms",
  "read_poll_timeout": "1m"
}
}'

Configure replication performance settings

Tune CCR performance parameters based on your network bandwidth and cluster capacity.

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "ccr.auto_follow.wait_for_metadata_timeout": "60s",
    "ccr.auto_follow.wait_for_timeout": "60s"
  }
}'

Set up index lifecycle management

Configure ILM policies for follower indices to manage storage and retention automatically. This integrates with Elasticsearch Index Lifecycle Management (ILM) for automated data retention.

curl -X PUT "localhost:9200/_ilm/policy/ccr_policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "10gb",
            "max_age": "7d"
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "delete": {
        "min_age": "30d"
      }
    }
  }
}'

Monitor replication status and troubleshooting

Check follower index statistics

Monitor replication lag, operations count, and error rates for all follower indices.

curl -X GET "localhost:9200/_ccr/stats"
curl -X GET "localhost:9200/logs-follower/_ccr/stats"

Monitor auto-follow patterns

Check the status and recent activity of your auto-follow patterns.

curl -X GET "localhost:9200/_ccr/auto_follow"
curl -X GET "localhost:9200/_ccr/auto_follow/logs_pattern/stats"

Set up monitoring with Prometheus

Configure metrics collection for CCR monitoring using Elasticsearch exporter. This works alongside Prometheus and Grafana monitoring stack.

es:
  uri: http://localhost:9200
  timeout: 30s
  all: true
  indices: true
  indices_settings: true
  cluster_settings: true
  shards: true
  snapshots: true

Create alerting rules

Set up Prometheus alerting rules to detect replication issues and lag.

groups:
  • name: elasticsearch.ccr
rules: - alert: CCRReplicationLag expr: elasticsearch_ccr_follower_operations_read_total - elasticsearch_ccr_follower_operations_written_total > 1000 for: 5m labels: severity: warning annotations: summary: "CCR replication lag detected" description: "Follower index {{ $labels.index }} is lagging behind leader by {{ $value }} operations" - alert: CCRFollowerIndexDown expr: up{job="elasticsearch-ccr"} == 0 for: 2m labels: severity: critical annotations: summary: "CCR follower index unavailable" description: "Follower cluster is down or unreachable"

Configure log monitoring

Monitor Elasticsearch logs for CCR-related errors and warnings.

tail -f /var/log/elasticsearch/elasticsearch.log | grep -i ccr
journalctl -u elasticsearch -f | grep -i "cross.cluster"

Verify your setup

# Check remote cluster connection
curl -X GET "localhost:9200/_remote/info"

Verify follower indices

curl -X GET "localhost:9200/_cat/indices/follower?v"

Check replication statistics

curl -X GET "localhost:9200/_ccr/stats" | jq '.follow_stats.indices[].shards[0].leader_global_checkpoint'

Test data replication

curl -X POST "leader-cluster:9200/test-index/_doc" -H 'Content-Type: application/json' -d '{"message": "test replication", "timestamp": "2024-01-15T10:30:00Z"}' sleep 5 curl -X GET "localhost:9200/test-index-replica/_search"

Common issues

SymptomCauseFix
Remote cluster connection failsNetwork connectivity or SSL configurationCheck firewall rules and certificate validity with curl -k https://leader-node:9200
High replication lagNetwork bandwidth or cluster resourcesIncrease max_read_request_size and max_outstanding_read_requests parameters
Follower index creation failsInsufficient permissions or license issuesVerify CCR user roles and check license status with GET _xpack/license
Auto-follow pattern not workingPattern matching issues or timingCheck pattern syntax and verify with GET _ccr/auto_follow/pattern_name/stats
Replication stops unexpectedlyLeader index closure or mapping conflictsCheck leader index status and resolve mapping differences before resuming

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. Our managed platform covers monitoring, backups and 24/7 response by default.

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.