Set up Elasticsearch cross-cluster replication (CCR) to replicate indices between clusters for disaster recovery. Configure remote clusters, create follower indices, and monitor replication status for high availability.
Prerequisites
- Elasticsearch 6.7+ with X-Pack license on both clusters
- Network connectivity between clusters on port 9300
- SSL certificates configured for transport layer
- Sufficient storage space for replicated data
What this solves
Elasticsearch cross-cluster replication (CCR) creates real-time copies of indices across different clusters for disaster recovery and data redundancy. This protects against data center outages and enables geographic data distribution.
Prerequisites and cluster setup
Verify cluster compatibility
CCR requires Elasticsearch 6.7+ with X-Pack license. Check your existing cluster versions and licensing status.
curl -X GET "localhost:9200/_xpack/license"
Configure network connectivity
Ensure clusters can communicate over the network. Configure firewall rules to allow Elasticsearch transport traffic between clusters.
sudo ufw allow from 203.0.113.0/24 to any port 9300
sudo ufw reload
Configure SSL certificates
Enable TLS for secure cross-cluster communication. Generate certificates for both clusters if not already configured.
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12
Update Elasticsearch configuration
Configure transport TLS settings in both leader and follower clusters.
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
Restart Elasticsearch services
Apply the configuration changes by restarting Elasticsearch on all cluster nodes.
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch
Configure cross-cluster replication settings
Set up remote cluster connection
Configure the follower cluster to connect to the leader cluster. Replace the IP addresses with your actual leader cluster nodes.
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster": {
"remote": {
"leader_cluster": {
"seeds": [
"203.0.113.10:9300",
"203.0.113.11:9300",
"203.0.113.12:9300"
]
}
}
}
}
}'
Configure CCR user roles
Create dedicated users with appropriate permissions for cross-cluster replication operations.
curl -X POST "localhost:9200/_security/role/ccr_user" -H 'Content-Type: application/json' -d'
{
"cluster": ["monitor", "read_ccr"],
"indices": [
{
"names": ["*"],
"privileges": ["monitor", "read", "view_index_metadata"]
}
]
}'
Create CCR authentication user
Set up a user account for cross-cluster replication with the appropriate role.
curl -X POST "localhost:9200/_security/user/ccr_user" -H 'Content-Type: application/json' -d'
{
"password": "StrongCCRPassword123!",
"roles": ["ccr_user"],
"full_name": "Cross Cluster Replication User"
}'
Test remote cluster connectivity
Verify the connection between clusters is established and working correctly.
curl -X GET "localhost:9200/_remote/info"
Set up follower indices and replication policies
Create a follower index
Set up a follower index that replicates data from a specific leader index. Adjust the index names to match your setup.
curl -X PUT "localhost:9200/logs-follower/_ccr/follow" -H 'Content-Type: application/json' -d'
{
"remote_cluster": "leader_cluster",
"leader_index": "logs-2024",
"settings": {
"index.number_of_replicas": 1,
"index.unfollow.leader_index.close.enable": true
}
}'
Configure auto-follow patterns
Set up automatic replication for indices matching specific patterns to reduce manual management.
curl -X PUT "localhost:9200/_ccr/auto_follow/logs_pattern" -H 'Content-Type: application/json' -d'
{
"remote_cluster": "leader_cluster",
"leader_index_patterns": ["logs-", "metrics-"],
"follow_index_pattern": "{{leader_index}}-replica",
"settings": {
"index.number_of_replicas": 0
},
"max_read_request_operation_count": 5120,
"max_outstanding_read_requests": 12,
"max_read_request_size": "32mb",
"max_write_request_operation_count": 5120,
"max_write_request_size": "9mb",
"max_outstanding_write_requests": 9,
"max_write_buffer_count": 2147483647,
"max_write_buffer_size": "512mb",
"max_retry_delay": "500ms",
"read_poll_timeout": "1m"
}
}'
Configure replication performance settings
Tune CCR performance parameters based on your network bandwidth and cluster capacity.
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"persistent": {
"ccr.auto_follow.wait_for_metadata_timeout": "60s",
"ccr.auto_follow.wait_for_timeout": "60s"
}
}'
Set up index lifecycle management
Configure ILM policies for follower indices to manage storage and retention automatically. This integrates with Elasticsearch Index Lifecycle Management (ILM) for automated data retention.
curl -X PUT "localhost:9200/_ilm/policy/ccr_policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "10gb",
"max_age": "7d"
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"allocate": {
"number_of_replicas": 0
}
}
},
"delete": {
"min_age": "30d"
}
}
}
}'
Monitor replication status and troubleshooting
Check follower index statistics
Monitor replication lag, operations count, and error rates for all follower indices.
curl -X GET "localhost:9200/_ccr/stats"
curl -X GET "localhost:9200/logs-follower/_ccr/stats"
Monitor auto-follow patterns
Check the status and recent activity of your auto-follow patterns.
curl -X GET "localhost:9200/_ccr/auto_follow"
curl -X GET "localhost:9200/_ccr/auto_follow/logs_pattern/stats"
Set up monitoring with Prometheus
Configure metrics collection for CCR monitoring using Elasticsearch exporter. This works alongside Prometheus and Grafana monitoring stack.
es:
uri: http://localhost:9200
timeout: 30s
all: true
indices: true
indices_settings: true
cluster_settings: true
shards: true
snapshots: true
Create alerting rules
Set up Prometheus alerting rules to detect replication issues and lag.
groups:
- name: elasticsearch.ccr
rules:
- alert: CCRReplicationLag
expr: elasticsearch_ccr_follower_operations_read_total - elasticsearch_ccr_follower_operations_written_total > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "CCR replication lag detected"
description: "Follower index {{ $labels.index }} is lagging behind leader by {{ $value }} operations"
- alert: CCRFollowerIndexDown
expr: up{job="elasticsearch-ccr"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "CCR follower index unavailable"
description: "Follower cluster is down or unreachable"
Configure log monitoring
Monitor Elasticsearch logs for CCR-related errors and warnings.
tail -f /var/log/elasticsearch/elasticsearch.log | grep -i ccr
journalctl -u elasticsearch -f | grep -i "cross.cluster"
Verify your setup
# Check remote cluster connection
curl -X GET "localhost:9200/_remote/info"
Verify follower indices
curl -X GET "localhost:9200/_cat/indices/follower?v"
Check replication statistics
curl -X GET "localhost:9200/_ccr/stats" | jq '.follow_stats.indices[].shards[0].leader_global_checkpoint'
Test data replication
curl -X POST "leader-cluster:9200/test-index/_doc" -H 'Content-Type: application/json' -d '{"message": "test replication", "timestamp": "2024-01-15T10:30:00Z"}'
sleep 5
curl -X GET "localhost:9200/test-index-replica/_search"
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Remote cluster connection fails | Network connectivity or SSL configuration | Check firewall rules and certificate validity with curl -k https://leader-node:9200 |
| High replication lag | Network bandwidth or cluster resources | Increase max_read_request_size and max_outstanding_read_requests parameters |
| Follower index creation fails | Insufficient permissions or license issues | Verify CCR user roles and check license status with GET _xpack/license |
| Auto-follow pattern not working | Pattern matching issues or timing | Check pattern syntax and verify with GET _ccr/auto_follow/pattern_name/stats |
| Replication stops unexpectedly | Leader index closure or mapping conflicts | Check leader index status and resolve mapping differences before resuming |
Next steps
- Configure InfluxDB 2.7 clustering for high availability with data replication and automated failover
- Set up Prometheus and Grafana monitoring stack with Docker Compose
- Implement Elasticsearch backup lifecycle management with S3 integration
- Configure Elasticsearch cluster security hardening and authentication
- Set up Elasticsearch multi-datacenter deployment with cross-region replication