Set up Consul WAN federation to connect multiple datacenters for global service discovery and failover. This tutorial covers primary and secondary datacenter configuration with ACL token replication and cross-datacenter networking.
Prerequisites
- Multiple servers across different datacenters
- Network connectivity between datacenters
- Basic understanding of Consul architecture
- Firewall access to configure ports 8300-8502
What this solves
Consul multi-datacenter WAN federation allows you to connect Consul clusters across different geographic locations or network segments. This provides global service discovery, cross-datacenter failover, and centralized configuration management while maintaining local autonomy in each datacenter. You need this when running distributed applications across multiple regions or availability zones.
Multi-datacenter architecture planning
Before configuring WAN federation, plan your datacenter topology. Each datacenter runs an independent Consul cluster with its own servers and agents. The primary datacenter handles ACL tokens and configuration replication, while secondary datacenters federate with the primary for global coordination.
Step-by-step configuration
Install Consul on all nodes
Install Consul on servers in both datacenters. Use the same version across all nodes to ensure compatibility.
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update
sudo apt install -y consul
Generate encryption keys and certificates
Create shared encryption keys and TLS certificates for secure communication between datacenters.
consul keygen
consul tls ca create
consul tls cert create -server -dc dc1
consul tls cert create -server -dc dc2
Save the encryption key output and distribute the CA certificate and keys to all nodes in both datacenters.
Configure primary datacenter (dc1)
Configure the primary datacenter servers with WAN federation enabled. This datacenter will be the authoritative source for ACL tokens.
datacenter = "dc1"
primary_datacenter = "dc1"
data_dir = "/opt/consul"
log_level = "INFO"
server = true
bootstrap_expect = 3
bind_addr = "203.0.113.10"
client_addr = "0.0.0.0"
retry_join = ["203.0.113.11", "203.0.113.12"]
connect {
enabled = true
}
ports {
grpc = 8502
}
encrypt = "your-encryption-key-here"
tls {
defaults {
ca_file = "/etc/consul.d/tls/consul-agent-ca.pem"
cert_file = "/etc/consul.d/tls/dc1-server-consul-0.pem"
key_file = "/etc/consul.d/tls/dc1-server-consul-0-key.pem"
verify_incoming = true
verify_outgoing = true
}
internal_rpc {
verify_server_hostname = true
}
}
acl = {
enabled = true
default_policy = "deny"
enable_token_persistence = true
}
ui_config {
enabled = true
}
Configure secondary datacenter (dc2)
Configure secondary datacenter servers to federate with the primary datacenter for ACL token replication and global coordination.
datacenter = "dc2"
primary_datacenter = "dc1"
data_dir = "/opt/consul"
log_level = "INFO"
server = true
bootstrap_expect = 3
bind_addr = "198.51.100.10"
client_addr = "0.0.0.0"
retry_join = ["198.51.100.11", "198.51.100.12"]
retry_join_wan = ["203.0.113.10", "203.0.113.11", "203.0.113.12"]
connect {
enabled = true
}
ports {
grpc = 8502
}
encrypt = "your-encryption-key-here"
tls {
defaults {
ca_file = "/etc/consul.d/tls/consul-agent-ca.pem"
cert_file = "/etc/consul.d/tls/dc2-server-consul-0.pem"
key_file = "/etc/consul.d/tls/dc2-server-consul-0-key.pem"
verify_incoming = true
verify_outgoing = true
}
internal_rpc {
verify_server_hostname = true
}
}
acl = {
enabled = true
default_policy = "deny"
enable_token_persistence = true
enable_token_replication = true
}
ui_config {
enabled = true
}
Create Consul user and set permissions
Create a dedicated user for Consul and set appropriate file permissions for security.
sudo useradd --system --home /etc/consul.d --shell /bin/false consul
sudo mkdir -p /opt/consul /etc/consul.d/tls
sudo chown -R consul:consul /opt/consul /etc/consul.d
sudo chmod 640 /etc/consul.d/consul.hcl
sudo chmod 640 /etc/consul.d/tls/*
Configure firewall rules
Open required ports for Consul communication between datacenters. WAN federation uses additional ports for cross-datacenter communication.
sudo ufw allow 8300/tcp comment "Consul server RPC"
sudo ufw allow 8301/tcp comment "Consul serf LAN"
sudo ufw allow 8301/udp comment "Consul serf LAN"
sudo ufw allow 8302/tcp comment "Consul serf WAN"
sudo ufw allow 8302/udp comment "Consul serf WAN"
sudo ufw allow 8500/tcp comment "Consul HTTP API"
sudo ufw allow 8502/tcp comment "Consul gRPC API"
sudo ufw reload
Start Consul services
Start Consul on all servers in the primary datacenter first, then start the secondary datacenter nodes.
sudo systemctl enable consul
sudo systemctl start consul
sudo systemctl status consul
Bootstrap ACL system in primary datacenter
Initialize the ACL system in the primary datacenter to generate the bootstrap token for administration.
consul acl bootstrap
Save the SecretID from the output as your master token. Export it for subsequent commands.
export CONSUL_HTTP_TOKEN="your-bootstrap-token-here"
Create replication token
Create an ACL token for cross-datacenter replication with appropriate permissions.
consul acl policy create -name "replication" -rules @- << EOF
acl = "write"
service_prefix "" {
policy = "read"
intentions = "read"
}
node_prefix "" {
policy = "read"
}
EOF
consul acl token create -description "Replication token" -policy-name "replication"
Save the token SecretID for configuring secondary datacenters.
Configure ACL replication on secondary datacenter
Set the replication token on secondary datacenter servers to enable ACL synchronization.
consul acl set-agent-token replication "your-replication-token-here"
Verify WAN federation
Check that datacenters are properly federated and can communicate across the WAN.
consul members -wan
consul catalog datacenters
Cross-datacenter service discovery and monitoring
With WAN federation configured, services can discover and communicate across datacenters. Configure service definitions with datacenter-aware health checks and failover policies.
Register cross-datacenter service
Register a service that can be discovered from multiple datacenters with health checking.
{
"service": {
"name": "web",
"port": 80,
"check": {
"http": "http://localhost:80/health",
"interval": "10s"
},
"meta": {
"datacenter": "dc1"
}
}
}
sudo systemctl reload consul
Query services across datacenters
Test cross-datacenter service discovery using the Consul API and DNS interface.
consul catalog services -datacenter dc1
consul catalog services -datacenter dc2
dig @127.0.0.1 -p 8600 web.service.dc1.consul
dig @127.0.0.1 -p 8600 web.service.dc2.consul
Configure prepared queries for failover
Create prepared queries to automatically fail over to other datacenters when local services are unavailable.
consul prepared-query create -name "web-failover" \-service "web" \-failover-datacenters "dc2" \-only-passing
Verify your setup
consul members -wan
consul catalog datacenters
consul acl token list
consul monitor -log-level=DEBUG
dig @127.0.0.1 -p 8600 web-failover.query.consul
The output should show servers from both datacenters in the WAN member list, and cross-datacenter service queries should resolve successfully.
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| WAN join fails | Firewall blocking port 8302 | Open TCP/UDP port 8302 between datacenters |
| ACL token replication fails | Incorrect replication token | Verify token has "acl = write" permission |
| Cross-DC service queries timeout | Network connectivity issues | Test connectivity on ports 8300-8302 between DCs |
| TLS handshake failures | Certificate hostname mismatch | Ensure certificates match server hostnames |
| Bootstrap token creation fails | ACLs already initialized | Use existing bootstrap token or reset ACL system |
Next steps
- Set up Consul multi-datacenter replication with ACL token replication
- Implement Consul backup and disaster recovery with automated snapshots and restoration
- Configure Consul Connect service mesh with Envoy proxy for secure microservices communication
- Monitor Consul with Prometheus and Grafana for service discovery observability
- Configure advanced network monitoring with SmokePing for detailed latency analysis
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Default values
DATACENTER=""
PRIMARY_DC=""
NODE_ROLE=""
BIND_ADDR=""
JOIN_ADDRS=""
WAN_JOIN_ADDRS=""
usage() {
echo "Usage: $0 --datacenter <dc> --primary-dc <primary> --role <server|client> --bind-addr <ip> [options]"
echo "Options:"
echo " --datacenter <dc> Datacenter name (e.g., dc1, dc2)"
echo " --primary-dc <primary> Primary datacenter name"
echo " --role <server|client> Node role"
echo " --bind-addr <ip> Bind address for this node"
echo " --join-addrs <ips> Comma-separated list of local cluster IPs"
echo " --wan-join-addrs <ips> Comma-separated list of primary DC IPs (for secondary DC)"
exit 1
}
cleanup() {
if [ $? -ne 0 ]; then
echo -e "${RED}[ERROR] Installation failed. Cleaning up...${NC}"
systemctl stop consul 2>/dev/null || true
systemctl disable consul 2>/dev/null || true
rm -f /etc/consul.d/consul.hcl
userdel consul 2>/dev/null || true
fi
}
trap cleanup ERR
# Parse arguments
while [[ $# -gt 0 ]]; do
case $1 in
--datacenter) DATACENTER="$2"; shift 2 ;;
--primary-dc) PRIMARY_DC="$2"; shift 2 ;;
--role) NODE_ROLE="$2"; shift 2 ;;
--bind-addr) BIND_ADDR="$2"; shift 2 ;;
--join-addrs) JOIN_ADDRS="$2"; shift 2 ;;
--wan-join-addrs) WAN_JOIN_ADDRS="$2"; shift 2 ;;
*) usage ;;
esac
done
# Validate required arguments
if [[ -z "$DATACENTER" || -z "$PRIMARY_DC" || -z "$NODE_ROLE" || -z "$BIND_ADDR" ]]; then
usage
fi
if [[ "$NODE_ROLE" != "server" && "$NODE_ROLE" != "client" ]]; then
echo -e "${RED}[ERROR] Role must be 'server' or 'client'${NC}"
exit 1
fi
echo -e "${GREEN}[1/10] Checking prerequisites...${NC}"
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}[ERROR] This script must be run as root${NC}"
exit 1
fi
# Detect OS
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
PKG_UPDATE="apt update"
FIREWALL_CMD="ufw"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf check-update || true"
FIREWALL_CMD="firewall-cmd"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum check-update || true"
FIREWALL_CMD="firewall-cmd"
;;
*)
echo -e "${RED}[ERROR] Unsupported distribution: $ID${NC}"
exit 1
;;
esac
else
echo -e "${RED}[ERROR] Cannot detect OS version${NC}"
exit 1
fi
echo -e "${GREEN}[2/10] Installing HashiCorp repository...${NC}"
case "$PKG_MGR" in
apt)
$PKG_INSTALL wget gpg lsb-release
wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" > /etc/apt/sources.list.d/hashicorp.list
$PKG_UPDATE
;;
dnf|yum)
$PKG_INSTALL dnf-plugins-core 2>/dev/null || $PKG_INSTALL yum-utils
if command -v dnf >/dev/null 2>&1; then
dnf config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
else
yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
fi
;;
esac
echo -e "${GREEN}[3/10] Installing Consul...${NC}"
$PKG_INSTALL consul
echo -e "${GREEN}[4/10] Creating consul user and directories...${NC}"
if ! id consul >/dev/null 2>&1; then
useradd --system --home /etc/consul.d --shell /bin/false consul
fi
mkdir -p /opt/consul /etc/consul.d/tls
chown -R consul:consul /opt/consul /etc/consul.d
chmod 755 /opt/consul /etc/consul.d
chmod 700 /etc/consul.d/tls
echo -e "${GREEN}[5/10] Generating encryption key and certificates...${NC}"
if [[ ! -f /etc/consul.d/encrypt.key ]]; then
ENCRYPT_KEY=$(consul keygen)
echo "$ENCRYPT_KEY" > /etc/consul.d/encrypt.key
chmod 600 /etc/consul.d/encrypt.key
chown consul:consul /etc/consul.d/encrypt.key
else
ENCRYPT_KEY=$(cat /etc/consul.d/encrypt.key)
fi
if [[ ! -f /etc/consul.d/tls/consul-agent-ca.pem ]]; then
cd /etc/consul.d/tls
consul tls ca create
consul tls cert create -server -dc "$DATACENTER"
chown -R consul:consul /etc/consul.d/tls
chmod 600 /etc/consul.d/tls/*.pem
fi
echo -e "${GREEN}[6/10] Creating Consul configuration...${NC}"
cat > /etc/consul.d/consul.hcl << EOF
datacenter = "$DATACENTER"
primary_datacenter = "$PRIMARY_DC"
data_dir = "/opt/consul"
log_level = "INFO"
server = $([ "$NODE_ROLE" = "server" ] && echo "true" || echo "false")
bind_addr = "$BIND_ADDR"
client_addr = "0.0.0.0"
EOF
if [[ "$NODE_ROLE" = "server" ]]; then
cat >> /etc/consul.d/consul.hcl << EOF
bootstrap_expect = 3
ui_config {
enabled = true
}
EOF
fi
if [[ -n "$JOIN_ADDRS" ]]; then
IFS=',' read -ra ADDR_ARRAY <<< "$JOIN_ADDRS"
echo "retry_join = [" >> /etc/consul.d/consul.hcl
for addr in "${ADDR_ARRAY[@]}"; do
echo " \"$addr\"," >> /etc/consul.d/consul.hcl
done
echo "]" >> /etc/consul.d/consul.hcl
echo "" >> /etc/consul.d/consul.hcl
fi
if [[ -n "$WAN_JOIN_ADDRS" && "$DATACENTER" != "$PRIMARY_DC" ]]; then
IFS=',' read -ra WAN_ADDR_ARRAY <<< "$WAN_JOIN_ADDRS"
echo "retry_join_wan = [" >> /etc/consul.d/consul.hcl
for addr in "${WAN_ADDR_ARRAY[@]}"; do
echo " \"$addr\"," >> /etc/consul.d/consul.hcl
done
echo "]" >> /etc/consul.d/consul.hcl
echo "" >> /etc/consul.d/consul.hcl
fi
cat >> /etc/consul.d/consul.hcl << EOF
connect {
enabled = true
}
ports {
grpc = 8502
}
encrypt = "$ENCRYPT_KEY"
tls {
defaults {
ca_file = "/etc/consul.d/tls/consul-agent-ca.pem"
cert_file = "/etc/consul.d/tls/$DATACENTER-server-consul-0.pem"
key_file = "/etc/consul.d/tls/$DATACENTER-server-consul-0-key.pem"
verify_incoming = true
verify_outgoing = true
}
internal_rpc {
verify_server_hostname = true
}
}
acl = {
enabled = true
default_policy = "deny"
enable_token_persistence = true
EOF
if [[ "$DATACENTER" != "$PRIMARY_DC" ]]; then
echo " enable_token_replication = true" >> /etc/consul.d/consul.hcl
fi
echo "}" >> /etc/consul.d/consul.hcl
chown consul:consul /etc/consul.d/consul.hcl
chmod 640 /etc/consul.d/consul.hcl
echo -e "${GREEN}[7/10] Creating systemd service...${NC}"
cat > /etc/systemd/system/consul.service << EOF
[Unit]
Description=Consul
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/consul.d/consul.hcl
[Service]
Type=notify
User=consul
Group=consul
ExecStart=/usr/bin/consul agent -config-dir=/etc/consul.d/
ExecReload=/bin/kill -HUP \$MAINPID
KillMode=process
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable consul
echo -e "${GREEN}[8/10] Configuring firewall...${NC}"
case "$FIREWALL_CMD" in
ufw)
if systemctl is-active --quiet ufw; then
ufw allow 8300/tcp
ufw allow 8301/tcp
ufw allow 8301/udp
ufw allow 8302/tcp
ufw allow 8302/udp
ufw allow 8500/tcp
ufw allow 8502/tcp
ufw allow 8600/tcp
ufw allow 8600/udp
fi
;;
firewall-cmd)
if systemctl is-active --quiet firewalld; then
firewall-cmd --permanent --add-port=8300/tcp
firewall-cmd --permanent --add-port=8301/tcp
firewall-cmd --permanent --add-port=8301/udp
firewall-cmd --permanent --add-port=8302/tcp
firewall-cmd --permanent --add-port=8302/udp
firewall-cmd --permanent --add-port=8500/tcp
firewall-cmd --permanent --add-port=8502/tcp
firewall-cmd --permanent --add-port=8600/tcp
firewall-cmd --permanent --add-port=8600/udp
firewall-cmd --reload
fi
;;
esac
echo -e "${GREEN}[9/10] Starting Consul service...${NC}"
systemctl start consul
echo -e "${GREEN}[10/10] Verifying installation...${NC}"
sleep 5
if systemctl is-active --quiet consul; then
echo -e "${GREEN}✓ Consul service is running${NC}"
else
echo -e "${RED}✗ Consul service failed to start${NC}"
exit 1
fi
if consul members >/dev/null 2>&1; then
echo -e "${GREEN}✓ Consul cluster is accessible${NC}"
else
echo -e "${YELLOW}⚠ Consul cluster not yet accessible (this is normal for initial setup)${NC}"
fi
echo -e "${GREEN}Installation completed successfully!${NC}"
echo -e "${YELLOW}Next steps:${NC}"
echo "1. Copy the encryption key to all nodes: $(cat /etc/consul.d/encrypt.key)"
echo "2. Copy TLS certificates to other nodes"
echo "3. Bootstrap ACL system: consul acl bootstrap"
echo "4. Access UI at: https://$BIND_ADDR:8500"
Review the script before running. Execute with: bash install.sh