Configure Elasticsearch 8 ILM policies to automatically manage log data through hot-warm-cold phases, optimize storage costs, and enforce retention policies for production workloads.
Prerequisites
- Elasticsearch 8.x cluster running
- Cluster health status yellow or green
- Administrative access to Elasticsearch
- At least 10GB free disk space
What this solves
Index lifecycle management (ILM) automates the movement of log data through different storage tiers based on age, size, and performance requirements. This prevents disk space issues, reduces storage costs, and maintains query performance by moving older data to cheaper storage while keeping recent data on fast disks.
Prerequisites and Elasticsearch verification
Verify Elasticsearch cluster health
Check that your Elasticsearch cluster is running and healthy before configuring ILM policies.
curl -X GET "localhost:9200/_cluster/health?pretty"
Check current ILM status
Verify that ILM is enabled in your cluster and see existing policies.
curl -X GET "localhost:9200/_ilm/status?pretty"
curl -X GET "localhost:9200/_ilm/policy?pretty"
Create data directories for different tiers
Set up separate directories for hot, warm, and cold data storage tiers.
sudo mkdir -p /var/lib/elasticsearch/hot
sudo mkdir -p /var/lib/elasticsearch/warm
sudo mkdir -p /var/lib/elasticsearch/cold
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch
Configure ILM policies for log retention
Create a basic log retention policy
Define a policy that moves data through hot, warm, and cold phases with automatic deletion after 90 days.
curl -X PUT "localhost:9200/_ilm/policy/logs-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "5GB",
"max_age": "7d",
"max_docs": 10000000
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"allocate": {
"number_of_replicas": 0
},
"forcemerge": {
"max_num_segments": 1
},
"set_priority": {
"priority": 50
}
}
},
"cold": {
"min_age": "30d",
"actions": {
"allocate": {
"number_of_replicas": 0
},
"set_priority": {
"priority": 0
}
}
},
"delete": {
"min_age": "90d"
}
}
}
}'
Create an application-specific policy
Configure a more aggressive policy for high-volume application logs with shorter retention.
curl -X PUT "localhost:9200/_ilm/policy/app-logs-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "2GB",
"max_age": "1d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "1d",
"actions": {
"allocate": {
"number_of_replicas": 0
},
"forcemerge": {
"max_num_segments": 1
},
"shrink": {
"number_of_shards": 1
}
}
},
"delete": {
"min_age": "30d"
}
}
}
}'
Create a security audit policy
Define a policy for security logs that require longer retention and immutable storage.
curl -X PUT "localhost:9200/_ilm/policy/security-logs-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "10GB",
"max_age": "30d"
},
"set_priority": {
"priority": 100
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"allocate": {
"number_of_replicas": 1
},
"forcemerge": {
"max_num_segments": 1
},
"readonly": {}
}
},
"cold": {
"min_age": "90d",
"actions": {
"allocate": {
"number_of_replicas": 0
}
}
},
"delete": {
"min_age": "2555d"
}
}
}
}'
Create index templates with ILM integration
Create template for application logs
Configure an index template that automatically applies the ILM policy to matching indices.
curl -X PUT "localhost:9200/_index_template/app-logs-template" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["app-logs-*"],
"template": {
"settings": {
"index": {
"lifecycle": {
"name": "app-logs-policy",
"rollover_alias": "app-logs"
},
"number_of_shards": 2,
"number_of_replicas": 1,
"codec": "best_compression"
}
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"level": {
"type": "keyword"
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"host": {
"type": "keyword"
},
"application": {
"type": "keyword"
}
}
}
},
"priority": 200,
"version": 1
}'
Create template for system logs
Set up a template for system logs with different mapping and ILM policy.
curl -X PUT "localhost:9200/_index_template/system-logs-template" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["system-logs-*"],
"template": {
"settings": {
"index": {
"lifecycle": {
"name": "logs-policy",
"rollover_alias": "system-logs"
},
"number_of_shards": 1,
"number_of_replicas": 1,
"refresh_interval": "30s"
}
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"severity": {
"type": "keyword"
},
"facility": {
"type": "keyword"
},
"program": {
"type": "keyword"
},
"message": {
"type": "text"
},
"host": {
"type": "keyword"
}
}
}
},
"priority": 100
}'
Create initial indices with aliases
Create the initial write indices with proper aliases for ILM rollover functionality.
curl -X PUT "localhost:9200/app-logs-000001" -H 'Content-Type: application/json' -d'
{
"aliases": {
"app-logs": {
"is_write_index": true
}
}
}'
curl -X PUT "localhost:9200/system-logs-000001" -H 'Content-Type: application/json' -d'
{
"aliases": {
"system-logs": {
"is_write_index": true
}
}
}'
Monitor and optimize ILM performance
Check ILM policy execution
Monitor how your indices are progressing through lifecycle phases.
curl -X GET "localhost:9200/*/_ilm/explain?pretty"
curl -X GET "localhost:9200/_ilm/status?pretty"
View index allocation across nodes
Check how indices are distributed across your cluster nodes and storage tiers.
curl -X GET "localhost:9200/_cat/indices?v&s=store.size:desc"
curl -X GET "localhost:9200/_cat/allocation?v"
Configure ILM polling interval
Adjust how frequently Elasticsearch checks for ILM actions to balance responsiveness with cluster load.
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"persistent": {
"indices.lifecycle.poll_interval": "10m"
}
}'
Create monitoring script
Set up automated monitoring to track ILM policy performance and storage usage.
#!/bin/bash
Monitor Elasticsearch ILM policies
ELASTIC_HOST="localhost:9200"
LOG_FILE="/var/log/elasticsearch-ilm-monitor.log"
echo "$(date): Starting ILM monitoring check" >> $LOG_FILE
Check ILM status
ILM_STATUS=$(curl -s "$ELASTIC_HOST/_ilm/status" | jq -r '.operation_mode')
echo "ILM Status: $ILM_STATUS" >> $LOG_FILE
Check indices in each phase
echo "Checking index phases:" >> $LOG_FILE
curl -s "$ELASTIC_HOST/*/_ilm/explain" | jq -r '.indices[] | "\(.index): \(.phase)"' >> $LOG_FILE
Check cluster disk usage
echo "Cluster storage usage:" >> $LOG_FILE
curl -s "$ELASTIC_HOST/_cat/allocation?v&h=node,disk.used_percent" >> $LOG_FILE
Alert if any index is in error state
ERRORS=$(curl -s "$ELASTIC_HOST/*/_ilm/explain" | jq -r '.indices[] | select(.step_info.type == "exception") | .index')
if [ ! -z "$ERRORS" ]; then
echo "ERROR: Indices with ILM errors: $ERRORS" >> $LOG_FILE
fi
echo "$(date): ILM monitoring check completed" >> $LOG_FILE
Make monitoring script executable and schedule it
Set proper permissions and create a cron job for regular ILM monitoring.
sudo chmod +x /usr/local/bin/monitor-ilm.sh
sudo chown root:root /usr/local/bin/monitor-ilm.sh
Add monitoring to cron
Schedule the monitoring script to run every hour and track ILM policy execution.
sudo crontab -e
Add this line to run monitoring every hour:
0 /usr/local/bin/monitor-ilm.sh
Advanced ILM configuration options
Configure node attributes for data tiers
Set up node attributes to control where hot, warm, and cold data is stored.
node.attr.data: hot
node.roles: ["data_hot", "data_content"]
For warm nodes:
node.attr.data: warm
node.roles: ["data_warm", "data_content"]
For cold nodes:
node.attr.data: cold
node.roles: ["data_cold", "data_content"]
Create policy with searchable snapshots
Configure ILM to use searchable snapshots for long-term cold storage.
curl -X PUT "localhost:9200/_ilm/policy/archive-policy" -H 'Content-Type: application/json' -d'
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "50GB",
"max_age": "30d"
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"forcemerge": {
"max_num_segments": 1
}
}
},
"cold": {
"min_age": "90d",
"actions": {
"searchable_snapshot": {
"snapshot_repository": "backup-repo"
}
}
},
"frozen": {
"min_age": "180d",
"actions": {
"searchable_snapshot": {
"snapshot_repository": "backup-repo",
"force_merge_index": true
}
}
},
"delete": {
"min_age": "2555d"
}
}
}
}'
Verify your setup
# Check all ILM policies
curl -X GET "localhost:9200/_ilm/policy?pretty"
Verify index templates
curl -X GET "localhost:9200/_index_template?pretty"
Check index lifecycle status
curl -X GET "localhost:9200/*/_ilm/explain?pretty"
Test with sample data
curl -X POST "localhost:9200/app-logs/_doc" -H 'Content-Type: application/json' -d'
{
"@timestamp": "2024-01-15T10:30:00Z",
"level": "INFO",
"message": "Application started successfully",
"host": "web01",
"application": "myapp"
}'
Verify the document was indexed
curl -X GET "localhost:9200/app-logs/_search?pretty"
curl -X POST "localhost:9200/_ilm/move/your-index" to manually trigger phase transitions for testing.Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Index stuck in hot phase | Rollover conditions not met | Check rollover settings and index size: curl -X GET "localhost:9200/_cat/indices?v" |
| ILM policy not applied | Index created before template | Apply policy manually: curl -X PUT "localhost:9200/index/_settings" -d '{"index.lifecycle.name":"policy"}' |
| Phase transition failed | Node allocation issues | Check cluster allocation: curl -X GET "localhost:9200/_cluster/allocation/explain" |
| Delete phase not working | Index has replicas | Set replicas to 0 in warm phase or check allocation settings |
| High memory usage | Too many active indices | Adjust rollover settings to create fewer, larger indices |
Next steps
- Setup centralized log aggregation with Elasticsearch, Logstash, and Kibana to collect logs from multiple sources
- Monitor Elasticsearch cluster with Prometheus and Grafana dashboards for comprehensive performance tracking
- Implement Elasticsearch snapshot lifecycle management with S3 storage for automated backups
- Configure Elasticsearch cross-cluster replication for disaster recovery to ensure data availability
- Setup Elasticsearch SSL/TLS encryption and advanced security hardening for production environments