Elasticsearch 8 ILM Setup Guide

Configure Elasticsearch 8 ILM policies to automatically manage log data through hot-warm-cold phases, optimize storage costs, and enforce retention policies for production workloads.

Prerequisites

Elasticsearch 8.x cluster running
Cluster health status yellow or green
Administrative access to Elasticsearch
At least 10GB free disk space

What this solves

Index lifecycle management (ILM) automates the movement of log data through different storage tiers based on age, size, and performance requirements. This prevents disk space issues, reduces storage costs, and maintains query performance by moving older data to cheaper storage while keeping recent data on fast disks.

Prerequisites and Elasticsearch verification

Verify Elasticsearch cluster health

Check that your Elasticsearch cluster is running and healthy before configuring ILM policies.

curl -X GET "localhost:9200/_cluster/health?pretty"

Check current ILM status

Verify that ILM is enabled in your cluster and see existing policies.

curl -X GET "localhost:9200/_ilm/status?pretty"
curl -X GET "localhost:9200/_ilm/policy?pretty"

Create data directories for different tiers

Set up separate directories for hot, warm, and cold data storage tiers.

sudo mkdir -p /var/lib/elasticsearch/hot
sudo mkdir -p /var/lib/elasticsearch/warm
sudo mkdir -p /var/lib/elasticsearch/cold
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch

Configure ILM policies for log retention

Create a basic log retention policy

Define a policy that moves data through hot, warm, and cold phases with automatic deletion after 90 days.

curl -X PUT "localhost:9200/_ilm/policy/logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "5GB",
            "max_age": "7d",
            "max_docs": 10000000
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "set_priority": {
            "priority": 0
          }
        }
      },
      "delete": {
        "min_age": "90d"
      }
    }
  }
}'

Create an application-specific policy

Configure a more aggressive policy for high-volume application logs with shorter retention.

curl -X PUT "localhost:9200/_ilm/policy/app-logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "2GB",
            "max_age": "1d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "1d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "shrink": {
            "number_of_shards": 1
          }
        }
      },
      "delete": {
        "min_age": "30d"
      }
    }
  }
}'

Create a security audit policy

Define a policy for security logs that require longer retention and immutable storage.

curl -X PUT "localhost:9200/_ilm/policy/security-logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "10GB",
            "max_age": "30d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "allocate": {
            "number_of_replicas": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "readonly": {}
        }
      },
      "cold": {
        "min_age": "90d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "delete": {
        "min_age": "2555d"
      }
    }
  }
}'

Create index templates with ILM integration

Create template for application logs

Configure an index template that automatically applies the ILM policy to matching indices.

curl -X PUT "localhost:9200/_index_template/app-logs-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["app-logs-*"],
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "app-logs-policy",
          "rollover_alias": "app-logs"
        },
        "number_of_shards": 2,
        "number_of_replicas": 1,
        "codec": "best_compression"
      }
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "level": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "host": {
          "type": "keyword"
        },
        "application": {
          "type": "keyword"
        }
      }
    }
  },
  "priority": 200,
  "version": 1
}'

Create template for system logs

Set up a template for system logs with different mapping and ILM policy.

curl -X PUT "localhost:9200/_index_template/system-logs-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["system-logs-*"],
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "logs-policy",
          "rollover_alias": "system-logs"
        },
        "number_of_shards": 1,
        "number_of_replicas": 1,
        "refresh_interval": "30s"
      }
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "severity": {
          "type": "keyword"
        },
        "facility": {
          "type": "keyword"
        },
        "program": {
          "type": "keyword"
        },
        "message": {
          "type": "text"
        },
        "host": {
          "type": "keyword"
        }
      }
    }
  },
  "priority": 100
}'

Create initial indices with aliases

Create the initial write indices with proper aliases for ILM rollover functionality.

curl -X PUT "localhost:9200/app-logs-000001" -H 'Content-Type: application/json' -d'
{
  "aliases": {
    "app-logs": {
      "is_write_index": true
    }
  }
}'

curl -X PUT "localhost:9200/system-logs-000001" -H 'Content-Type: application/json' -d'
{
  "aliases": {
    "system-logs": {
      "is_write_index": true
    }
  }
}'

Monitor and optimize ILM performance

Check ILM policy execution

Monitor how your indices are progressing through lifecycle phases.

curl -X GET "localhost:9200/*/_ilm/explain?pretty"
curl -X GET "localhost:9200/_ilm/status?pretty"

View index allocation across nodes

Check how indices are distributed across your cluster nodes and storage tiers.

curl -X GET "localhost:9200/_cat/indices?v&s=store.size:desc"
curl -X GET "localhost:9200/_cat/allocation?v"

Configure ILM polling interval

Adjust how frequently Elasticsearch checks for ILM actions to balance responsiveness with cluster load.

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10m"
  }
}'

Create monitoring script

Set up automated monitoring to track ILM policy performance and storage usage.

#!/bin/bash

Monitor Elasticsearch ILM policies
ELASTIC_HOST="localhost:9200"
LOG_FILE="/var/log/elasticsearch-ilm-monitor.log"

echo "$(date): Starting ILM monitoring check" >> $LOG_FILE

Check ILM status
ILM_STATUS=$(curl -s "$ELASTIC_HOST/_ilm/status" | jq -r '.operation_mode')
echo "ILM Status: $ILM_STATUS" >> $LOG_FILE

Check indices in each phase
echo "Checking index phases:" >> $LOG_FILE
curl -s "$ELASTIC_HOST/*/_ilm/explain" | jq -r '.indices[] | "\(.index): \(.phase)"' >> $LOG_FILE

Check cluster disk usage
echo "Cluster storage usage:" >> $LOG_FILE
curl -s "$ELASTIC_HOST/_cat/allocation?v&h=node,disk.used_percent" >> $LOG_FILE

Alert if any index is in error state
ERRORS=$(curl -s "$ELASTIC_HOST/*/_ilm/explain" | jq -r '.indices[] | select(.step_info.type == "exception") | .index')
if [ ! -z "$ERRORS" ]; then
    echo "ERROR: Indices with ILM errors: $ERRORS" >> $LOG_FILE
fi

echo "$(date): ILM monitoring check completed" >> $LOG_FILE

Make monitoring script executable and schedule it

Set proper permissions and create a cron job for regular ILM monitoring.

sudo chmod +x /usr/local/bin/monitor-ilm.sh
sudo chown root:root /usr/local/bin/monitor-ilm.sh

Add monitoring to cron

Schedule the monitoring script to run every hour and track ILM policy execution.

sudo crontab -e

Add this line to run monitoring every hour:

0     /usr/local/bin/monitor-ilm.sh

Advanced ILM configuration options

Configure node attributes for data tiers

Set up node attributes to control where hot, warm, and cold data is stored.

node.attr.data: hot
node.roles: ["data_hot", "data_content"]

For warm nodes:
node.attr.data: warm
node.roles: ["data_warm", "data_content"]

For cold nodes:
node.attr.data: cold
node.roles: ["data_cold", "data_content"]

Create policy with searchable snapshots

Configure ILM to use searchable snapshots for long-term cold storage.

curl -X PUT "localhost:9200/_ilm/policy/archive-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB",
            "max_age": "30d"
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "90d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "backup-repo"
          }
        }
      },
      "frozen": {
        "min_age": "180d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "backup-repo",
            "force_merge_index": true
          }
        }
      },
      "delete": {
        "min_age": "2555d"
      }
    }
  }
}'

Verify your setup

# Check all ILM policies
curl -X GET "localhost:9200/_ilm/policy?pretty"

Verify index templates
curl -X GET "localhost:9200/_index_template?pretty"

Check index lifecycle status
curl -X GET "localhost:9200/*/_ilm/explain?pretty"

Test with sample data
curl -X POST "localhost:9200/app-logs/_doc" -H 'Content-Type: application/json' -d'
{
  "@timestamp": "2024-01-15T10:30:00Z",
  "level": "INFO",
  "message": "Application started successfully",
  "host": "web01",
  "application": "myapp"
}'

Verify the document was indexed
curl -X GET "localhost:9200/app-logs/_search?pretty"

Note: ILM actions occur based on the polling interval. Use curl -X POST "localhost:9200/_ilm/move/your-index" to manually trigger phase transitions for testing.

Common issues

Symptom	Cause	Fix
Index stuck in hot phase	Rollover conditions not met	Check rollover settings and index size: `curl -X GET "localhost:9200/_cat/indices?v"`
ILM policy not applied	Index created before template	Apply policy manually: `curl -X PUT "localhost:9200/index/_settings" -d '{"index.lifecycle.name":"policy"}'`
Phase transition failed	Node allocation issues	Check cluster allocation: `curl -X GET "localhost:9200/_cluster/allocation/explain"`
Delete phase not working	Index has replicas	Set replicas to 0 in warm phase or check allocation settings
High memory usage	Too many active indices	Adjust rollover settings to create fewer, larger indices

Next steps

Setup centralized log aggregation with Elasticsearch, Logstash, and Kibana to collect logs from multiple sources
Monitor Elasticsearch cluster with Prometheus and Grafana dashboards for comprehensive performance tracking
Implement Elasticsearch snapshot lifecycle management with S3 storage for automated backups
Configure Elasticsearch cross-cluster replication for disaster recovery to ensure data availability
Setup Elasticsearch SSL/TLS encryption and advanced security hardening for production environments

Running this in production?

Want this handled for you? Setting up ILM once is straightforward. Keeping it optimized, monitoring storage costs, and managing retention policies across environments is the harder part. See how we run infrastructure like this for European teams.

#elasticsearch #ilm #log-management #storage-optimization #data-retention

Setup Elasticsearch 8 index lifecycle management for automated log retention and storage optimization

Prerequisites

What this solves

Prerequisites and Elasticsearch verification

Verify Elasticsearch cluster health

Check current ILM status

Create data directories for different tiers

Configure ILM policies for log retention

Create a basic log retention policy

Create an application-specific policy

Create a security audit policy

Create index templates with ILM integration

Create template for application logs

Create template for system logs

Create initial indices with aliases

Monitor and optimize ILM performance

Check ILM policy execution

View index allocation across nodes

Configure ILM polling interval

Create monitoring script

Monitor Elasticsearch ILM policies

Check ILM status

Check indices in each phase

Check cluster disk usage

Alert if any index is in error state

Make monitoring script executable and schedule it

Add monitoring to cron

Advanced ILM configuration options

Configure node attributes for data tiers

For warm nodes:

node.attr.data: warm

node.roles: ["data_warm", "data_content"]

For cold nodes:

node.attr.data: cold

node.roles: ["data_cold", "data_content"]

Create policy with searchable snapshots

Verify your setup

Verify index templates

Check index lifecycle status

Test with sample data

Verify the document was indexed

Common issues

Next steps

Running this in production?

Related tutorials

Configure Kubernetes RBAC with service accounts and cluster roles for secure access control

Implement Deno microservices architecture with service discovery and load balancing

Implement Kubernetes RBAC with service accounts and role-based access control

Don't want to manage this yourself?

`node.roles: ["data_cold", "data_content"]`