Setup Elasticsearch 8 index lifecycle management for automated log retention and storage optimization

Intermediate 25 min May 17, 2026 58 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure Elasticsearch 8 ILM policies to automatically manage log data through hot-warm-cold phases, optimize storage costs, and enforce retention policies for production workloads.

Prerequisites

  • Elasticsearch 8.x cluster running
  • Cluster health status yellow or green
  • Administrative access to Elasticsearch
  • At least 10GB free disk space

What this solves

Index lifecycle management (ILM) automates the movement of log data through different storage tiers based on age, size, and performance requirements. This prevents disk space issues, reduces storage costs, and maintains query performance by moving older data to cheaper storage while keeping recent data on fast disks.

Prerequisites and Elasticsearch verification

Verify Elasticsearch cluster health

Check that your Elasticsearch cluster is running and healthy before configuring ILM policies.

curl -X GET "localhost:9200/_cluster/health?pretty"

Check current ILM status

Verify that ILM is enabled in your cluster and see existing policies.

curl -X GET "localhost:9200/_ilm/status?pretty"
curl -X GET "localhost:9200/_ilm/policy?pretty"

Create data directories for different tiers

Set up separate directories for hot, warm, and cold data storage tiers.

sudo mkdir -p /var/lib/elasticsearch/hot
sudo mkdir -p /var/lib/elasticsearch/warm
sudo mkdir -p /var/lib/elasticsearch/cold
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch

Configure ILM policies for log retention

Create a basic log retention policy

Define a policy that moves data through hot, warm, and cold phases with automatic deletion after 90 days.

curl -X PUT "localhost:9200/_ilm/policy/logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "5GB",
            "max_age": "7d",
            "max_docs": 10000000
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "set_priority": {
            "priority": 0
          }
        }
      },
      "delete": {
        "min_age": "90d"
      }
    }
  }
}'

Create an application-specific policy

Configure a more aggressive policy for high-volume application logs with shorter retention.

curl -X PUT "localhost:9200/_ilm/policy/app-logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "2GB",
            "max_age": "1d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "1d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "shrink": {
            "number_of_shards": 1
          }
        }
      },
      "delete": {
        "min_age": "30d"
      }
    }
  }
}'

Create a security audit policy

Define a policy for security logs that require longer retention and immutable storage.

curl -X PUT "localhost:9200/_ilm/policy/security-logs-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "10GB",
            "max_age": "30d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "allocate": {
            "number_of_replicas": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "readonly": {}
        }
      },
      "cold": {
        "min_age": "90d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "delete": {
        "min_age": "2555d"
      }
    }
  }
}'

Create index templates with ILM integration

Create template for application logs

Configure an index template that automatically applies the ILM policy to matching indices.

curl -X PUT "localhost:9200/_index_template/app-logs-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["app-logs-*"],
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "app-logs-policy",
          "rollover_alias": "app-logs"
        },
        "number_of_shards": 2,
        "number_of_replicas": 1,
        "codec": "best_compression"
      }
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "level": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "host": {
          "type": "keyword"
        },
        "application": {
          "type": "keyword"
        }
      }
    }
  },
  "priority": 200,
  "version": 1
}'

Create template for system logs

Set up a template for system logs with different mapping and ILM policy.

curl -X PUT "localhost:9200/_index_template/system-logs-template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["system-logs-*"],
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "logs-policy",
          "rollover_alias": "system-logs"
        },
        "number_of_shards": 1,
        "number_of_replicas": 1,
        "refresh_interval": "30s"
      }
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "severity": {
          "type": "keyword"
        },
        "facility": {
          "type": "keyword"
        },
        "program": {
          "type": "keyword"
        },
        "message": {
          "type": "text"
        },
        "host": {
          "type": "keyword"
        }
      }
    }
  },
  "priority": 100
}'

Create initial indices with aliases

Create the initial write indices with proper aliases for ILM rollover functionality.

curl -X PUT "localhost:9200/app-logs-000001" -H 'Content-Type: application/json' -d'
{
  "aliases": {
    "app-logs": {
      "is_write_index": true
    }
  }
}'

curl -X PUT "localhost:9200/system-logs-000001" -H 'Content-Type: application/json' -d'
{
  "aliases": {
    "system-logs": {
      "is_write_index": true
    }
  }
}'

Monitor and optimize ILM performance

Check ILM policy execution

Monitor how your indices are progressing through lifecycle phases.

curl -X GET "localhost:9200/*/_ilm/explain?pretty"
curl -X GET "localhost:9200/_ilm/status?pretty"

View index allocation across nodes

Check how indices are distributed across your cluster nodes and storage tiers.

curl -X GET "localhost:9200/_cat/indices?v&s=store.size:desc"
curl -X GET "localhost:9200/_cat/allocation?v"

Configure ILM polling interval

Adjust how frequently Elasticsearch checks for ILM actions to balance responsiveness with cluster load.

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10m"
  }
}'

Create monitoring script

Set up automated monitoring to track ILM policy performance and storage usage.

#!/bin/bash

Monitor Elasticsearch ILM policies

ELASTIC_HOST="localhost:9200" LOG_FILE="/var/log/elasticsearch-ilm-monitor.log" echo "$(date): Starting ILM monitoring check" >> $LOG_FILE

Check ILM status

ILM_STATUS=$(curl -s "$ELASTIC_HOST/_ilm/status" | jq -r '.operation_mode') echo "ILM Status: $ILM_STATUS" >> $LOG_FILE

Check indices in each phase

echo "Checking index phases:" >> $LOG_FILE curl -s "$ELASTIC_HOST/*/_ilm/explain" | jq -r '.indices[] | "\(.index): \(.phase)"' >> $LOG_FILE

Check cluster disk usage

echo "Cluster storage usage:" >> $LOG_FILE curl -s "$ELASTIC_HOST/_cat/allocation?v&h=node,disk.used_percent" >> $LOG_FILE

Alert if any index is in error state

ERRORS=$(curl -s "$ELASTIC_HOST/*/_ilm/explain" | jq -r '.indices[] | select(.step_info.type == "exception") | .index') if [ ! -z "$ERRORS" ]; then echo "ERROR: Indices with ILM errors: $ERRORS" >> $LOG_FILE fi echo "$(date): ILM monitoring check completed" >> $LOG_FILE

Make monitoring script executable and schedule it

Set proper permissions and create a cron job for regular ILM monitoring.

sudo chmod +x /usr/local/bin/monitor-ilm.sh
sudo chown root:root /usr/local/bin/monitor-ilm.sh

Add monitoring to cron

Schedule the monitoring script to run every hour and track ILM policy execution.

sudo crontab -e

Add this line to run monitoring every hour:

0     /usr/local/bin/monitor-ilm.sh

Advanced ILM configuration options

Configure node attributes for data tiers

Set up node attributes to control where hot, warm, and cold data is stored.

node.attr.data: hot
node.roles: ["data_hot", "data_content"]

For warm nodes:

node.attr.data: warm

node.roles: ["data_warm", "data_content"]

For cold nodes:

node.attr.data: cold

node.roles: ["data_cold", "data_content"]

Create policy with searchable snapshots

Configure ILM to use searchable snapshots for long-term cold storage.

curl -X PUT "localhost:9200/_ilm/policy/archive-policy" -H 'Content-Type: application/json' -d'
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB",
            "max_age": "30d"
          }
        }
      },
      "warm": {
        "min_age": "30d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "90d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "backup-repo"
          }
        }
      },
      "frozen": {
        "min_age": "180d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "backup-repo",
            "force_merge_index": true
          }
        }
      },
      "delete": {
        "min_age": "2555d"
      }
    }
  }
}'

Verify your setup

# Check all ILM policies
curl -X GET "localhost:9200/_ilm/policy?pretty"

Verify index templates

curl -X GET "localhost:9200/_index_template?pretty"

Check index lifecycle status

curl -X GET "localhost:9200/*/_ilm/explain?pretty"

Test with sample data

curl -X POST "localhost:9200/app-logs/_doc" -H 'Content-Type: application/json' -d' { "@timestamp": "2024-01-15T10:30:00Z", "level": "INFO", "message": "Application started successfully", "host": "web01", "application": "myapp" }'

Verify the document was indexed

curl -X GET "localhost:9200/app-logs/_search?pretty"
Note: ILM actions occur based on the polling interval. Use curl -X POST "localhost:9200/_ilm/move/your-index" to manually trigger phase transitions for testing.

Common issues

Symptom Cause Fix
Index stuck in hot phase Rollover conditions not met Check rollover settings and index size: curl -X GET "localhost:9200/_cat/indices?v"
ILM policy not applied Index created before template Apply policy manually: curl -X PUT "localhost:9200/index/_settings" -d '{"index.lifecycle.name":"policy"}'
Phase transition failed Node allocation issues Check cluster allocation: curl -X GET "localhost:9200/_cluster/allocation/explain"
Delete phase not working Index has replicas Set replicas to 0 in warm phase or check allocation settings
High memory usage Too many active indices Adjust rollover settings to create fewer, larger indices

Next steps

Running this in production?

Want this handled for you? Setting up ILM once is straightforward. Keeping it optimized, monitoring storage costs, and managing retention policies across environments is the harder part. See how we run infrastructure like this for European teams.

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.