Elasticsearch Index Lifecycle Management for Fluentd

External tools, such as Curator, used to be a necessity for managing Elasticsearch indexes. This has changed with the introduction of Index Lifecycle Management in (ILM) Elasticsearch 6.6. It has all but eliminated the need for other tools. While has been developed primarily with Logstash in mind, you can also take advantage of it when using Fluentd. It works with both data streams and regular indexes. Becase most people are probably familiar with the latter, this post will explain how to setup ILM for your Fluentd indexes.

ILM Phases

First of all, I need to explain how ILM works. It consists of four different phases:

  • hot - index you actively query and write to; has the highest priority
  • warm - the index is read-only and shrunk to 1 shard; has a lower priority
  • cold - read-only and, for the most part, not loaded to memory, making it slow to query; has even lower priority
  • delete - it’s gone, Jim

You can use any or all of the above-mentioned. On top of that, ILM can perform automatic rollovers. But since Fluentd creates a new index every day by default, you can leave it off.

Create ILM

You can create a policy either by a direct call to Elasticsearch, or using Kibana. A simple policy, that keeps the latest index hot, moves it from warm to cold after a week and then deletes it after 30 days, can be created using the following request:

PUT /_ilm/policy/hot-warm-cold-delete-30d
{
  "policy" : {
    "phases" : {
      "warm" : {
        "min_age" : "1d",
        "actions" : {
          "forcemerge" : {
            "max_num_segments" : 1
          },
        }
      },
      "cold" : {
        "min_age" : "7d",
        "actions" : {
          "freeze" : { },
        }
      },
      "hot" : {
        "min_age" : "0ms",
      },
      "delete" : {
        "min_age" : "30d",
        "actions" : {
          "delete" : {
            "delete_searchable_snapshot" : true
          }
        }
      }
    }
  }
}

If you prefer to do it within Kibana, go to Index Management -> Index Lifecycle Policies.

Apply policy to indexes

Now that you have created a policy, you need to apply it to your Fluentd indexes. That’s what you use an index template for. Assuming your indexes follow the fluentd-* pattern, you can create a template that matches your policy to them using the following API call:

PUT /_index_template/hot-warm-cold-delete-30d
{
  "priority": 100,
  "index_patterns" : [
    "fluentd*"
  ],
  "settings" : {
    "index" : {
      "lifecycle" : {
        "name" : "hot-warm-cold-delete-30d"
      },
    }
  }
}

Alternatively, you can create it in Kibana at Index Management -> Index Templates.

Enable ILM in Fluentd

The only thing remaining now, is to enable ILM in Fluentd. It is fairly straightforward - you only need to add enable_ilm true to your elasticsearch store configuration.

Conclusion

Index Lifecycle Policy was a much needed feature, that greatly simplified index management. It is very easy the implement and will save you both time and cluster resources.