Tutorial: Create a data stream with a lifecycle
Stack Serverless
To create a data stream with a built-in lifecycle, follow these steps:
A data stream requires a matching index template. You can configure the data stream lifecycle by setting the lifecycle
field in the index template the same as you do for mappings and index settings. You can define an index template that sets a lifecycle as follows:
- Include the
data_stream
object to enable data streams. - Define the lifecycle in the template section or include a composable template that defines the lifecycle.
- Use a priority higher than
200
to avoid collisions with built-in templates. See Avoid index pattern collisions.
You can use the create index template API.
PUT _index_template/my-index-template
{
"index_patterns": ["my-data-stream-test"],
"data_stream": { },
"priority": 500,
"template": {
"lifecycle": {
"data_retention": "7d"
}
},
"_meta": {
"description": "Template with data stream lifecycle"
}
}
- In this case the index template will be applied to a data stream named
my-data-stream-test
. You can optionally use a wildcard (*
) in the index pattern to match all data streams created (either manually or using an indexing request) that have a name matching the specified pattern.
You can create a data stream in two ways:
By manually creating the stream using the create data stream API. The streamβs name must still match one of your templateβs index patterns.
PUT _data_stream/my-data-stream-test
By indexing requests that target the streamβs name. This name must match one of your index templateβs index patterns.
PUT my-data-stream-test/_bulk
{ "create":{ } } { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" } { "create":{ } } { "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" }
You can use the get data stream lifecycle API to see the data stream lifecycle of your data stream and the explain data stream lifecycle API to see the exact state of each backing index.
GET _data_stream/my-data-stream-test/_lifecycle
The result will look like this:
{
"data_streams": [
{
"name": "my-data-stream-test",
"lifecycle": {
"enabled": true,
"data_retention": "7d",
"effective_retention": "7d",
"retention_determined_by": "data_stream_configuration"
}
}
],
"global_retention": {}
}
- The name of your data stream.
- Shows if the data stream lifecycle is enabled for this data stream.
- The retention period of the data indexed in this data stream, as configured by the user.
- The retention period that will be applied by the data stream lifecycle. This means that the data in this data stream will be kept at least for 7 days. After that Elasticsearch can delete it at its own discretion.
If you want to see more information about how the data stream lifecycle is applied on individual backing indices use the explain data stream lifecycle API:
GET .ds-my-data-stream-test/_lifecycle/explain
You can use a wildcard (*
) in the data stream name to retrieve the lifecycle status for all data streams matching the pattern.
The result will look like this:
{
"indices": {
".ds-my-data-stream-test-2023.04.19-000001": {
"index": ".ds-my-data-stream-test-2023.04.19-000001",
"managed_by_lifecycle": true,
"index_creation_date_millis": 1681918009501,
"time_since_index_creation": "1.6m",
"lifecycle": {
"enabled": true,
"data_retention": "7d"
}
}
}
}
- The name of the backing index.
- If it is managed by the built-in data stream lifecycle.
- Time since the index was created.
- The lifecycle configuration that is applied on this backing index.