Skip to content

Commit

Permalink
Merge pull request #3381 from dlvenable/data-prepper-2.10-release-blog
Browse files Browse the repository at this point in the history
Data Prepper 2.10.0 release blog
  • Loading branch information
kolchfa-aws authored Oct 16, 2024
2 parents 5e69aa7 + d201fd2 commit 25a06f6
Showing 1 changed file with 146 additions and 0 deletions.
146 changes: 146 additions & 0 deletions _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
---
layout: post
title: Announcing Data Prepper 2.10.0
authors:
- sb2k16
- dvenable
date: 2024-10-15 12:30:00 -0600
categories:
- releases
excerpt: Data Prepper 2.10.0 offers an OpenSearch _bulk API and reads from Amazon Kinesis.
meta_keywords: Data Prepper, OpenSearch bulk API, Kinesis data streams, Kafka, SASL/SCRAM authentication, streaming data ingestion
meta_description: Data Prepper 2.10.0 adds OpenSearch API and Kinesis Data Streams sources for seamless ingestion, plus Kafka SASL/SCRAM support and OpenTelemetry log parsing.
---

## Introduction

Data Prepper 2.10 is now available!
Two major features include a source that sends data to Data Prepper using an API mimicking the OpenSearch `_bulk` API and the ability to read from Amazon Kinesis Data Streams.


## OpenSearch API source

Many existing OpenSearch clients that perform ingestion directly to OpenSearch can now send that data to Data Prepper first.
This means that you can use Data Prepper's buffering and rich processor set before sending data to OpenSearch without having to change clients that are using the OpenSearch `_bulk` API.
A new Data Prepper source named `opensearch_api` has been added that accepts [OpenSearch Document API bulk operation](https://opensearch.org/docs/latest/api-reference/document-apis/bulk/) requests from clients using REST and ingests data into OpenSearch.
The behavior of this source is also quite similar to the existing `http` source.
It supports industry-standard encryption in the form of TLS/HTTPS and HTTP basic authentication.
It also parses incoming requests and creates Data Prepper events and associated event metadata, making it compatible with the `opensearch` sink.
The request body is compatible with the OpenSearch Document API bulk operation and supports all actions: index, create, delete, and update.

The following two HTTP methods are now supported:

```
POST _bulk
POST <index>/_bulk
```

The second API specifies the index in the path, so you don't need to include it in the request body.

Additionally, the following OpenSearch Document API bulk operation query parameters are supported:

* `pipeline`
* `routing`

The following example demonstrates how to use the source:

```
version: "2"
opensearch-api-pipeline:
source:
opensearch_api:
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
insecure: true
username: "admin"
password: "admin"
index: "${getMetadata(\"opensearch_index\")}"
action: "${getMetadata(\"opensearch_action\")}"
document_id: "${getMetadata(\"opensearch_id\")}"
routing: "${getMetadata(\"opensearch_routing\")}"
pipeline: "${getMetadata(\"opensearch_pipeline\")}"
```

Consider the following example request:

```
POST _bulk
{ "index": { "_index": "movies", "_id": "tt1979320" } }
{ "title": "Rush", "year": 2013 }
```

This request will be ingested into OpenSearch, and a new document will be created under the index `movies` with the document ID `tt1979320` and the document source `{ "title": "Rush", "year": 2013 }`.

The Data Prepper maintainers are interested in further expanding this source to support other indexing APIs, allowing it to stand in for an OpenSearch cluster in ingestion workloads.
To learn more or provide feedback, see [Provide an OpenSearch API source #4180](https://github.com/opensearch-project/data-prepper/issues/4180).


## Kinesis source

[Amazon Kinesis Data Streams](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) is a high-speed streaming data service.
Data Prepper has also introduced a new source named `kinesis` that can be used to ingest stream record data from multiple Kinesis data streams into OpenSearch clusters.
You can configure it to read stream records from either the oldest untrimmed record or from the most recent record.
Moreover, if you enable end-to-end acknowledgements, Kinesis data streams will be checkpointed to prevent duplicate processing of records.

The following is an example pipeline:

```
version: "2"
kinesis-pipeline:
source:
kinesis:
codec:
newline:
streams:
- stream_name: "MyStream1"
initial_position: LATEST
checkpoint_interval: "PT5M"
- stream_name: "MyStream2"
# Enable this if ingestion should start from the start of the stream.
initial_position: EARLIEST
consumer_strategy: "polling"
polling:
max_polling_records: 100
idle_time_between_reads: "250ms"
```



## Other features and improvements

Data Prepper 2.10 has introduced a number of other improvements:

* The `kafka` source now supports authentication with an Apache Kafka cluster using SASL/SCRAM in addition to the SASL/PLAIN authentication provided in previous versions.
* Data Prepper can now parse OpenTelemetry logs from sources such as Amazon Simple Storage Service (Amazon S3). The new `otel_logs` codec parses data from OpenTelemetry Protocol (OTLP) JSON-formatted files. Now you can write OpenTelemetry logs from [AWS S3 Exporter for OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/awss3exporter/README.md) and read these using Data Prepper.
* Additionally, the maintainers have worked to improve performance through the addition of an internal cache for event keys. Data Prepper administrators can configure this cache as necessary.


## Next steps

* To download Data Prepper, visit the [OpenSearch downloads](https://opensearch.org/downloads.html) page.
* For instructions on how to get started with Data Prepper, see [Getting started with Data Prepper](https://opensearch.org/docs/latest/data-prepper/getting-started/).
* To learn more about the work in progress for Data Prepper 2.11 and other releases, see the [Data Prepper Project Roadmap](https://github.com/orgs/opensearch-project/projects/221).

## Thanks to our contributors!

The following community members contributed to this release. Thank you!

* [chenqi0805](https://github.com/chenqi0805) -- Qi Chen
* [danhli](https://github.com/danhli) -- Daniel Li

Check failure on line 130 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L130

[OpenSearch.Spelling] Error: danhli. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: danhli. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 130, "column": 4}}}, "severity": "ERROR"}
* [dependabot[bot]](https://github.com/apps/dependabot)
* [dinujoh](https://github.com/dinujoh) -- Dinu John

Check failure on line 132 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L132

[OpenSearch.Spelling] Error: dinujoh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: dinujoh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 132, "column": 4}}}, "severity": "ERROR"}

Check failure on line 132 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L132

[OpenSearch.Spelling] Error: Dinu. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Dinu. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 132, "column": 44}}}, "severity": "ERROR"}
* [dlvenable](https://github.com/dlvenable) -- David Venable

Check failure on line 133 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L133

[OpenSearch.Spelling] Error: dlvenable. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: dlvenable. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 133, "column": 4}}}, "severity": "ERROR"}

Check failure on line 133 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L133

[OpenSearch.Spelling] Error: Venable. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Venable. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 133, "column": 54}}}, "severity": "ERROR"}
* [franky-m](https://github.com/franky-m)
* [graytaylor0](https://github.com/graytaylor0) -- Taylor Gray
* [jayeshjeh](https://github.com/jayeshjeh) -- Jayesh Parmar

Check failure on line 136 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L136

[OpenSearch.Spelling] Error: jayeshjeh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: jayeshjeh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 136, "column": 4}}}, "severity": "ERROR"}

Check failure on line 136 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L136

[OpenSearch.Spelling] Error: Jayesh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Jayesh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 136, "column": 48}}}, "severity": "ERROR"}

Check failure on line 136 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L136

[OpenSearch.Spelling] Error: Parmar. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Parmar. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 136, "column": 55}}}, "severity": "ERROR"}
* [KarstenSchnitter](https://github.com/KarstenSchnitter) -- Karsten Schnitter
* [kkondaka](https://github.com/kkondaka) -- Krishna Kondaka

Check failure on line 138 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L138

[OpenSearch.Spelling] Error: kkondaka. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: kkondaka. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 138, "column": 4}}}, "severity": "ERROR"}

Check failure on line 138 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L138

[OpenSearch.Spelling] Error: Kondaka. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Kondaka. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 138, "column": 54}}}, "severity": "ERROR"}
* [LeeroyHannigan](https://github.com/LeeroyHannigan) -- Lee
* [linghengqian](https://github.com/linghengqian) -- Ling Hengqian

Check failure on line 140 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L140

[OpenSearch.Spelling] Error: linghengqian. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: linghengqian. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 140, "column": 4}}}, "severity": "ERROR"}

Check failure on line 140 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L140

[OpenSearch.Spelling] Error: Hengqian. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Hengqian. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 140, "column": 59}}}, "severity": "ERROR"}
* [oeyh](https://github.com/oeyh) -- Hai Yan

Check failure on line 141 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L141

[OpenSearch.Spelling] Error: oeyh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: oeyh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 141, "column": 4}}}, "severity": "ERROR"}

Check failure on line 141 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L141

[OpenSearch.Spelling] Error: Hai. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Hai. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 141, "column": 38}}}, "severity": "ERROR"}

Check failure on line 141 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L141

[OpenSearch.Spelling] Error: Yan. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Yan. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 141, "column": 42}}}, "severity": "ERROR"}
* [quanghungb](https://github.com/quanghungb) -- qhung

Check failure on line 142 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L142

[OpenSearch.Spelling] Error: quanghungb. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: quanghungb. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 142, "column": 4}}}, "severity": "ERROR"}

Check failure on line 142 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L142

[OpenSearch.Spelling] Error: qhung. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: qhung. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 142, "column": 50}}}, "severity": "ERROR"}
* [san81](https://github.com/san81) -- Santhosh Gandhe

Check failure on line 143 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L143

[OpenSearch.Spelling] Error: Santhosh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Santhosh. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 143, "column": 40}}}, "severity": "ERROR"}

Check failure on line 143 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L143

[OpenSearch.Spelling] Error: Gandhe. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Gandhe. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 143, "column": 49}}}, "severity": "ERROR"}
* [sb2k16](https://github.com/sb2k16) -- Souvik Bose

Check failure on line 144 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L144

[OpenSearch.Spelling] Error: Souvik. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Souvik. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 144, "column": 42}}}, "severity": "ERROR"}
* [shenkw1](https://github.com/shenkw1) -- Katherine Shen

Check failure on line 145 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L145

[OpenSearch.Spelling] Error: Shen. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Shen. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 145, "column": 54}}}, "severity": "ERROR"}
* [srikanthjg](https://github.com/srikanthjg) -- Srikanth Govindarajan

Check failure on line 146 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L146

[OpenSearch.Spelling] Error: srikanthjg. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: srikanthjg. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 146, "column": 4}}}, "severity": "ERROR"}

Check failure on line 146 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L146

[OpenSearch.Spelling] Error: Srikanth. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Srikanth. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 146, "column": 50}}}, "severity": "ERROR"}

Check failure on line 146 in _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md#L146

[OpenSearch.Spelling] Error: Govindarajan. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Govindarajan. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2024-10-15-Announcing-Data-Prepper-2.10.0.md", "range": {"start": {"line": 146, "column": 59}}}, "severity": "ERROR"}

0 comments on commit 25a06f6

Please sign in to comment.