Skip to content

Hands-on lab around open-source streaming technologies such as Flink, Debezium, and langchain4j

License

Notifications You must be signed in to change notification settings

decodableco/oss-streaming-lab

Repository files navigation

👩‍🔬 Hands-on Stream Processing 🧪 Labs 🥽

👋 Welcome! Great to have you in our hands-on lab. 🤩

In this lab, you’ll build a real-time data pipeline from Postgres to OpenSearch, enabling use cases such as full-text search, analytics, and dashboarding on the data located and maintained in an operational database. You are working on top of Apache Flink, in particular, you'll primarily use Flink SQL as a convenient way to express your data processing needs in a declarative way.

You’ll learn how to set up E2E streaming data pipelines on top of existing Flink connectors to interact with the respective source (Postgres) and sink (OpenSearch) systems used during this lab. Flink SQL will be applied for filtering, joining, grouping and transforming your data in-flight. Specific data processing needs, for instance, the interaction with AI-related building blocks such as (large) language models, transformers, embedding models etc. can be achieved in SQL by extending Flink's built-in capabilities with custom user-defined functions (UDFs).

Under the hood, Debezium will be used for extracting change events from the source database via change data capture (CDC).

Ready to go? Then let’s get started by checking the local infrastructure setup for this lab!

Contents

License

This repository and its resources are licensed under Creative Commons BY-NC-ND 4.0.

About

Hands-on lab around open-source streaming technologies such as Flink, Debezium, and langchain4j

Resources

License

Stars

Watchers

Forks

Packages

No packages published