Abstract: An instant world requires instant decisions at scale. This includes the ability to digest and react to changes in real-time. Thus, event logs such as Apache Kafka can be found in almost every architecture, while databases and similar systems still provide the foundation. Change Data Capture (CDC) has become popular for propagating changes. Nevertheless, integrating all these systems, which often have slightly different semantics, can be a challenge.
In this talk, we highlight what it means for Apache Flink to be a general data processor that acts as a data integration hub. Looking under the hood, we demonstrate Flink's SQL engine as a changelog processor that ships with an ecosystem tailored to processing CDC data and maintaining materialized views. We will discuss the semantics of different data sources and how to perform joins or stream enrichment between them. This talk illustrates how Flink can be used with systems such as Kafka (for upsert logging), Debezium, JDBC, and others.
Programming knowledge would help, but SQL knowledge might be sufficient to get the concepts.
Bio: Timo Walther is a Principal Software Engineer at Confluent and a long-time member of Apache Flink’s management committee. He studied Computer Science at TU Berlin and was part of the Database Group there - the origins of Apache Flink. He worked as a software engineer at DataArtisans and led SQL team at Ververica. He was a Co-Founder of Immerok which was acquired by Confluent in 2023. In Flink, he is working on various topics in the Table & SQL ecosystem to make stream processing accessible for everyone.