Abstract: Machine Learning (ML) is separated into model training and model inference. Today, most projects use a cloud-native data lake to process historical data and train analytic models. But it’s possible to completely avoid such a data store, using a modern data streaming platform.
This talk compares a cloud-native data streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems.
The talk explains how this can be achieved leveraging Apache Kafka, Tiered Storage and TensorFlow, but also explores when to better combine data streaming with a data lake.
Bio: Kai Waehner is Field CTO at Confluent. He works with customers and partners across the globe and with internal teams like engineering and marketing. Kai’s main area of expertise lies within the fields of Data Streaming, Analytics, Hybrid Cloud Architectures and Internet of Things. Kai is a regular speaker at international conferences, writes articles for professional journals, and shares his experiences with industry use cases and new technologies on his blog: www.kai-waehner.de. Contact: firstname.lastname@example.org / @KaiWaehner / linkedin.com/in/kaiwaehner.
Global Field CTO | Author | International Speaker