Simplify and Scale Data Engineering Pipelines with Open Source Delta Lake
Simplify and Scale Data Engineering Pipelines with Open Source Delta Lake


A common data engineering pipeline architecture uses tables that correspond to different quality levels, progressively adding structure to the data: data ingestion (“Bronze” tables), transformation/feature engineering (“Silver” tables), and machine learning training or prediction (“Gold” tables). Combined, we refer to these tables as a “multi-hop” architecture. It allows data engineers to build a pipeline that begins with raw data as a “single source of truth” from which everything flows. In this session, we will show how to build a scalable data engineering data pipeline using Delta Lake.

Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs

In this session you will learn about:
- The data engineering pipeline architecture
- Data engineering pipeline scenarios
- Data engineering pipeline best practices
- How Delta Lake enhances data engineering pipelines
- The ease of adopting Delta Lake for building your data engineering pipelines


Joshua Cook has been teaching in one capacity or another for nearly fifteen years. He currently works as a Data Architect for Databricks. Most recently, he taught Data Science for UCLA Extension. Prior to this, he taught Data Science for General Assembly, in the Master of Education program at UCLA, high school mathematics at Crenshaw and Jefferson High Schools in Los Angeles, and early childhood literacy in West Oakland. Additionally, Joshua is trained as a computational mathematician. He has production experience with model prediction and deployment using the Python numerical stack and Amazon Web Services. He is the author of the book, Docker for Data Science, published by Apress Media.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from Youtube
Consent to display content from Vimeo
Google Maps
Consent to display content from Google