Data Engineering in the Era of Gen AI


In the era of Gen AI, the landscape of data engineering is undergoing a transformative evolution, and this talk delves into the pivotal role it plays in harnessing the power of artificial intelligence. The session explores the dynamic interplay between data engineering and the emerging generation of AI technologies, highlighting key strategies to adapt and thrive in this data-driven era. The discussion begins by examining the unique challenges and opportunities posed by Gen AI, where advanced machine learning algorithms and neural networks demand a sophisticated and scalable data infrastructure. The speaker emphasizes the importance of building resilient pipelines that can seamlessly integrate diverse and massive datasets, ensuring a robust foundation for training and deploying AI models. The talk also delves into the crucial aspect of data quality and governance in the context of Gen AI, emphasizing the need for meticulous data engineering practices to mitigate biases and ensure ethical AI development. Furthermore, the session explores cutting-edge technologies and best practices, such as real-time data processing and federated learning, that empower data engineers to stay at the forefront of innovation. Ultimately, this talk serves as a comprehensive guide for data engineers navigating the complexities of Gen AI, offering insights, strategies, and real-world examples to inspire and equip professionals in the rapidly evolving field of data engineering.

Session Outline:

-Comprehensive Understanding of Gen AI Landscape:Gain a deep comprehension of the Gen AI ecosystem, exploring the various components, challenges, and opportunities associated with advanced artificial intelligence.

-Strategies for Scalable Data Engineering:Learn strategies for building scalable data engineering pipelines that can efficiently handle the massive and diverse datasets required for training and deploying Gen AI models.

-Data Quality and Ethical Considerations:Understand the critical role of data quality and governance in Gen AI, addressing biases and ensuring ethical AI development through meticulous data engineering practices.

-Integration of Cutting-edge Technologies:Explore the integration of cutting-edge technologies, such as real-time data processing and federated learning, to stay at the forefront of innovation in the rapidly evolving landscape of Gen AI.

-Practical Application and Case Studies:Apply theoretical knowledge through practical exercises and case studies, gaining hands-on experience with tools and techniques relevant to data engineering in the era of Gen AI.

Background Knowledge:

Some data engineering and AI exposure


Anindita is a Lead Solutions Architect at Databricks in the Financial Services vertical focused on helping organizations make the most of their data investments.

She has co-authored 2 patents and has over 20 years of industry experience in software development, consulting, and client-facing roles. Notably, she has authored the book ""Simplifying Data Engineering and Analytics with Delta,"" a definitive guide to crafting analytics-ready data that powers artificial intelligence.

She has a Masters in Computer Science from Boston University, a Masters in Liberal Arts and Management from Harvard Extension School.

Anindita's commitment to knowledge dissemination is evident through her current role as a graduate course instructor on Data Engineering at Harvard Extension School.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google