The English SDK for Apache Spark™


In the fast-paced world of data science and AI, we will explore how large language models (LLMs) can elevate the development process of Apache Spark™ applications.

We'll demonstrate how LLMs can simplify SQL query creation, data ingestion, and DataFrame transformations, leading to faster development and more precise code that's easier to review and understand. We'll also show how LLMs can assist in creating visualizations and clarifying data insights, making complex data easy to understand.

Furthermore, we'll discuss how LLMs can be used to create user-defined data sources and functions, offering higher adaptability in Apache Spark applications.

Our session, filled with practical examples, highlights the innovative role of LLMs in the realm of Apache Spark development. We invite you to join us in exploring how these advanced language models can drive innovation and boost efficiency in data science and AI.

The attendees for this session will learn about simplifying open-source Apache Spark code generation using open-source and proprietary LLMs.


Xinrong Meng is an Apache Spark PMC (Project Management Committee) Member and Committer, with deep technical expertise in PySpark. She is one of the main contributors to the Pandas API on Spark and English SDK. She works as a software engineer at Databricks.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google