Analyze Data, Build a UI and Deploy on the Cloud with Apache Spark, Notebooks and PixieDust
Analyze Data, Build a UI and Deploy on the Cloud with Apache Spark, Notebooks and PixieDust


Working with Apache Spark and Jupyter Notebooks, but want to be more efficient? You need a dash of PixieDust, a fast-growing open source library. PixieDust speeds data manipulation and display with features like auto-visualization of Spark DataFrames, real-time Spark Job progress monitoring directly from the Notebook, seamless integration to cloud services, and automated local install of Python and Scala kernels running with Spark. It also adds the ability to build embedded apps with minimal coding, making data more shareable and consumable.

In this 90-minute workshop, IBM Distinguished Engineer David Taieb will walk through how to use PixieDust with Spark and Notebooks to analyze open data around traffic accidents in the UK, and then build charts and maps to discover insights. David will then show how to build a dashboard that drills down into specific areas and how to combine multiple data sources like crime or speeding zones to further refine your analysis. We’ll also introduce new capabilities that let you deploy your Notebook and PixieApp - with one click - as a full-fledged web application running on the cloud.

Regardless of your skill level with Spark, Notebooks and data science, you'll be able to attend this session. Please bring your own network-enabled computer, as you will need to work out of a web browser for the hands-on portion of the workshop. Before attending the session, you should also follow the instructions at to install PixieDust, which runs on IBM Data Science Experience (sign up at or any Jupyter Notebook deployment of your choice.


David Taieb is a Distinguished Engineer for the Watson Data Platform Developer Advocacy team at IBM, leading a team of avid technologists with the mission of educating developers on the art of possible with cloud technologies. He’s passionate about innovation and building Open Source tools like the PixieDust Python Library for Jupyter Notebooks and Apache Spark, that help improve developer productivity and overall experience. David enjoys sharing his experience by speaking at conferences and meeting as many people as possible.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google