Develop and Deploy a Machine Learning Pipeline in 45 Minutes with Ploomber

Abstract: 

Description

Development tools such as Jupyter are prevalent among data scientists because they provide an environment to explore data visually and interactively. However, when deploying a project, we must ensure the analysis can run reliably in a production environment like Airflow or Argo; this causes data scientists to move code back and forth between their notebooks and these production tools. Furthermore, data scientists have to learn an unfamiliar framework and write pipeline code, which severely delays the deployment process.

Ploomber solves this problem by providing:

1. A workflow orchestrator that automatically infers task execution order using static analysis.
2. A sensible layout to bootstrap projects.
3. A development environment integrated with Jupyter.
4. Capabilities to export to production systems (Airflow and Argo) without code changes.

This talk develops and deploys a Machine Learning pipeline in 30 minutes to demonstrate how Ploomber streamlines the Machine Learning development and deployment process.

Who and why

This talk is for data scientists (with experience developing Machine Learning projects) looking to enhance their workflow. Experience with production tools such as Airflow or Argo is not necessary.

The talk has two objectives:

1. Advocate for more development-friendly tools that let data scientists focus on analyzing data and taking off popular production tools' overhead.
2. Demonstrate an example workflow using Ploomber where a pipeline is developed interactively (using Jupyter) and deployed without code changes.


GitHub: https://github.com/ploomber/ploomber

Bio: 

Eduardo is interested in developing tools to deliver reliable Machine Learning products. Towards that end, he created Ploomber, an open-source Python library to compose production-ready data workflows. Eduardo holds an M.S in Data Science from Columbia University, where he took part in Computational Neuroscience research. Eduardo started his Data Science career in 2015 at the Center for Data Science and Public Policy at The University of Chicago.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google