
Abstract: Artificial Intelligence is already helping many businesses become more competitive, but how do you move machine learning (ML) models efficiently from research to production? We believe it is imperative to plan for production from day one—both by using a discipline process and by choosing the right tools. Fortunately, we don’t have to build from scratch, as Machine Learning Engineers we can adapt many of the best practices from the DevOps playbook and apply them to the ML workflow.
In this session, we explain some of the ML development best practices we have developed from working in the trenches. The core of our approach comes down to a philosophy of engineering discipline around “being the Navy, and not pirates.” Wherever possible, we explain how to reduce the incidental complexity in ML development by using appropriate tooling and process. We deep dive into certain best practices like using Docker on day one, continuous integration for ML, experiment tracking using MLFlow, and experimenting at scale in the cloud.
Session Outline
* Background (10 mins)
* Key Lessons
** Use Docker on Day One (25 min)
** Use a Structured ML Software Workflow (25 min)
** Downstream Containerization Benefits (15 min)
* Conclusion / Q&A (15 min)
Background Knowledge
Python, Git
Bio: As CTO for Manifold, Sourav is responsible for the overall delivery of data science and data product services to make clients successful. Before Manifold, Sourav led teams to build data products across the technology stack, from smart thermostats and security cams (Google / Nest) to power grid forecasting (AutoGrid) to wireless communication chips (Qualcomm). He holds patents for his work, has been published in several IEEE journals, and has won numerous awards. He earned his PhD, MS, and BS degrees from MIT in Electrical Engineering and Computer Science.