Engineering a Performant Machine Learning Pipeline: From Dask to Kubeflow
Engineering a Performant Machine Learning Pipeline: From Dask to Kubeflow


The lifecycle of any machine learning model, regular or deep, consists of (a) the pre-processing/transformation/augmenting of data (b) the training of the model with different hyper-parameter values/learning rates (c) the computing of results on new data/test sets. Whether you are using transfer learning, or a from-scratch model, this process requires a large amount of computation, management of your experimental process, and the quick perusal of results from your experiment. In this workshop, we will learn how to combine off-the shelf clustering software such as kubernetes and dask, with learning systems such as tensorflow/pytorch/scikit-learn, on cloud infrastructure such as AWS/Google Cloud/Azure to construct a machine-learning system for your data science team. We'll start with an understanding of kubernetes, move onto analysis pipelines in sklearn and dask, finally arriving at kubeflow. Participants should install minikube on their laptops (, and create accounts on the Google Cloud.


Richard Kim is the founder and CEO of Markov Lab, an AI startup that explores the application of probabilistic modeling and deep learning in the analysis and prediction of financial data. Richard is a Chartered Financial Analyst (CFA) with years of fundamental equity research experience and academic research in artificial intelligence from MIT. Richard has earned his Master’s in Sciences from Massachusetts Institute of Technology where he authored several papers in computational cognitive models of ethical decision makings for autonomous vehicles, one of which was published in October 2018 issue of Nature, “The Moral Machine experiment.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google