Scalable Machine Learning with Kubernetes and Kubeflow

Abstract: Container have emerged as an excellent tool for building portable and scalable applications, with Kubernetes as a leading system for orchestrating container deployments across clusters. However, in order for data science teams to reap the benefits of Kubernetes for machine learning they must first overcome myriad engineering and DevOps challenges.

Kubeflow seeks to simplify the process of building, deploying, and scaling machine learning workflows on Kubernetes. Originally an internal project at Google for streamlining TensorFlow jobs on Kubernetes but has since been open sourced and now supports a variety of frameworks and workflows. Kubeflow takes care of many common pain points, allowing data science teams to focus on building and deploying models instead of managing a brittle network of systems held together with glue code.

This workshop will provide an overview of the benefits of using containers, Kubernetes, and Kubeflow to build portable and scalable machine learning pipelines, as well as hands on exercises building, deploying, and consuming Kubeflow based models on local and public cloud-based infrastructure. Attendees can follow the shown examples or code along on their own machines using provided data and scripts.

Bio: As a Senior Data Scientist and Instructor at Metis, John Tate leads immersive data science programs training students in areas including machine learning, math & statistics, and the python data science toolkit. Previously, John worked as a consultant developing end to end data science solutions for clients in industries such as Healthcare, Pharma, Finance, Automotive, and Marketing Technology.