Model Training with GPUs and Live Metrics Tracking with Tensorboard on Kubeflow
Model Training with GPUs and Live Metrics Tracking with Tensorboard on Kubeflow

Abstract: 

Kubeflow is the de facto standard for running Machine Learning (ML) workflows on Kubernetes. Its goal is to simplify the day-to-day operations of the data scientists and accelerate the production deployment of models.

Kubeflow comes with all of the tools and technologies that end users are accustomed to like Jupyter Notebooks, Tensorflow, and Tensorboard. It also provides intuitive UIs for managing and consuming the data of the cluster.

In this session you will: 1) learn the basics of Kubeflow, including configuring a Jupyter Notebook on a K8s cluster, 2) upload data from your local machine directly to the cluster using Kubeflow’s UIs, 3) tackle a real world ML problem using Keras and GPUs to train a dog breed identifier, 4) track and visualize training metrics using Tensorboard.

Session Outline
* Lesson 1: Learn the basics of Kubeflow
Discover the different tools and services of Kubeflow. Configure a Jupyter Notebook, including injecting Object Store credentials and assigning GPUs. And all of this within Kubeflow’s UIs, without the need to access any terminal.

* Lesson 2: Learn how to upload your local data to the Kubeflow cluster
You will learn how to use Kubeflow’s intuitive UIs and applications to upload files and folder from your local machine directly to the cloud with simple drag and drop mechanisms. You will also be able to navigate and play around with the data that lives inside your cluster’s volumes.


* Lesson 3: Train a Keras model using GPUs
Create and train a Keras CNN model with GPUs. Given a dog image, the final model should be able to identify the dog breed reliably.

* Lesson 4: Launch Tensorboard to visualize your training metrics
You will learn how to launch and use Tensorboard with Kubeflow to track and visualize your training metrics while they are generated live from the Jupyter Notebook.

Background Knowledge
Attendees should be familiar with Kubernetes, Jupyter Notebooks, and Tensorboard.

Bio: 

Kimonas is a Software Engineer at Arrikto, working on storage solutions on the cloud. He loves Open Source and has been a core contributor to the Kubeflow project for more than a year. Kimonas is the owner of the platform's Jupyter infrastructure and his main goal is to improve the way users manage the lifecycle of their ML tools, like Notebooks, and data on top of Kubeflow. He is also a mentor at the Kubeflow project at Google Summer of Code 2020 providing guidance for adding seamless support for launching Tensorboard instances.