Abstract: MLOps (Machine Learning Operations) has become the linchpin for deploying robust and reliable AI applications in the rapidly evolving landscape of machine learning. Using the intriguing Brackish Underwater dataset, this immersive workshop will take you through the entire ML pipeline, from EDA to deployment, with real-world insights and hands-on experience.
Our workshop begins with diving into exploratory data analysis on the Brackish Underwater dataset. This dataset comprises images captured 9 meters below the surface on the Limfjords bridge in northern Denmark by Aalborg University. Dive deep into the world of underwater AI as we explore marine life, including fish, crabs, and other captivating aquatic creatures.
With a solid understanding of the dataset, we guide you through the Model Development Lifecycle, emphasizing the importance of consuming artifacts and training models effectively, complete with rigorous evaluation. We will also cover active learning, which becomes an essential part of the model development lifecycle.
Next, we will showcase the why and what of Hyperparameter Optimization. To improve the performance of the baseline model while making it manageable for real-time inference in a production environment, we need not only the best model but also the best set of hyperparameters to train it with.
Managing models is a crucial aspect of MLOps, and our workshop introduces you to Model Registry best practices. We then transition to the ML CI/CD (GitOps) process, showcasing how to seamlessly integrate machine learning into your software development workflow and automate a few steps.
Finally, we cap off the workshop by exploring deployment using Huggingface Spaces, ensuring your underwater models are ready for real-world applications, whether marine research, environmental monitoring, or beyond.
**Module 1: Exploratory Data Analysis (EDA) and Dataset Introduction**
- Understand the Brackish Underwater dataset.
- Learn the basics of exploratory data analysis.
- Introduction to the Brackish Underwater dataset (14,674 images).
- Hands-on exploration of the dataset, including data loading and visualization.
- Introduction to key EDA tools, with a focus on Weights & Biases (W&B) Tables.
- Exercise: Perform EDA on the dataset, visualizing marine life and understanding data distribution.
**Module 2: Model Development Lifecycle and Active Learning**
- Comprehend the Model Development Lifecycle.
- Understand the concept and significance of active learning in MLOps.
- Learn how active learning improves model performance.
- Overview of the Model Development Lifecycle from data ingestion to model training.
- In-depth explanation of active learning, its benefits, and why it's vital in the model development process.
**Module 3: Hyperparameter Optimization and Model Management**
- Understand the importance of hyperparameter optimization (HPO).
- Learn how to optimize hyperparameters for improved model performance.
- Introduction to hyperparameter optimization (HPO) and its role in model performance.
- Exploring different HPO techniques, focusing on Weights & Biases (W&B) Sweeps.
- Hands-on HPO using the Brackish Underwater dataset.
**Module 4: ML CI/CD (GitOps) and Deployment with Huggingface Spaces**
- Understand the ML CI/CD (GitOps) process.
- Learn how to automate machine learning workflows.
- Explore deployment with Huggingface Spaces.
- Overview of ML CI/CD (GitOps) principles and practices.
- Introduction to deployment using Huggingface Spaces.
- Hands-on exercise: Automate the ML CI/CD pipeline for your underwater object detection model
The participants should be familiar with Python and have some knowledge about Machine Learning. The participants should have trained a neural network based model.
Bio: Ayush Thakur is a MLE at Weights and Biases and Google Developer Expert in Machine Learning. He is interested in everything computer vision and representation learning. For the past 8 months he’s been working with LLMs and have covered RLHF and how and what of building LLM-based systems.