Automated Machine Learning with Python – from scikit-learn to auto-sklearn
Automated Machine Learning with Python – from scikit-learn to auto-sklearn


Developing well-performing machine learning pipelines requires a lot of expertise, time and manual tuning. AutoML automates the development process to make machine learning more accessible and efficient. In this workshop we will cover how to move from manually constructing and tuning machine learning pipelines to using efficient hyperparameter optimization algorithms and full AutoML using the popular Auto-sklearn library. After this tutorial you will be able to use Auto-sklearn to build machine learning pipelines for tabular datasets and analyze them.

Session Outline
Part 1: Why is finding the right pipeline hard and how to search efficiently?
You will learn about the vast design space of machine learning pipelines including seemingly subtle design choices with a huge impact. As a first step we will use simple hyperparameter optimization methods and show when and how they break as soon as the problems get more complicated.

Part 2: AutoML methods
Next, we will discuss advanced hyperparameter optimization methods such as Bayesian optimization and Hyperband and use them to optimize larger numbers of hyperparameters for machine learning models with longer runtimes. Based on this knowledge we will introduce Auto-sklearn, one of the leading open source AutoML libraries, and leverage it to simplify the machine learning workflow.

Part 3: Advanced use cases of Auto-sklearn
In the last session we will demonstrate advanced use cases of Auto-sklearn such as inspecting the final model, obtaining model-independent feature importance measures, restricting Auto-sklearn to use only interpretable models and extending Auto-sklearn with additional components from 3rd-party libraries.

Background Knowledge
Basic knowledge in machine learning and model selection strategies. To understand code examples, knowledge in Python and scikit-learn is recommended. This tutorial is a practical complement of the talk by Prof. Dr. Frank Hutter in the main track, attending it is a plus.


Katharina Eggensperger is a doctoral candidate at the Machine Learning Lab at the University of Freiburg, Germany. Her research focuses on empirical performance modeling, automated machine learning and hyperparameter optimization. She has been an invited speaker at the BayesOpt workshop at NeurIPS 2016 and co-organized the AutoML workshop in 2019 and 2020. Furthermore, she was part of the winning team of the 1st&2nd AutoML challenges and the BBO challenge@NeurIPS 2020.

Open Data Science

Open Data Science
Innovation Center
101 Main St
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from Youtube
Consent to display content from Vimeo
Google Maps
Consent to display content from Google