Advanced Machine Learning with scikit-learn: Imbalanced Classification and Text Data
Advanced Machine Learning with scikit-learn: Imbalanced Classification and Text Data


scikit-learn is a machine learning library in Python, that has become a valuable tool for many data science practitioners. This training will cover some advanced topics in using scikit-learn and how to build your own models or feature extraction methods that are compatible with scikit-learn. We will also discuss different approaches to feature selection and resampling methods for imbalanced data. Finally, we'll discuss how to do the classification of text data using the bag-of-words model and its variants.


Andreas Mueller is an Associate Research Scientist at the Data Science Institute at Columbia University and author of the O'Reilly book """"Introduction to machine learning with Python"""". He is one of the core developers of the scikit-learn machine learning library and has co-maintained it for several years.

His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize access to high-quality machine learning algorithms.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google