Introduction to Machine Learning with scikit-learn
Introduction to Machine Learning with scikit-learn


Machine learning has become an indispensable tool across many areas of research and commercial applications. From text-to-speech for your phone to detect the Higgs boson, machine learning excels at extracting knowledge from large amounts of data. This talk will give a general introduction to machine learning, as well as introduce practical tools for you to apply machine learning in your research. We will focus on one particularly important subfield of machine learning, supervised learning. The goal of supervised learning is to "learn" a function that maps inputs x to an output y, by using a collection of training data consisting of input-output pairs. We will walk through formulating a problem as a supervised machine learning problem, creating the necessary training data and applying and evaluating a machine learning algorithm. This workshop should give you all the necessary background to start using machine learning yourself.

● Language used will be Python. Please access the link provided for download option of Jupyter Notebooks on Anaconda.
● Training materials can be found at the Github repository link provided.
● Attendees can also follow the provided google drive link for content overview.
● This workshop assumes familiarity with Jupyter notebooks and basics of the following libraries:
• pandas,
• matplotlib
• numpy


Andreas Mueller is an Associate Research Scientist at the Data Science Institute at Columbia University and author of the O'Reilly book ""Introduction to machine learning with Python"". He is one of the core developers of the scikit-learn machine learning library and has co-maintained it for several years.

His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize access to high-quality machine learning algorithms.