Guided Labeling: Human-in-the-Loop Label Generation with Active Learning and Weak Supervision
Guided Labeling: Human-in-the-Loop Label Generation with Active Learning and Weak Supervision


We are in the age of data. In recent years, many companies have already started collecting large amounts of data about their business. Many other companies are starting now. However, before you can train any decent supervised model you need ground truth data. And this is the ugly truth: before proceeding, you need a sufficiently large set of correctly labeled data records to describe your problem. And data labeling - especially in a sufficiently large amount - is … expensive. In this presentation we explain the main parts of the guided labeling procedure and we show a blueprint web-application, based on active learning and weak supervision, to interactively label any document set while investing only a fractional amount of time in manual labeling. Additionally the user can provide labeling functions or rules which can label portions of the dataset. Both labels and labeling function provided by the human-in-the-loop are processed by the guided labeling application to train a machine learning model to delegate the boring and expensive task of data labeling.


Paolo Tamagnini is a data scientist at KNIME, holds a master’s degree in data science from Sapienza University of Rome and has research experience from NYU in data visualization techniques for machine learning interpretability.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google