Software 2.0 and Snorkel: Beyond hand-labeled data
Software 2.0 and Snorkel: Beyond hand-labeled data

Abstract: 

In the last several years, deep learning models have simultaneously become more performant and more readily available as easy-to-use, commodity tools--however, their deployment in practice is bottlenecked by the need for large, hand-labeled training sets. This talk describes Snorkel, a system that focuses on this emerging training data bottleneck in the software 2.0 stack. In Snorkel, instead of tediously hand-labeling individual data items, a user implicitly defines large training sets by writing simple programs, called labeling functions, that label subsets of data points. This allows users to build high-quality models despite the fact that these labeling functions will have varying quality, coverage, and specificity--and be correlated in unknown ways. A key technical challenge in Snorkel is to estimate the quality and correlations among these labeling functions without hand-labeled data. This talk will explain a theory of learning without labeled data, and a host of recent applications in natural language processing, structured data problems, and computer vision. This talk will also briefly discuss recent extensions of these core ideas to automatically generating data augmentations, synthesizing training data, and learning from multi-task supervision.

Snorkel is open source on github. Technical blog posts and tutorials are available at Snorkel.Stanford.edu.

Bio: 

Alex Ratner is a Ph.D. candidate in computer science at Stanford, advised by Chris Re, where his research focuses on weak supervision: the idea of using higher-level, noisier input from domain experts to train complex state-of-the-art models where limited or no hand-labeled training data is available. He leads the development of the Snorkel framework (snorkel.stanford.edu) for weakly supervised ML, which has been applied to machine learning problems in domains like genomics, radiology, and political science. He is supported by a Stanford Bio-X SIGF fellowship.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google