Tell Me What To Do: Prioritizing Data Labeling for NLP Systems with Active Learning


In this talk, we will discuss the usage of active learning for prioritizing samples to be labeled by human experts in the context of automated cancer pathology reports information extraction. Pathology reports are medical documents containing a detailed description of a patient's diagnosis made by examining cancerous cells and tissues under a microscope. In the US, cancer registries are responsible for collecting, storing, and processing pathology reports from all cancer cases diagnosed in the country to help with cancer surveillance. Hundreds of thousands of these free-text reports are manually processed by the registries. Therefore, machine learning and NLP tools can provide an effective solution for automating this process. To accelerate the development of such AI-powered systems, active learning can be used to intelligently guide the acquisition of more data samples to train better machine learning models.


Andre Goncalves serves as a Machine Learning Research Scientist within the Machine Learning group at the Lawrence Livermore National Laboratory. Dr. Goncalves provides its machine learning expertise to a variety of projects across the Lab, including cancer prognosis from clinical and genomic information, seasonal and inter-seasonal climate forecasting, antibody design to counter the SARS-CoV-2 virus that causes COVID-19, and small molecule drug discovery. His particular expertise lies in probabilistic machine learning, multi-task/transfer learning, uncertainty quantification, and deep learning. Dr. Goncalves received his Ph.D. in Computer Engineering from University of Campinas (Brazil) and University of Minnesota, Twin Cities (USA) in 2016, after conducting his thesis work on multi-task learning models for climate forecasting.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google