Abstract: In this talk, we will discuss the usage of active learning for prioritizing samples to be labeled by human experts in the context of automated cancer pathology reports information extraction. Pathology reports are medical documents containing a detailed description of a patient's diagnosis made by examining cancerous cells and tissues under a microscope. In the US, cancer registries are responsible for collecting, storing, and processing pathology reports from all cancer cases diagnosed in the country to help with cancer surveillance. Hundreds of thousands of these free-text reports are manually processed by the registries. Therefore, machine learning and NLP tools can provide an effective solution for automating this process. To accelerate the development of such AI-powered systems, active learning can be used to intelligently guide the acquisition of more data samples to train better machine learning models.
Bio: Andre Goncalves serves as a Machine Learning Research Scientist within the Machine Learning group at the Lawrence Livermore National Laboratory. Dr. Goncalves provides its machine learning expertise to a variety of projects across the Lab, including cancer prognosis from clinical and genomic information, seasonal and inter-seasonal climate forecasting, antibody design to counter the SARS-CoV-2 virus that causes COVID-19, and small molecule drug discovery. His particular expertise lies in probabilistic machine learning, multi-task/transfer learning, uncertainty quantification, and deep learning. Dr. Goncalves received his Ph.D. in Computer Engineering from University of Campinas (Brazil) and University of Minnesota, Twin Cities (USA) in 2016, after conducting his thesis work on multi-task learning models for climate forecasting.