Introduction to Clinical Natural Language Processing: Predicting Hospital Readmission with Discharge Summaries

Abstract: Clinical notes from physicians and nurses contain a vast wealth of knowledge and insight that can be utilized for predictive models to improve patient care and hospital workflow. In this workshop, we will introduce a few Natural Language Processing techniques for building a machine learning model in Python with clinical notes. As an example, we will focus on predicting unplanned hospital readmission with discharge summaries using the MIMIC III data set. After completing this tutorial, the audience will know how to prepare data for a machine learning project, preprocess unstructured notes using a bag-of-words approach, build a simple predictive model, assess the quality of the model and strategize how to improve the model. Note to the audience: the MIMIC III data set requires requesting access in advance, so please request access as early as possible.

Bio: Andrew Long is a Data Scientist at Fresenius Medical Care North America (FMCNA). Andrew holds a PhD in biomedical engineering from Johns Hopkins University and a Master’s degree in mechanical engineering from Northwestern University. Andrew joined FMCNA last year after participating in the Insight Health Data Fellows Program. At FMCNA, he is responsible for building predictive models using machine learning to improve the quality of life of every patient who receives dialysis from FMCNA. He is currently creating a model to predict which patients are at the highest risk of imminent hospitalization.