Abstract: Datasets are almost never complete and this can introduce various biases to your analysis. Due to these biases, your supervised machine learning model can produce incorrect predictions. The goal of this post is to give you an idea of why some of the most common approaches for dealing with missing values often introduce some type of bias. At ODSC Europe 2020, I will describe the methods and techniques that can help you to arrive at an unbiased conclusion in the face of missing data.
Bio: Andras Zsom is a Lead Data Scientist in the Center for Computation and Visualization group at Brown University, Providence, RI. He works with high-level academic administrators to tackle predictive modeling problems, he collaborates with faculty members on data-intensive research projects, and he was the instructor of a data science course offered to the data science master students at Brown.