The Last Frontier of Machine Learning – Data Wrangling

Abstract: The bane of any organization deploying high quality machine learning technology is data wrangling. Data wrangling consists of data pre-processing, feature cleaning, and feature engineering. We estimate that data scientists spend upwards of 90% of their time wrestling with data making it the biggest bottleneck to widespread Machine Learning adoption. Automating aspects of data wrangling would dramatically increase the adoption of Machine Learning technology across enterprise organizations. In this talk, Alex Holub, PhD, draws upon his experience in both industry and academe to illustrate both why data wrangling is a challenge and some of the solutions being developed to automate the data wrangling.

Bio: Alex Holub is the Co-founder and CEO of Vidora, a Machine Learning company focused on automating data wrangling and enabling operational intelligence for everyone. He received his Ph.D. at Caltech in Machine Learning and Computer Vision, and has published over 20 peer-reviewed articles in the areas of Machine Learning, Statistics, and Artificial Intelligence. Prior to Caltech, Alex received undergraduate degrees from Cornell in Computer Science and Neurobiology, and spent one year as a visiting scientist at the Max-Planck Institute for Biological Cybernetics in Tuebingen.