Myths of Data Science: Things you Should and Should Not Believe

Abstract: Our most important data science tools are our theories and methods. In this talk, we will go back to fundamentals and look closely at some usually unexamined assumptions about statistics and machine learning. We will look at "myths" that arise in three common data scientist tasks: predictive modeling, analyzing the reliability or validity of results, and running controlled experiments (A/B testing). We will "debunk" these myths and offer some potential fixes to issues that can arise, all in a (hopefully) entertaining way.

Bio: Dr. Nina Zumel is co-founder and principal at Win-Vector LLC, a data science consultancy based in San Francisco. She frequently writes and speaks on statistics and machine learning. She is also the coauthor of the popular book Practical Data Science with R (Manning 2014). Nina started her advanced education in electrical engineering at UC Berkeley and holds a Ph.D. in robotics from Carnegie Mellon.