The Wisdom of the Cloud

Abstract: 

What can we learn about data science by watching data science competitions?

During a data science competition like the ones hosted by DrivenData and Kaggle, the leaderboard lists the teams that have submitted models and the scores the top models have achieved.

As the competition proceeds, the scores often improve quickly as teams explore a variety of models and then more slowly as they approach the limits of what’s possible.

Using 170,000 scores from more than 50 competitions hosted by DrivenData, we explore the aggregated behavior of the competing teams.

What patterns can we see?

Based on early returns, can we predict the limits?
What factors influence the time, and number of submissions it takes to reach the performance plateau?
Do models tend to overfit the data as the contest progresses?
And what guidance can we provide for deciding when to stop searching?

In this talk, we will answer these questions and share other observations from the other side of the leaderboard.

Bio: 

Isaac is a co-founder and Principal Data Scientist at DrivenData, Inc, where he leads client engagements and spearheads development of the data science competition platform. He holds a master's in Computational Science and Engineering from Harvard’s School of Engineering and Applied Sciences and a BS in Operations Research from the U.S. Coast Guard Academy, and previously spent seven years as a Coast Guard officer serving in a variety of operational and quantitative roles.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google