Solving the Data Scientist’s Cold-Start Problem with Machine Learning Examples
Solving the Data Scientist’s Cold-Start Problem with Machine Learning Examples

Abstract: 

Unsupervised learning models (including analysis of correlations, clusters, and associations in data) converge more readily to a useful solution if we start with good model parameterizations. Feature engineering is key, but selection of features often becomes guesswork. Similarly, in supervised machine learning, the choice of features in labeled data to use in training may still seem arbitrary. So, how does model-building start and move towards an optimal solution? This challenge is known as the cold-start problem! The solution to the problem is easy (sort of): We start with a guess, a totally random guess! That sounds so random, and so wrong! But there is an orderly and productive way forward from such a start, which we will describe in this workshop. We will present several machine learning modeling examples, suggested solutions to their cold-start challenges, and related concepts, including the objective function, genetic algorithms, backpropagation, gradient descent, and meta-learning.

Session Outline
1. Data Science Preliminaries: Discovery from Data Using Algorithms
2. The Cold-Start Problem and its Challenges
3. Machine Learning Examples and Their Solutions: 5-10 Specific Use Cases

Background Knowledge
- Basic Machine Learning Algorithms;
- Algorithm Concepts.

Session Outline
1. Data Science Preliminaries: Discovery from Data Using Algorithms
2. The Cold-Start Problem and its Challenges
3. Machine Learning Examples and Their Solutions: 5-10 Specific Use Cases

Background Knowledge
- Basic Machine Learning Algorithms;
- Algorithm Concepts.

Bio: 

Dr. Kirk Borne is the Principal Data Scientist and an Executive Advisor at global technology and consulting firm Booz Allen Hamilton. In those roles, he focuses on applications of data science, data management, machine learning, A.I., and modeling across a wide variety of disciplines. He also provides training and mentoring to executives and data scientists within numerous external organizations, industries, agencies, and partners in the use of large data repositories and machine learning for discovery, decision support, and innovation. Previously, he was Professor of Astrophysics and Computational Science at George Mason University for 12 years where he did research, taught, and advised students in data science. Prior to that, Kirk spent nearly 20 years supporting data systems activities on NASA space science programs, which included a period as NASA's Data Archive Project Scientist for the Hubble Space Telescope. Dr. Borne has a B.S. degree in Physics from LSU, and a Ph.D. in Astronomy from Caltech. In 2016 he was elected Fellow of the International Astrostatistics Association for his lifelong contributions to big data research in astronomy. As a global speaker, he has given hundreds of invited talks worldwide, including conference keynote presentations at many dozens of data science, A.I. and big data analytics events globally. He is an active contributor on social media, where he has been named consistently among the top worldwide influencers in big data and data science since 2013. He was recently identified as the #1 digital influencer worldwide for 2018-2019. You can follow him on Twitter at @KirkDBorne.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google