Evaluation of Statistical Models based on Stakeholder Needs

Abstract: 

While stakeholders needs from a data science team vary considerably, those needs are often met using the same statistical model. I propose a framework in which a data scientist can evaluate a statistical models using one of three approaches — (1) models for feature/predictor evaluation, allowing stakeholders to take specific actions that maximize impact on the model outcome variable (2) models for actionable output, where the outcome variable’s value on a prediction set is of primary interest (3) and machine learning models for real-time prediction, where the algorithm and its value are of interest in their active and continued impact on a user.

In this 90-minute tutorial we will evaluate a set of best practices and methods for each of the three approaches to statistical model evaluation. Leveraging common Python libraries such as scikit-learn and statsmodels, we will examine available methods and model output that generate the desired value for stakeholders depending on their specific question. We will also discuss common stakeholder questions that fall into each of the 3 above categories, how to identify a question as belonging to a category, and common follow-up questions to anticipate in presenting your work.

By the end of this tutorial you will be able to:
identify the most effective libraries in Python (with additional recommendations in R) for each category of statistical model
effectively interpret and leverage available model output
classify stakeholder needs into each of the 3 above categories
prepare stakeholder’s requested output with necessary documentation for a variety of technical proficiencies

Bio: 

Mona is a Senior Data Scientist at Greenhouse Software in New York City, where they contribute to data-informed decision making across the company and machine learning solutions to improve the hiring process for Greenhouse customers. They’ve previously worked in government, creating analytics and machine learning solutions to improve the lives of New Yorkers, and continue to be involved in civic projects through a number of volunteer and non-profit organizations. They’ve also been a statistics and data science educator with DataCamp, Emeritus, and in university settings. They hold a graduate degree in Developmental Psychology, and are passionate about contributing to the ethical use of data science methodology in the public and private sector.

Open Data Science

Open Data Science
Innovation Center
101 Main St
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google