Visualization in Bayesian Workflow Using Python or R

Abstract:

Visualization can be a powerful tool to help you build better statistical models. In this tutorial, you will learn how to create and interpret visualizations that are useful in each step of a Bayesian regression workflow. A Bayesian workflow includes the three steps of (1) model building, (2) model interpretation, and (3) model checking/improvement, along with model comparison. Visualization is helpful in each of these steps – generating graphical representations of the model and plotting prior distributions aid model building, visualizing MCMC diagnostics and plotting posterior distributions aid interpretation, and plotting posterior predictive, counterfactual, and model comparisons aid model checking/improvement.

Session Outline:

Hour 1: Provide an overview of visualizations in a Bayesian workflow (60 minutes)
- Plot graphical representation of model
- Plot prior distributions of parameters
- Plot MCMC diagnostics
- Plot posterior predictive checks
- Plot posterior distributions of parameters
- Plot the data and fitted model
- Plot uncertainty in the mean and predictive uncertainty
- Plot posterior prediction and counterfactual
- Plot model comparisons

Hour 2: Visualization in Bayesian workflow: Linear regression (60 minutes)
- Describe the example
- Step through a Bayesian workflow (explaining interpretation of visualizations)
- Provide time for some exercises for participants

Hour 3: Visualization in Bayesian workflow: Logistic regression (60 minutes)
- Describe the example
- Step through a Bayesian workflow (explaining interpretation of visualizations)
- Provide time for some exercises for participants

Learning Objectives:

- Understand Bayesian regression workflows
- Understand visualization concepts, guidelines, and best practices
- Understand how to interpret and use Bayesian workflow visualizations
- Understand how to create Bayesian workflow visualizations in Python or R

Tools

- Python: pandas, bambi, pymc, arviz
- R: tidyverse, rstanarm, brms, bayesplot

Level of Background Assumed

For this tutorial, I assume workshop attendees have a basic understanding of linear and logistic regression, as well as a basic understanding of (or an interest to learn) a Bayesian analysis framework (i.e., combining prior credibilities of hypotheses with the data to produce posterior credibilities of the explanations). I also assume attendees have a basic understanding of Python or R, but not necessarily the specific packages covered in the tutorial.

Bio:

Clinton Brownley, Ph.D., is currently a lead data scientist at Tala with a focus on causal inference, machine learning, and experimentation. Prior to this role, he was a data scientist at Meta (formerly Facebook), where he was responsible for a variety of analytics projects designed to empower employees to do their best work. Prior to this role, he was a data scientist at WhatsApp, working to improve messaging and VoIP calling performance and reliability. Before WhatsApp, he worked on large-scale infrastructure analytics projects to inform hardware acquisition, maintenance, and datacenter operations decisions at Facebook.

As an avid student and teacher of modern data analysis and visualization techniques, Clinton teaches a graduate course in interactive data visualization for UC Berkeley's MIDS program and a graduate course in regression analysis for NYU's A3SR program. He also leads an annual machine learning in python workshop at the ML Week and ODSC West conferences. Clinton is also the author of two books, ""Foundations for Analytics with Python"" and Multi-objective Decision Analysis"".

Clinton is a past-president of the San Francisco Bay Area Chapter of the American Statistical Association and is a council member for the Section on Practice of the Institute for Operations Research and the Management Sciences. Clinton received degrees from Carnegie Mellon University and American University.

###### Open Data Science

Open Data Science
Cambridge, MA 02142
info@odsc.com