Abstract: This tutorial will introduce you to the wonderful world of Bayesian data science through the lens of probabilistic programming in Python. In the first half of the tutorial, we will introduce the key concepts of probability distributions via hacker statistics, hands-on simulation, and telling stories of the data-generation processes. We will also cover the basics of joint and conditional probability, Bayes' rule, and Bayesian inference, all through hands-on coding and real-world examples. In the second half of the tutorial, we will use a series of models to build your familiarity with PyMC3, showcasing how to perform the foundational inference tasks of parameter estimation, group comparison (for example, A/B tests and hypothesis testing), and arbitrary curve regression. By the end of this tutorial, you will be equipped with a solid grounding in Bayesian inference, able to write arbitrary models, and have experienced basic model checking workflow.
After attending this tutorial, participants will:
- have a solid foundation of probability viewed through the lens of computational simulation and see how probability distributions can be matched to real-world data generating processes.
- understand how to use `numpy.random` to simulate draws from a probability distribution, use those simulations to calculate summary statistics, and use those summary statistics in testing hypotheses against data in a Bayesian fashion.
- understand how to use the probabilistic programming language PyMC3 to build arbitrary statistical models.
- be able to build and validate statistical models in a robust and principled fashion.
Knowledge of `numpy`, `matplotlib`, and Python are prerequisites for this tutorial, in addition to curiosity and an excitement to learn new things!
Bio: Hugo Bowne-Anderson is a Data Scientist at DataCamp and has had extensive experience teaching basic to advanced data science topics at institutions such as Yale University and Cold Spring Harbor Laboratory, conferences such as SciPy, PyCon, and with organizations such as Data Carpentry. He has developed over 25 courses on the DataCamp platform, impacting over 300,000 learners worldwide through his own courses. He previously also hosted DataFramed, the DataCamp podcast, loves teaching Bayesian data analysis and aspires to reduce as much “computational anxiety’ in the world as he can through pedagogy.