Introduction to Math for Data Science


With the availability of data, there is a growing demand for talent who can analyze and make sense of it. This makes practical math all the more important because it helps infer insights from data. However, mathematics comprises many topics, and it is hard to identify which ones are applicable and relevant for a data science career. Knowing these essential math topics is key to integrating knowledge across data science, statistics, and machine learning. It has become even more important with the prevalance of libraries like PyTorch and scikit-learn, which can create """"black box"""" approaches where data science professionals use these libraries but do not fully understand how they work.

In this training, Thomas Nield (author of O'Reilly book """"Essential Math for Data Science"""") will provide a crash-course of carefully curated topics to jumpstart proficiency in key areas of mathematics. This includes probability, statistics, hypothesis testing, and linear algebra. Along the way you’ll integrate what you’ve learned and see practical applications for real-world problems. These examples include how statistical concepts apply to machine learning, and how linear algebra is used to fit a linear regression. We will also use Python to explore ideas in calculus and model-fitting, using a combination of libraries and from-scratch approaches.

Session Outline:

Lesson 1: Calculus and Probability Basics

Become acquainted with fundamental calculus concepts like derivatives and integrals, and how to calculate them from scratch in Python as well as using the SymPy library. Afterwards, we will use this knowledge to build a normal distribution and interpret its PDF, CDF, and PPF.

Lesson 2: Statistics and Hypothesis Testing

After gaining knowledge on probability and calculus, this section will turn to ideas on hypothesis testing and understanding the relationship between a sample and population. We will also get context on how samples and populations play a role in data collection, and hypothesis testing's role in machine learning.

Lesson 3: Linear Algebra and Basic Machine Learning

We finally turn our attention to some basic linear algebra concepts, like linear transformations, and how they play a role in machine learning. We will use the most basic machine learning model, the linear regression, to learn about matrix decomposition and gradient descent, and why it extends into more advanced topics like neural networks and deep learning.

After this session, attendees will understand a variety of from scratch and library approaches to explore mathematical concepts in data science. They will be able to perform derivatives and integrals using SymPy, and use probability distributions and hypothesis testing using SciPy. Most importantly, they will integrate probability, statistics, and linear algebra concepts and understand their relevance to model fitting and machine learning.

Background Knowledge:

Some experience with Python recommended


Thomas Nield is the founder of Nield Consulting Group and Yawman Flight, as well as an instructor at University of Southern California. He enjoys making technical content relatable and relevant to those unfamiliar or intimidated by it. Thomas regularly teaches classes on data analysis, machine learning, mathematical optimization, and practical artificial intelligence. At USC he teaches AI System Safety, developing systematic approaches for identifying AI-related hazards in aviation and ground vehicles. He's authored three books, including Essential Math for Data Science (O’Reilly) and Getting Started with SQL (O'Reilly)

He is also the founder and inventor of Yawman Flight, a company developing universal handheld flight controls for flight simulation and unmanned aerial vehicles.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google