Mathematics for Data Science and Machine Learning
Mathematics for Data Science and Machine Learning


The field of machine learning and data science has gained sudden resurgence in the last few years. The contributions of machine learning in solving data-driven problems and creating intelligent applications cannot be overemphasized. This field which intersects statistics and probability, mathematics, computer science and algorithms can be used to learn iteratively from complex data and find hidden insights. Understanding the mathematics behind machine learning allows us to choose the right algorithms for our problem, make good choices on parameter settings and validation strategies, recognize under- and over-fitting, troubleshoot ambiguous results and put appropriate confidence bounds on results.

By completing this workshop, you will develop an understanding of some of the most important mathematical concepts in machine learning and data science, and how useful they are in practice. You will familiarize yourself with using multivariate calculus to understand the foundations of feedforward neural networks, the linear algebra concepts behind dimensionality reduction, how the maximum likelihood estimation can be used to derive machine learning cost functions and the building blocks of continuous optimizations.

Session Outline
Lesson 1: Multivariate Calculus and Neural Networks
Training a neural network means parameter optimization. In this lesson, you will familiarize yourself with using differential calculus to compute gradients of a loss function with respect to the parameters of a neural network. You will understand the building blocks of multivariate calculus like the sum rule, the product rule, chain rule, the Jacobian and the Hessian.

Lesson 2: From Linear Algebra to Dimensionality Reduction
The goal of dimensionality reduction is to replace a large matrix by two more other matrices whose sizes are smaller than the original, but from which the original can be approximately reconstructed, usually by taking their product. In this lesson, we will explore the basic concepts of linear algebra: eigenvalues, eigenvectors and matrix multiplication. You will be able to understand dimensionality reduction techniques like Principal-Component Analysis (PCA) and Singular-Value Decomposition (SVD).

Lesson 3: Maximum Likelihood Estimation and Gradient-Based Optimization
In supervised machine learning, the cost functions are important for measuring the performance of a trained model. In this lesson, you will learn how to derive the cost function for regression (mean squared error) and binary classification (cross entropy) using maximum likelihood estimation. You will also build up the intuition for gradient-based optimization, their difficulties, variants and tips for optimal performance.


Adewale (Wale) Akinfaderin is a Data Scientist at Amazon Web Services. His expertise is in machine learning, deep learning, statistical experimentation and general information theory. He has broad experience implementing and extending ML techniques to solve practical and business problems. In his spare time, he conducts research on Machine Learning for the Developing World.