Hands-on Training Sessions

– Taught by World-Class Data Scientists –

Learn the latest data science concepts, tools, and techniques from the best. Forge a connection with these rock stars from industry and academia, who are passionate about molding the next generation of data scientists.

Highly Experienced Instructors

Our instructors are highly regarded in data science, coming from both academia and notable companies.

Real World Applications

Gain the skills and knowledge to use data science in your career and business, without breaking the bank.

Cutting Edge Subject Matter

Find training sessions offered on a wide variety of data science topics from machine learning to data visualization.

WHY ODSC?

Form a working relationship with some of the world’s top data scientists for follow up questions and advice.

Access to 50+ talks and workshops.

Recordings of workshop and talks sessions for later review.

Equivalent training at other conferences costs much more.

Professionally prepared learning materials, custom tailored to each course.

Opportunities to connect with other ambitious like-minded data scientists.

Current Data Science Training Instructors


East Trainings
---30th March, Tuesday
--31st March, Wednesday
-1st April, Thursday
East Workshops & Tutorials
---30th March, Tuesday
--31st March, Wednesday
-1st April, Thursday
---30th March, Tuesday
--31st March, Wednesday
-1st April, Thursday
---30th March, Tuesday
--31st March, Wednesday
-1st April, Thursday
10:00 - 13:00
Session Title by Leonardo De Marchi Coming Soon!

Half-Day Training

Session Title by Leonardo De Marchi Coming Soon! image
Leonardo De Marchi
Head of Data Science and Analytics | Badoo (now MagicLab, which owns several apps)
10:00 - 13:00
Probabilistic Programming and Bayesian Inference with Python

Half-Day Training | Machine Learning | ML for Programmers | Intermediate

 

If you can write a model in sklearn, you can make the leap to Bayesian inference with PyMC3, a user-friendly intro to probabilistic programming (PP) in Python. PP just means building models where the building blocks are probability distributions! And we can use PP to do Bayesian inference easily. Bayesian inference allows us to solve problems that aren’t otherwise tractable with classical methods...more details

Probabilistic Programming and Bayesian Inference with Python image
Lara Kattan
Data Science Manager | EY
10:00 - 16:30
Modern Machine Learning in R Part I

Full-Day Training | Machine Learning | ML for Programmers | Intermediate

 

With hundreds of machine learning models available in R having one unified interface makes life so much easier. The author of the popular {caret} package has reinvented it for the modern age in the form of {tidymodels}. We’ll use the suite of packages to specify models, perform feature engineering, tune over hyperparameters and make predictions...more details

Modern Machine Learning in R Part I image
Jared Lander
Chief Data Scientist, Author of R for Everyone, Professor | Lander Analytics, Columbia Business School
10:00 - 16:30
Modern Machine learning in R Part II

Full-Day Training | Machine Learning | ML for Programmers | Intermediate

 

This second day of modern machine learning has us fitting forecasting models on time series data using the new {fable} package. Then we’ll take our fitted models and turn them into an API using {plumber} and expose them in Docker containers so they are ready for production.…more details

Modern Machine learning in R Part II image
Jared Lander
Chief Data Scientist, Author of R for Everyone, Professor | Lander Analytics, Columbia Business School
10:00 - 13:00
Intermediate Machine Learning with Scikit-learn: Evaluation, Calibration, and Inspection

Half-Day Training | Machine Learning | Intermediate

 

Scikit-learn is a machine learning library in Python that is used by many data science practitioners. In this training, we will learn about model evaluation, model calibration, and model inspection. We will start by learning about evaluating a machine learning model after it is trained. We will compare various metrics such as ROC AUC and mean average precision and see how they behave on datasets with different characteristics. We will use scikit-learn’s plotting API to easily visualize the performance of a model and to compare multiple models. Next, we will learn about how to calibrate a machine learning model with scikit-learn. A well-calibrated model will predict probabilities that reflect the true likelihood of an event...more details

Intermediate Machine Learning with Scikit-learn: Evaluation, Calibration, and Inspection image
Thomas Fan
Staff Associate - Machine Learning | Columbia University in the City of New York
10:00 - 13:00
Introduction to Scikit-learn: Machine Learning in Python

Half-Day Training | Machine Learning | Beginner

 

Scikit-learn is a machine learning library in Python that is used by many data science practitioners. Machine learning is a valuable tool used across many domains such as medicine, physics, and finance. We will start this training by learning about scikit-learn’s API for supervised machine learning. scikit-learn’s API mainly consists of three methods: fit, to build models, predict, to make predictions from models, and transform, to change the representation of the input data...more details

Introduction to Scikit-learn: Machine Learning in Python image
Thomas Fan
Staff Associate - Machine Learning | Columbia University in the City of New York
10:00 - 13:00
Atypical Applications of Typical Machine Learning Algorithms

Half-Day Training | Machine Learning | Intermediate-Advanced

 

How could a violation of the triangle inequality theorem in mathematics lead to a cure for cancer? How can a mathematical concept from the 18th century be used to estimate the mass density of galaxies across the Universe? How could a marketing segmentation algorithm protect astronauts traveling to Mars from certain death? How does a F1 race from the 1950’s inspire one of the greatest machine learning use cases for the Internet of Things? This workshop will answer these questions, and more, by presenting several examples of typical algorithms that were adopted for specific use cases or application domains, then showing how each one can be adapted to an atypical (often mind-bending) use case, producing significantly surprising results in some other domain…more details

Atypical Applications of Typical Machine Learning Algorithms image
Dr. Kirk Borne
Principal Data Scientist | Booz Allen Hamilton
10:00 - 13:00
Network Analysis Made Simple

Half-Day Training | Data Visualization | Machine Learning | Beginner-Intermediate

 

Upon completing this tutorial, you will be:
– familiar with how to use the NetworkX and nxviz Python packages for modelling and rationally visualizing networks,
– able to load node and edge data from a Pandas dataframe,
– familiar with object-oriented and matrix-oriented representations of graphs,
able to find paths between nodes, interesting structures in graphs, and projections of bipartite graphs.
– (if time permits) able to use matrix operations to simulate diffusion of information on networks..more details

Network Analysis Made Simple image
Eric Ma, PhD
Author of nxviz Package
10:00 - 16:30
Programming with Data: Python and Pandas

Half-Day Training | Kick-starter | Intermediate

 

In this training, you will learn how to accelerate your data analyses using the Python language and Pandas, a library specifically designed for tabular data analysis. We start by learning the core Pandas data structures, the Series and DataFrame. From these foundations, we will learn to use the split-apply-combine paradigm for grouped computations, manipulate time series, and perform advanced joins between datasets. Specifically, loading, filtering, grouping, and transforming data. Having completed this workshop, you will understand the fundamentals and advanced features of Pandas, be aware of common pitfalls, and be ready to perform your own analyses…more details

 

Programming with Data: Python and Pandas image
Daniel Gerlanc
President | Enplus Advisors Inc.
10:00 - 13:00
Solving the Data Scientist’s Cold-Start Problem with Machine Learning Examples

Half-Day Training  | Machine Learning | Intermediate-Advanced

 

Unsupervised learning models (including analysis of correlations, clusters, and associations in data) converge more readily to a useful solution if we start with good model parameterizations. Feature engineering is key, but selection of features often becomes guesswork. Similarly, in supervised machine learning, the choice of features in labeled data to use in training may still seem arbitrary. So, how does model-building start and move towards an optimal solution? This challenge is known as the cold-start problem! The solution to the problem is easy (sort of): We start with a guess, a totally random guess! That sounds so random and so wrong! But there is an orderly and productive way forward from such a start, which we will describe in this workshop…more details

Solving the Data Scientist’s Cold-Start Problem with Machine Learning Examples image
Dr. Kirk Borne
Principal Data Scientist | Booz Allen Hamilton
13:30 - 16:30
Painting with Data: Introduction to d3.js

Half-Day Training | Data Visualization | Intermediate-Advanced

 

In this workshop we will build an interactive data visualization from scratch using d3.js in the browser. The posibilities shown in d3 examples are exciting but the API surface of d3 and the various browser standards like HTML, CSS, SVG and JavaScript, can be overwhelming. Think of this workshop as a guided tour that will point out the important things to pay attention to as we go step-by-step from CSV file to interactive visualization...more details

Painting with Data: Introduction to d3.js image
Ian Johnson
Data Visualization Developer | Observable
13:30 - 16:30
Advanced Machine Learning with Scikit-learn: Text Data, Imbalanced Data, and Poisson Regression

Half-Day Training | Machine Learning | Intermediate

 

Scikit-learn is a machine learning library in Python that is used by many data science practitioners. In this training, we will learn about processing text data, working with imbalanced data and Poisson regression. We will start by learning about processing text data with scikit-learn’s CountVectorizer and TfidfVectorizer. The CountVectorizer converts a collection of text documents into a matrix of token counts. We will explore the hyper-parameters that the CountVectorizer provides for creating these token counts. The TfidfVectorizer weights the count features into floating point values using the term frequency and inverse document-frequency...more details

Advanced Machine Learning with Scikit-learn: Text Data, Imbalanced Data, and Poisson Regression image
Thomas Fan
Staff Associate - Machine Learning | Columbia University in the City of New York
13:30 - 16:30
Intermediate Machine Learning with Scikit-learn: Cross-validation, Parameter Tuning, Pandas Interoperability, and Missing Values

Half-Day Training | Machine Learning | Intermediate

 

Scikit-learn is a machine learning library in Python that is used by many data science practitioners. In this training, we will learn about cross validation, tuning machine learning algorithms and pandas interoperability. We will start by learning about cross validation for machine learning. Cross validation enables us to evaluate our machine learning models by splitting our data into training and testing datasets. We will cover cross validation schemes such as K-Fold cross-validation and the importance of stratifying your data. Next, we will learn about tuning algorithms in scikit-learn with grid search and random search. These hyper-parameter searching techniques help find hyper-parameter combinations that are suited for your dataset...more details

Intermediate Machine Learning with Scikit-learn: Cross-validation, Parameter Tuning, Pandas Interoperability, and Missing Values image
Thomas Fan
Staff Associate - Machine Learning | Columbia University in the City of New York
13:30 - 16:30
Applied Deep Learning: Building a Chess Object Detection Model with TensorFlow

Half-Day Training | Deep Learning | Intermediate

 

In this tutorial, we will introduce how to build an object detection model. Specifically, we will build an object detection model that identifies chess pieces (a custom dataset provided by the presenter). In doing so, participants will gain insight into the fundamentals of computer vision: structuring a good problem for object detection, dataset collection and annotation, data preparation through preprocessing, data augmentation to support a well-fit model, training a model, debugging a model’s fit, and using the model for inference.

We will introduce Keras and TensorFlow as our specific libraries for writing computer vision models and train a Yolov3 (You Only Look Once) model for real-time detection. Participants will leverage Colab for GPU compute...more details

Applied Deep Learning: Building a Chess Object Detection Model with TensorFlow image
Joseph Nelson
Cofounder, Principal Data Scientist & Faculty | Roboflow.ai, BetaVector, General Assembly
13:30 - 16:30
Good, Fast, Cheap: How to do Data Science with Missing Data

Half-Day Training | Machine Learning | Beginner

 

If you’ve never heard of the “good, fast, cheap” dilemma, it goes something like this: You can have something good and fast, but it won’t be cheap. You can have something good and cheap, but it won’t be fast. You can have something fast and cheap, but it won’t be good. In short, you can pick two of the three but you can’t have all three. If you’ve tackled a data science problem before, I can all but guarantee that you’ve run into missing data. How do we handle it? Well, we can avoid, ignore, or try to account for missing data. The problem is, none of these strategies are good, fast, *and* cheap...more details

Good, Fast, Cheap: How to do Data Science with Missing Data image
Matt Brems
Global Lead Data Science Instructor | General Assembly
10:15 - 11:45
Session Title by Julie Josse, PhD Coming Soon!

Tutorial

Session Title by Julie Josse, PhD Coming Soon! image
Julie Josse, PhD
Advanced Researcher | Inria
10:15 - 11:45
Session Title by Freddy Lecue, PhD Coming Soon!

Workshop

Session Title by Freddy Lecue, PhD Coming Soon! image
Freddy Lecue, PhD
Chief AI Scientist at CortAIx | Thales
10:15 - 11:45
Session Title by Laura A. Seaman, PhD Coming Soon!

Tutorial

Session Title by Laura A. Seaman, PhD Coming Soon! image
Laura A. Seaman, PhD
Machine Intelligence Scientist | Draper
10:15 - 11:45
Session Title by Dr. Jon Krohn Coming Soon!

Tutorial

Session Title by Dr. Jon Krohn Coming Soon! image
Dr. Jon Krohn
Chief Data Scientist, Author of Deep Learning Illustrated | Untapt
10:15 - 11:45
Exploring the Interconnected World: Network/Graph Analysis in Python

Tutorial | Data Visualization | Beginner-Intermediate

 

Networks, also known as graphs are one of the most crucial data structures in our increasingly intertwined world. Social friendship networks, the world-wide-web, financial systems, infrastructure (power grid, streets), etc. are all network structures. Knowing how to analyze the underlying network topology of interconnected systems can provide an invaluable skill in anyone’s toolbox. This tutorial will provide a hands-on guide on how to approach a network analysis project from scratch and end-to-end: how to generate, manipulate, analyze and visualize graph structures that will help you gain insight about relationships between elements in your datamore details

Exploring the Interconnected World: Network/Graph Analysis in Python image
Noemi Derzsy, PhD
Senior Inventive Scientist | AT&T Chief Data Office
11:55 - 13:25
Session Title by Christopher Kanan, PhD Coming Soon!

Tutorial | Deep Learning

Session Title by Christopher Kanan, PhD Coming Soon! image
Christopher Kanan, PhD
Professor, Lab Director, Sr. AI Scientist | Cornell Tech, Rochester Institute of Technology, Paige
11:55 - 13:25
Session Title by Jayeeta Putatunda Coming Soon!

Workshop

Session Title by Jayeeta Putatunda Coming Soon! image
Jayeeta Putatunda
Data Scientist | Indellient
11:55 - 13:25
Session Title by Paige Roberts Coming Soon!

Tutorial

Session Title by Paige Roberts Coming Soon! image
Paige Roberts
Open Source Relations Manager | Vertica
13:40 - 15:10
Session Title by Shagun Sodhani Coming Soon!

Tutorial

Session Title by Shagun Sodhani Coming Soon! image
Shagun Sodhani
Research Engineer | Facebook AI Research Group.
13:40 - 15:10
Accelerating The Journey To Document Understanding AI

Tutorial | NLP | Beginner-Intermediate

 

Accelerating The Journey To Document Understanding AI

Companies are sitting on a goldmine. You have all these documents – docs, emails, PDFs, forms, images… Some of these you need to understand as a requirement of doing business but others could give you valuable insight into your business and customers to help you make better decisions. But instead most documents are just sitting there, untapped, because it’s difficult to read, compare and understand the relationship between them.

Document AI turns unstructured documents into structured data with the power to read, understand and make it useful. Under the hood, Vision and NLP technology give you the capability to unlock the values of documents and offers you a competitive advantage to re-invest & repurpose the resources for your organization…more details

Accelerating The Journey To Document Understanding AI image
Elliott Ning
Cloud Architect | Google
13:40 - 15:10
Session Title by Sage Elliott Coming Soon!

Workshop

Session Title by Sage Elliott Coming Soon! image
Sage Elliott
Developer Evangelist, Machine Learning | Sixgill, LLC
13:40 - 15:10
Echo State Networks for Time-Series Data

Tutorial | Machine Learning | Beginner-Intermediate

 

In this session, participants will be introduced to Echo State Networks (a type of recurrent neural network) including theory, key parameters in implementation and practical considerations. Participants will have the opportunity to use a publicly available Echo State Network implementation on open data. Additional results will be shown based on a highly customized implementation.
Participants will come away with a basic understanding of Echo State Networks, how to use the EchoTorch python module, the impact key parameters have on algorithm performance and potential application areas…
more details

Echo State Networks for Time-Series Data image
Teal Guidici, PhD
Machine Intelligence Scientist | Draper
15:20 - 16:50
Ml Inference on Mobile Device With Onnx Runtime

Tutorial | Machine Learning | Intermediate

 

AI model inference on the phone is important to deliver real-time experience that requires execution locally on the device. But there are many constraints to build a scalable solution addressing all the different hardware specs and platform requirements across Android and iOS. Devices have limited storage and memory, and the platform app stores impose restrictions on the size of the app package.
ONNX Runtime Mobile is a new feature to address the needs for developers to build solutions for Mobile devices. You can build a reduced size binary package to integrate with the phone application and inference your ONNX models locally on devices.
ONNX Quantization techniques can be used to reduce the model size by converting FP32 weights to INT8. Improved performance is achieved with INT8 execution on ARM processors on the mobile phonemore details

Ml Inference on Mobile Device With Onnx Runtime image
Manash Goswami
Principal Program Manager | Microsoft
15:20 - 16:50
Scaling Machine Learning with Dask

Tutorial | MLOps & Data Engineering | Intermediate

 

In this talk, attendees will get an introduction to Dask, a distributed computing framework in the PyData ecosystem.

The first half of the talk will describe the current state of the project and its ecosystem including distributed data collections, cloud deployment options, distributed machine learning projects, and workflow orchestration.

The second half of the talk will be a live demo showing the programming model for machine learning on Dask. Dask’s potential for speeding up machine learning workflows will be demonstrated with an intermediate-level tutorial on training XGBoost and LightGBM models with Daskmore details

Scaling Machine Learning with Dask image
James Lamb
Software Engineer | Saturn Cloud
Select date to see events.

Save 40% Off Limited Offer Ends Soon

SAVE NOW

Confirmed Trainings

Half-Day Training

Focus Area: Machine Learning  Session: Half Day Training 

Solving the Data Scientist’s Cold-Start Problem with Machine Learning Examples

Instructor: Dr. Kirk Borne, Principal Data Scientist | Booz Allen Hamilton


Full-Day Training

Focus Area: Machine Learning   Session: Full-Day Training

Modern Machine Learning in R

Instructor: Jared Lander, Chief Data Scientist, Author of R for Everyone, Professor | Lander Analytics, Columbia Business School


Half-Day Training

Focus Area: Machine Learning   Session: Half-Day Training

Good, Fast, Cheap: How to do Data Science with Missing Data

Instructor: Matt Brems, Global Lead Data Science Instructor | General Assembly


Half-Day Training

Focus Area: Machine Learning   Session: Half-Day Training 

Introduction to Scikit-learn: Machine learning in Python

Instructor: Thomas Fan, Staff Associate – Machine Learning | Columbia University in the City of New York


Half Day Training

Focus Area: Machine Learning  Session: Half Day Training 

Probabilistic Programming and Bayesian Inference with Python  

Instructor: Lara Kattan, Risk Management Specialist | Federal Reserve Bank of Chicago


Half Day Training

Focus Area: Machine Learning  Session: Half Day Training

Atypical Applications of Typical Machine Learning Algorithms

Instructor: Dr. Kirk Borne, Principal Data Scientist | Booz Allen Hamilton


Half Day Training

Focus Area: Machine Learning  Session: Half Day Training

Intermediate Machine Learning with Scikit-learn: Cross-validation, Parameter Tuning, Pandas Interoperability, and Missing Values

Instructor: Thomas Fan, Staff Associate – Machine Learning | Columbia University in the City of New York


Half Day Training

Focus Area: Data Visualization   Session: Half-Day Training

Painting with Data: Introduction to d3.js

Instructor: Ian Johnson, Data Visualization Developer | Observable


Half Day Training

Focus Area: Data Visualization Session: Half Day Training 

Network Analysis Made Simple

Instructor: Eric Ma, PhD, Author of nxviz Package


Full-Day Training

Focus Area: Data Science Kickstart   Session: Full-Day Training

Programming with Data: Python and Pandas

Instructor: Daniel Gerlanc, President | Enplus Advisors, Inc.


Half Day Training

Focus Area: Deep Learning  Session: Half Day Training 

Applied Deep Learning: Building a Chess Object Detection Model with TensorFlow

Instructor: Joseph Nelson, Co-Founder, Principal Data Scientist & Faculty | Roboflow.ai, BetaVector, General Assembly


Save 40% Off Limited Offer Ends Soon

SAVE NOW