ODSC EAST 2023

Conference Preliminary Schedule

more sessions added weekly

For East 2022 please rever to the event app for the in-person conference and the our virtual platform for our most current virtual schedule


East 2023 Schedule

We are delighted to announce our East 2023 Preliminary Schedule!
Note: In-Person sessions will not be recorded. If you have a virtual Pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded.

 

East Bootcamp Sessions – EST
Monday, 8th May
East Trainings/Workshops/Tutorials – EST
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
East Talks – EST
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
Monday, 8th May
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
Programming with Data: Python and Pandas

Virtual | Bootcamp | Machine Learning | MLOps | Intermediate

 

 

In this training, you will learn how to accelerate your data analyses using the Python language and Pandas, a library specifically designed for tabular data analysis. We start by learning the core Pandas data structures, the Series and DataFrame. From these foundations, we will learn to use the split-apply-combine paradigm for grouped computations, manipulate time series, and perform advanced joins between datasets. Specifically, loading, filtering, grouping, and transforming data. Having completed this workshop, you will understand the fundamentals and advanced features of Pandas, be aware of common pitfalls, and be ready to perform your own analyses…more details

Programming with Data: Python and Pandas image
Daniel Gerlanc
Sr. Director - Data Science & ML Engineering | Ampersand
An Introduction to Data Wrangling with SQL

Virtual | Bootcamp | Machine Learning | Beginner

 

 

Data wrangling is an essential foundational topic for anyone considering a role in data engineering, data science, or machine learning. This session will help you understand core data wrangling concepts including what is data, data generation and collecting, data cleaning, profiling, transformation, and other essential data wrangling topics. As this is an interactive training session, in addition to covering these topics, we will layer on hands-on SQL training and an introduction to relational databases. As we journey through the data workflow we will use SQL to wrangle and transform the data as needed. SQL consistently makes the top 5 job requirements list for data scientists, data analysts, machine learning engineers, and other related data roles.  The SQL standard is the universal go-to tool for manipulating structured data stores including relational databases…more details

An Introduction to Data Wrangling with SQL image
Sheamus McGovern
Founder and CEO | ODSC
Hyper-productive NLP with Hugging Face Transformers

In-person | Workshop | NLP | Machine Learning | Beginner-Intermediate

 

 

In this workshop, you’ll walk through a complete end-to-end example of using Hugging Face Transformers, involving both our open-source libraries and some of our commercial products. Starting from a dataset containing real-life product reviews from Amazon.com, you’ll train and deploy a text classification model predicting the star rating for similar reviews…more details

Hyper-productive NLP with Hugging Face Transformers image
Julien Simon
Chief Evangelist | Hugging Face
Introduction to AutoML: Hyperparameter Optimization and Neural Architecture Search

In-person | Tutorial | Machine Learning | Intermediate

 

 

This tutorial is targeted towards Data Scientists and machine learning engineers who work on machine learning and deep learning models.
Given a task, one is interested in finding a well-performing model to solve that task. Very often, this would involve tweaking the model either by changing the hyper parameters or modifying its architecture in order to find a better performing model. In the past, this was always done manually. But, with the advent of Automated Machine Learning, we can now leave that to the machines. In this tutorial, we will provide an overview of Hyperparameter Optimization (HPO) and Neural Architecture Search (NAS)…more details

Introduction to AutoML: Hyperparameter Optimization and Neural Architecture Search image
Tejaswini Pedapati
Research Engineer | IBM TJ Watson
The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation

Virtual | Tutorial | Responsible AI | Machine Learning | All Levels

 

 

Data Cards are transparency artifacts that provide structured summaries of ML datasets with explanations of processes and rationale that shape the data. They also describe how the data may be used to train or evaluate ML models. In practice, two critical factors determine the success of a transparency artifact: (1) the ability to identify the information decision-makers use and (2) the establishment of processes and guidance needed to acquire that information. To initiate practice-oriented foundations in transparency that support responsible AI development in cross-functional groups and organizations, we created the Data Cards Playbook — an open-sourced, self-service, comprehensive toolkit consisting of participatory activities, frameworks, and guidance designed to address specific challenges faced by teams, product areas and companies when setting up an AI dataset transparency effort…more details

The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation image
Andrew Zaldivar, PhD
Senior Developer Relations Engineer | Google Research
NLP Fundamentals

Virtual | Full-Day Training | NLP | Beginner

 

In this course we will go through Natural Language Processing fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases…more details

NLP Fundamentals image
Leonardo De Marchi
VP of Labs | Thomson Reuters
Modern NLP: Pre-training, Fine-tuning, Prompt Engineering, and Human Feedback

In-person | Workshop | NLP | Intermediate

 


Leaving this workshop, you will understand each of these topics, and you will have gained the practical, hands-on expertise to start integrating modern NLP in your domain. Participants will fine-tune and prompt engineer state-of-the-art models like BART and XLM-Roberta, and they will peer behind the curtain of world shaking technologies like ChatGPT to understand their utility and architectures…more details

Modern NLP: Pre-training, Fine-tuning, Prompt Engineering, and Human Feedback image
Daniel Whitenack, PhD
Data Scientist | SIL International
Self-Supervised and Unsupervised Learning for Conversational AI and NLP

Virtual | Workshop | NLP | Machine Learning | Beginner-Intermediate

 

 

In this session, I will be giving some background in Conversational AI, NLP along with Self-supervised and Unsupervised techniques. Transformers based large language models (LLMs) such as GPT-3, Jurasic, T5 have been foundational to the advances that we see. I will walk the audience through hands-on examples and how they can leverage transformers and large language models for few-shot, zero-shot learning in a variety of NLP applications such as text classification, summarization and question-answering…more details

Self-Supervised and Unsupervised Learning for Conversational AI and NLP image
Chandra Khatri
Chief Scientist and Head of AI | Got It AI
Advanced Fraud Modeling & Anomaly Detection with Python & R

In-person | Full-Day Training | Machine Learning Safety and Security | All Levels

 

This course outlines the typical fraud framework at an organization and where data science can play a role as well as lay out how to build an analytically advanced fraud system. It then covers statistical and machine learning approaches to anomaly detection…more details

Advanced Fraud Modeling & Anomaly Detection with Python & R image
Aric LaBarr, PhD
Associate Professor of Analytics | Institute for Advanced Analytics at NC State University
Idiomatic Pandas

In-person | Workshop | Machine Learning | Beginner

 

Pandas can be tricky, and there is a lot of bad advice floating around. This tutorial will cut through some of the biggest issues I’ve seen with Pandas code after working with the library for a while and writing three books on it…more details

Idiomatic Pandas image
Matt Harrison
Python & Data Science Corporate Trainer | Consultant | MetaSnake
Unifying ML With One Line of Code

In-person | Tutorial | Deep Learning |Machine Learning | All Levels

 

Why should we try to unify the ML frameworks? Won’t we just create a new incompatible standard and make the ML fragmentation even worse? I will argue that the answer to these sensible and important questions is no…more details

Unifying ML With One Line of Code image
Daniel Lenton, PhD
CEO | Ivy
Introduction to Large-scale Analytics with PySpark

Virtual | Workshop | Machine Learning | Deep Learning, Data Engineering & Big Data | Beginner-Intermediate

 

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data over the last few years and is now a critical part of the data science toolbox. This workshop will introduce you to the fundamentals of PySpark, Spark’s Python API, and other best practices in Spark programming…more details

Introduction to Large-scale Analytics with PySpark image
Akash Tandon
Co-Founder | Co-author, Advanced Analytics with PySpark | Looppanel | O'Reilly Media
Beyond the Basics: Data Visualization in Python

In-person | Full-Day Training | Data Visualization & Data Analysis | Machine Learning | Intermediate-Advanced

 

The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python…more details

Beyond the Basics: Data Visualization in Python image
Stefanie Molin
Data Scientist | Bloomberg | Author of Hands-On Data Analysis with Pandas
A Practical Tutorial on Building Machine Learning Demos with Gradio

In-person | Workshop | Machine Learning | Deep Learning | Data Engineering | NLP | Beginner

 

In this workshop, we will cover how to build machine learning web applications using the Gradio (www.gradio.dev) library…more details

A Practical Tutorial on Building Machine Learning Demos with Gradio image
Freddy Boulton
Senior Software Engineer | Alteryx Innovation Labs
Getting Started with Hyperparameter Optimisation

In-person | Half-Day Training | Machine Learning | Intermediate

 

 

The workshop participants will then get a chance to complete a set of tasks revolving around the various optimisation techniques and observe the outcomes. The tasks will include hyperparameter optimisation for a deep neural network and optimization of the parameters of one ensemble model (Random Forest)…more details

Getting Started with Hyperparameter Optimisation image
Nikolay Manchev, PhD
Head of Data Science for EMEA | Domino Data Lab
Advanced Gradient Boosting (II): Calibration, Probabilistic Regression and Conformal Prediction

In-person | Half-Day Training | Machine Learning  | Intermediate-Advanced

 

Gradient Boosting remains the most effective method for classification and regression problems on tabular data. This session is Part Two of two, covering advanced topics that are newer and may be less familiar. First, we will discuss how to calibrate the probabilities of classification models, reviewing the major techniques. Next, we will discuss Probabilistic Regression, wherein the goal is to predict the full probability distribution of the numerical target given the features, demonstrating different approaches to this problem. Finally, we will present tools for Conformal Prediction – a hot topic which can provide prediction intervals with strong theoretical guarantees.more details

Advanced Gradient Boosting (II): Calibration, Probabilistic Regression and Conformal Prediction image
Brian Lucena, PhD
Principal | Numeristical
Advanced Gradient Boosting (I): Fundamentals, Interpretability, and Categorical Structure

In-person | Half-Day Training | Machine Learning | Intermediate-Advanced

 

Gradient Boosting remains the most effective method for classification and regression problems on tabular data. This session is Part One of two. We will start with the fundamentals of how boosting works and best practices for model building and hyper-parameter tuning. Next, we will discuss how to interpret the model, understanding what features are important generally and for a specific prediction. Finally, we will discuss how to exploit categorical structure, when the different values of a categorical variable have a known relationship to one another.more details

Advanced Gradient Boosting (I): Fundamentals, Interpretability, and Categorical Structure image
Brian Lucena, PhD
Principal | Numeristical
Machine Learning with XGBoost

In-person | Workshop | Machine Learning | Intermediate

 

This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation…more details

Machine Learning with XGBoost image
Matt Harrison
Python & Data Science Corporate Trainer | Consultant | MetaSnake
Deepfakes: How’re They Made, Detected, and How They Impact Society

In-person | Tutorial | Deep Learning | All Levels

 

Deepfake photos and videos are already impacting many industries and sectors of society, in both positive and negative ways. In this session I’ll weave between the social context of deepfakes (how they’ve been used and what impact they’ve had) and the technical side of them (how they’re made, and some approaches to detecting them). This is the multifaceted story of deepfakes…more details

Deepfakes: How’re They Made, Detected, and How They Impact Society image
Noah Giansiracusa, PhD
Assistant Professor of Mathematics, Data Science at Bentley University
Mastering Adversarial Evaluation for NLP: A Practical Workshop

Virtual | Workshop | NLP | Beginner-Intermediate

 

This workshop will equip participants with the skills and knowledge to conduct adversarial evaluation of NLP systems. Through active exercises and examples, we will discuss how to identify and address system weaknesses and explore how this approach can improve accuracy, reduce risk, and uncover potential blind spots. Participants will gain a greater understanding of how to use adversarial evaluation to detect and prevent errors in their NLP systems…more details

Mastering Adversarial Evaluation for NLP: A Practical Workshop image
Panos Alexopoulos, PhD
Head of Ontology | Textkernel BV
Deep Learning with PyTorch and TensorFlow

In-person | Full-Day Training | Deep Learning | Machine Learning | Beginner-Intermediate

 

 

Obscure until recently, Deep Learning is ubiquitous today across data-driven applications as diverse as machine vision, natural language processing, generative A.I., and superhuman game-playing.

This training is an introduction to Deep Learning that brings high-level theory to life with interactive examples featuring PyTorch, TensorFlow 2, and Keras — all three of the principal Python libraries for Deep Learning. Essential theory will be covered in a manner that provides students with a complete intuitive understanding of Deep Learning’s underlying foundations…more details

Deep Learning with PyTorch and TensorFlow image
Dr. Jon Krohn
Chief Data Scientist, Author of Deep Learning Illustrated | Nebula.io
Next-Level Data Visualization in Python: A Practical Guide to Upgrading Your Plots by Making the Most of Matplotlib and More

Virtual | Tutorial | Data Visualization & Data Analysis | All Levels

 

 

Amid the explosion of interactive data visualization options, Matplotlib remains the go-to for quick data visualization for researchers working in python, either directly or via libraries like Pandas and Seaborn. It also remains one of the most robust and flexible tools for customizing static visualizations. This tutorial will discuss ways to level-up your quick plots for use in published papers, automated reporting, and any other scenario where crisp and/or custom visualizations are called for…more details

Building Recommendation Systems

In-person | Workshop | Deep Learning | NLP | Machine Learning | Beginner-Intermediate

 

 

Recommendation describes suggesting, or recommending, items tailored to a particular user. As generative AI creates an explosion of digital content, personalization will be more important than ever! Whether the application is sneaker designs, blog posts, or even pre-trained machine learning model weights, most recommendation tasks have a similar underlying structure. We need some way to represent items and users, typically as vectors, as well as a way to index them for fast computation. We also need to design intuitive APIs that interface the recommendation system to application developers. Weaviate is an open-source vector search database that has many unique search and database features…more details

Building Recommendation Systems image
Connor Shorten, PhD
Research Scientist | SeMI Technologies
Hybrid AI for Complex Applications with Scruff

In-person | Workshop | Machine Learning | Intermediate-Advanced

 


In this workshop, I will explain the core principles of Scruff and the main programming concepts. I will then demonstrate how we used Scruff to create a tool for wildfire risk assessment and mitigation that includes climate models, historical fire data, and fire propagation simulators. Finally, we will work through a hands-on session of getting up and running with Scruff and implementing and running simple models…more details

Hybrid AI for Complex Applications with Scruff image
Avi Pfeffer, PhD
Author, Chief Scientist | Charles River Analytics