ODSC East 2020

Virtual Conference

The latest tools, breakthrough models, & frameworks streamed live to your laptop.

 
 

April 13th to April 17th, 2020

ODSC Livestream & On Demand

Hear from the best and brightest in data science and AI. Topics include:

Livestream Sessions
Recordings
Hours of Content

ODSC is the best community data science event on the planet. There are other events that cover special topics, or industries, etc., but ODSC is comprehensive and totally community-focused: it's the conference to engage, to build, to develop, and to learn from the whole data science community.

Kirk Borne - Principal Data Scientist and Executive Advisor at Booz Allen Hamilton @ ODSC East 2019

East 2020 Speakers & Instructors

Announcing our first 50 speakers and instructors. 230+ more coming soon!

 

See our full list of speakers and instructors Here

Livestream & On Demand Registration

Discounted Sale: Ends in…


Choose your Pass

Streaming Access to ODSC Keynotes and beakout sessions (Thu&Fri)

Immediate Access to ODSC East 2019 Recordings

ODSC Livestream Talks (Thu&Fri)

Remote Livestream Speaker Track Access 

Access to Thurs/Fri On Demand Recordings

Access to Hands-on Workhshops & Training Sessions & On Demand  (Wednesday)

Access to Hands-on Workhshops & Training Sessions & On Demand (Tuesday)

Livestream Only 2-Day

Pass 

$499

2 days (Thu& Fri)

Livestream & On Demand Pass 

$849

2 days (Thu&Fri)

Training & On Demand - 3 Day

$999

3 days (Wed,Thu&Fri)

Training & On Demand - 4 Day

$1,449

4 days (Tues to Friday)


Confirmed Sessions for Livestream and On-Demand Recordings

Please see details of confirmed livestream sessions below. Confirmed session times will be updated in the coming weeks. Full lineup of speakers can be found hereTuesday sessions added soon

ODSC Livestream Sessions
Thursday, April 16th
Friday, April 17th
Thursday, April 16th
Friday, April 17th
10:40 - 11:25
Improving Subseasonal Forecasting in the Western U.S. with Machine Learning

Track Keynote | Research Frontiers | Data for good | All Levels

 

Here we present and evaluate our machine learning approach to the Rodeo and release our SubseasonalRodeo dataset, collected to train and evaluate our forecasting system.

Our system is an ensemble of two nonlinear regression models. The first integrates the diverse collection of meteorological measurements and dynamic model forecasts in the SubseasonalRodeo dataset and prunes irrelevant predictors using a customized multitask model selection procedure. The second uses only historical measurements of the target variable (temperature or precipitation) and introduces multitask nearest neighbor features into a weighted local linear regression. Each model alone is significantly more accurate than the debiased operational U.S. Climate Forecasting System (CFSv2), and our ensemble skill exceeds that of the top Rodeo competitor for each target variable and forecast horizon. Moreover, over 2011-2018, an ensemble of our regression models and debiased CFSv2 improves debiased CFSv2 skill by 40-50% for temperature and 129-169% for precipitation. We hope that both our dataset and our methods will help to advance the state of the art in subseasonal forecasting…more details

Improving Subseasonal Forecasting in the Western U.S. with Machine Learning image
Lester Mackey, PhD
ML Researcher, Adjunct Professor | Microsoft Research New England, Stanford University
11:30 - 12:15
Finding Correlated Trends Across Multiple Data Sets Using Matrix Factorization

Talk | Machine Learning | R-programming | Intermediate-Advanced

 

Matrix factorization or latent variable analysis is a powerful approach to identify trends in data or reduce the dimension of high dimensional data. Whilst most data scientists are familiar with principal component analysis (PCA), this talk will describe extensions to PCA that can be used to examine trends and extract the most variant component across many datasets. I will compare matrix factorization approaches for integrative analysis of multiple datasets (including canonical correlation analysis, multiple factor analysis, joint non-negative matrix factorization) and describe how we apply these methods to identify biomarkers of disease in oncology...more details

Finding Correlated Trends Across Multiple Data Sets Using Matrix Factorization image
Aedin Culhane, PhD
Senior Research Scientist | Dana-Farber Cancer Institute, Harvard TH Chan School of Public Health
11:30 - 12:15
Training and Operationalizing Interpretable Machine Learning Models

Talk | Machine Learning | MLOps & Data Engineering | Intermediate-Advanced

 

AI offers companies the possibility to transform their operations: from AI applications able to predict and schedule equipment’s maintenance, to intelligent R&D applications able to estimate the success of future drugs, until HR AI-powered tools able to enhance the hiring process and employee retention strategy. However, in order to be able to leverage this opportunity, companies have to learn how to successfully build, train, test, and push hundreds of machine learning models in production, and to move models from development to their production environment in ways that are robust, explainable, and repeatable.

Nowadays data scientists and developers have a much easier experience when building AI-based solutions through the availability and accessibility of data and open-source machine learning frameworks. However, this process becomes a lot more complex when they need to think about model deployment and pick the best strategy to scale up to a production-grade system.

In this talk, we will introduce some common challenges of machine learning model deployment and we will discuss some points in order to enable you to tackle some of those challenges…more details

Training and Operationalizing Interpretable Machine Learning Models image
Francesca Lazzeri, PhD
Senior ML Scientist | Microsoft
12:45 - 13:30
The Hamiltonian Monte Carlo Revolution is Open Source: Probabilistic Programming with PyMC3

Talk | ML for Programmers | Beginner

 

In the last ten years, there have been a number of advancements in the study of Hamiltonian Monte Carlo algorithms that have enabled effective Bayesian statistical computation for much more complicated models than were previously feasible. These algorithmic advancements have been accompanied by a number of open source probabilistic programming packages that make them accessible to programmers and statisticians. PyMC3 is one such package written in Python and supported by NumFOCUS. This talk will give an introduction to probabilistic programming with PyMC3. No preexisting knowledge of Bayesian statistics is necessary; a working knowledge of Python will be helpful…more details

The Hamiltonian Monte Carlo Revolution is Open Source: Probabilistic Programming with PyMC3 image
Austin Rochford
Chief Data Scientist | Monetate Labs
12:45 - 13:30
Accelerate ML Lifecycle with Kubernetes and Containerized Data Science Tools

Talk | ML for Programmers | MLOps & Data Engineering | Beginner-Intermediate

 

Kubernetes & container platforms provide desired agility, flexibility, scalability, & portability for data scientists to train, test, & deploy ML models quickly, without IT dependency. The session will provide an overview of containers and Kubernetes, and how these technologies can help solve the challenges faced by data scientists, ML engineers, and application developers. Next, we will review the key capabilities required in a containers and kubernetes platform to help data scientists easily use technologies like Jupyter Notebooks, ML frameworks, programming languages to innovate faster. Finally we will share the available platform options (e.g. Red Hat OpenShift, KubeFlow, etc.), and some examples of how data scientists are accelerating their ML initiatives with containers and kubernetes platform…more details

 

Accelerate ML Lifecycle with Kubernetes and Containerized Data Science Tools image
Abhinav Joshi
Sr. Principal Marketing Manager | Red Hat
Accelerate ML Lifecycle with Kubernetes and Containerized Data Science Tools image
Tushar Katarki
Sr. Principal Product Manager | Red Hat
12:45 - 13:30
The Art (and Importance) of Data Storytelling

Talk | Data Visualization | ML for Programmers | Beginner

 

In this talk, you’ll learn more about the importance of developing a story for your data and some of the initial ways to build a cohesive narrative. We’ll use the science of human vision and processing to talk through some best practices for creating high leverage visualizations that strengthen data stories. We’ll also look at how we can make improvements and enhancements to visualizations to add power to our stories in order to get and sustain our audiences’ attention. The most impressive data analysis is useless without the ability to clearly communicate essential takeaways and offer up persuasive recommendations. Data scientists of any technical level will benefit from learning about how to better communicate with data. Taking a storytelling approach to sharing your data can help get your work noticed and recommendations heard…more details

The Art (and Importance) of Data Storytelling image
Diedre Downing
Lead Data Storytelling Trainer| StoryIQ
12:45 - 13:30
Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery

Talk | Machine Learning | ML for Programmers | Intermediate

 

The talk aims to introduce the attendees to the application of computer vision techniques to overhead imagery such as satellite, aerial and drone imagery. The emphasis is on object detection on satellite images as we share our learnings from dealing with those datasets (such as the xView Object Detection Challenge). The attendees will learn about challenges common in such datasets, such as scale variance and images with more than three channels, as well as approaches to address them and the results of our experiments. The session will also introduce datasets available in the growing field of CV on remote sensing data, as well as various use cases in disaster relief, refugee tracking, animal population counting, geospatial intelligence for business decisions and defense. The attendees will walk away with a good grasp of what is possible when machine learning at scale is applied to geospatial data…more details

Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery image
Xiaoyong Zhu
Senior Data Scientist | Microsoft
12:45 - 13:30
Explainable AI for Training with Weakly Annotated Data

Talk | Machine Learning | Research Frontiers | Intermediate-Advanced

 

Deep learning technologies, however, commonly suffer from a lack of explainability, which is an important aspect for the acceptance of AI into the highly regulated and high-stakes healthcare industry. For example, in addition to accurately classifying an image as containing a critical finding such as pneumothorax, it’s important to also localize where the pneumothorax is in the image to explain to the radiologist the reason for the algorithm’s prediction.

In this talk, we address these shortcomings with an interpretable AI algorithm that can classify and localize critical findings in medical images without the need of expensive pixel-level annotations, providing a general solution for training with weakly annotated data that has the potential to be adopted to a host of applications in the healthcare domain…more details

Explainable AI for Training with Weakly Annotated Data image
Evan Schwab, PhD
Research Scientist | Philips Research North America
13:35 - 14:20
A Data Science Playbook for Explainable AI – Navigating Predictive and Interpretable Models

Talk | Machine Learning | Deep Learning | Intermediate

 

Model ethics, interpretability, and trust will be seminal issues in data science in the coming decade. This technical talk discusses traditional and modern approaches for interpreting black box models. Additionally, we will review cutting edge research coming out of UCSF, CMU, and industry. This new research reveals holes in traditional approaches like SHAP and LIME when applied to some deep net architectures and introduces a new approach to explainable modeling where interpretability is a hyperparameter in the model building phase rather than a post-modeling exercise. We will provide step-by-step guides that practitioners can use in their work to navigate this interesting space. We will review code examples of interpretability techniques and provide notebooks for attendees to download…more details

A Data Science Playbook for Explainable AI – Navigating Predictive and Interpretable Models image
Joshua Poduska
Chief Data Scientist | Domino Data Lab
13:35 - 14:20
Distributed Training Platform at Facebook

Talk | Deep Learning | MLOps & Data Engineering | Intermediate-Advanced

 

Large scale distributed training has become an essential element to scaling the productivity for ML engineers. Today, ML models are getting larger and more complex in terms of compute and memory requirements. The amount of data we train on at Facebook is huge. In this talk, we will learn about the Distributed Training Platform to support large scale data and model parallelism. We will touch base on Distributed Training support for PyTorch and how we are offering a flexible training platform for ML engineers to increase their productivity at facebook scale…more details

Distributed Training Platform at Facebook image
Mohamed Fawzy
Senior Engineering Manager | Facebook
Distributed Training Platform at Facebook image
Kiuk Chung
Software Engineer | Facebook
14:25 - 15:10
AI / Machine Learning Driven Improvement of Demand Forecasts

Talk | Machine Learning | Intermediate

 

This talk will cover technical challenges related to scaling computational architecture, data engineering complexity, demand forecasting approaches and limitations, challenges of integration of AutoML engines, challenges of architecting for future algorithm inclusion to handle the complexities of product demand heterogeneity while building a maintainable system future proofed for business continuity. Additionally, the speaker will cover the full scope of transformational challenges this situation provides and the operating model to implement these challenges using more AI/ML driven approaches. The speaker will also cover how to use such a demand forecasting system to help drive strategic actions in Product Portfolio Management, Promotions, Pricing, Sales and Marketing to showcase an ecosystem of business decision making…more details

AI / Machine Learning Driven Improvement of Demand Forecasts image
Prabhakar Narasimhadevara
Director of Data Science | Stanley Black & Decker
14:25 - 15:10
Applying State-of-the-art Natural Language Processing for Personalized Healthcare

Talk | NLP | Deep Learning | Intermediate

 

Accelerating progress in personalized healthcare requires learning the causal relationships between diseases, genes, treatments, medications, labs, and other clinical information – at scale over a large population and time range. More than half of the clinically relevant data in oncology is only found in free-text pathology reports, radiology reports, sequencing reports, and progress notes.

Extracting and normalizing these facts from these clinical documents requires training oncology-specific models that can accurately extract these specific facts from a variety of documents. This talk describes results and lessons learned, from a real-world project doing this at scale…more details

Applying State-of-the-art Natural Language Processing for Personalized Healthcare image
David Talby, PhD
CTO | Pacific AI
Applying State-of-the-art Natural Language Processing for Personalized Healthcare image
Guneet Walia, PhD
Principal Data Scientist | Genentech
14:25 - 15:10
Managing Data Projects Like a Software Engineer

Talk | ML for Programmers | Kick-starter | Beginner

 

In this talk we’ll go over how to write code that is reproducible and easy for other people to work with.

We’ll start by talking about virtual environments. Virtual environments allow you to define the dependencies for your projects (such as NumPy or Matplotlib) and to keep these dependencies separated between projects. We’ll also outline some choices you have about how to manage your virtual environments.

Next we’ll talk about version control and why you should be using it even if you’re the only contributor to a project. Version control helps create a log of what work was done and why, and will give you the ability to go back when you inevitably make a change to your project that you can’t figure out how to undo.

Then we’ll discuss project structure by reviewing DrivenData’s Cookiecutter Data Science template. The template encourages a number of best practices, and makes it so that anyone familiar with the template will be able to look at your code for the first time will be reasonably well oriented.

Finally, we’ll briefly cover why you should establish coding styles and always use a linter...more details

Managing Data Projects Like a Software Engineer image
Michael Jalkio
Data Engineer | Amazon
15:15 - 16:00
Fast AI: Enabling Rapid Prototyping of AI Solutions

Talk | Research Frontiers | Machine Learning | Beginner-Intermediate

 

Recent advances in Artificial Intelligence (AI) and Machine Learning (ML) have relied on access to modern computing hardware, massive quantities of data, and advanced algorithms that leverage high-performance computing centers such as the MIT Lincoln Laboratory Supercomputing Center (LLSC). While AI will play in important role in many organizations, rapid adoption of AI is often limited by legacy hardware, siloed and messy data, and sparse/unsupervised ML algorithms in data starved application domains. As a world-leader in developing high-performance computing (HPC) tools that are easy-to-use without compromising performance, the LLSC has been developing a number of novel technologies that aim to overcome these technical hurdles that restrict easy adoption of AI. This research in “Fast AI” is pillared on modern computing, big data management, and interfaces & algorithms. In this talk, I will discuss the AI landscape; highlight a few AI adoption challenges; provide an overview of research that simplifies AI adoption; and discuss novel AI applications to cybersecurity and video recognition…more details

Fast AI: Enabling Rapid Prototyping of AI Solutions image
Vijay Gadepally, PhD
Senior Scientist | Massachusetts Institute of Technology, Lincoln Laboratory
15:15 - 16:00
The What, Why, and How of Weighting

Talk | Machine Learning | Deep Learning | Beginner

 

In this talk, we will start with some basics about model evaluation to help motivate using weighting techniques to improve model building. We’ll discuss accuracy, why it’s useful, but why it’s also a flawed metric in many cases. From here, we’ll move on to discuss some other tools for model evaluation which can take a larger variety of perspectives into account.  This will lead us to introduce weighting as a natural technique to help influence the models we’re building in order to make choices which may seem sub-optimal in some metrics (for instance, the metrics being used to train the model), but cause desired behavior in other metrics. Then we’ll take a step back to consider what specific types of problems weighting should be used to help solve, as well as some types of problems it might sometimes be used for in practice, but probably shouldn’t be. We’ll discuss some alternatives to weighting (specifically thresholding) and consider cases where each is preferable. Finally, we’ll discuss a strategy for choosing optimal weights in order to minimize a cost function in the case of cost-based classification problems…more details

The What, Why, and How of Weighting image
Eric Hart, PhD
Senior Data Scientist | Altair Engineering
Select date to see events.

More Sessions include Tuesday Will Be Added Soon

.

How It Works

  • Access multiple livestream tracks on Wednesday, Thursday, or Friday

  • Switch between sessions or tracks as your interests dictate

  • Multiple focus areas include deep learning, machine learning, research frontiers, AI X for business

  • Sessions you missed can be viewed on demand at your leisure

  • Participate in Q&A sessions with your speaker over live chat

  • Directly download slides and other session materials

  • (Training only) Access training and workshops prerequisites, notebooks, and other materials prior to training session starting

  • (Training only) Access hands-on training and workshops with instructed let code labs and notebooks.

Quick Pass Guide

Livestream 2-Day Pass | Access multiple tracks on April 16th & 17th that include 6 keynotes and all breakout sessions, as well as multiple remote speaker tracks. On demand recordings are not included.

Livestream &  On Demand Recordings Pass |  Get access to everything included in the Livestream 2-Day pass plus on demand recordings of all sessions from Thursday and Friday. On Demand recording will be accessible immediately with continued access for over 12 months. This does not include Tuesday and Wednesday workshop or training livestream or on demand sessions. 

Livestream Trainings, Talks, and Workshop & On Demand Recordings Pass |  3 Day pass (Wed-Fri) Get access to everything included in the Livestream 2-Day + On Demand pass + access all of Wednesday’s workshops and training sessions. You also get immediate On Demand access to all recorded sessions including talks, keynotes, AIX business, training, workshops etc. 

Livestream Trainings, Talks, and Workshop & On Demand Recordings Pass |  4 Day pass (Tue-Fri)

Incudes all access and On Demand included in the 3-day pass PLUS all Tuesday traning sessions livestream tracks and On Demand Recordings  

 

Livestream FAQ

How will I access the Livestream?
You will receive instructions for accessing the Livestream 24-48 hours prior to the start of the event.

Is my registration/ticket transferable?
Yes, up until 7 days prior to the event. Email info@odsc.com with your full name, email address and the full name, email address, company name, and title of the person to whom you are transferring. Please be advised that transfers are not available within 7 days of the start of the event.

Where can I contact the organizer with any questions
Email info@odsc.com

What is the Privacy Policy?
In summary, registrant contact information is NOT shared with third parties without your consent. Registrant information is primarily used to verify registration and notify you of similar events held by ODSC in the future. We may share your contact information with sponsors, but only with your consent upon registering.

Please note:

  • Ticket prices and availability are subject to increase or decrease, at the discretion of ODSC, before and/or after you have made your purchase, and do not entitle the purchaser to a refund or credit (partial or full).