Boston | April 14th – April 17th, 2020

Research Frontiers

The Most Advanced Active Research, Summarized

Rapid Pace of Advancement

Data Science is a broad field and advancing at a tremendous pace. Every few months new research, models, and advances are announced.  For data science practitioners it’s essential to keep abreast of the latest advances. However, given the demands on our time that can be a daunting task.

The Most Advanced Research, Summarized

The Data Science Research track is the first of its kind. Instead of having to parse the contents of countless papers or attend academic conferences, we bring the best to you. World-class academics, researchers, and professionals condense the latest research across focus areas and detail what’s important. This summary accelerates your insights on the latest research and serves as a foundation for more in-depth analysis.

Some Current Research Frontiers Speakers

Click Here For Full Lineup
2020 Speakers

Sample Talk, Workshop, and Training Sessions

Research Frontiers Sessions
Friday, April 17th
Thursday, April 16th
Friday, April 17th
Thursday, April 16th
10:40 - 11:25
Improving Subseasonal Forecasting in the Western U.S. with Machine Learning

Track Keynote | Research Frontiers | Data for good | All Levels


Here we present and evaluate our machine learning approach to the Rodeo and release our SubseasonalRodeo dataset, collected to train and evaluate our forecasting system.

Our system is an ensemble of two nonlinear regression models. The first integrates the diverse collection of meteorological measurements and dynamic model forecasts in the SubseasonalRodeo dataset and prunes irrelevant predictors using a customized multitask model selection procedure. The second uses only historical measurements of the target variable (temperature or precipitation) and introduces multitask nearest neighbor features into a weighted local linear regression. Each model alone is significantly more accurate than the debiased operational U.S. Climate Forecasting System (CFSv2), and our ensemble skill exceeds that of the top Rodeo competitor for each target variable and forecast horizon. Moreover, over 2011-2018, an ensemble of our regression models and debiased CFSv2 improves debiased CFSv2 skill by 40-50% for temperature and 129-169% for precipitation. We hope that both our dataset and our methods will help to advance the state of the art in subseasonal forecasting…more details

Improving Subseasonal Forecasting in the Western U.S. with Machine Learning image
Lester Mackey, PhD
ML Researcher, Adjunct Professor | Microsoft Research New England, Stanford University
10:40 - 11:25
Smart Technologies in Enhancing Browsing Experiences

Talk | Data Visualization | Research Frontiers | Intermediate


In this talk, I would like to focus on design methods, used for some of the visualization systems I have been working on for the past 3 years. The systems aim to bridge the physical and digital arenas, using digital data associated with physically situated objects and transforming and visualizing this data in relation to a given context. The systems are in the form of a web-based app – they serve as a visual “companion” that recognizes objects and uses them instantly to provide users with information or insight. After snapping a photo or using an AR headset, applications generate the object-related data or visual dashboard that users may use for further exploration. With the above-mentioned systems and its interplay between real and digital worlds, new avenues could be opened for creating new dimensions for adaptive visualizations…more details

Smart Technologies in Enhancing Browsing Experiences image
Zona Kostic, PhD
Research Fellow | Harvard University
12:40 - 14:10
Uplift Modeling Tutorial: Predictive and Prescriptive Analytics

Tutorial | Machine Learning | Research Frontiers | Intermediate

This tutorial will cover both introductory and advanced topics. I will first introduce the uplift concept, contrast with the traditional response modeling method, and review various predictive analytics approaches to Uplift Modeling. Our discussion extends from experimental data to observational data, by integrating Uplift Modeling with Causal Inference. I will also discuss the multiple treatment situation where the optimal treatment for each person needs to be determined. Prescriptive analytics from the optimization field will be employed to handle the uncertainty of lift estimates. I will illustrate the application and methodologies with examples from multiple industries…more details

Uplift Modeling Tutorial: Predictive and Prescriptive Analytics image
Victor Lo, PhD
Head of Data Science & Artificial Intelligence | Fidelity Investments
12:45 - 13:30
Explainable AI for Training with Weakly Annotated Data

Talk | Machine Learning | Research Frontiers | Intermediate-Advanced


Deep learning technologies, however, commonly suffer from a lack of explainability, which is an important aspect for the acceptance of AI into the highly regulated and high-stakes healthcare industry. For example, in addition to accurately classifying an image as containing a critical finding such as pneumothorax, it’s important to also localize where the pneumothorax is in the image to explain to the radiologist the reason for the algorithm’s prediction.

In this talk, we address these shortcomings with an interpretable AI algorithm that can classify and localize critical findings in medical images without the need of expensive pixel-level annotations, providing a general solution for training with weakly annotated data that has the potential to be adopted to a host of applications in the healthcare domain…more details

Explainable AI for Training with Weakly Annotated Data image
Evan Schwab, PhD
Research Scientist | Philips Research North America
13:35 - 14:20
Outlier Robust Machine Learning

Talk | Research Frontiers | Advanced


In this talk, we provide a new class of computationally-efficient class of machine learning algorithms that are provably robust to a variety of robustness settings, such as arbitrary outliers, and heavy-tailed data, among others. Our workhorse is a novel robust variant of gradient descent, and we provide conditions under which our gradient descent variant provides accurate and robust estimators in any general convex risk minimization problem. These results provide some of the first computationally tractable and provably robust machine learning algorithms for general machine learning models…more details

Outlier Robust Machine Learning image
Pradeep Ravikumar, PhD
Associate Professor | CMU
15:15 - 16:00
Delivering on the Promise of AI in Precision Medicine Oncology

Talk | Machine Learning | Research Frontiers | All levels


At Foundation Medicine, we have the world’s largest and unique clinicogenomics database that unites comprehensive genomic profiles to clinical outcomes. This talk will discuss the promise and power of this data as we continue to push the boundaries of harnessing AI to transform cancer care. We’ll begin with a tour of the history of next-generation sequencing technology and how this has transformed oncology from single-gene and single-therapy analysis to comprehensive genomic profiling that require large-scale computation, analysis, and machine learning. Next, we’ll cover the history of and conventional statistical modeling in the field of clinicogenomics and how we are complementing and extending these methods with machine learning models to empower patients, doctors, and biopharma companies to fight cancer…more details

Delivering on the Promise of AI in Precision Medicine Oncology image
John Mercer
Head of Data Science | Foundation Medicine
15:15 - 16:00
Hybrid Deep Learning Approach to Speed up Certain Numerical Simulations

Talk | Deep Learning | Research Frontiers | Intermediate


In this presentation, we will demonstrate how to leverage deep learning to speed up production forecasting, as well as seismic imaging.
In order to accelerate reservoir production simulations, we build a recurrent neural network model using production forecasting results. The model can be viewed as a proxy and allow us to understand the reservoir much quicker. Once the model is released, we can rely on it to predict reservoir performance and make economic decisions . The model will be monitored and updated if necessary.
When it comes to seismic imaging, we attempt to decode the wave equations and inversions in the ML framework. First, we started with some randomly generated depth velocity models, and perform forward modeling to generate shot gathers. Next, we run reverse time migration (RTM) to generate depth images. Once the training data set is ready, we leverage the state-of-the-art of Machine Learning to extract features from shots and velocity model to generate seismic images. Once we have a satisfied ML model, the future work could be focusing on applying reinforcement learning to uplift velocity model building…more details

Hybrid Deep Learning Approach to Speed up Certain Numerical Simulations image
Cheng Zhan, PhD
Senior Data Scientist | Microsoft
Select date to see events.

See all our talks and hands-on workshop and training sessions
See all sessions

Active Research Focus Areas

Data science is a broad and expanding field with many areas of study. Here are some of the main areas that our presenting researchers will be addressing:
  • Neural Networks

  • Machine Learning

  • Transfer Learning

  • Machine Vision

  • Natural Language Processing

  • Predictive Analytics

  • Pattern Recognition

  • Quantitative Finance

  • Speach Recognition

  • Time Series Analysis

  • Graph Theory

  • Network Analysis

  • Data Visualization

  • Anomaly Detection

Previous Sessions in Research Frontiers Track

  • Workshop: Deciphering the Black Box: Latest Tools and Techniques for Interpretability

  • Talk: Adversarial Attacks on Deep Neural Networks

  • Training: Integrating Pandas with Scikit-Learn, an Exciting New Workflow

  • Workshop: Machine Learning for Digital Identity

  • Talk: Adding Context and Cognition to Modern NLP Techniques

  • Training: Good, Fast, Cheap: How to do Data Science with Missing Data

  • Workshop: Open Data Hub workshop on OpenShift

  • Talk: Practical AI solutions within healthcare and biotechnology

  • Training:  Apache Spark for Fast Data Science (and Fast Python Integration!) at Scale

  • Workshop: Reproducible Data Science Using Orbyter

  • Talk: Combining millions of products into one marketplace using computer vision and natural language processing

  • See the whole schedule!

Why Attend?

Hear from world-class researchers and academics about the top areas of active research

Take time out of your busy schedule to accelerate your knowledge of the latest advances in data science

Be the first amongst your peers to grasp changes that will affect the field in the next few years

Take advantage of and chose from another 120 talks, tutorials, and workshops at ODSC West

Learn directly from top researchers what works and what doesn’t 

Connect and network with academics, research, and fellow professionals

Meet with peers and professionals looking to learn, connect, and collaborate

Get access to other focus area content including ML / DL, Data Visualization, Quant finance, and Open Data Science

Who Should Attend

The Data Science Research track will prove invaluable to those of us looking to quickly understand in detail the topics that matter most in data science now

  • Experienced data scientists

  • Students and academics

  • Software engineers and architects

  • Business professionals interested in data science advancements

  • Experts from other domains looking to leverage data science

  • Beginners interested in the latest research

  • Researchers from academia and industry

  • Industry professionals

  • Technologists interested in new data science applications

  • Industry experts looking to assess the impact of data science

Sign Up for ODSC East | April 14th – April 17th, 2020

Register Now