Discounted Sale: Ends in…
Choose your Pass
Streaming Access to ODSC Keynotes and beakout sessions (Thu&Fri)
Immediate Access to ODSC East 2019 Recordings
ODSC Livestream Talks (Thu&Fri)
Remote Livestream Speaker Track Access
Access to Thurs/Fri On Demand Recordings
Access to Hands-on Workhshops & Training Sessions & On Demand (Wednesday)
Access to Hands-on Workhshops & Training Sessions & On Demand (Tuesday)
Livestream Only 2-Day
2 days (Thu& Fri)
Livestream & On Demand Pass
2 days (Thu&Fri)
Training & On Demand - 3 Day
3 days (Wed,Thu&Fri)
Training & On Demand - 4 Day
4 days (Tues to Friday)
Confirmed Sessions for Livestream and On-Demand Recordings
Please see details of confirmed livestream sessions below. Confirmed session times will be updated in the coming weeks. Full lineup of speakers can be found here. Tuesday sessions added soon
Track Keynote | Research Frontiers | Data for good | All Levels
Here we present and evaluate our machine learning approach to the Rodeo and release our SubseasonalRodeo dataset, collected to train and evaluate our forecasting system.
Our system is an ensemble of two nonlinear regression models. The first integrates the diverse collection of meteorological measurements and dynamic model forecasts in the SubseasonalRodeo dataset and prunes irrelevant predictors using a customized multitask model selection procedure. The second uses only historical measurements of the target variable (temperature or precipitation) and introduces multitask nearest neighbor features into a weighted local linear regression. Each model alone is significantly more accurate than the debiased operational U.S. Climate Forecasting System (CFSv2), and our ensemble skill exceeds that of the top Rodeo competitor for each target variable and forecast horizon. Moreover, over 2011-2018, an ensemble of our regression models and debiased CFSv2 improves debiased CFSv2 skill by 40-50% for temperature and 129-169% for precipitation. We hope that both our dataset and our methods will help to advance the state of the art in subseasonal forecasting…more details
Talk | Machine Learning | R-programming | Intermediate-Advanced
Matrix factorization or latent variable analysis is a powerful approach to identify trends in data or reduce the dimension of high dimensional data. Whilst most data scientists are familiar with principal component analysis (PCA), this talk will describe extensions to PCA that can be used to examine trends and extract the most variant component across many datasets. I will compare matrix factorization approaches for integrative analysis of multiple datasets (including canonical correlation analysis, multiple factor analysis, joint non-negative matrix factorization) and describe how we apply these methods to identify biomarkers of disease in oncology...more details
Talk | Machine Learning | MLOps & Data Engineering | Intermediate-Advanced
AI offers companies the possibility to transform their operations: from AI applications able to predict and schedule equipment’s maintenance, to intelligent R&D applications able to estimate the success of future drugs, until HR AI-powered tools able to enhance the hiring process and employee retention strategy. However, in order to be able to leverage this opportunity, companies have to learn how to successfully build, train, test, and push hundreds of machine learning models in production, and to move models from development to their production environment in ways that are robust, explainable, and repeatable.
Nowadays data scientists and developers have a much easier experience when building AI-based solutions through the availability and accessibility of data and open-source machine learning frameworks. However, this process becomes a lot more complex when they need to think about model deployment and pick the best strategy to scale up to a production-grade system.
In this talk, we will introduce some common challenges of machine learning model deployment and we will discuss some points in order to enable you to tackle some of those challenges…more details
Talk | ML for Programmers | Beginner
In the last ten years, there have been a number of advancements in the study of Hamiltonian Monte Carlo algorithms that have enabled effective Bayesian statistical computation for much more complicated models than were previously feasible. These algorithmic advancements have been accompanied by a number of open source probabilistic programming packages that make them accessible to programmers and statisticians. PyMC3 is one such package written in Python and supported by NumFOCUS. This talk will give an introduction to probabilistic programming with PyMC3. No preexisting knowledge of Bayesian statistics is necessary; a working knowledge of Python will be helpful…more details
Talk | ML for Programmers | MLOps & Data Engineering | Beginner-Intermediate
Kubernetes & container platforms provide desired agility, flexibility, scalability, & portability for data scientists to train, test, & deploy ML models quickly, without IT dependency. The session will provide an overview of containers and Kubernetes, and how these technologies can help solve the challenges faced by data scientists, ML engineers, and application developers. Next, we will review the key capabilities required in a containers and kubernetes platform to help data scientists easily use technologies like Jupyter Notebooks, ML frameworks, programming languages to innovate faster. Finally we will share the available platform options (e.g. Red Hat OpenShift, KubeFlow, etc.), and some examples of how data scientists are accelerating their ML initiatives with containers and kubernetes platform…more details
Talk | Data Visualization | ML for Programmers | Beginner
In this talk, you’ll learn more about the importance of developing a story for your data and some of the initial ways to build a cohesive narrative. We’ll use the science of human vision and processing to talk through some best practices for creating high leverage visualizations that strengthen data stories. We’ll also look at how we can make improvements and enhancements to visualizations to add power to our stories in order to get and sustain our audiences’ attention. The most impressive data analysis is useless without the ability to clearly communicate essential takeaways and offer up persuasive recommendations. Data scientists of any technical level will benefit from learning about how to better communicate with data. Taking a storytelling approach to sharing your data can help get your work noticed and recommendations heard…more details
Talk | Machine Learning | ML for Programmers | Intermediate
The talk aims to introduce the attendees to the application of computer vision techniques to overhead imagery such as satellite, aerial and drone imagery. The emphasis is on object detection on satellite images as we share our learnings from dealing with those datasets (such as the xView Object Detection Challenge). The attendees will learn about challenges common in such datasets, such as scale variance and images with more than three channels, as well as approaches to address them and the results of our experiments. The session will also introduce datasets available in the growing field of CV on remote sensing data, as well as various use cases in disaster relief, refugee tracking, animal population counting, geospatial intelligence for business decisions and defense. The attendees will walk away with a good grasp of what is possible when machine learning at scale is applied to geospatial data…more details
Talk | Machine Learning | Research Frontiers | Intermediate-Advanced
Deep learning technologies, however, commonly suffer from a lack of explainability, which is an important aspect for the acceptance of AI into the highly regulated and high-stakes healthcare industry. For example, in addition to accurately classifying an image as containing a critical finding such as pneumothorax, it’s important to also localize where the pneumothorax is in the image to explain to the radiologist the reason for the algorithm’s prediction.
In this talk, we address these shortcomings with an interpretable AI algorithm that can classify and localize critical findings in medical images without the need of expensive pixel-level annotations, providing a general solution for training with weakly annotated data that has the potential to be adopted to a host of applications in the healthcare domain…more details
Talk | Machine Learning | Deep Learning | Intermediate
Model ethics, interpretability, and trust will be seminal issues in data science in the coming decade. This technical talk discusses traditional and modern approaches for interpreting black box models. Additionally, we will review cutting edge research coming out of UCSF, CMU, and industry. This new research reveals holes in traditional approaches like SHAP and LIME when applied to some deep net architectures and introduces a new approach to explainable modeling where interpretability is a hyperparameter in the model building phase rather than a post-modeling exercise. We will provide step-by-step guides that practitioners can use in their work to navigate this interesting space. We will review code examples of interpretability techniques and provide notebooks for attendees to download…more details
Talk | Deep Learning | MLOps & Data Engineering | Intermediate-Advanced
Large scale distributed training has become an essential element to scaling the productivity for ML engineers. Today, ML models are getting larger and more complex in terms of compute and memory requirements. The amount of data we train on at Facebook is huge. In this talk, we will learn about the Distributed Training Platform to support large scale data and model parallelism. We will touch base on Distributed Training support for PyTorch and how we are offering a flexible training platform for ML engineers to increase their productivity at facebook scale…more details
Talk | Machine Learning | Intermediate
This talk will cover technical challenges related to scaling computational architecture, data engineering complexity, demand forecasting approaches and limitations, challenges of integration of AutoML engines, challenges of architecting for future algorithm inclusion to handle the complexities of product demand heterogeneity while building a maintainable system future proofed for business continuity. Additionally, the speaker will cover the full scope of transformational challenges this situation provides and the operating model to implement these challenges using more AI/ML driven approaches. The speaker will also cover how to use such a demand forecasting system to help drive strategic actions in Product Portfolio Management, Promotions, Pricing, Sales and Marketing to showcase an ecosystem of business decision making…more details
Talk | NLP | Deep Learning | Intermediate
Accelerating progress in personalized healthcare requires learning the causal relationships between diseases, genes, treatments, medications, labs, and other clinical information – at scale over a large population and time range. More than half of the clinically relevant data in oncology is only found in free-text pathology reports, radiology reports, sequencing reports, and progress notes.
Extracting and normalizing these facts from these clinical documents requires training oncology-specific models that can accurately extract these specific facts from a variety of documents. This talk describes results and lessons learned, from a real-world project doing this at scale…more details
Talk | ML for Programmers | Kick-starter | Beginner
In this talk we’ll go over how to write code that is reproducible and easy for other people to work with.
We’ll start by talking about virtual environments. Virtual environments allow you to define the dependencies for your projects (such as NumPy or Matplotlib) and to keep these dependencies separated between projects. We’ll also outline some choices you have about how to manage your virtual environments.
Next we’ll talk about version control and why you should be using it even if you’re the only contributor to a project. Version control helps create a log of what work was done and why, and will give you the ability to go back when you inevitably make a change to your project that you can’t figure out how to undo.
Then we’ll discuss project structure by reviewing DrivenData’s Cookiecutter Data Science template. The template encourages a number of best practices, and makes it so that anyone familiar with the template will be able to look at your code for the first time will be reasonably well oriented.
Finally, we’ll briefly cover why you should establish coding styles and always use a linter...more details
Talk | Research Frontiers | Machine Learning | Beginner-Intermediate
Recent advances in Artificial Intelligence (AI) and Machine Learning (ML) have relied on access to modern computing hardware, massive quantities of data, and advanced algorithms that leverage high-performance computing centers such as the MIT Lincoln Laboratory Supercomputing Center (LLSC). While AI will play in important role in many organizations, rapid adoption of AI is often limited by legacy hardware, siloed and messy data, and sparse/unsupervised ML algorithms in data starved application domains. As a world-leader in developing high-performance computing (HPC) tools that are easy-to-use without compromising performance, the LLSC has been developing a number of novel technologies that aim to overcome these technical hurdles that restrict easy adoption of AI. This research in “Fast AI” is pillared on modern computing, big data management, and interfaces & algorithms. In this talk, I will discuss the AI landscape; highlight a few AI adoption challenges; provide an overview of research that simplifies AI adoption; and discuss novel AI applications to cybersecurity and video recognition…more details
Talk | Machine Learning | Deep Learning | Beginner
In this talk, we will start with some basics about model evaluation to help motivate using weighting techniques to improve model building. We’ll discuss accuracy, why it’s useful, but why it’s also a flawed metric in many cases. From here, we’ll move on to discuss some other tools for model evaluation which can take a larger variety of perspectives into account. This will lead us to introduce weighting as a natural technique to help influence the models we’re building in order to make choices which may seem sub-optimal in some metrics (for instance, the metrics being used to train the model), but cause desired behavior in other metrics. Then we’ll take a step back to consider what specific types of problems weighting should be used to help solve, as well as some types of problems it might sometimes be used for in practice, but probably shouldn’t be. We’ll discuss some alternatives to weighting (specifically thresholding) and consider cases where each is preferable. Finally, we’ll discuss a strategy for choosing optimal weights in order to minimize a cost function in the case of cost-based classification problems…more details
More Sessions include Tuesday Will Be Added Soon
How It Works
Access multiple livestream tracks on Wednesday, Thursday, or Friday
Switch between sessions or tracks as your interests dictate
Multiple focus areas include deep learning, machine learning, research frontiers, AI X for business
Sessions you missed can be viewed on demand at your leisure
Participate in Q&A sessions with your speaker over live chat
Directly download slides and other session materials
(Training only) Access training and workshops prerequisites, notebooks, and other materials prior to training session starting
(Training only) Access hands-on training and workshops with instructed let code labs and notebooks.
Quick Pass Guide
Livestream 2-Day Pass | Access multiple tracks on April 16th & 17th that include 6 keynotes and all breakout sessions, as well as multiple remote speaker tracks. On demand recordings are not included.
Livestream & On Demand Recordings Pass | Get access to everything included in the Livestream 2-Day pass plus on demand recordings of all sessions from Thursday and Friday. On Demand recording will be accessible immediately with continued access for over 12 months. This does not include Tuesday and Wednesday workshop or training livestream or on demand sessions.
Livestream Trainings, Talks, and Workshop & On Demand Recordings Pass | 3 Day pass (Wed-Fri) Get access to everything included in the Livestream 2-Day + On Demand pass + access all of Wednesday’s workshops and training sessions. You also get immediate On Demand access to all recorded sessions including talks, keynotes, AIX business, training, workshops etc.
Livestream Trainings, Talks, and Workshop & On Demand Recordings Pass | 4 Day pass (Tue-Fri)
Incudes all access and On Demand included in the 3-day pass PLUS all Tuesday traning sessions livestream tracks and On Demand Recordings
How will I access the Livestream?
You will receive instructions for accessing the Livestream 24-48 hours prior to the start of the event.
Is my registration/ticket transferable?
Yes, up until 7 days prior to the event. Email firstname.lastname@example.org with your full name, email address and the full name, email address, company name, and title of the person to whom you are transferring. Please be advised that transfers are not available within 7 days of the start of the event.
Where can I contact the organizer with any questions
In summary, registrant contact information is NOT shared with third parties without your consent. Registrant information is primarily used to verify registration and notify you of similar events held by ODSC in the future. We may share your contact information with sponsors, but only with your consent upon registering.
- Ticket prices and availability are subject to increase or decrease, at the discretion of ODSC, before and/or after you have made your purchase, and do not entitle the purchaser to a refund or credit (partial or full).