ODSC Masterclass Summit 2017

Premium Data Science Training

March 1-2 • San Francisco Airport Marriott Waterfront


Masterclass Summit 2017

Superior Training for Career Enhancing Skills

The ODSC masterclass summit offers premium training taught by the best and brightest AI and data science instructors on the planet.  Our hands-on workshops and in-depth training sessions will help accelerate your career by quickly getting you up to speed on some of the most sought after skills in the job market.  Over two intense days of workshop and training modules, you will train on the hottest skills in data science including deep learning, machine learning, predictive analytics, text analytics, Python, R and many more.
Andreas Mueller
Andreas Mueller

Core Contributor to scikit-learn

Rumman Chowdhury
Rumman Chowdhury

Senior Manager at Accenture AI

Jared Lander
Jared Lander

Author of R for Everyone

CEO at Lander Analytics

Full Speaker List

Why Attend?

Superior Instructors: Our instructors are not just anybody. We only accept the best. These are some of the top contributors and data scientists in their field such as Andy Muller from scikit-learn and Jared Lander from Lander Analytics. Not only do you learn the content but also insights you won’t get elsewhere.
Training for all Levels: Regardless of if you are just starting your AI and data science career, or are already an established data scientist, we have training modules for you. Workshops and trainings are structured to go as broad or as in-depth on a topic as you feel comfortable with.
Better Networking: Connect with top level instructors and network with your peers. We will also have an on-premises career fair with many companies looking to hire.

Workshops & Trainings

We have 16 workshops (1.5-hour) and 28 trainings (4-hour) for you to choose from.

Our intensive, hands-on training will accelerate your knowledge of machine learning, deep learning, data visualization and other in-demand data science topics.

Taught by some of the best minds in data science these workshops will give you the knowledge and experience that is exploding in demand across all industries.

Career Fair

Representatives from top data science companies will be seeking talented individuals at the Masterclass Summit. Stop by the career fair and bring your A-game. You could get hired in no time.

Interested in Hiring?

Contact Us
1.5-hour Workshops
4-hour Trainings
Partner Brief1

Partnering With ODSC

Last year, ODSC welcomed nearly 12,000 attendees in an unparalleled range of events from our large conferences, to hackathons and small community gatherings.
Request 2017 Partnership Brief


Workshop & Training List

More workshops (1.5-hour) & trainings (4-hour) coming soon!

Introduction to Machine Learning with Andreas Mueller, core developer of scikit-learn

The resurging interest in machine learning is due to multiple factors including growing volumes and varieties of data,and cheaper computational processing. Thus making it possible to quickly and automatically produce models that can analyze bigger, more complex data and deliver faster, more accurate results on a very large scale. scikit-learn (http://scikit-learn.org/) has emerged as one of the most popular open source machine learning toolkits, now widely used in academia and industry.

scikit-learn provides easy-to-use interfaces in Python to perform advanced analysis and build powerful predictive models.


Andreas Mueller received his MS degree in Mathematics (Dipl.-Math.) in 2008 from the Department of Mathematics at the University of Bonn. In 2013, he finalized his PhD thesis at the Institute for Computer Science at the University of Bonn. After working as a machine learning scientist at the Amazon Development Center Germany in Berlin for a year, he joined the Center for Data Science at the New York University in the end of 2014. In his current position as assistant research engineer at the Center for Data Science, he works on open source tools for machine learning and data science. He is one of the core contributors of scikit-learn, a machine learning toolkit widely used in industry and academia, for several years, and has authored and contributed to a number of open source projects related to machine learning.


This workshop will cover basic concepts of machine learning, such as supervised and unsupervised learning, cross-validation and model selection, and how they map to programming concepts in scikit-learn. Andreas will demonstrate how to prepare data for machine learning, and go from applying a single algorithm to building a machine learning pipeline. We will cover the trade-offs of learning on large datasets, and describe some techinques to handle larger-than-RAM and streaming data on a single machine.



Introduction to Deep Learning with Tensorflow Contributor and Kaggle Winner Dan Becker

Machine learning applications depend on a researcher handcrafting features. Deep learning has become hugely important since it allows a deep network to learn features by itself. Deep learning modeling is mostly unsupervised and by utilizing large-scale neural nets allow computers to learn and “think” by itself without the need for direct human intervention. Deep learning promises to revolutionize tasks such as image recognition, speech recognition, and other AI challenges.


Dan is the Technical Product Director at DataRobot. He has broad data science expertise, with consulting experience for 6 companies from the Fortune 100, a 2nd place finish in Kaggle’s $3million Heritage Health Prize, and contributions to the Keras and Tensorflow libraries for deep learning. Dan has a PhD in Econometrics from the University of Virginia.


– Basic model set-up in Keras
– Multi-class classification
– Convolutional neural networks 
– Debugging deep learning models



Deep Learning with H2O Open Platform with Jo-fai (Joe) Chow

H2O is fast scalable open-source machine learning and deep learning platform. Using in-memory compression techniques, H2O can handle billions of data rows in-memory — even on small compute clusters. The platform includes interfaces for R, Python, Scala, Java, JS and JSON, along with its interactive graphical Flow interface that make it easier for non-engineers to stitch together complete analytic workflows. H2O was built alongside (and on top of) both Hadoop and Spark clusters and is deployed within minutes. It is a math and machine learning engine that brings distribution and parallelism to powerful algorithms that enable you to make better predictions and more accurate models faster.


Jo-fai (Joe) is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media where he developed data products to enable quick and smart business decisions. He also worked remotely for Domino Data Lab as a data science evangelist promoting products via blogging and giving talks at meetups.


  • Gridlines
  • Pipe Search
  • Deep Water



Step by Step Bot Construction - Micheleen Harris

This tutorial will ramp up the attendee very quickly on the Microsoft Bot Framework, providing sample code upon which to base a bot experience. In this tutorial the attendee will build their own intelligent bot. This tutorial will help attendees decide if they want to make a bot to solve a repetitive task they have encountered or they know might be useful to others. User experience will be heavily emphasized to create the best bot experiences. Components will be laid out for the attendee.


Micheleen is a Data Scientist and trainer at Microsoft where she shares her Python, R and advanced analytics experience internally and externally. She has led or co-led workshops around data science and analytics concepts in Python and R, often utilizing Jupyter notebooks for interactive coding. Micheleen has developed a “Python for the Data Scientist” course delivered on Jupyter notebooks and have delivered this at Microsoft several times and look forward to its external release. She has also delivered courses utilizing Microsoft Azure and covering DocumentDB, Cognitive Services, the Bot Framework, as well as other components of the Cortana Intelligence Suite. She enjoys teaching/training and finding the most effective ways to teach data science and advanced analytics on any size dataset.


  1. Cognitive services overview
    1. What are Cognitive APIs
    2. Demos
  2. Introduction for Bot Framework Part
    • Syllabus
    • Learning objectives
  3. Bot Framework Overview
    1. What a bot is and is not
    2. The major components of the Bot Framework
    3. Deploying and working with channels
    4. Your arsenal or toolbox
  4. Developer’s Introduction and Building an intelligent bot with Bot Builder Node.js SDK
    1. Toolbox – Go over prereqs
    2. Setup project in VSCode (and set up debugger)
    3. Get code from course website with Git
    4. Update with Vision API key from Cognitive Services “My Account”
    5. Test with emulator
  5. Create more bots! Follow along or create your own
  6. Summary


There are a few things you will need in order to take full advantage of the course:

Please bring a laptop with internet connectivity.

  1. Node.js with npm installed locally – get the latest at:
  2. Visual Studio Code [recommended] or equivalent code editing and debugging environment with IntelliSense.
  3. Bot Framework Emulator (Windows and Unix-compatible) installed locally – information and links at
  4. GitHub Account – a code repository and collaboration tool we’ll use
  5. Git Bash – included in git download
  6. [Recommended]Azure account – use the one you have, sign up for a free trial at https://azure.microsoft.com/en-us/free/, or, if you have an MSDN account and Azure as a benefit, link your Microsoft Account or Work/School Account to MSDN and activate the Azure benefit by following this guide

We will assume you have already have the following background:

  1. Basic knowledge around using and navigating in a unix-style command line or terminal (for using Git Bash) (good basic guide at http://linuxcommand.org/lc3_learning_the_shell.php)
  2. Familiarity with Git and GitHub as a tools for software development, versioning and collaboration. (great book on Git at https://git-scm.com/book/en/v2)
  3. Have learned about debugging bots with VSCode in https://docs.botframework.com/en-us/node/builder/guides/debug-locally-with-vscode/ docs.
  4. If you are new to Node, here’s a good video tutorial series at https://www.youtube.com/playlist?list=PL6gx4Cwl9DGBMdkKFn3HasZnnAqVjzHn_

Modeling in R with Jared Lander, Chief Data Scientist of Lander Analytics and Author of R for Everyone,

At one point the open source language, R, was considered the lingua franca for data science in terms of programming. As more languages competing for that title, R still has a very passionate following. In fact, one of the main strengths of R is its huge community that provides open source user-contributed packages (CRAN), documentation and very active user support group. R packages are a collection of R functions and data that make it easy to immediately get access to the latest techniques and functionalities without needing to develop everything from scratch yourself.


Jared Lander is theChief Data scientist at Lander Analytics, Columbia Professor, Author of R for Everyone and Organizer of the World’s Largest R Meetup


The linear model, and its extensions, forms the backbone of statistical analysis. In this course we cover Linear Regression using `lm`, Generalized Linear Models using `glm` and model assessment using `AIC`, `BIC` and other measures. The focus will be mainly on applied programming, though theoretical properties and derivations will be taught where appropriate. Attendees should already have a basic knowledge of linear models and have R and RStudio installed, along with the `UsingR`, `ggplot2` and `coefplot` packages. Linear Models: Learn about the best fit line, Understand the formula interface in R, Understand the design matrix, Fit Models with `lm`, Visualize the coefficients with `coefplot`, Make predictions on new data. Generalized Linear Models: Learn about Logistic Regression for classification, Learn about Poisson Regression for count data, Fit models with `glm`, Visualize the coefficients with `coefplot`, Model Assessment, Compare models, `AIC`,’BIC`



Deploying and Scaling Spark ML and Tensorflow AI Models with Chris Fregly, Research Scientist, Contributor, Author and Trainer

Apache Spark is becoming increasing popular for big data science projects. Spark can handle large volumes of data significantly faster and easier than other platforms, and it includes tools for real-time processing, machine learning, and interactive SQL. It is quickly being adopted by industry to achieve business objectives that need data and data science at scale.


Chris Fregly is a Reaserach Scientist at Pipeline.IO. Chris is also the founder of the global Advanced Apache Spark Meetup and author of the upcoming book, Advanced Spark @ advancedspark.com. Previously, Chris was a Data Solutions Engineer at Databricks and a Streaming Data Engineer at Netflix. When Chris isn’t contributing to Spark and other open source projects, he’s creating book chapters, slides, and demos to share knowledge with his peers at meetups and conferences throughout the world.


In this workshop, we will train, deploy, and scale Spark ML and Tensorflow AI Models in a distributed, hybrid-cloud and on-premise production environment. We will use 100% open source tools including Tensorflow, Spark ML, Jupyter Notebook, Docker, Kubernetes, and NetflixOSS Microservices. This talk will discuss the trade-offs of mutable vs. immutable model deployments, on-the-fly JVM byte-code generation, global request batching, miroservice circuit breakers, and dynamic cluster scaling – all from within a Jupyter notebook. All code and docker images are 100% open source and available from Github and DockerHub at http://pipeline.io.



16 Workshops | 28 Trainings

View Full Speaker List

Platinum Sponsors



Gold Sponsors



San Francisco Airport Marriott Waterfront

Book a Discounted Hotel Room
1800 Old Bayshore Highway, Burlingame, CA 94010
Your Server is Unable to connect to the Google Geocoding API, kindly visit THIS LINK , find out the latitude and longitude of your address and enter it manually in the Google Maps Module of the Page Builder
Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from Youtube
Consent to display content from Vimeo
Google Maps
Consent to display content from Google