Training Sessions

– Taught by World-Class Data Scientists –

Learn the latest data science concepts, tools and techniques from the best. Forge a connection with these rockstars from industry and academic, who are passionate about molding the next generation of data scientists.

Highly Experienced Instructors

Our instructors are highly regarded in data science, coming from both academia and notable companies.

Real World Applications

Gain the skills and knowledge to use data science in your career and business, without breaking the bank.

Cutting Edge Subject Matter

Find training sessions offered on a wide variety of data science topics from machine learning to data visualization.

ODSC Training Includes

Form a working relationship with some of the world’s top data scientists for follow up questions and advice.

Additionally, your ticket includes access to 50+ talks and workshops.

High quality recordings of each session, exclusively available to premium training attendees.

Equivalent training at other conferences costs much more.

Professionally prepared learning materials, custom tailored to each course.

Opportunities to connect with other ambitious like-minded data scientists.

Training Sessions

Deep Learning with Tensorflow for Absolute Beginners with Lead Staff Developer Advocate at Google, Deep Learning TensorFlow & Machine Learning Expert, Kaz Sato

Bio

Kaz Sato is Staff Developer Advocate at Google Cloud team, Google Inc. focusing on Machine Learning and Data Analytics products, such as TensorFlow, Cloud ML and BigQuery. Kaz has been speaking at major events including Google Cloud Next SF, Google I/O, Strata NYC etc., authoring many GCP blog posts, and leading developer communities for Google Cloud for over 8 years. He is also interested in hardwares and IoT, and has been hosting FPGA meetups since 2013.

Abstract

When I opened a neural network text book and saw the bunch of math formulas, I felt like “this is not for me”. But wait, TensorFlow now provides the high-level API that let you write a few lines of Python code to get started with neural network, without understanding the hard math. Try this codelab to see how machine learning works on your laptop.

This codelab is designed as an easy TensorFlow introduction for non ML experts. All you need to know is how to use Python. It would take about 2 – 3 hours to go through all the sections.

Curriculum

TBD

Prerequisites

TBD

Deep Learning with Tensorflow for Absolute Beginners with Head of Data Science at Datatonic, Matthias Feys

Bio

Matthias is a data geek, passionate about gaining insights from data. By day, as head of data science at Datatonic, he helps corporations unleashing the power of data using Google Cloud Platform with custom Machine Learning solutions. By night, Matthias is co-organizer of GDG Cloud Belgium, Tensorflow Belgium and is active participant in other data related meetups. He shares his knowledge by organizing/presenting talks and free training sessions.

Abstract

When I opened a neural network text book and saw the bunch of math formulas, I felt like “this is not for me”. But wait, TensorFlow now provides the high-level API that let you write a few lines of Python code to get started with neural network, without understanding the hard math. Try this codelab to see how machine learning works on your laptop.

This codelab is designed as an easy TensorFlow introduction for non ML experts. All you need to know is how to use Python. It would take about 2 – 3 hours to go through all the sections.

Curriculum

TBD

Prerequisites

TBD

An Introduction to Python in Data Science with Owner of Global Sports Statistics, Robert Mastrodomenico, PhD

Bio

Rob studied a BSc Mathematics and Statistics at the University of Reading. Thereafter, having particularly enjoyed the Statistics side of his degree, he stayed on at Reading to study for a Statistics PhD. Following the completion of his studies worked as a Quantitative Analyst at a statistical consultancy which undertook statistical research and provided sports modeling services in the betting sector. In 2011 he started up his own company called Global Sports Statistics and has continued working in the sports modeling sector. His interest in Python really began back in 2010 when he was looking at alternatives to R and since then it has become his language of choice. Rob conducts Python training courses for the Royal Statistical Society and has seen his Introduction to Python Course feature on Sages online training.

Abstract

Python is awesome, I may be biased but it can do loads of things and Data Science is one of them. With Data Science being such a wide subject we will concentrate on dealing with and manipulating data in Python.

We begin by doing a whistle stop tour of the core Python library to give attendees of how Python works and how to do the basics.
Next we expand on that by introducing the Pandas library and show how this allows us to manipulate and deal with data.
The culmination of this session will be a data analysis task putting together everything covered

Note this course is very much intended for a beginner and those with Python experience may not benefit from what is taught.
All attendees will be required to have installed Python 3 https://www.python.org/downloads/ and the Anaconda distribution https://www.anaconda.com/download/#macos prior to the course this will contain all packages used through the session. Users will also be required to download the file http://www.football-data.co.uk/mmz4281/1617/E0.csv which will be used in the data analysis

Curriculum

TBD

Prerequisites

TBD

Machine Learning in R Part I with Statistics Professor at Columbia University and Author of R for Everyone , Jared Lander

Bio

Jared Lander is the Chief Data Scientist of Lander Analytics a data science consultancy based in New York City, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fund raising to finance and humanitarian relief efforts.

He specializes in data management, multilevel models, machine learning, generalized linear models, data management and statistical computing. He is the author of R for Everyone: Advanced Analytics and Graphics, a book about R Programming geared toward Data Scientists and Non-Statisticians alike and is creating a course on glmnet with DataCamp.

Abstract

Modern statistics has become almost synonymous with machine learning, a collection of techniques that utilize today’s incredible computing power. This two-part course focuses on the available methods for implementing machine learning algorithms in R, and will examine some of the underlying theory behind the curtain. We start with the foundation of it all, the linear model and its generalization, the glm. We look how to assess model quality with traditional measures and cross-validation and visualize models with coefficient plots. Next we turn to penalized regression with the Elastic Net. After that we turn to Boosted Decision Trees utilizing xgboost. Attendees should have a good understanding of linear models and classification and should have R and RStudio installed, along with the `glmnet`, `xgboost`, `boot`, `ggplot2`, `UsingR` and `coefplot` packages.

Linear Models
Learn about the best fit line
Understand the formula interface in R
Understand the design matrix
Fit Models with `lm`
Visualize the coefficients with `coefplot`
Make predictions on new data

Generalized Linear Models
Learn about Logistic Regression for classification
Learn about Poisson Regression for count data
Fit models with `glm`
Visualize the coefficients with `coefplot`

Model Assessment
Compare models
`AIC`
`BIC`

Cross-validation
Learn the reasoning and process behind cross-validation

Elastic Net
Learn about penalized regression with the Lasso and Ridge
Fit models with `glmnet`
Understand the coefficient path
View coefficients with `coefplot`

Boosted Decision Trees
Learn how to make classifications (and regression) using recursive partitioning
Fit models with `xgboost`
Make compelling visualizations with `DiagrammeR`

Curriculum

TBD

Prerequisites

TBD

Machine Learning in R Part II with Statistics Professor at Columbia University and Author of R for Everyone , Jared Lander

Bio

Jared Lander is the Chief Data Scientist of Lander Analytics a data science consultancy based in New York City, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fund raising to finance and humanitarian relief efforts.

He specializes in data management, multilevel models, machine learning, generalized linear models, data management and statistical computing. He is the author of R for Everyone: Advanced Analytics and Graphics, a book about R Programming geared toward Data Scientists and Non-Statisticians alike and is creating a course on glmnet with DataCamp.

Abstract

Modern statistics has become almost synonymous with machine learning, a collection of techniques that utilize today’s incredible computing power. This two-part course focuses on the available methods for implementing machine learning algorithms in R, and will examine some of the underlying theory behind the curtain. We start with the foundation of it all, the linear model and its generalization, the glm. We look how to assess model quality with traditional measures and cross-validation and visualize models with coefficient plots. Next we turn to penalized regression with the Elastic Net. After that we turn to Boosted Decision Trees utilizing xgboost. Attendees should have a good understanding of linear models and classification and should have R and RStudio installed, along with the `glmnet`, `xgboost`, `boot`, `ggplot2`, `UsingR` and `coefplot` packages.

Linear Models
Learn about the best fit line
Understand the formula interface in R
Understand the design matrix
Fit Models with `lm`
Visualize the coefficients with `coefplot`
Make predictions on new data

Generalized Linear Models
Learn about Logistic Regression for classification
Learn about Poisson Regression for count data
Fit models with `glm`
Visualize the coefficients with `coefplot`

Model Assessment
Compare models
`AIC`
`BIC`

Cross-validation
Learn the reasoning and process behind cross-validation

Elastic Net
Learn about penalized regression with the Lasso and Ridge
Fit models with `glmnet`
Understand the coefficient path
View coefficients with `coefplot`

Boosted Decision Trees
Learn how to make classifications (and regression) using recursive partitioning
Fit models with `xgboost`
Make compelling visualizations with `DiagrammeR`

Curriculum

TBD

Prerequisites

TBD

Algorithmic Trading with Machine & Deep Learning with Lecturer, Author of Derivatives Analytics with Python and Python for Finance, and Founder of PyQuants, Yves Hilpisch

 

Bio

Dr. Yves J. Hilpisch is founder and managing partner of The Python Quants (http://tpq.io), a group focusing on the use of open source technologies for financial data science, algorithmic trading and computational finance. He is the author of the books

* Python for Finance (O’Reilly, 2014),
* Derivatives Analytics with Python (Wiley, 2015) and
* Listed Volatility and Variance Derivatives (Wiley, 2017).

Yves lectures on computational finance at the CQF Program (http://cqf.com), on data science at htw saar University of Applied Sciences (http://htwsaar.de) and is the director of the first online training program leading to a Python for Algorithmic Trading University Certificate (awarded by htw saar).

Yves has written the financial analytics library DX Analytics (http://dx-analytics.com) and organizes meetups and conferences about Python for quantitative finance in Frankfurt, London and New York. He has given keynote speeches at technology conferences in the United States, Europe and Asia.

Abstract

This workshop illustrates the use of machine and deep learning algorithms for classification in the context of predicting stock market movements. The workshop shows that there are parallels between building self-driving cars and deploying automated algorithmic trading strategies.

Curriculum

TBD

Prerequisites

TBD

PipelineAI's High Performance, Distributed Spark ML, Tensorflow AI with Founder and Research Scientist at PipelineAI, Apache Spark Contributor, Author of the upcoming book, Advanced Spark, Chris Fregly

Bio

Chris Fregly is Founder and Research Engineer at PipelineAI, a Streaming Machine Learning and Artificial Intelligence Startup based in San Francisco. He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, “High Performance TensorFlow in Production.” Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco.

Abstract

We will each build an end-to-end, continuous Tensorflow AI model training and deployment pipeline on our own GPU-based cloud instance. At the end, we will combine our cloud instances to create the LARGEST Distributed Tensorflow AI Training and Serving Cluster in the WORLD!

 

Curriculum

Spark ML
TensorFlow AI
Storing and Serving Models with HDFS
Trade-offs of CPU vs. *GPU, Scale Up vs. Scale Out
CUDA + cuDNN GPU Development Overview
TensorFlow Model Checkpointing, Saving, Exporting, and Importing
Distributed TensorFlow AI Model Training (Distributed Tensorflow)
TensorFlow’s Accelerated Linear Algebra Framework (XLA)
TensorFlow’s Just-in-Time (JIT) Compiler, Ahead of Time (AOT) Compiler
Centralized Logging and Visualizing of Distributed TensorFlow Training (Tensorboard)
Distributed Tensorflow AI Model Serving/Predicting (TensorFlow Serving)
Centralized Logging and Metrics Collection (Prometheus, Grafana)
Continuous TensorFlow AI Model Deployment (TensorFlow, Airflow)
Hybrid Cross-Cloud and On-Premise Deployments (Kubernetes)
High-Performance and Fault-Tolerant Micro-services (NetflixOSS)
More Info including GitHub and Docker Repos
http://pipeline.ai

Prerequisites

Just a modern browser, internet connection, and a good night’s sleep! We’ll provide the rest.

A Gentle Introduction to Predictive Analytics with R with Dr. Colin Gillespie, Author of Efficient R Programming, R Trainer/Consultant, and Senior Lecturer at Newcastle University

Bio

Dr. Colin Gillespie is Senior lecturer (Associate professor) at Newcastle University, UK. His research interests are high performance statistical computing and Bayesian statistics. He is regularly employed as a consultant by Jumping Rivers and has been teaching R since 2005 at a variety of levels, ranging from beginners to advanced programming.

Abstract

This course aims to give a gentle introduction to the ideas behind some of the standard algorithms used in predictive analytics. The goal is to dispel the “magic” around common methods, such as lasso, logistic regression, naive bayes and others. The workshop will consist of a combination of programming (with R) and lectures. If you’ve ever wondered how these algorithms work, then this session is for you.

Curriculum

TBD

Prerequisites

TBD

Graph Data - Modelling and Quering with Neo4j and Cypher with Software Engineer at PRODYNA AG and Python Software Developer Iryna Feurstein

Bio

Iryna is passionate graphista, co-organiser of the Graph Database NRW Meetup. Her data science experience includes serving as an IT-Consultant and Software Engineer at PRODYNA AG in Düsseldorf, Germany where she has been involved in different projects based on graph models and graph databases for customers.

Previously, Iryna served as a Python Software Developer at University Library TU München, Germany and was responsible development and maintenance of mediaTUM, a media and publications repository. Additionally, Iryna worked as a Mathematical Technical Software Developer (IHK) at Technische Universität München.

Abstract

Neo4j is an innovative open-source NoSQL database. It is based on the graph theory in contrast to the relational databases based on the set theory. This half a day course teaches the core functionality of Neo4j. By means of a fancy but realistic use case, participants learn how to leverage the power of graph databases through the Cypher, graph query language. The session covers querying graph patterns with Cypher, using some basic graph algorithms, as well as designing and implementing a graph database model.

Curriculum

TBD

Prerequisites

TBD

Sign Up for ODSC EUROPE 2017 | October 12-14

Register Now