ODSC Webinar Calendar

ODSC’s free webinar series serves to educate our community on the languages, tools, and topics of AI and Data Science


ODSC West Warm-Up

October 23rd, 2018

starting at 1:00 PM PST

4 speakers from our upcoming ODSC West conference
30 minutes sessions


Matthew Rubashkin, Ph.D. AI Program Director at Insight Data Science

Building an image search service from scratch

 

Building an image search service from scratch

We are bringing a workshop on how you would go about building your own representations, both for image and text data, and efficiently do similarity search. By the end of this workshop, you should be able to build a quick semantic search model from scratch, no matter the size of your dataset.

Presenter bio - Matthew Rubashkin, Ph.D. AI Program Director at Insight Data Science


Michael Mahoney, PhD, Professor at UC Berkeley

Matrix Algorithms at Scale: Randomization and using Alchemist to bridge the Spark-MPI gap

 

Matrix Algorithms at Scale: Randomization and using Alchemist to bridge the Spark-MPI gap

In this talk we will describe some of the underlying randomized linear algebra techniques. Finally, we’ll describe Alchemist, a system for interfacing between Spark and existing MPI libraries that is designed to address this performance gap. The libraries can be called from a Spark application with little effort, and we illustrate how the resulting system leads to efficient and scalable performance on large datasets. We describe use cases from scientific data analysis that motivated the development of Alchemist and that benefit from this system. We’ll also describe related work on communication-avoiding machine learning, optimization-based methods that can call these algorithms, and extending Alchemist to provide an ipython notebook <=> MPI interface.

Presenter Bio - Michael Mahoney, PhD, Professor at UC Berkeley

Michael Mahoney is at the University of California at Berkeley in the Department of Statistics and at the International Computer Science Institute (ICSI). He works on algorithmic and statistical aspects of modern large-scale data analysis. Much of his recent research has focused on large-scale machine learning, including randomized matrix algorithms and randomized numerical linear algebra, geometric network analysis tools for structure extraction in large informatics graphs, scalable implicit regularization methods, and applications in genetics, astronomy, medical imaging, social network analysis, and internet data analysis. He received him PhD from Yale University with a dissertation in computational statistical mechanics, and he has worked and taught at Yale University in the mathematics department, at Yahoo Research, and at Stanford University in the mathematics department. Among other things, he is on the national advisory committee of the Statistical and Applied Mathematical Sciences Institute (SAMSI), he was on the National Research Council’s Committee on the Analysis of Massive Data, he runs the biennial MMDS Workshops on Algorithms for Modern Massive Data Sets, and he spent fall 2013 at UC Berkeley co-organizing the Simons Foundation’s program on the Theoretical Foundations of Big Data Analysis.

George Williams, Director of Data Science at GSI Technology, Inc.

Visual Search: The Next Frontier of Search

Visual Search: The Next Frontier of Search

In this session, you will learn the latest state-of-the-art visual search research and techniques as the speakers will share their in-depth knowledge on the subject, how to scale your visual search solution to address the billion-scale problem and how to train models that provide more specific and accurate results for visually rich categories.

Presenter Bio - George Williams, Director of Data Science at GSI Technology, Inc.

 

Joshua Cook, Curriculum Developer at Databricks

Engineering for Data Science

Engineering for Data Science

This talk will discuss Docker as a tool for the data scientist, in particular in conjunction with the popular interactive programming platform, Jupyter, and the cloud computing platform, Amazon Web Services (AWS). Using Docker, Jupyter, and AWS, the data scientist can take control of their environment configuration, prototype scalable data architectures, and trivially clone their work toward replicability and communication. This talk will toward developing a set of best practices for Engineering for Data Science.

Presenter Bio - Joshua Cook, Curriculum Developer at Databricks

Joshua Cook is a mathematician. He writes code in Bash, C, and Python and has done pure and applied for computational work in geospatial predictive modeling, quantum mechanics, semantic search, and artificial intelligence. He also has ten years experience teaching mathematics at the secondary and post-secondary level. His research interests lie in high-performance computing, interactive computing, feature extraction, and reinforcement learning. He is always willing to discuss orthogonality or to explain why Fortran is the language of the future over a warm or cold beverage.

 

Nisha Talagala, CTO/VP of Engineering at ParallelM

Bringing Your Machine Learning and Deep Learning Algorithms to Life: From Experiments to Production Use

Bringing Your Machine Learning and Deep Learning Algorithms to Life: From Experiments to Production Use

In this hands on workshop, attendees will learn how to take Machine Learning and Deep Learning programs into a production use case and manage the full production lifecycle. This workshop is targeted for data scientists, with some basic knowledge of Machine Learning and/or Deep Learning algorithms, who would like to learn how to bring their promising experimental results on ML and DL algorithms into production success. In the first half of the workshop, attendees will learn how to develop an ML algorithm in a Jupyter notebook and transition this algorithm into an automated production scoring environment using Apache Spark. The audience will then learn how to diagnose production scenarios for their application (for example, data and model drift) and optimize their ML performance further using retraining. In the second half of the workshop, users will perform a similar exercise for Deep Learning. They will learn how to experiment with Convolutional Neural Network algorithms in TensorFlow and then deploy their chosen algorithm into production use. They will learn how to monitor the behavior of Deep Learning algorithms in production and approaches to optimizing production DL behavior via retraining and transfer learning.

Attendees should have basic knowledge of ML and DL algorithm types. Deep mathematical knowledge of algorithm internals is not required. All experiments will use Python. Environments will be provided in Azure for hands on use by all attendees. Each attendee will receive an account for use during the workshop and access to the notebook environments, Spark and TensorFlow engines, as well as an ML lifecycle management environment. For the ML experiments, sample algorithms and public data sets will be provided for Anomaly Detection and Classification. For the DL experiments, sample algorithms and public data sets will be provided for Image Classification and Text Recognition.

Presenter Bio - Nisha Talagala, CTO/VP of Engineering at ParallelM

Nisha Talagala is Co-Founder, CTO/VP of Engineering at ParallelM, a startup focused on Production Machine Learning. As Fellow at SanDisk and Fellow/Lead Architect at Fusion-io, she led advanced technology development in Non-Volatile Memory and applications. Nisha has more than 15 years of expertise in software, distributed systems, machine learning, persistent memory, and flash. Nisha was also technology lead for server flash at Intel and the CTO of Gear6. Nisha earned her PhD at UC Berkeley on distributed systems research. Nisha holds 54 patents, is a frequent speaker at both industry and academic conferences, and serves on multiple technical conference program committees.


Add to Calendar
10/3/2018 1:00 PM
America/Los_Angeles
ODSC West Warm-Up

Click here for Webinar Access
ODSC West Warm-Up

Register Here for October 23rd



Free access to ODSC talks and content is available at our

AI Learning Accelerator

ODSC EAST | Boston

– May 1-4, 2018 –

The World’s Largest Applied Data Science Conference

ODSC EUROPE | London

– Sept 19-22, 2018 –

Europe’s Fastest Growing Data Science Community

ODSC WEST | San Francisco

– Oct 31- Nov 3, 2018 –

The World’s Largest Applied Data Science Conference

Accelerate AI

Business Conference

The Accelerate AI conference series is where executives and business professionals meet the best and brightest innovators in AI and Data Science The conference brings together top industry executives and CxOs to help you understand how AI and data science will transform your business.

Accelerate AI East | Boston

– May 1 to 4, 2018 –

The ODSC summit on accelerating your business growth with AI

Accelerate AI Europe | London 

– Sept 19 to 22, 2018 –

The ODSC summit on accelerating your business growth with AI

Accelerate AI West | San Francisco 

– Oct 31 to Nov 3, 2018 –

The ODSC summit on accelerating your business growth with AI
Open Data Science Conference