ODSC EAST 2023
Conference Preliminary Schedule
more sessions added weekly
East 2023 Schedule
We are delighted to announce our East 2023 Preliminary Schedule!
Note: In-Person sessions will not be recorded. If you have a virtual Pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded.
Virtual | Bootcamp | Machine Learning | MLOps | Intermediate
In this training, you will learn how to accelerate your data analyses using the Python language and Pandas, a library specifically designed for tabular data analysis. We start by learning the core Pandas data structures, the Series and DataFrame. From these foundations, we will learn to use the split-apply-combine paradigm for grouped computations, manipulate time series, and perform advanced joins between datasets. Specifically, loading, filtering, grouping, and transforming data. Having completed this workshop, you will understand the fundamentals and advanced features of Pandas, be aware of common pitfalls, and be ready to perform your own analyses…more details
Virtual | Bootcamp | Machine Learning | Beginner
Data wrangling is an essential foundational topic for anyone considering a role in data engineering, data science, or machine learning. This session will help you understand core data wrangling concepts including what is data, data generation and collecting, data cleaning, profiling, transformation, and other essential data wrangling topics. As this is an interactive training session, in addition to covering these topics, we will layer on hands-on SQL training and an introduction to relational databases. As we journey through the data workflow we will use SQL to wrangle and transform the data as needed. SQL consistently makes the top 5 job requirements list for data scientists, data analysts, machine learning engineers, and other related data roles. The SQL standard is the universal go-to tool for manipulating structured data stores including relational databases…more details
In-person | Workshop | NLP | Machine Learning | Beginner-Intermediate
In this workshop, you’ll walk through a complete end-to-end example of using Hugging Face Transformers, involving both our open-source libraries and some of our commercial products. Starting from a dataset containing real-life product reviews from Amazon.com, you’ll train and deploy a text classification model predicting the star rating for similar reviews…more details
In-person | Tutorial | Machine Learning | Intermediate
This tutorial is targeted towards Data Scientists and machine learning engineers who work on machine learning and deep learning models.
Given a task, one is interested in finding a well-performing model to solve that task. Very often, this would involve tweaking the model either by changing the hyper parameters or modifying its architecture in order to find a better performing model. In the past, this was always done manually. But, with the advent of Automated Machine Learning, we can now leave that to the machines. In this tutorial, we will provide an overview of Hyperparameter Optimization (HPO) and Neural Architecture Search (NAS)…more details
Virtual | Tutorial | Responsible AI | Machine Learning | All Levels
Data Cards are transparency artifacts that provide structured summaries of ML datasets with explanations of processes and rationale that shape the data. They also describe how the data may be used to train or evaluate ML models. In practice, two critical factors determine the success of a transparency artifact: (1) the ability to identify the information decision-makers use and (2) the establishment of processes and guidance needed to acquire that information. To initiate practice-oriented foundations in transparency that support responsible AI development in cross-functional groups and organizations, we created the Data Cards Playbook — an open-sourced, self-service, comprehensive toolkit consisting of participatory activities, frameworks, and guidance designed to address specific challenges faced by teams, product areas and companies when setting up an AI dataset transparency effort…more details
Virtual | Full-Day Training | NLP | Beginner
In this course we will go through Natural Language Processing fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases…more details
In-person | Workshop | NLP | Intermediate
Leaving this workshop, you will understand each of these topics, and you will have gained the practical, hands-on expertise to start integrating modern NLP in your domain. Participants will fine-tune and prompt engineer state-of-the-art models like BART and XLM-Roberta, and they will peer behind the curtain of world shaking technologies like ChatGPT to understand their utility and architectures…more details
Virtual | Workshop | NLP | Machine Learning | Beginner-Intermediate
In this session, I will be giving some background in Conversational AI, NLP along with Self-supervised and Unsupervised techniques. Transformers based large language models (LLMs) such as GPT-3, Jurasic, T5 have been foundational to the advances that we see. I will walk the audience through hands-on examples and how they can leverage transformers and large language models for few-shot, zero-shot learning in a variety of NLP applications such as text classification, summarization and question-answering…more details
In-person | Full-Day Training | Machine Learning Safety and Security | All Levels
This course outlines the typical fraud framework at an organization and where data science can play a role as well as lay out how to build an analytically advanced fraud system. It then covers statistical and machine learning approaches to anomaly detection…more details
In-person | Workshop | Machine Learning | Beginner
Pandas can be tricky, and there is a lot of bad advice floating around. This tutorial will cut through some of the biggest issues I’ve seen with Pandas code after working with the library for a while and writing three books on it…more details
In-person | Tutorial | Deep Learning |Machine Learning | All Levels
Why should we try to unify the ML frameworks? Won’t we just create a new incompatible standard and make the ML fragmentation even worse? I will argue that the answer to these sensible and important questions is no…more details
Virtual | Workshop | Machine Learning | Deep Learning, Data Engineering & Big Data | Beginner-Intermediate
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data over the last few years and is now a critical part of the data science toolbox. This workshop will introduce you to the fundamentals of PySpark, Spark’s Python API, and other best practices in Spark programming…more details
In-person | Full-Day Training | Data Visualization & Data Analysis | Machine Learning | Intermediate-Advanced
The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python…more details
In-person | Workshop | Machine Learning | Deep Learning | Data Engineering | NLP | Beginner
In this workshop, we will cover how to build machine learning web applications using the Gradio (www.gradio.dev) library…more details
In-person | Half-Day Training | Machine Learning | Intermediate
The workshop participants will then get a chance to complete a set of tasks revolving around the various optimisation techniques and observe the outcomes. The tasks will include hyperparameter optimisation for a deep neural network and optimization of the parameters of one ensemble model (Random Forest)…more details
In-person | Half-Day Training | Machine Learning | Intermediate-Advanced
Gradient Boosting remains the most effective method for classification and regression problems on tabular data. This session is Part Two of two, covering advanced topics that are newer and may be less familiar. First, we will discuss how to calibrate the probabilities of classification models, reviewing the major techniques. Next, we will discuss Probabilistic Regression, wherein the goal is to predict the full probability distribution of the numerical target given the features, demonstrating different approaches to this problem. Finally, we will present tools for Conformal Prediction – a hot topic which can provide prediction intervals with strong theoretical guarantees.…more details
In-person | Half-Day Training | Machine Learning | Intermediate-Advanced
Gradient Boosting remains the most effective method for classification and regression problems on tabular data. This session is Part One of two. We will start with the fundamentals of how boosting works and best practices for model building and hyper-parameter tuning. Next, we will discuss how to interpret the model, understanding what features are important generally and for a specific prediction. Finally, we will discuss how to exploit categorical structure, when the different values of a categorical variable have a known relationship to one another.…more details
In-person | Workshop | Machine Learning | Intermediate
This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation…more details
In-person | Tutorial | Deep Learning | All Levels
Deepfake photos and videos are already impacting many industries and sectors of society, in both positive and negative ways. In this session I’ll weave between the social context of deepfakes (how they’ve been used and what impact they’ve had) and the technical side of them (how they’re made, and some approaches to detecting them). This is the multifaceted story of deepfakes…more details
Virtual | Workshop | NLP | Beginner-Intermediate
This workshop will equip participants with the skills and knowledge to conduct adversarial evaluation of NLP systems. Through active exercises and examples, we will discuss how to identify and address system weaknesses and explore how this approach can improve accuracy, reduce risk, and uncover potential blind spots. Participants will gain a greater understanding of how to use adversarial evaluation to detect and prevent errors in their NLP systems…more details
In-person | Full-Day Training | Deep Learning | Machine Learning | Beginner-Intermediate
Obscure until recently, Deep Learning is ubiquitous today across data-driven applications as diverse as machine vision, natural language processing, generative A.I., and superhuman game-playing.
This training is an introduction to Deep Learning that brings high-level theory to life with interactive examples featuring PyTorch, TensorFlow 2, and Keras — all three of the principal Python libraries for Deep Learning. Essential theory will be covered in a manner that provides students with a complete intuitive understanding of Deep Learning’s underlying foundations…more details
Virtual | Tutorial | Data Visualization & Data Analysis | All Levels
Amid the explosion of interactive data visualization options, Matplotlib remains the go-to for quick data visualization for researchers working in python, either directly or via libraries like Pandas and Seaborn. It also remains one of the most robust and flexible tools for customizing static visualizations. This tutorial will discuss ways to level-up your quick plots for use in published papers, automated reporting, and any other scenario where crisp and/or custom visualizations are called for…more details
In-person | Workshop | Deep Learning | NLP | Machine Learning | Beginner-Intermediate
Recommendation describes suggesting, or recommending, items tailored to a particular user. As generative AI creates an explosion of digital content, personalization will be more important than ever! Whether the application is sneaker designs, blog posts, or even pre-trained machine learning model weights, most recommendation tasks have a similar underlying structure. We need some way to represent items and users, typically as vectors, as well as a way to index them for fast computation. We also need to design intuitive APIs that interface the recommendation system to application developers. Weaviate is an open-source vector search database that has many unique search and database features…more details
In-person | Workshop | Machine Learning | Intermediate-Advanced
In this workshop, I will explain the core principles of Scruff and the main programming concepts. I will then demonstrate how we used Scruff to create a tool for wildfire risk assessment and mitigation that includes climate models, historical fire data, and fire propagation simulators. Finally, we will work through a hands-on session of getting up and running with Scruff and implementing and running simple models…more details