May 9th - 11th, 2023
Machine Learning & Deep Learning
Learn the latest models, advancements, and trends from the top practitioners behind two of data science’s hottest topics
Comprising multiple tracks, this focus area is where leading experts in the rapidly expanding fields of Deep Learning and Machine Learning gather to discuss the latest advances, trends, and models in this exciting field.
Attend talks, tutorials, and workshops and hear from the creators and top practitioners as they teach the latest models and trends in Machine Learning and Deep Learning to solve problems in business and society. Some of the topics you’ll learn include:
Machine Learning
Deep Learning
Deep Reinforcement Learning
Neural Networks
LSTM, CNNs, RNNs, & GANs,
Computer Vision
Pattern Recognition
Tensorflow
Scikit-learn
Keras
Caffe 2
PyTorch
Theano
Apache Spark & MlLib
and many more…
Federated Learning
Transfer Learning
Autonomous Machines
MLOps and Kubeflow
Recommendation Systems
Never Ending Learning for ML
Causal Inference
Some of Past ML & DL Speakers

John Peach
A modern polymath, John holds advanced degrees in mechanical engineering, kinesiology and data science, with a focus on solving novel and ambiguous problems. As a senior applied data scientist at Amazon, John worked closely with engineering to create machine learning models to arbitrate chatbot skills, entity resolution, search, and personalization.
As a principal data scientist for Oracle Cloud Infrastructure, he is now defining tooling for data science at scale. John frequently gives talks on best practices and reproducible research. To that end, he has developed an approach to improve validation and reliability by using data unit tests and has pioneered Data Science Design Thinking. He also coordinates SoCal RUG, the largest R meetup group in Southern California.
Tired of Cleaning your Data? Have Confidence in Data with Feature Types(Workshop)

Gijsbert Janssen Van Doorn
Gijsbert Janssen van Doorn is Director of Technical Product Marketing at Run:AI. He is a passionate advocate for technology that will shape the future of how organizations run AI. Gijsbert comes from a technical engineering background, with six years in multiple roles at Zerto, a Cloud Data Management and Protection vendor.

Milecia McGregor
Milecia is a senior software engineer, international tech speaker, and mad scientist that works with hardware and software. She will try to make anything with JavaScript first. In her free time, she enjoys learning random things, like how to ride a unicycle, and playing with her dog.
Preventing Stale Models in Production(Talk)

Anais Dotis-Georgiou
Anais Dotis-Georgiou is a Developer Advocate for InfluxData with a passion for making data beautiful with the use of Data Analytics, AI, and Machine Learning. She takes the data that she collects, does a mix of research, exploration, and engineering to translate the data into something of function, value, and beauty. When she is not behind a screen, you can find her outside drawing, stretching, boarding, or chasing after a soccer ball.
InfluxDB: The Database for Your Time Series Data Science Problems(Demo Talk)

Tim Kraska, PhD
Tim Kraska is an Associate Professor of Electrical Engineering and Computer Science in MIT’s Computer Science and Artificial Intelligence Laboratory, co-director of the Data System and AI Lab at MIT (DSAIL@CSAIL), and co-founder of Einblick Analytics. Currently, his research focuses on building systems for machine learning, and using machine learning for systems. Before joining MIT, Tim was an Assistant Professor at Brown, spent time at Google Brain, and was a PostDoc in the AMPLab at UC Berkeley after he got his PhD from ETH Zurich. Tim is a 2017 Alfred P. Sloan Research Fellow in computer science and received several awards including the VLDB Early Career Research Contribution Award, the VMware Systems Research Award, the university-wide Early Career Research Achievement Award at Brown University, an NSF CAREER Award, as well as several best paper and demo awards at VLDB, SIGMOD, and ICDE.
Data Boards: A Collaborative and Interactive Space for Data Science(Demo Talk)

Jai Natarajan
Jai Natarajan is the Vice President, Strategic Business Development at iMerit, a global AI data solutions company delivering high-quality data that powers machine learning and artificial intelligence applications for Fortune 500 companies. Bringing more than 24 years of experience, Jai works with more than 5500 data experts who label and enrich data at scale to help customers get better results from their machine learning algorithms. Jai works with iMerit’s partner ecosystem to develop iMerit’s solutions for its customers, and provides strategic inputs to the company. Previously, Jai worked at Lucasfilm and Sony, and founded Xentrix, an Emmy-winning animation studio. He is a board member of the Anudip Foundation. JaI has an M.S. in Computer Science from UCLA, and undergraduate degrees from Birla Institute of Technology and Science.

Matthew Rocklin, PhD
Matthew Rocklin, CEO and founder of Coiled, and the initial author of Coiled’s underlying technology, Dask. He developed Dask to help people solve challenging distributed computing problems while working at Anaconda. While he is primarily known for his work on Dask, he also coordinates and maintains several dozen libraries within Python’s numeric computing ecosystem, with a substantial focus on efficient and scalable computing. Matthew is a frequent speaker at several technical, academic, and industry events, such as PyData, SciPy, Google Next, O’Reilly’s Strata, AGU, AMS, and ICML. He has a Doctorate of Philosophy in Computer Science from the University of Chicago, and a Bachelors in Physics, Mathematics, and Astronomy from the University of California.

Nick Rabinowitz
Nick Rabinowitz has over 25 years of experience in software engineering, data visualization, and information management. He is currently a Senior Staff Engineer at Foursquare, where he works on web-based geospatial applications. Prior to joining Foursquare, he spent 5 years as a software engineer at Uber, where he led the development of business intelligence applications, helped to launch the Open Source H3 library, and created the Two Percent Pledge to promote charitable giving for Uber employees. He recently started geocaching and is unduly proud of his 138 finds.
The Power Of Hexagons: How H3 & Foursquare Are Transforming Spatial Analytics(Talk)

Bradley Franko
Brad leads business development in North America in Data Science for all commercial and enterprise accounts. He brings a unique background in SAAS sales, technology, and leadership to HP’s Advanced Compute Solutions organization.
Data Science Innovation with Z by HP Workstations and Software Stack(Business Talk)

Hunter Kempf
Hunter’s Data Science Journey began when working for AT&T. During this time, he was also earning his masters from the University of Notre Dame. In the 4 years he worked for AT&T, Hunter worked on a variety of projects in Planning/Forecasting and Fraud/Cyber Threat Prevention. Today, Hunter works for an Internet Infrastructure and Cybersecurity Company and is finishing up his second Masters in Cybersecurity from Georgia Tech. Outside of work, Hunter enjoys analyzing Entertainment, Video Game and Streaming data.
Data Science Innovation with Z by HP Workstations and Software Stack(Business Talk)

Max Urbany
As Max progresses through his Master’s Program, he is particularly interested in intelligent digital accessibility design, along with the ethical analysis of existing predictive models. His passion for creating quality user-centered tools drives him to understand as much as he can about end users while leveraging what data can reveal.
Z by HP Panel Discussion on the Diverse Role of Data Science in Education(Talk)

Dan Chaney
Dan Chaney is the VP, Enterprise AI / Data Science Solutions, for Future Tech Enterprise, Inc., an award-winning global IT solutions provider. He oversees all sales, marketing, and technical activities focused on Future Tech’s comprehensive range of AI and data science workstation solutions. Prior to joining Future Tech, Dan spent 20 years at Northrop Grumman, most recently serving as the company’s Enterprise Director of IT Solution Architecture & Engineering. Dan earned his bachelor’s and master’s degrees in communication and computer science from the University of Kentucky. Dan is a Certified Information Systems Security Professional (CISSP) and adjunct instructor for the University of Louisville’s cybersecurity workforce program sponsored by the National Centers of Academic Excellence in Cybersecurity.
Z by HP Panel Discussion on the Diverse Role of Data Science in Education(Talk)

Kristin Hempstead
Kristin has been with HP for 11 years and is currently the North America business development manager for HP’s data science and artificial intelligence solutions focusing on federal, education, and public sector customers. She has an MBA from University in South Florida with a specialization in Finance and MIS and a BS in Agriculture from the University of Georgia.
Z by HP Panel Discussion on the Diverse Role of Data Science in Education(Talk)

Dan S. Camper
Dan has been with LexisNexis Risk Solutions Group since 2014 and is an Enterprise Architect in the Solutions Lab Group. He has worked for Apple as well as Dun & Bradstreet, and he ran his own custom programming shop for a decade. He’s been writing software professionally for more than 40 years and has worked on a myriad of systems, using many different programming languages.
A New Indexing Technique for Quickly Fuzzy-Matching Entire Dataset Records(Talk)

Dmitry Petrov, PhD
Dmitry Petrov is an ex-Data Scientist at Microsoft with Ph.D. in Computer Science and active open source contributor. He has written and open sourced the first version of DVC.org – machine learning workflow management tool. Also he implemented Wavelet-based image hashing algorithm (wHash) in open source library ImageHash for Python. Now Dmitry is working on tools for machine learning and ML workflow management as a co-founder and CEO of Iterative in San Francisco.
Model Registry with Open Source Tools: Git, GitHub and CI/CD(Track Keynote)

Jason Hepp
Jason is a Solutions Architecture Director with a background in building data engineering platforms to facilitate both streaming and batch analytics at scale. He is an experienced architect across verticals such as government, retail, manufacturing, and finance allowing for a unique perspective to customer problems.
Aiven is Your One Stop Shop for Open-source Database Solutions to Power ML/AI in the Cloud.(Demo Talk)

Matt Tolley
Matt Tolley is a Sales Director working with Aiven customers across North America. Matt formerly worked at a Boston based DBaaS company focused on commercial DB virtualization acquired by Google in 2020. He then stepped into the world of open source and has been with Aiven for 2 years, with a core focus of consulting with customers that are looking to solve business-critical challenges through the use of open source database and streaming technology in the cloud.
Aiven is Your One Stop Shop for Open-source Database Solutions to Power ML/AI in the Cloud.(Demo Talk)

Hannes Hapke
Hannes Hapke works in machine learning at Digits. Prior, he was a senior machine learning scientist for Concur Labs at SAP Concurfor Concur Labs at SAP Concur, where he explored innovative ways to use machine learning to improve the experience of a business traveler. Hannes has also solved machine learning and ML infrastructure problems in various industries including healthcare, retail, recruiting, and renewable energies. He was recognized as a Google Developer Expert for ML and has co-authored two machine learning publications: “Building Machine Learning Pipeline” by O’Reilly Media and “NLP in Action” by Manning Publications.

Tina Eliassi-Rad, PhD
Tina Eliassi-Rad is a Professor of Computer Science at Northeastern University. She is also a core faculty member at Northeastern University’s Network Science Institute. Prior to joining Northeastern, Tina was an Associate Professor of Computer Science at Rutgers University; and before that she was a Member of Technical Staff and Principal Investigator at Lawrence Livermore National Laboratory. Tina earned her Ph.D. in Computer Sciences at the University of Wisconsin-Madison. Her research is rooted in data mining and machine learning; and spans theory, algorithms, and applications of big data from networked representations of physical and social phenomena. She has over 100 peer-reviewed publications (including a few best paper and best paper runner-up awardees). Tina’s work has been applied to personalized search on the World-Wide Web, statistical indices of large-scale scientific simulation data, fraud detection, mobile ad targeting, cyber situational awareness, and ethics in machine learning. Her algorithms have been incorporated into systems used by the government and industry (e.g., IBM System G Graph Analytics) as well as open-source software (e.g., Stanford Network Analysis Project). In 2017, Tina served as the program co-chair for the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining and as the program co-chair for the International Conference on Network Science . In 2020, she is serving as the program co-chair for the International Conference on Computational Social Science. Tina received an Outstanding Mentor Award from the Office of Science at the US Department of Energy in 2010; and became a Fellow of the ISI Foundation in Turin Italy in 2019.
Just Machine Learning(Talk)

Julie Josse, PhD
Julie Josse is a senior researcher in statistics and machine learning applied to health at Inria, a French research institute in digital sciences, and Professor at Ecole Polytechnique (Paris). She is an expert in the treatment of missing values (inference, multiple imputation, matrix completion, MNAR, supervised learning with missing values) and has created a website on the topic (https://rmisstastic.netlify.app/) for users. Her research also focuses on causal inference techniques (causal inference with missing values, combining RCT and observational data) for personalized medicine. Julie Josse is dedicated to reproducible research with R statistical software: she has developed packages including FactoMineR and missMDA to transfer her work.

Sharmistha Chatterjee
Sharmistha Chatterjee is a Data Science Evangelist with 16+ years of professional experience in the field of Machine Learning (AI research and productionizing scalable solutions) and Cloud applications. She has worked in both Fortune 500 companies, as well as in very early-stage startups. She is currently working as a Senior Manager of Data Sciences at Publicis Sapient where she leads the digital transformation of clients across industry verticals. She is an active blogger, an international speaker at various tech conferences, and a 2X Google Developer Expert in Machine Learning and Google Cloud. She is also the Hackernoon Tech award winner for 2020, been listed as 40 under 40 Data Scientist by AIM and & 21 tech trailblazers 2021 by Google. She is involved in mentoring startups in Google Startup for Accelerators Program. She has also completed Business in AI from the London School of Business recently.
Need of Adaptive Ethical ML Models in Post Pandemic Era(Talk)

Mona Khalil
Mona is a Data Science Manager at Greenhouse Software in New York City, where they contribute to data-informed decision making across the company and machine learning solutions to improve the hiring process for Greenhouse customers. They’ve previously worked in government, creating analytics and machine learning solutions to improve the lives of New Yorkers, and continue to be involved in civic projects through a number of volunteer and non-profit organizations. They’ve also been a statistics and data science educator with DataCamp, Emeritus, and in university settings. They hold a graduate degree in Developmental Psychology, and are passionate about contributing to the ethical use of data science methodology in the public and private sector.
Leveling Up Your Organization’s Capacity for Data-informed Decisions(Talk)
SQL for Data Science(Training)

Chandra Khatri
Chandra Khatri is the Chief Scientist and Head of AI at Got It AI, wherein, his team is transforming the AI space by leveraging state-of-the-art technologies in order to deliver Self-Discovering, Self-Training, and Self-Optimizing products. Under his leadership, Got It AI is democratizing Conversational AI and related ecosystems through automation. Prior to Got-It, Chandra was leading various kinds of applied research projects at Uber AI such as Conversational AI, Multi-modal AI, and Recommendation Systems.
Prior to Uber AI, he was the founding member of the Alexa Prize Competition at Amazon, wherein he was leading the R&D and got the opportunity to significantly advance the field of Conversational AI, particularly Open-domain Dialog Systems, which is considered as the holy-grail of Conversational AI and is one of the open-ended problems in AI. Prior to Alexa AI, he was driving NLP, Deep Learning, and Recommendation Systems related Applied Research at eBay. He graduated from Georgia Tech with a specialization in Deep Learning in 2015 and holds an undergraduate degree from BITS Pilani, India.
His current areas of research include Artificial and General Intelligence, Democratization of AI, Reinforcement Learning, Language Understanding, Conversational AI, Multi-modal and Human-agent Interactions, and Introducing Common Sense within Artificial Agents.
Self-Supervised and Unsupervised Learning for Conversational AI and NLP(Workshop)

Laura Ham
Laura is a ML Product Researcher at SeMI Technologies, the company behind the open-source vector search engine Weaviate. She researches new machine learning features for Weaviate and works on everything UX/DX related to Weaviate. For example, she is responsible for the GraphQL API design. She is in close contact with our open source community. Additionally, she likes to solve custom use cases with Weaviate, and introduces Weaviate to other people by means of Meetups, talks and presentations.
What is Vector Search?(Talk)

Gaël Varoquaux, PhD
Gaël Varoquaux is a research director working on data science and health at Inria (French Computer Science National research). His research focuses on using data and machine learning for scientific inference, with applications to health and social science, as well as developing tools that make it easier for non-specialists to use machine learning. He has long applied it to brain-imaging data to understand cognition. Years before the NSA, he was hoping to make bleeding-edge data processing available across new fields, and he has been working on a mastermind plan building easy-to-use open-source software in Python. He is a core developer of scikit-learn, joblib, Mayavi and nilearn, a nominated member of the PSF, and often teaches scientific computing with Python using the scipy lecture notes.
Prediction with Missing Values(Tutorial)

Dr. Clair J. Sullivan
Dr. Clair Sullivan is currently a graph data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society’s Mary J. Oestmann Professional Women’s Achievement Award in 2015.
When SQL is Not the Best Answer: Identifying “Graph-y” Problems and When Graphs Can Help(Talk)

James Skelton
James Skelton is a technical evangelist who specializes in Machine Learning. After graduating from the University of St. Andrews and completing Galvanize’s Data Science Immersive program, James has been focused on creating accessible educational content, community building, and identifying meaningful trends surrounding the worlds of Deep Learning and AI development.
What We’ve Learned Pushing Nearly 100M Hours of GPU Compute(Talk)

Yong Tang, PhD
Yong Tang, Ph.D., is Director of Engineering at Ivanti. He is a core contributor of many open-source projects in machine learning and cloud native areas. He is a maintainer and SIG I/O lead of the TensorFlow project, and received the Open Source Peer Bonus award from Google for his contributions to TensorFlow. He is also a maintainer of Docker/Moby, the widely used open-source container platform, and a core maintainer of CoreDNS, a Cloud Native Computing Foundation (CNCF) graduated project for service discovery.
Tutorial: Building and Deploying Machine Learning Models with TensorFlow and Keras(Training)

Justin Gottschlich, Ph.D.
Justin Gottschlich is the Founder, CEO & Chief Scientist of Merly, Inc. (http://merly.ai), a company aimed at making software developers more productive using state-of-the-art machine programming systems. Justin also has an academic appointment as an Adjunct Assistant Professor at the University of Pennsylvania. Before founding Merly, Justin was a Principal AI Scientist and the Founder & Director of Machine Programming Research at Intel Labs. In 2017, he co-founded the ACM SIGPLAN Machine Programming Symposium (MAPS) and now serves as its Steering Committee Chair. Justin also serves on the 2020 NSF Expeditions advisory board “Understanding the World Through Code” led by MIT Prof. Armando Solar-Lezama. Justin received his PhD in Computer Engineering from the University of Colorado-Boulder in 2011 and has 40+ peer-reviewed publications, 50+ issued patents, with 100+ patents pending. Justin’s research has been highlighted in venues like The New York Times, Communications of the ACM, MIT Technology Review, and The Wall Street Journal.
The Future of Software Development Using Machine Programming(Tutorial)

Gulrez Khan
Gulrez Khan is a Lead Data Scientist at PayPal and has been wrangling with data for more than 14 years. He champions Data Visualization & Storytelling with data in the community and has been giving talks, speaking at conferences to democratize data. Outside his day job, he loves to play with the public dataset and is also teaching kids to draw data.
Telling Stories with Data(Tutorial)

Robert Nishihara
Robert Nishihara is one of the creators of Ray, a distributed system for scaling Python and machine learning applications. He is one of the co-founders and CEO of Anyscale, which is the company behind Ray. He did his PhD in machine learning and distributed systems in the computer science department at UC Berkeley. Before that, he majored in math at Harvard.

Joel Z. Leibo, PhD
Joel Z. Leibo is a research scientist at DeepMind. He obtained his PhD in 2013 from MIT where he worked on the computational neuroscience of face recognition. Nowadays, Joel’s research is aimed at the following questions: How can we get deep reinforcement learning agents to perform complex cognitive behaviors like cooperating with one another in groups? How should we evaluate the performance of deep reinforcement learning agents? How can we model processes like cumulative culture that gave rise to unique aspects of human intelligence?
Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot(Talk)

Ronny Mathew
Ronny Mathew is a Data Science lead at Rue Gilt Groupe building next-generation online shopping experiences for their members. He is passionate about applied machine learning and deep learning and works on recommendation systems, computer vision, and Natural language processing for big data. At RGG, they are currently building the next generation of their personalization platform leveraging cutting-edge tools and algorithms.
Quick to Production with the Best of Both Spark and Tensorflow(Talk)

Devavrat Shah, PhD
Devavrat Shah is Andrew (1956) and Erna Viterbi Professor with the department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology. He is the founding director of Statistics and Data Science at MIT. He is also a member of IDSS, LIDS, CSAIL and ORC at MIT. He co-founded Celect, Inc. (now part of Nike) in 2013 to help retailers decide what to put where by accurately predicting demand using omni-channel data. He is a co-founder and CTO of IkigaiLabs with the mission to build self-driving organizations by enabling data-driven operations with human-in-the-loop. His research focuses on statistical inference and stochastic networks. His contributions span a variety of areas including resource allocation in communications networks, inference and learning on graphical models, algorithms for social data processing including ranking, recommendations and crowdsourcing and more recently causal inference. He has made foundational contributions to the development of “gossip” protocols and “message-passing” algorithms for statistical inference which have been the building blocks of modern distributed data processing systems.His work spans a range of areas across electrical engineering, computer science and operations research. His work has received broad recognition, including prize paper awards in Machine Learning, Operations Research and Computer Science, and career prizes including 2010 Erlang prize from the INFORMS Applied Probability Society, awarded bi-annually to a young researcher who has made outstanding contributions to applied probability. He is a distinguished alumni of his alma mater IIT Bombay from where he graduated with the honor of President of India Gold Medal. His work has been covered in popular press including NY Times, Forbes, Wired and Reditt.
Automation for Data Professionals(Training)

Ryan Wright
Ryan Wright is the creator of Quine, and has been leading software teams focused on data infrastructure and data science for two decades. He has served as principal engineer, director of engineering, principal investigator on DARPA-funded research programs, and is currently the founder and CEO of thatDot—the company supporting Quine. Ryan particularly enjoys taking the philosophical ends of computer science—usually problems related to language, meaning, and data—and making them more practical.
Noiseless Anomaly Detection with Streaming Graph A.I.(Demo Talk)
Quine: A Streaming Graph for Event-Driven Data Pipelines(Talk)

Jerry Zhang
Jerry Zhang is a Software Engineer in PyTorch Architecture Optimization team under AI Frameworks org in Meta. He has been working on PyTorch Quantization for the past three years, trying to provide self-serve and easy to use tools for people to optimize the inference speed of their model while maintaining accuracy. Before Meta, he was a master’s student in computer science at Carnegie Mellon University and he got his Bachelor’s degree in Computer Science and Technology from Zhejiang University, China.
Quantization in PyTorch(Tutorial)

Sabrina Smai
Sabrina Smai is a Product Manager in Microsoft’s AI Frameworks team. She works with all things PyTorch and ONNX Runtime.
Profiling and Optimizing PyTorch Applications with the PyTorch Profiler(Tutorial)

Dalton Lunga, PhD
Dalton Lunga is a senior R&D scientist and research group leader for GeoAI at the Oak Ridge National Laboratory. His research interests are in domain adaptation, manifold learning, unsupervised representation learning using deep learning approaches for geospatial imagery analytics. His technical background includes image processing, statistical machine learning, remote sensing, and geospatial data analysis. He currently conducts research and development in machine learning techniques and advanced workflows for handling large volumes of geospatial data. Before ORNL, Dalton worked as machine learning research scientist at the council for scientific and industrial research in South Africa on various projects. He received his Ph.D. in electrical and computer engineering from Purdue University, West Lafayette, Indiana.

Jacob Arndt
Jacob Arndt received the B.S. degree in geography and the master’s degree in geographic information science (MGIS) from the University of Minnesota – Twin Cities, Minneapolis, MN, USA, in 2016 and 2018, respectively. He is a Geospatial Data Scientist with the Geospatial Artificial Intelligence (GeoAI) Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA. His research interests include remote sensing, machine learning, and geographic information systems.

Jesse Piburn
Jesse Piburn is a Research Scientist in Geographic Data Science at Oak Ridge National Laboratory. His work includes research and development in spatiotemporal analytics, time series data mining, and machine learning. As a member of the GeoAI group at ORNL, he has had the opportunity to work across multiple domains including population dynamics, socio-economic modeling, and image analysis. Jesse has a background in spatial statistics and geographic information science, he received his M.S. in Geography, and is currently pursuing a PhD in Data Science and Engineering from the University of Tennessee.

Eric Ma, PhD
Eric is an Investigator at the Novartis Institutes for Biomedical Research, where he solves biological problems using machine learning. He obtained his Doctor of Science (ScD) from the Department of Biological Engineering, MIT, and was an Insight Health Data Fellow in the summer of 2017. He has taught Network Analysis at a variety of data science venues, including PyCon USA, SciPy, PyData, and ODSC, and has also co-developed the Python Network Analysis curriculum on DataCamp. As an open-source contributor, he has made contributions to PyMC3, matplotlib, and bokeh. He has also led the development of the graph visualization package nxviz, and a data cleaning package pyjanitor (a Python port of the R package).
Network Analysis Made Simple(Training)

Sean Owen
Sean is a principal solutions architect focusing on machine learning and data science at Databricks. He is an Apache Spark committer and PMC member, and co-author Advanced Analytics with Spark. Previously, he was director of Data Science at Cloudera and an engineer at Google.
MLOps: Relieving Technical Debt in ML with MLflow, Delta and Databricks(Talk)

Stefan Wager, PhD
Stefan Wager is an Associate Professor of Operations, Information and Technology at Stanford Graduate School of Business, and an Associate Professor of Statistics (by courtesy). He received his PhD in Statistics from Stanford in 2016, and has worked with or consulted for several Silicon Valley companies, including Dropbox, Facebook, Google and Uber. His research lies at the intersection of causal inference, optimization, and statistical learning. He is particularly interested in developing new solutions to problems in statistics, economics and decision making that leverage recent advances in machine learning.
Machine Learning for Causal Inference(Tutorial)

Luke Metz
Luke Metz is a research scientist at Google Brain working on meta-learning and learned optimizers. He’s interested in building general purpose, learned learning algorithms that not only perform well, but generalizes to new types of never before seen problems.
Learned Optimizers: Learning to Learn Optimization Algorithms(Talks)

Jirka Borovec, PhD
Jirika is working in Machine learning and Data science for several years. He has done Ph.D. in Medical Imaging. In parallel, he gains practical experience while he has been working for a few IT companies as a consultant or data scientist. Actually,he is focusing on exploring interesting world problems and solving them with state-of-the-art techniques. He has developed several open-source python packages, He is the core contributor of `PyTorch-Lightning` and `TorchMetrics` and actively participating in other well-known projects.

Thomas Fan
Thomas J. Fan is a Senior Software Engineer at Quansight Labs, working to sustain and evolve the PyData open-source ecosystem. He is a maintainer for scikit-learn, an open-source machine learning library written for Python. Previously, he worked at Columbia University, improving the interoperability between scikit-learn and AutoML systems. Thomas holds a Masters in Physics from Stony Brook University and a Masters in Mathematics from New York University.
Introduction to Scikit-learn: Machine Learning in Python(Training)
Intermediate Machine Learning with Scikit-learn: Evaluation, Calibration, and Inspection(Training)
Advanced Machine Learning with Scikit-learn: Text Data, Imbalanced Data, and Poisson Regression(Training)

Ville Tuulos
Ville has been developing infrastructure for machine learning for over two decades. He has worked as an ML researcher in academia and as a leader at a number of companies, including Netflix where he led the ML infrastructure team that created Metaflow, a popular open-source framework for data science infrastructure. He is a co-founder and CEO of Outerbounds, a company developing modern human-centric ML. He is also the author of an upcoming book, Effective Data Science Infrastructure, published by Manning.
Human-Friendly, Production-Ready Data Science with Metaflow(Talk)

Richard Liaw, PhD
Richard Liaw is an engineer manager at Anyscale, where he leads a team in building open source machine learning libraries on top of Ray. He is on leave from the PhD program at UC Berkeley, where he worked at the RISELab advised by Ion Stoica, Joseph Gonzalez, and Ken Goldberg. In his time in the PhD program, he was part of the Ray team, building scalable ML libraries on top of Ray.
Hands-on Reinforcement Learning with Ray and RLlib(Tutorial)

J K Terry
Jordan is a PhD student at the University of Maryland College Park, and the founder and CEO of the farama foundation. Jordan is the maintainer of Gym, the largest reinforcement learning library in the world, and the maintainer and creator of PettingZoo, which serves an analogous role to Gym in multi-agent reinforcement learning.

Hugo Bowne-Anderson, PhD
Hugo Bowne-Anderson is a data scientist, writer, educator & podcaster. His interests include promoting data & AI literacy/fluency, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. He does many of these at DataCamp, a data science training company educating over 3 million learners worldwide through interactive courses on the use of Python, R, SQL, Git, Bash and Spreadsheets in a data science context. He has spearheaded the development of over 25 courses in DataCamp’s Python curriculum, impacting over 170,000 learners worldwide through my own courses. He hosts and produce the data science podcast DataFramed, in which he uses long-format interviews with working data scientists to delve into what actually happens in the space and what impact it can and does have. He earned PhD in Mathematics from the University of New South Wales, Australia and has conducted biomedical research at the Max Planck Institute in Germany and Yale University, New Haven.
Full-stack Machine Learning for Data Scientists(Tutorial)

Robert Crowe
A data scientist and TensorFlow addict, Robert has a passion for helping developers quickly learn what they need to be productive. He’s used TensorFlow since the very early days and is excited about how it’s evolving quickly to become even better than it already is. Before moving to data science Robert led software engineering teams for both large and small companies, always focusing on clean, elegant solutions to well-defined needs.
From Experimentation to Products: The Production ML Journey(Talk)

Ankur Taly, PhD
Ankur Taly is a Staff Research Scientist at Google, where he carries out research in Machine Learning and Explainable AI. Previously, he served as the Head of Data Science at Fiddler labs, where he was responsible for developing, productionizing, and evangelizing core explainable AI technology. Ankur is most well-known for his contribution to developing and applying Integrated Gradients— a new interpretability algorithm for deep networks. His research in this area has resulted in publications at top-tier machine learning conferences and prestigious journals like the American Academy of Ophthalmology (AAO) and Proceedings of the National Academy of Sciences (PNAS). Besides explainable AI, Ankur has a broad research background and has published 30+ papers in areas including computer security, programming languages, formal verification, and machine learning. He has served on several academic conference program committees, and instructed short courses at summer schools and conferences. Ankur earned his PhD in computer science from Stanford University in 2012 and a BTech in Computer Science from IIT Bombay in 2007.
Evaluating, Interpreting and Monitoring Machine Learning Models(Talk)

Matt Harrison
Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and
storage.
He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences.
End to End Machine Learning with XGBoost(Training)
Effective Pandas(Workshop)

Stephanie Wang
Stephanie is a final-year PhD student at UC Berkeley and a software engineer at Anyscale. She is interested in abstractions for distributed computing and problems in fault tolerance. Towards this end, she is also a maintainer for the open-source project Ray, which provides a simple, universal API for building distributed applications in Python.
Distributed Python with Ray: Hands-on with the Ray Core APIs (Tutorial)

Marinela Profi
As a Marketing Manager for data science and open-source, Marinela uses her cross-domain expertise in statistics, business and marketing, to position SAS as a leader in the Data Science and Machine Learning Platform market. She focuses on helping customers apply advanced analytics, machine learning, natural language processing and forecasting to solve their most complex problems. Over the past 5 years, Profi honed her skills mining data, developing models and technical/business solutions, including deploying AI at scale. Her experience spans banking, manufacturing, retails and energy. She is a keynote speaker and presenter at different global conferences, where she shares trend and priorities of the data science industry. She is a published author, contributor to several eBooks, and blog writer on major industry and data science blogs. She has a bachelor’s in economics, an MBA and a master’s in statistics. She is passionate about getting more younger passionate to code and pursue careers in STEM.
Developing, Deploying and Managing Models at Scale with SAS® Viya®(Talk)

Jennifer Hobbs, PhD
Jennifer Hobbs is the Director of Machine Learning at Intelinair. Her team is responsible for the development and delivery of computer vision and machine learning models to deliver intelligence and insights to the agriculture industry. She completed her PhD in Physics and Astronomy at Northwestern University. Throughout her career she has been involved in all phases of the machine learning lifecycle, transforming raw data into compelling technology products through data modeling and architecture, pipeline design and management, machine learning, and visualization.
Deep Learning Enables a New View in the Agriculture Industry(Talk)

Martin Frigaard
Martin is a Senior Clinical Programmer at BioMarin, where he builds dashboards and tools for making data-informed decisions. Previously, Martin built statistical tools and dashboards for the Diabetes Technology Society, a contributing author for Data Journalism in R on the Northeastern University School of Journalism blog/website, and other volunteer and non-profit organizations. He’s a data journalism instructor for California State University, Chico. Martin holds a graduate degree in Clinical Research and is passionate about data literacy and open source technologies.
Data Visualization with ggplot2(Workshop)

Yuan Tang
Yuan is currently a founding engineer at Akuity. Previously he was a senior software engineer at Alibaba Group, building AI infrastructure and AutoML platform. He’s a PMC member of XGBoost and Apache MXNet, co-chair of Kubeflow, maintainer of TensorFlow, Argo, Couler, and ElasticDL, as well as author of numerous open source projects. He’s the author of Distributed Machine Learning Patterns (https://github.com/terrytangyuan/distributed-ml-patterns) as well as the co-author of TensorFlow in Practice and Dive into Deep Learning (with TensorFlow).

Jennifer Dawn Davis, PhD
Jennifer Davis, Ph.D. is a Staff Field Data Scientist at Domino Data Labs, where she empowers clients on complex data science projects. She has completed two postdocs in computational and systems biology, trained at a supercomputing center at the University of Texas, Austin, and worked on hundreds of consulting projects with companies ranging from start-ups to the Fortune 100. Jennifer has previously presented topics at conferences for Association for Computing Machinery on LSTMs and Natural Language Generation and at conferences across the US and in Italy. Jennifer was part of a panel discussion for an IEEE conference on artificial intelligence in biology and medicine. She has practical experience teaching both corporate classes and at the college level. Jennifer enjoys working with clients and helping them achieve their goals.
Creating a Benchmark for a Large-Scale Image Captioning Pipeline(Talk)

Stefanie Molin
Stefanie Molin is a data scientist and software engineer at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around anomaly detection, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science. She is currently pursuing a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

Ed Shee
Ed Shee, Head of Developer Relations at Seldon. Having previously led a tech team at IBM, Ed comes from a cloud computing background and is a strong believer in making deployments as easy as possible for developers. With an education in computational modelling and an enthusiasm for machine learning, Ed has blended his work in ML and cloud native computing together to cement himself firmly in the emerging field of MLOps.
An Introduction to Drift Detection(Workshop)

Ashley Scillitoe
Ashley is a data science research engineer at Seldon, where he works on developing production-ready tools for drift, adversarial and outlier detection. Prior to joining Seldon, he spent a number of years as a Research Fellow at The Alan Turing Institute. Here, he explored the use of machine learning for tackling aerospace engineering problems, with a focus on explainability and uncertainty quantification. Ashley also completed a PhD at the University of Cambridge, and is a keen proponent of open-source software.
An Introduction to Drift Detection(Workshop)

Yaron Haviv
Yaron Haviv is a serial entrepreneur who has been applying his deep technological experience in data, cloud, AI and networking to leading startups and enterprise companies since the late 1990s. As the co-founder and CTO of Iguazio, Yaron drives the strategy for the company’s MLOps platform and led the shift towards the production-first approach to data science and catering to real-time AI use cases. He also initiated and built Nuclio, a leading open source serverless platform with over 4,000 Github stars and MLRun, Iguazio’s open source MLOps orchestration framework. Prior to co-founding Iguazio in 2014, Yaron was the Vice President of Datacenter Solutions at Mellanox (now NVIDIA), where he led technology innovation, software development and solution integrations. He was also the CTO and Vice President of R&D at Voltaire, a high-performance computing, IO and networking company which floated on the NYSE in 2007. Yaron is an active contributor to the CNCF Working Group and was one of the foundation’s first members. He presents at major industry events and writes tech content for leading publications including TheNewStack, Hackernoon, DZone, Towards Data Science and more.
MLOps Beyond Training: The Production-First Approach to AI(Track Keynote)

Dr. Jacqueline Nolis
Dr. Jacqueline Nolis is a data science leader with 15 years of experience in running data science teams and projects at companies ranging from Airbnb to Boeing. She is the Head of Data Science at Saturn Cloud where she helps design products for data scientists. Jacqueline has a PhD in Industrial Engineering and her academic research focused on optimization under uncertainty. For fun Jacqueline likes to use data science for humor—like using deep learning to generate offensive license plates.
Intro to Deep Learning in R(Workshop)

Ralf Gommers, PhD
Ralf has been deeply involved in the SciPy and PyData communities for over a decade. He is a maintainer of NumPy, SciPy and data-apis.org, and has contributed widely throughout the SciPy ecosystem. Ralf is currently the SciPy Steering Council Chair, and he served on the NumFOCUS Board of Directors from 2012-2018. Ralf co-directs Quansight Labs, which consists of developers, community managers, designers, and documentation writers who build open-source technology and grow open-source communities around data science and scientific computing projects. Previously Ralf has worked in industrial R&D, on topics as diverse as MRI, lithography and forestry.
Understanding and Optimizing Parallelism in NumPy-based Programs(Talk)

Alex Peysakhovich, PhD
Alexander Peysakhovich is technically a behavioral economist, but he bristles a bit at being defined that narrowly. He’s a scientist in Facebook’s artificial intelligence research lab, as well as a prolific scholar, having posted five papers in 2016 alone. He has a Ph.D. from Harvard University, where he won a teaching award, and has published articles in the New York Times, Wired, and several prestigious academic journals. Despite these accomplishments, Peysakhovich says, “I’m most proud of the fact that I’ve managed to learn enough of lots of different fields so that I can work on problems that I’m interested in using those methods. I’ve co-authored with economists, game theorists, computer scientists, neuroscientists, psychologists, evolutionary biologists, and statisticians.
Peysakhovich’s interdisciplinary work is driven by his deep interest in understanding decision-making—both human and machine—and by his desire to figure out how artificial intelligence can improve our decision-making processes. He builds tools that help people make better choices, and machines that can turn data into, as he puts it, “not just correlations but actual causal relationships.”

Ramakrishna Vedantam
Ramakrishna Vedantam is a Research Scientist at Facebook AI Research (FAIR) in New York. Previously, he obtained his Ph.D. at the Georgia Institute of Technology (2018), an MS from Virginia Tech (2016) and did his undergraduate studies at IIIT, Hyderabad (2013). His research interest is in machine learning that mimics the capabilities of human learning and reasoning. He has been awarded the Google Ph.D. fellowship in Machine Perception and has received best reviewer awards at ICCV and CVPR.

Sagar Samtani, PhD
Dr. Sagar Samtani is an Assistant Professor and Grant Thornton Scholar in the Department of Operations and Decision Technologies at Indiana University. Dr. Samtani graduated with his Ph.D. from the AI Lab from University of Arizona. Dr. Samtani’s research interests are in AI for Cybersecurity, developing deep learning approaches for cyber threat intelligence, vulnerability assessment, open-source software, AI risk management, and Dark Web analytics. He has received funding from NSF’s SaTC, CICI, and SFS programs and has published over 40 peer-reviewed articles in leading information systems, machine learning, and cybersecurity venues. He is deeply involved with industry, serving on the Board of Directors for the DEFCON AI Village and Executive Advisory Council for the CompTIA ISAO.
An Overview of AI for Cybersecurity: An Overview of the Field and Promising Future Directions(Workshop)

Benjamin Batorsky, PhD
Ben is a Senior Data Scientist at the Institute for Experiential AI. He obtained his Masters in Public Health (MPH) from Johns Hopkins and his PhD in Policy Analysis from the Pardee RAND Graduate School. Since 2014, he has been working in data science for government, academia and the private sector. His major focus has been on Natural Language Processing (NLP) technology and applications. Throughout his career, he has pursued opportunities to contribute to the larger data science community. He has spoken at data science conferences , taught courses in Data Science, and helped organize the Boston chapter of PyData. He also contributes to volunteer projects applying data science tools for public good.
Neural Named-Entity Recognition pipelines with spaCy(Workshop)

Allen Downey, PhD
Allen Downey is a Professor of Computer Science at Olin College of Engineering in Needham, MA. He is the author of several books related to computer science and data science, including Think Python, Think Stats, Think Bayes, and Think Complexity. Prof Downey has taught at Colby College and Wellesley College, and in 2009 he was a Visiting Scientist at Google. He received his Ph.D. in Computer Science from U.C. Berkeley, and M.S. and B.S. degrees from MIT.
Bayesian Decision Analysis(Workshop)
The Wisdom of the Cloud(Talk)

Tomasz Adamusiak, MD, PhD
Tomasz Adamusiak MD Ph.D. is a Chief Scientist in the Clinical Insights & Innovation Cell at MITRE. He leads a multi-disciplinary group driving high-impact contributions to private and public sectors in Clinical and Genomic Data Science. Before MITRE, Tomasz was the Head of Data Science in the Pfizer Innovation Research (PfIRe) Lab. His team was responsible for developing novel digital endpoints, designing decentralized approaches for clinical trials, and applying AI/machine learning methods to generate novel insights from clinical data. Tomasz served in leadership and advisory roles in the American Medical Informatics Association, the SNOMED International, and the Epic Research Data Network.
Applying AI/ML Methods to Generate Digital Endpoints in Clinical Trials(Talk)

Isha Chaturvedi
Isha is a principal data scientist at Capital One, working in conversational AI space. Prior to that, I worked at Ericsson as a data scientist in the computer vision team. I completed my master’s from New York University from an Urban Data Science program in 2018. I have worked in different NYU research labs (NYU Urban Observatory, NYU Sounds of New York City (SONYC) lab). Before moving to New York, I lived in Hong Kong for 5 years, where I did my bachelors from Hong Kong University of Science & Tech (HKUST) in Environmental Technology and Computer Science and later worked in HKUST- Deutsche Telecom Systems and Media lab (an Augmented Reality and Computer Vision focused lab) as a Research Assistant.
Few-Shot Learning(Workshop)

Julia Lintern
Julia Lintern currently works as an instructor for the Metis Data Science Flex Program. Previously, she worked as a Data Scientist for the New York Times. Julia began her career as a structures engineer designing repairs for damaged aircraft. Julia holds an MA in applied math from Hunter College, where she focused on visualizations of various numerical methods and discovered a deep appreciation for the combination of mathematics and visualizations. During certain seasons of her career, she has also worked on creative side projects such as Lia Lintern, her own fashion label.
Introduction to Machine Learning (Bootcamp)

Anish Shah
Anish loves turning ML ideas into ML products. Anish started his career working with multiple Data Science teams within SAP, using traditional ML, deep learning, and recommendation systems before landing at Weights & Biases. With the art of programming and a little bit of magic, Anish crafts ML projects to help better serve our customers, turning “oh nos” to “a-ha”s!
Orchestrating Reproducible AutoML Experiments using PyCaret, W&B, and Prefect(Workshop)

Isaac Slavitt
Isaac is a co-founder and Principal Data Scientist at DrivenData, Inc, where he leads client engagements and spearheads development of the data science competition platform. He holds a master’s in Computational Science and Engineering from Harvard’s School of Engineering and Applied Sciences and a BS in Operations Research from the U.S. Coast Guard Academy, and previously spent seven years as a Coast Guard officer serving in a variety of operational and quantitative roles.
The Wisdom of the Cloud(Talk)

Andras Zsom, PhD
Andras Zsom is an assistant Professor of the Practice of Data Science and Director of Industry and Research Engagement at Brown University, Providence, RI. He works with high-level academic administrators to tackle predictive modeling problems, he collaborates with faculty members on data-intensive research projects, and he was the instructor of a data science course offered to the data science master students at Brown.
Introduction to Interpretability in Machine Learning(Workshop)

Karin Wolok
Karin is currently the leading developer community programming in the Developer Relations team at StarTree. Karin initially began her career in entertainment marketing working with the likes of names like Eminem and Live Nation. She also launched a successful professional women’s network in two major cities in the U.S., organized events for her local Data Science meetup, and helped lead a on-going hackathon to put machine learning in the hands of cancer biologists. Her journey working in data eventually let her to a position as Program Manager for Community Development for the leading graph database in the world, Neo4j. Most recently, she was brought on to StarTree to improve the adoption and success of the overall developer community.
Real-Time Analytics: Going Beyond Stream Processing with Apache Pinot(Workshop)

Tim Berglund
Tim is a teacher, author, and technology leader with StarTree, where he serves as the Vice President of Developer Relations. He is a regular speaker at conferences and a presence on YouTube explaining complex technology topics in an accessible way. He tweets as @tlberglund, blogs every few years at http://timberglund.com, and lives in Littleton, CO, USA. He has three grown children and one grandchild.
Using Apache Kafka and Apache Pinot for User-Facing, Real-Time Analytics(Talk)

Dan Hendrycks
Dan Hendrycks is a PhD candidate at UC Berkeley, advised by Jacob Steinhardt and Dawn Song. His research aims to disentangle and concretize the components necessary for safe AI. His research is supported by the NSF GRFP and the Open Philanthropy AI Fellowship. Dan has helped contribute the GELU activation function, the default activation in most Transformers including BERT, GPT, and Vision Transformers.
Unsolved ML Safety Problems(Talk)

Sujit Pal
Sujit Pal is an applied data scientist at Elsevier Labs, an advanced technology group within the Reed-Elsevier Group of companies. His areas of interests include Semantic Search, Natural Language Processing, Machine Learning and Deep Learning. At Elsevier, he has worked on several machine learning initiatives involving large image and text corpora, and other initiatives around recommendation systems and knowledge graph development. He has co-authored Deep Learning with Keras (https://www.packtpub.com/big-data-and-business-intelligence/deep-learning-keras) and Deep Learning with Tensorflow 2.x and Keras (https://www.packtpub.com/data/deep-learning-with-tensorflow-2-0-and-keras-second-edition), and writes about technology on his blog Salmon Run (https://sujitpal.blogspot.com/).
Transformer Based Approaches to Named Entity Recognition (NER) and Relationship Extraction (RE)(Training)

Abdel-rahman Mohamed, PhD
Abdelrahman Mohamed (PhD) is a research scientist at Meta AI Research (previously, Facebook AI Research (FAIR)). He was a principal scientist/manager in Amazon Alexa and a researcher in Microsoft Research. Abdelrahman was part of the team that started the Deep Learning revolution in Spoken Language Processing in 2009. He is the recipient of the IEEE Signal Processing Society Best Journal Paper Award for 2016. His current research interest focuses on improving, using, and benchmarking learned speech representations, e.g. HuBERT, Wav2vec 2.0, TextlessNLP, and SUPERB.
Self-supervised Representation Learning for Speech Processing(Workshop)

Violeta Misheva, PhD
Violeta has been interested in understanding the causes of social inequalities and to what extent bad experiences early in life propagate to negative outcomes later. When she realized ML can result in widening already existing social gaps, she became an advocate for the responsible development and deployment of ML. Violeta currently works as a data scientist at ABN Amro. Before that, she worked in consultancy and obtained her PhD in applied econometrics. Violeta likes sharing her knowledge with others by the form of workshops on data science and online courses. Violeta proposes that developers of ML solutions alone cannot ensure their safety but, rather, that the additional efforts of multidisciplinary experts as well as proper regulation is also needed.

Daniel Vale
Daniel has long been interested in the intersection between the law, technology and society. Unsurprisingly, this drew him into the field of data science and law. Daniel currently works as legal counsel for AI & data science at the H&M Group: where his principal focus is on developing and maturing the company’s MLOps (business, governance, and regulatory) capacities. Daniel is also completing his PhD in law, MLOps, & finance at Leiden University. His education is in behavioural science, statistics, and law. Having worked at corporate law firms and as a consultant, Daniel has practical legal and commercial experience in the field. He proposes that responsible ML is centred around two essential themes – (a) a constant appreciation of context, and (b) prudent MLOps & project management.

Leonardo De Marchi
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks and Manchester United, and with large social networks, like Justgiving. His previous experience includes Head of Data Science and Analytics in Bumble, the largest dating site with over 500 million users, heading the team through an acquisition and an IPO. He is also the lead instructor at ideai.io, a company specialized in Reinforcement Learning, Deep Learning and Machine Learning training. He is also a contractor for several companies and for the European Commission, as an expert in AI and Machine Learning. As an author he wrote “Hands On Deep Learning” and he authored an online training course for O’Reilly, Introduction to Reinforcement Learning. In the academic world, he also helped set up the PhD center on Interactive Artificial Intelligence and will take part in the Inner Assessment Board to assign funding to Irish research in AI.
NLP Fundamentals(Training)

Juhi Pandey
Juhi Pandey is an Artificial Intelligence and Machine Learning Evangelist, a Speaker, and a Mentor. She has nearly 11 years of experience, statistical, and architectural experience in different domains like Life Science, Marketing, Finance, and Supply Chain Management. She has rich experience in building and scaling AI and Machine Learning businesses. She is currently working as a Senior Data Scientist at Publicis Sapient where she is part of the core data science team, working on various Machine Learning, Deep Learning, Natural Language Processing, and Artificial intelligence engagements by applying state of the art techniques in this space. She is Azure Data Science Certified and Certified Business Analysis Professional (CBAP). She Participated in International Conference for Engineering 2021-Talked about Anomaly Detection She holds a bachelor’s degree in the subject of Computer Science. She’s an active blogger. She engages in technical reading, blogging, answering technical queries, and mentoring budding Data Scientists in her leisure time.
Need of Adaptive Ethical ML Models in Post Pandemic Era(Talk)

Jeremy Irvin
Jeremy is a PhD candidate at Stanford University advised by Professor Andrew Ng. Jeremy is interested in developing machine learning tools for climate change and medicine. His current research is focused on developing machine learning approaches using remote sensing data for mapping energy and transportation infrastructure, with an emphasis on identifying sources of methane emissions globally.
Mapping for Climate Change with Deep Learning on Remotely Sensed Imagery(Talk)

Sanghamitra Deb, PhD
Sanghamitra Deb is a Staff Data Scientist at Chegg, she works on problems related school and college education to sustain and improve the learning process. Her work involves recommendation systems, computer vision, graph modeling, deep NLP analysis , data pipelines and machine learning. Previously, Sanghamitra was a data scientist at a Accenture where she worked on a wide variety of problems related data modeling, architecture and visual story telling. She is an avid fan of python and has been programming for more than a decade. Trained as an astrophysicist (she holds a PhD in physics) she uses her analytical mind to not only work in a range of domains such as: education, healthcare and recruitment but also in her leadership style. She mentors junior data scientists at her current organization and coaches students from various field to transition into Data Science. Sanghamitra enjoys addressing technical and non-technical audiences at conferences and encourages women into joining tech careers. She is passionate about diversity and has organized Women In Data Science meetups.
Intro to NLP: Text Categorization and Topic Modeling(Workshop)

Arle Lommel, PhD
Dr. Arle Lommel is a senior analyst with independent market research firm CSA Research. He is a recognized expert in translation quality processes and interoperability standards. Arle’s research focuses on translation technology and the intersection of language and artificial intelligence as well as the value of language in the economy. Born in Alaska, he holds a PhD from Indiana University. Prior to joining CSA Research he worked at the German Research Center for Artificial Intelligence (DFKI) in its Berlin-based language technology lab. In addition to English he speaks fluent Hungarian and passable German, along with bits and pieces of other languages.
How Can We Make Machine Translation Responsive and Responsible?(Business Talk)

Andrew Engel, PhD
Andrew Engel is the Chief Data Scientist at Rasgo. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, HP and led DataRobot’s efforts in Entertainment, Sports and Gaming before joining Rasgo in August of 2020.
Feature Engineering on the Modern Data Stack(Demo Talk)

Serdar Cellat, PhD
Serdar Cellat is currently working as a Lead Machine Learning Engineer at Y Meadows, an early stage startup helping customer service teams with NLP solutions. Serdar’s work includes a variety of NLP tasks such as Intent Classification, Named Entity Recognition, Emotion Detection, Topic Modeling, Question and Answering Systems, and Semantic Textual Similarity. Prior to that, Serdar was a Senior Data Scientist at Liberty Mutual Insurance Legal Department where he built NLP systems to detect excessive charges in legal invoices. Serdar has also served as an instructor and mentor for post graduate certificate programs in Machine Learning and Data Science offered by MIT and UT Austin. Serdar holds a PhD degree in Mathematics from Florida State University, and his research focus was on Machine Learning and Optimization.

Srinivasa Kadamati
Srini is a PMC member for the Apache Superset project and heads up community & developer relations efforts at Preset. Before joining Preset, Srini worked as a data scientist and a data science educator for over 6 years. Most recently, he was the first employee at Dataquest and helped lead educational content, engineering, and product efforts there.
Deep Dive Workshop for Apache Superset(Workshop)

Jay Lowe
Jay is a field engineer with a background in deep learning, full stack development, and marine research. At Roboflow, he combines technical CV skills with business acumen to help customers rapidly build value and empower developers to integrate CV into their own applications.

Patrick von Platen
Patrick von Platen is a research engineer at Hugging Face and one of the core maintainers of the popular Transformers library. He specializes in speech recognition, encoder-decoder models and long-range sequence modeling. Before joining Hugging Face, Patrick conducted research in speech recognition at Uber AI, Cambridge University, and RWTH Aachen University.
Transformers &
Datasets for Research and Production(Training)

Adrien Treuille, Phd
Adrien is co-founder and CEO of Streamlit. Previously, Dr. Treuille has been VP of Simulation Zoox, lead a Google X project, and was a Professor of Computer Science and Robotics at Carnegie Mellon. He gives talks around the world, including to the President’s Council of Advisors on Science and Technology, and has won numerous scientific awards, including the MIT TR35. Adrien and his work have been featured in the documentaries “What Will the Future Be Like” by PBS/NOVA, and “Lo and Behold” by Werner Herzog.
Streamlit: Next-generation Communication of Data Insights(Workshop)

Shan He
Shan He is Senior Director of Engineering at Foursquare and Co-Founder of Unfolded – acquired by Foursquare. She is an engineer, a designer, and a data artist who has built her career in geospatial analytics and visualization. Before founding Unfolded, Shan was the first member of Uber’s data visualization team. At Uber, she created and open-sourced kepler.gl, an advanced geospatial visualization tool and the 2018 Kantar Information is Beautiful Award Gold winner.
Supercharging Geospatial Analysis In Your Data Science Workflow(Demo Talk)

Akram Dweikat
Akram Dweikat is a computer engineer and entrepreneur, specialized in machine learning & AI. He has been recognized by the UK Government as an Exceptional Talent in computer engineering, innovation, and entrepreneurship. Akram is currently the Engineering Manager for Deliveroo’s Network Economics (ML) team. Also, he is a global data science ambassador for Z by HP. He has been appointed as an AI Expert by the World Economic Forum, serving on their Global Future Council on Artificial Intelligence for Humanity. In his spare time, Akram helps build agricultural gardens for income and food security in his native Palestine. Earlier in his career, Akram helped establish the entrepreneurial community in Nablus and was one of eight youth selected to meet US President Barack Obama on his official visit to Palestine.
Introduction to WSL2 for Data Science with Z by HP(Demo Talk)

Bob Foreman
Bob Foreman has worked with LexisNexis and the open source big data HPCC Systems technology platform and the ECL programming language for more than 10 years, and has been a technical trainer for more than 25 years. He is the developer and designer of the HPCC Systems Online Training Courses, and is the Senior Instructor for all classroom and remote training. In addition to being one of HPCC Systems favorite Trainers, Bob has more than 30 years of industry experience in training, consulting and technical writing with RDBMS platforms and most recently large scale data technology.
HPCC Systems – The Kit and Kaboodle for Big Data and Data Science(Demo Talk)

Gavin McCormick
While a PhD student in energy econometrics at UC Berkeley, Gavin McCormick invented “Automated Emissions Reduction”: software that instantly reduces pollution from smart devices such as electric vehicles and smart thermostats. Today he is Executive Director of WattTime, a nonprofit that helps Fortune 500 companies and governments use this technology to lower their carbon footprint at scale. He is also a cofounder of Climate TRACE: a coalition of nonprofits, tech companies, and universities using satellites and AI to monitor every source of greenhouse gas emissions source on Earth.
Timing IoT Devices to Slash Carbon Emissions at Scale(Business Talk)

Glen Ford
Glen Ford is VP of Product at iMerit — a leading AI data solutions company — where he leads the product management and design teams. Glen holds more than two decades of product development experience across the technology sector. A Graduate of Texas A&M University—Commerce, Glen began his career as a consultant where he handled full-stack web programming and architecture for clients including Time Warner and AIM Funds. Over the years, he has held senior and director-level product management roles at several companies including Demand Media, WP Engine and Humanify. Most recently, Glen spent four years at Alegion — an ML-powered data annotation platform — where he helped the company grow from eight full-time employees to more than 100 in a challenging, emerging market.
The Hidden Layers of Tech Behind Successful Data Labeling(Demo Talk)

Aishwarya Srinivasan
Aishwarya is working as a Data Scientist in the Google Cloud AI Services team to build machine learning solutions for customer use cases, leveraging core Google products including TensorFlow, DataFlow, and AI Platform. Aishwarya was working as an AI & ML Innovation Leader at IBM Data & AI, where she was working cross-functionally with the product team, data science team and sales to research AI use-cases for clients by conducting discovery workshops and building assets to showcase the business value of the technology. She is an advocate for open-source technologies; previously a developer advocate for PyTorch Lightning and a contributor to Scikit Learn. She holds a post-graduate in Data Science from Columbia University. She has worked with clients all across the globe and has traveled internationally to London, Dubai, Istanbul, and India to lead and work with them. She is very focused on expanding her horizons in the machine learning research community including her recent Patent Award won in 2018 for developing a Reinforcement Learning model for Machine Trading.
She is an ambassador for the Women in Data Science community, originating from Stanford University. She has a huge follower base on LinkedIn and actively organizes events and conferences to inspire budding data scientists. She has been spotlighted as a LinkedIn Top Voice 2020 for Data Science and AI, which features Top 10 Machine Learning influencers across the world.
She is an ardent reader and has contributed to the scholastic community. To spread her knowledge in the space of data science, and to inspire budding Data Scientists, she actively writes blogs related to machine learning on LinkedIn: https://www.linkedin.com/in/aishwarya-srinivasan/

Jared Lander
Jared Lander is the Chief Data Scientist of Lander Analytics a data science consultancy based in New York City, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fundraising to finance and humanitarian relief efforts.
He specializes in data management, multilevel models, machine learning, generalized linear models, data management and statistical computing. He is the author of R for Everyone: Advanced Analytics and Graphics, a book about R Programming geared toward Data Scientists and Non-Statisticians alike and is creating a course on glmnet with DataCamp.
Manipulating and Visualizing Data with R(Bootcamp)

Peter Spangler
Peter is a hands-on data science leader with a business focused approach to building data science solutions and telling stories with data. Experienced in translating business problems into data products using advanced statistical techniques and ML to support decision making in a variety of rapid growth environments. Scaled data science solutions for user acquisition, retention, channel optimization, revenue and fraud at Lyft, Alibaba and Citrix. Currently leading Marketing Science for Growth at Nextdoor.
Data Visualization with ggplot2(Workshop)

Zoe Steinkamp
Zoe Steinkamp is a Developer Advocate for InfluxData. She is new to the developer advocate role. She has worked for InfluxData as a front end software engineer for over two years. Before InfluxData, she worked as a front end engineer for over 5 years in the original AngularJS. She originally went to a bootcamp for training in Python. Her favorite activities outside of work include traveling and gardening.
InfluxDB: The Database for Your Time Series Data Science Problem(Demo Talk)

Kaushik Bokka
Kaushik Bokka is a Senior Research Engineer at Grid.ai and one of the core maintainers of the PyTorch Lightning library. He has prior experience in building production scale Machine Learning and Computer Vision systems for several products ranging from Video Analytics to Fashion AI workflows. He has also been a contributor for few other open source projects and aims to empower the way people and organizations build AI applications.

Manuela Veloso, PhD
Manuela Veloso is Head of J.P. Morgan Chase AI Research and Herbert A. Simon University Professor Emerita at Carnegie Mellon University, where she was previously Faculty in the Computer Science Department and Head of the Machine Learning Department. She is past president of the Association for the Advancement of Artificial Intelligence (AAAI), and the co-founder and a Past President of the RoboCup Federation. In her career she has received numerous awards and honors, including: National Science Foundation CAREER Award, Allen Newell Medal for Excellence in Research, Radcliffe Fellow at the Radcliffe Institute for Advanced Study (Harvard University), Einstein Chair Professor of the Chinese Academy of Sciences, and the ACM/SIGART Autonomous Agents Research Award for “contributions to the field of artificial intelligence, in particular in planning, learning, multi-agent systems, and robotics.” Veloso is a Fellow of AAAI, AAAS, ACM, and IEEE. She was elected in 2022 to the National Academy of Engineering.

Usama Fayyad, PhD
Usama Fayyad is the Inaugural Executive Director of the Institute for Experiential AI at Northeastern University where he is also professor of computer science. He is Chairman at Open Insights focusing on AI/ML/BigData, Data strategy, and new business models for Data. He was Global Chief Data Officer at Barclays Bank in London (2013-2016) after which he served as Co-founder and CTO at OODA Health in San Francisco focused on AI in the healthcare space. He was Chairman/CEO/CTO at several Seattle/Silicon Valley tech startups and the first person to hold the title: Chief Data Officer when Yahoo! acquired his 2nd startup in 2004. He held leadership roles at Microsoft (1996-2000) and founded the Machine Learning Systems group at NASA’s Jet Propulsion Laboratory (1989-1996) where he was awarded Caltech’s top Excellence in Research award & a U.S. Government medal from NASA. Usama published over 100 technical articles, holds over 30 patents, is a Fellow of both Association for Advancement of Artificial Intelligence and the Association of Computing Machinery. He is a recipient of both the ACM SIGKDD Awards for Innovation and for Service. He earned his Ph.D. from the University of Michigan and holds two BSE’s in Electrical and Computer Engineering, MSE Computer Engineering and M.Sc. in Mathematics.
Data Science and AI in Digital Transformation: Digital Can Lead to Blindness(Keynote)

Hilary Mason
Hilary Mason is the co-founder and CEO of Hidden Door. Prior to Hidden Door she was General Manager of the Machine Learning business unit at Cloudera (NYSE: CLDR). She previously founded Fast Forward Labs, an applied machine learning research and consulting startup which Cloudera acquired in 2017. Additionally, she was Data Scientist in Residence at Accel Partners, co-founded HackNY, and was Chief Scientist at bitly. Hilary has received numerous awards, is a regular keynote speaker, and has advised startups, corporations, and governments.
Making Story Computable: The Future of Co-creative Entertainment(Keynote)

Christy Bergman
Christy is a Developer Advocate at Anyscale. Her work involves figuring out how to parallelize different AI algorithms and creating demos and tutorials on how to use Ray and Anyscale. Before that, she was a Senior AI/ML Specialist Solutions Architect at AWS and Data Scientist at a banking startup and at Atlassian. In her spare time, she enjoys hiking and bird watching.
Hands-on Reinforcement Learning with Ray and RLlib(Tutorial)

Avnish Narayan
Avnish Narayan is an ML Engineer at Anyscale where he works on RLlib. He’s passionate about exploring where RL can improve upon existing solutions in industrial applications. He previously received his MS in Computer Science at USC, where he did research on the applications of RL in robotic manipulation problems.
Hands-on Reinforcement Learning with Ray and RLlib(Tutorial)

Karthik Rao
Karthik Rao is a machine learning engineer at Arthur AI (Monitoring, Performance, Explainability). He was previously an undergraduate at Harvard where he focused on big data systems for machine learning. He is passionate about designing and building novel machine learning solutions using state-of-the art frameworks.
FastCFE: A Distributed Deep Reinforcement Learning Counterfactual Explainer(Talk)

Steve Dietrich
Delivers B2B and B2C AI/ML-based solution outcomes through Iguazio’s distinctive MLOps automation technology.
Steve builts and implemented MLOps, industrial analytics and operations research-oriented software products and solutions in individual contributor and leadership roles in industries including oil and gas E&P, energy trading, derivatives trading, B2B medical, B2B high-tech manufacturing, process manufacturing, industrial goods and consulting/SI.
How Enterprises use MLOps Automation to Continuously Roll out New AI Services(Talk)

Jessica Dai
Jessica Dai is a Machine Learning Engineer at Arthur AI, where she works on research and development for fairness-related features. Previously, she conducted research with collaborators from CMU, Harvard, and Brown.
Operationalizing Fair ML: From Industry to Research and Back(Talk)

Laura Mariano
Laura Mariano is a data scientist and engineer with twelve years of industry experience developing machine learning and signal processing algorithms across a wide range of domains, including: computer vision, weather sensing, behavioral fingerprinting, biometrics, disease diagnostics, RF signal processing, and brain computer interfaces. She is the Lead Ethical AI Data Scientist at Humana, where she is developing technology and policy that translate the principles of Responsible and Ethical AI development into practice, at enterprise scale.
Operationalizing Fair ML: From Industry to Research and Back(Talk)
See all our talks and hands-on workshop and training sessions
See all sessionsYou Will Meet
Top speakers and practitioners in Machine Learning and Deep Learning
Data Scientists and Data Analysts
Decision makers
Software Developers focused on Machine Learning and Deep Learning
Data Science Innovators
CEOs, CTOs, CIOs
Industry leaders
Core contributors in the fields of Machine Learning and Deep Learning
Data Science Enthusiasts
Why Attend?
Immerse yourself in talks, tutorials, and workshops on Machine Learning and Deep Learning tools, topics, models and advanced trends
Expand your network and connect with like-minded attendees to discover how Machine Learning and Deep Learning knowledge can transform not only your data models but also your business and career
Meet and connect with the core contributors and top practitioners in the expanding and exciting fields of Machine Learning and Deep Learning
Learn how the rapid rise of intelligent machines is revolutionizing how we make sense of data in the real world and its coming impact on the domains of business, society, healthcare, finance, manufacturing, and more
ODSC EAST 2023 | May 9th-11th
Register your interest for East 2023Sessions on Machine Learning & Deep Learning Track
Workshop: Deciphering the Black Box: Latest Tools and Techniques for Interpretability
Talk: Adversarial Attacks on Deep Neural Networks
Training: Integrating Pandas with Scikit-Learn, an Exciting New Workflow
Workshop: Machine Learning for Digital Identity
Talk: Adding Context and Cognition to Modern NLP Techniques
Training: Good, Fast, Cheap: How to do Data Science with Missing Data
Workshop: Open Data Hub workshop on OpenShift
Talk: Practical AI solutions within healthcare and biotechnology
Training: Apache Spark for Fast Data Science (and Fast Python Integration!) at Scale
Workshop: Reproducible Data Science Using Orbyter
Talk: Combining millions of products into one marketplace using computer vision and natural language processing
ODSC Newsletter
Stay current with the latest news and updates in open source data science. In addition, we’ll inform you about our many upcoming Virtual and in person events in Boston, NYC, Sao Paulo, San Francisco, and London. And keep a lookout for special discount codes, only available to our newsletter subscribers!