ODSC Europe | June 14-15th, 2023 | In-person and Virtual
Data Engineering & MLOps
Focusing on the practice, engineering, workflows and DataOps of data science
Understand the Practice of Data Engineering in the Real World
As data science extends its reach across an enterprise, the need for better management, workflow, production and deployment practices increase. The challenges of deploying and monitoring models in production, managing data science workflows and teams, and understanding ROI are a few of the issues organizations wrestle with.
Learn best practices for effective data science management
Sessions in this broad focus area will look at uses cases, best practices, and stories from the field to show how to effectively incorporate data science practice into the wider business process. This focus area will look beyond data sourcing and modeling towards the many challenges teams need to overcome to effectively apply data science in their organization.
ODSC EUROPE 2023 CONFERENCE | June 14-15th
Register your interestWhat You'll Learn
Data science has many focus areas. The goal of this track is to accelerate your knowledge of data science through a series of introductory level training sessions, talks, tutorials and workshops on the most important data science tools and topics.
Machine Learning Pipelines
Kubeflow & Kubernetes
Automated Machine Learning
Data Science Architecture
Debugging Machine Learning
Data Gathering
Data Analysis
Data Transformation & Preparation
Model Training & Development
Model Validation, Monitoring, and Re-training
Best Practices & Uses Cases
ODSC EUROPE Hybrid Conference 2022 | June 14-15th
Our Previous MLOps & Data Engineering Speakers

Yaron Haviv
Yaron Haviv is a serial entrepreneur who has been applying his deep technological experience in data, cloud, AI and networking to leading startups and enterprise companies since the late 1990s. As the co-founder and CTO of Iguazio, Yaron drives the strategy for the company’s MLOps platform and led the shift towards the production-first approach to data science and catering to real-time AI use cases. He also initiated and built Nuclio, a leading open source serverless platform with over 4,000 Github stars and MLRun, Iguazio’s open source MLOps orchestration framework. Prior to co-founding Iguazio in 2014, Yaron was the Vice President of Datacenter Solutions at Mellanox (now NVIDIA), where he led technology innovation, software development and solution integrations. He was also the CTO and Vice President of R&D at Voltaire, a high-performance computing, IO and networking company which floated on the NYSE in 2007. Yaron is an active contributor to the CNCF Working Group and was one of the foundation’s first members. He presents at major industry events and writes tech content for leading publications including TheNewStack, Hackernoon, DZone, Towards Data Science and more.
From AutoML to AutoMLOps(Track Keynote)

Wade Schulz, MD, PhD
Dr. Schulz is a physician scientist with a background in computational healthcare, molecular biology, and virology. Dr. Schulz has over 20 years’ experience in software development with a focus on enterprise system architecture and has a research interests in the management of large, biomedical data sets and the use of real-world data for predictive modeling. At Yale School of Medicine, he has led the deployment of the organization’s data science infrastructure which consists of a composable computing infrastructure to support the development of biomedical AI applications. Dr. Schulz is also a co-founder of Refactor Health, a digital health startup focused on the development of AI-driven digital signatures and automated healthcare DataOps.

Karin Wolok
Karin is currently the leading developer community programming in the Developer Relations team at StarTree. Karin initially began her career in entertainment marketing working with the likes of names like Eminem and Live Nation. She also launched a successful professional women’s network in two major cities in the U.S., organized events for her local Data Science meetup, and helped lead a on-going hackathon to put machine learning in the hands of cancer biologists. Her journey working in data eventually let her to a position as Program Manager for Community Development for the leading graph database in the world, Neo4j. Most recently, she was brought on to StarTree to improve the adoption and success of the overall developer community.
Real-Time Analytics: Going Beyond Stream Processing with Apache Pinot(Workshop)

Ville Tuulos
Ville has been developing infrastructure for machine learning for over two decades. He has worked as an ML researcher in academia and as a leader at a number of companies, including Netflix where he led the ML infrastructure team that created Metaflow, a popular open-source framework for data science infrastructure. He is a co-founder and CEO of Outerbounds, a company developing modern human-centric ML. He is also the author of an upcoming book, Effective Data Science Infrastructure, published by Manning.
Human-Friendly, Production-Ready Data Science with Metaflow(Talk)

Adi Hirschtein
Adi Hirschtein brings 20 years of experience as an executive, product manager and entrepreneur building and driving innovation in technology companies. As the VP of Product at Iguazio, the MLOps platform built for production and real-time use cases, he leads the product roadmap and strategy. His previous roles spanned technology companies such as Dell EMC, Zettapoint and InfraGate, in diverse positions including product management, business development, marketing, sales and execution, with a strong focus on machine learning, database and storage technology. When working with startups and corporates, Adi’s passion lies in taking a team’s ideas from their very first day, through a successful market penetration, all the way to an established business. Adi holds a B.A. in Business Administration and Information Technology from the College of Management Academic Studies.

Hugo Bowne-Anderson, PhD
Hugo Bowne-Anderson is a data scientist, writer, educator & podcaster. His interests include promoting data & AI literacy/fluency, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. He does many of these at DataCamp, a data science training company educating over 3 million learners worldwide through interactive courses on the use of Python, R, SQL, Git, Bash and Spreadsheets in a data science context. He has spearheaded the development of over 25 courses in DataCamp’s Python curriculum, impacting over 170,000 learners worldwide through my own courses. He hosts and produce the data science podcast DataFramed, in which he uses long-format interviews with working data scientists to delve into what actually happens in the space and what impact it can and does have. He earned PhD in Mathematics from the University of New South Wales, Australia and has conducted biomedical research at the Max Planck Institute in Germany and Yale University, New Haven.
Full-stack Machine Learning for Data Scientists(Tutorial)

Gilad Shaham
Gilad has over 15 years of experience in product management and a solid R&D background. He combines analytical skills and technical innovation with Data Science market experience. Gilad’s passion is to define a product vision and turn it into reality. As Director of Product Management at Iguazio, Gilad manages both the Enterprise MLOps Platform product as well as MLRun, Iguazio’s open source MLOps orchestration framework. Prior to joining Iguazio, Gilad managed several different products at NICE-Actimize, a leading vendor of financial crime prevention solutions, including coverage of machine-learning based solutions, formation of a marketplace and addressing customer needs across different domains. Gilad holds a B.A in Computer Science, M.Sc. in Biomedical Engineering and MBA from Tel-Aviv University.
It worked on my Laptop, now what? Using OS tool MLRun to Automate the Path to Production(Demo Talk)

Stefanie Molin
Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

Jacob Tomlinson
Jacob Tomlinson is a senior Python software engineer at NVIDIA with a focus on deployment tooling for distributed systems. His work involves maintaining open source projects including RAPIDS and Dask. RAPIDS is a suite of GPU accelerated open source Python tools which mimic APIs from the PyData stack including those of Numpy, Pandas and SciKit-Learn. Dask provides advanced parallelism for analytics with out-of-core computation, lazy evaluation and distributed execution of the PyData stack. He also tinkers with the open source chatbot automation framework Opsdroid in his spare time. Jacob volunteers with the local tech community group Tech Exeter and lives in Exeter, UK.
GPU Development with Python 101(Workshop)

Hadrien Jean, PhD
Hadrien Jean is a machine learning scientist working at My Medical Assistent where he is developing deep learning models in the medical domain. He wrote the book Essential Math for Data Science (https://www.essentialmathfordatascience.com/) aimed at helping people to get the math needed in data science from a coding perspective. He previously worked at Ava on speech diarization. He also worked on a bird detection project using deep learning. He completed his Ph.D. in cognitive science at the École Normale Supérieure (Paris, France) on the topic of auditory perceptual learning with a behavioral and electrophysiological approach. He has published a series of blog articles aiming at building intuition on mathematics through code and visualization (https://hadrienj.github.io/posts/).
Introduction to Linear Algebra for Data Science and Machine Learning With Python(Bootcamp)

Dan Sullivan, PhD
Dan Sullivan is a Principal Data Architect at 4 Mile Analytics. He has a PhD in Genetics, Bioinformatics and Computational Biology and is a former research scientist in infectious disease genomics. He is the author of Google Cloud Certification Study Guides for the Professional Data Engineer, Professional Architect, and Associate Cloud Engineer certifications and an instructor on LinkedIn Learning and Udemy where he provides courses on data science, data modeling, and cloud computing.

Mike Tapi Nzali, PhD
Mike Tapi Nzali PhD, is a machine learning engineer at CybelAngel with a PhD in Computer Science. He likes to work in a startup environment, also leading the development of machine learning products from idea to production. He is interested in cutting-edge technology, sharing knowledge and industrialization of Machine Learning.
Reproducible and Shareable Notebooks Across a Data Science Team(Talk)

Aoife Cahill, PhD
Aoife Cahill is a Natural Language Processing (NLP) expert and a director of AI research at Dataminr, the leading real-time information discovery platform. Since joining in 2021, Aoife has led a team of data scientists focused on the efficient iterative process of developing and evaluating AI technology that supports the expansion of Dataminr’s internal and external products.
Prior to Dataminr, Aoife led a team of research scientists and engineers working on high-stakes NLP applications in the educational domain at the Educational Testing Service (ETS). The NLP teams at ETS are known leaders in the field of developing and deploying robust, well-documented, scalable NLP prototypes that maintain fairness across user groups.
Aoife holds a PhD in Computational Linguistics from Dublin City University, Ireland, and has also spent time conducting NLP research in Germany, Norway and in the U.S. As an active member of the computational linguistics research community, her research has been published in top-tier journals including Computational Linguistics and the Journal of Research on Language and Computation, as well as conference proceedings at the annual conference for the Association for Computational Linguistics (ACL), the International Conference on Computational Linguistics (COLING) and the Conference on Empirical Methods in Natural Language Processing (EMNLP).
AI for Emergency Response(Demo Talk)

Tuhin Sharma
Tuhin Sharma is Senior Principal Data Scientist at Redhat in the Corporate Development and Strategy group. Prior that he worked at Hpersonix as AI Architect. He also co-founded and has been CEO of Binaize, a website conversion intelligence product for e-commerce SMBs. He received master’s degree from Indian Institute of Technology Roorkee in Computer Science with specialization in Data Mining. He received bachelor’s degree from Indian Institute of Engineering Science and Technology Shibpur in Computer Science. He loves to code and collaborate on open source and research projects. He has 4 research papers and 5 patents in the field of AI and NLP. He is reviewer of IEEE MASS conference in the AI track. He writes deep learning articles for O’reilly with the collaboration with AWS MXNET team. He loves to play TT and Guitar in his leisure time. His favorite quote is “Life is Beautiful”.
Eagleeye: Data Pipeline for Anomaly Detection in Cyber Security(Talk)

Ryan Dawson
Ryan Dawson is a technologist passionate about data. Ryan works with clients on large-scale data and AI initiatives, helping organizations get more value from data. His work includes strategies to productionize machine learning, organizing the way data is captured and shared, selecting the right data technologies and optimal team structures, as well as writing the code to make it happen. He has over 15 years of experience and, as well as many widely read articles about MLOps, software design, and delivery. is author of the Thoughtworks Guide to Evaluating MLOps Platforms.
Data Mesh: From Concept to Code(Talk)
Data Science in the Industry: Continuous Delivery for Machine Learning(Training)

Christos Hadjinikolis, PhD
Christos has a PhD in Computing and has worked for many years as an ML consultant for many companies covering different domains (telcom, finance, gaming). For the last 3 years, he has been focussing on ML-Ops, defining and curating the ML-Development Lifecycle for the companies that hire him. He has recently embarked on a new adventure with Vortexa Ltd, working as a Lead ML Engineer and helping the company scale technically as it grows.
Dynamicio (a pandas I/O wrapper); Why you Should Start your ML-Ops Journey with Wrapping your I/O(Talk)

Shawn Kyzer
Shawn is passionate about harnessing the power of data strategy, engineering and analytics in order to help businesses uncover new opportunities. As an innovative technologist with over 13 years experience, Shawn removes technology as a barrier, and broadens the art of the possible for business and product leaders. His holistic view of technology and emphasis on developing and motivating strong engineering talent, with a focus on delivering outcomes whilst minimising outputs, is one of the characteristics which sets him apart from the crowd.
Shawn’s deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning and engineering analytics platforms. He has years of experience working as a consultant practitioner for a variety of prestigious clients ranging from secret clearance level government organizations to Fortune 500 companies.

Itai Bar-Sinai
With over 10 years of experience (Google, AI-focused startups) with big data and as the CPO and head of customer success at Mona, the leading AI monitoring intelligence company, Itai has a unique view of the AI industry. Working closely with data science and ML teams applying dozens of solutions in over 10 industries, Itai encounters a wide variety of business use-cases, organizational structures and cultures, and technologies used in today’s AI world.
Utilizing Advanced Monitoring Capabilities to Promote Product-oriented Data Science(Tutorial)

Prathiba Krishna
Prathiba is an experienced Data Scientist with a rich background in the Insurance industry. With a Master’s degree in Operational Research with Applied Statistics and Risk, her passion takes form through seeing the varying applications of Machine Learning and AI techniques, and how they propel data scientists to build better models and solutions. Skilled in data analysis and modelling, she utilizes SAS software and Open Source to assess and address problems within enterprise organizations.
Interpretability vs Explainability: Unpacking the Role of Human Morality in AI Models(Talk)

Gal Naamani
Gal Naamani has been working as a data scientist for 4 years, with the past 3 years being at Fiverr. As the Senior Data Scientist, Gal works closely with developers, analysts, product managers, and business owners on growth opportunities and new ideas, from research to production. Gal currently has leading roles in projects that are focused around search engine ranking, promoted ads, online bidding optimization, exploration-exploitation problems, monitoring, and more.
Utilizing Advanced Monitoring Capabilities to Promote Product-oriented Data Science(Tutorial)

Felipe de Pontes
Felipe is a Data Scientist in WhyLabs. He is a core contributor to whylogs, an open-source data logging library, and focuses on writing technical content and expanding the whylogs library in order to make AI more accessible, robust, and responsible. Previously, Felipe was an AI Researcher at WEG, where he researched and deployed Natural Language Processing approaches to extract knowledge from textual information about electric machinery. He is also a Master in Electronic Systems Engineering from UFSC (Universidade Federal de Santa Catarina), with research focused on developing and deploying fault detection strategies based on machine learning for unmanned underwater vehicles. Felipe has published a series of blog articles about MLOps, Monitoring, and Natural Language Processing in publications such as Towards Data Science, Analytics Vidhya, and Google Cloud Community.
Visually Inspecting Data Profiles for Data Distribution Shifts(Workshop)

Mihir Mathur
Mihir Mathur is the lead Product Manager for Machine Learning at Lyft, where he works on building ML and AI tools that power automated intelligent decisions across realtime pricing, ETAs, fraud detection, safety classification etc. In the past Mihir has worked on building delightful products for millions of users at Quora, Houzz, and Thomson Reuters. Mihir graduated magna cum laude from UCLA with a Bachelor’s and Master’s in Computer Science.
A Systematic Approach for Building Full-Spectrum Model Monitoring(Talk)

Robert Magno
Rob Magno is a Sales Engineer/Solution Architect at Run:AI based in New Jersey. He has been working in the Docker and Kubernetes space for the past five years. He enjoys tackling the diverse customer challenges that come with orchestrating AI/ML workloads through Kubernetes.
Building the Best AI Infrastructure Stack to Accelerate Your Data Science(Demo Talk)

Gijsbert Janssen Van Doorn
Gijsbert Janssen van Doorn is Director of Technical Product Marketing at Run:AI. He is a passionate advocate for technology that will shape the future of how organizations run AI. Gijsbert comes from a technical engineering background, with six years in multiple roles at Zerto, a Cloud Data Management and Protection vendor.

Jake Bengtson
Jake is currently working as a Senior Product Marketing Manager over ML Lifecycle products at Cloudera. Before joining Cloudera, Jake worked as a Data Scientist and Solution Architect at ExxonMobil. Additionally, he worked as a Senior Data Scientist at FarmersEdge. Before starting his professional career, Jake obtained his bachelor’s and master’s degree from Brigham Young University. When he isn’t working, Jake enjoys skiing, golfing, and spending time with his family in the mountains.
Forecasting Crypto Currency Prices with Cloudera Applied Machine Learning Prototypes(Demo Talk)

Terry McCann
Terry is the Director of AI for Advancing Analytics and Microsoft Artificial Intelligence MVP with a focus on all things AI and Data Science. Terry has a passion for applying traditional Software Engineering techniques to Data, to improve the way teams deliver Machine Learning projects. Terry is the host of the popular podcasts Data Science in Production and Totally Skewed, and organises the Global AI Bootcamp London event.
Simplifying Model Production with MLFlow Pipelines and Delta(Demo Talk)

Simon Whiteley
Simon is the Director of Engineering for Advancing Analytics, a Microsoft Data Platform MVP and one of the few Databricks Beacons Globally. Simon has pioneered Lakehouse Architectures for a some of the world’s largest companies, challenging traditional analytical solutions and pushing for the very best for the data industry. Simon runs the Advancing Spark YouTube channel, where he can often be found digging into Spark features, investigating new Microsoft technologies and cheering on the Delta Lake project.
A Dive into Delta Lake: A Modern File Format for the Next-Generation Lake(Workshop)

Oryan Omer
Oryan is a ֿLead Software Engineer with a passion for Machine Learning and DevOps, with 7 years of experience developing services for production and development environments and leading teams.
Data-driven ML Retraining with Production Insights(Demo Talk)

Lee Baker
Lee is the General Secretary for the AI Infrastructure Alliance. Based out of the UK, he is responsible for crafting and nurturing relationships with companies to build a canonical stack for AI and ML. When not shuttling his 3 children around, he can most often be found cycling, running and swimming around England’s South Coast.
The Rapid Evolution of the Canonical Stack for Machine Learning(Demo Talk)

Carl Osipov
Carl implemented his first neural net in 2000. He is a senior director of the AI / ML practice at Cognizant, focusing on communications, technology, and media customers. Previously he worked on deep learning and machine learning at Google and IBM. Carl is an author of over 20 articles in professional, trade, and academic journals, an inventor with 6 patents at USPTO, and holds 3 corporate awards from IBM for his innovative work. His machine learning book, “MLOps Engineering at Scale” continues to receive reader acclaim. You can find out more about Carl from his blog www.cloudswithcarl.com
Revealing the Inner Self: Automatic Differentiation (Autodiff) Clearly Explained(Workshop)

Kaxil Naik
Kaxil is currently working as the Director of Airflow Engineering Team @ Astronomer. Currently, he is one of the top three committers of the Airflow Project based on the number of commits. He is one of the release managers of Airflow. Most prominent works include co-authoring DAG Serialization, Scheduler HA, Secrets Backend.
He did his Masters in Data Science & Analytics from Royal Holloway, University of London. Started as a Data Scientist and then gained experience in Data Engineering, BigData and DevOps space. He began working on Airflow in 2017 while working at Data Reply as a BigData consultant and became a PMC member in 2018 and now works full-time at astronomer.io making Airflow better for everyone. He is a huge cricket fan and his favourite cricketers are Rahul Dravid and Virat Kohli.

Mark Needham
Bio Coming Soon!
Real-Time Analytics: Going Beyond Stream Processing with Apache Pinot(Workshop)

Tyler Ferguson
Tyler works as a Data Scientist at Vortexa where he focuses on building machine learning models that capture the dynamics of the energy markets. Prior to Vortexa, Tyler was doing research in clinical machine learning and published work in sports injury analytics and mathematical optimisation. He previously worked as a software engineer working for startups and clients in finance and uses this experience to contribute to the full lifecycle of building machine learning pipelines.
Dynamicio (a pandas I/O wrapper); Why you Should Start your ML-Ops Journey with Wrapping your I/O(Talk)

Meissane Chami
Meissane Chami serves ThoughtWorks, Inc. as a Senior ML Engineer, advising and developing innovative data science and machine learning solutions from proof of concept to production. She has gained expertise setting up innovation frameworks and conducting fast cycle proof of concepts. Her primary areas of expertise are in Natural Language processing, MLOps, DevOps, cloud computing, containerisation and Python. She holds a MSc degree in Machine Learning and Data Science form University College London School of Engineering.
Data Science in the Industry: Continuous Delivery for Machine Learning(Training)

Dr. Colin Gillespie
Dr Colin Gillespie is the Co-Founder and CTO of Jumping Rivers. A data science consultancy that specialises in all things R and Python. He is also a Senior Statistics lecturer at Newcastle University, has published over eighty peer-reviewed papers, and co-authored the O’Reilly book, Efficient R programming.

Andy Symonds, PhD
Andy Symonds is a technologist passionate about using data science in new and interesting ways. With a background in academia before moving into consulting, he loves problem solving and experimentation. Andy works with clients to help them gain insights and drive business value, by developing proof of concepts and moving these solutions into production.
Data Science in the Industry: Continuous Delivery for Machine Learning(Training)

Alice Grout-Smith
Alice Grout-Smith is a Data Science Manager at Jaguar Land Rover (JLR). Over the past 5 years at JLR she has enjoyed being part of the data science team that has enabled £300m+ of value to the business. After presenting at the Open Data Science Conference back in 2019 on Hierarchical Bayesian Models, Alice and the team have been busy pursuing the exciting applications of causal inference. They have observed that data science in a business context often involves making interventions and taking actions, requiring techniques beyond traditional machine learning. Prior to joining JLR, Alice graduated from the University of Oxford with a degree in Chemistry and a Masters specialising in Quantum Mechanics, which was later published.

Jamie Hilton
Jamie Hilton is a Senior Data Scientist at Jaguar Land Rover (JLR) with over 5 years of experience realising business value through data insights. At JLR his work focuses on driving digital transformation with data, helping the business to make the right decisions at the right time. Previously, he led advanced analytics initiatives as Head of Customer Science at Manchester-based e-commerce business THG. He holds a MA in Mathematics from the University of Cambridge.
Jamie is particularly passionate about the application of data science to the automotive and motorsport industries. In 2021, he worked with leading Formula 2 team Virtuosi Racing to deliver a competitive advantage by leveraging their data, having studied Advanced Motorsport Engineering at Cranfield University.
Why Attend
Immerse yourself in talks and workshops on AI in MLOps and Data Engineering
With numerous introductory level workshops, you get hands-on experience to quickly build up your skills
Post-conference, get access to recorded talks online and learn from over 100+ high-quality recording sessions that let you review content at your own pace
Take time out of your busy schedule to accelerate your knowledge of the latest advances in data science
Learn directly from world-class instructors who are the authors and contributors to many of the tools and languages used in data science today
Meet hiring companies, ranging from hot startups to Fortune 500, looking to hire professionals with data science skills at all levels
Get speaker insights and training in AI frameworks such as TensorFlow, MXNet, PyTorch, Spark, Storm, Drill, Keras, and other AI platforms
Get access to other focus area content, including ML/DL, Data Visualization Big Data, and Open Data Science
Who Should Attend
Data scientists looking to configure data pipelines for building and deploying Machine Learning algorithms.
Data scientists seeking to learn automation skills for experimentation and execution of Machine Learning algorithms
Anyone interested in understanding underlying frameworks in terms of machine configurations, storage indexing or data loading
Business professionals and industry experts looking to understand data science in practice
Software engineers and technologists who need to configure and administer Machine Learning Analytics related technologies such as Kafka, Spark, and Hadoop
CTO, CDS, and other managerial roles that require a bigger picture view of data science
Technologists in the field of MLOps looking to break into data science
Students and academics looking for more practical applied training in data science tools and techniques
ODSC EUROPE Hybrid Conference 2023 | June 14-15th
Register your interestODSC Newsletter
Stay current with the latest news and updates in open source data science. In addition, we’ll inform you about our many upcoming Virtual and in person events in Boston, NYC, Sao Paulo, San Francisco, and London. And keep a lookout for special discount codes, only available to our newsletter subscribers!