ODSC EAST | IN-PERSON & VIRTUAL CONFERENCE
AI EXPO & DEMO HALL
AI Solutions Showcase | Networking
Boston Hynes Convention Center, May 9-11, 2023
TALKS AND DEMOS
PARTNERS
NETWORKING EVENTS
ATTENDEES
Learn how to Build AI Better
Want to keep up with the latest AI developments, trends, and insights? Dealing with the build or buy dilemma to grow your business? Seeking to interact with data-obsessed peers and build your network?
Look no further: The ODSC AI Expo & Demo Hall is the right destination for you
Expo Hall Topics
Partner sessions offer compelling insights on how to make data science and AI work for your industry. Here are some of the topics you can expect at AI Expo & Demo Hall. Full agenda is coming soon.
Save 60% on Full Price
Save 60% on Full Price
Visionaries and Thought Leaders
With an AI Expo Pass you can take advantage of 40+ demo sessions and ODSC Keynotes. Our Speakers will provide compelling insights on how to make data science and AI work for your industry.
Past ODSC Keynote Speakers

Dr Hari Bhaskar, PhD
Hari Bhaskar is an engineering leader with hands on experience in designing and developing the AI platform at OCI. He is a researcher with a PhD on big data architectures and machine learning. His expertise and interests include the areas of model life cycle management, MLOps, and ML security and bias assessment. He has published 25+ papers in leading academic journals such as IEEE and Springer, and presented in international conferences on topics related to AI and machine learning. He is passionate about model security as it is one of the nascent areas of research where threat vectors emerge in terms of sophisticated and crafted attacks to mine models and associated information on data sets.
Is Your ML Secure? Cybersecurity and Threats in the ML World(Keynote)

Hilary Mason
Hilary Mason is the co-founder and CEO of Hidden Door. Prior to Hidden Door she was General Manager of the Machine Learning business unit at Cloudera (NYSE: CLDR). She previously founded Fast Forward Labs, an applied machine learning research and consulting startup which Cloudera acquired in 2017. Additionally, she was Data Scientist in Residence at Accel Partners, co-founded HackNY, and was Chief Scientist at bitly. Hilary has received numerous awards, is a regular keynote speaker, and has advised startups, corporations, and governments.
Making Story Computable: The Future of Co-creative Entertainment(Keynote)

Jean-Rene Gauthier, PhD
Jean-René Gauthier is the product architect behind the Oracle Cloud Infrastructure AI platform. Previously at DataScience.com, Jean-René designed the datascience.com platform model management features and roadmap. In addition, he managed a team of data experts in developing algorithms and analytics models to solve customers’ unique business problems. He is also responsible for educating clients on these algorithms and models, ensuring that they are incorporated into the business to add maximum value. Prior to his three years at DataScience.com, Jean-René was a data scientist at AuriQ Systems where he focused on online marketing analytics and data engineering, often involving high-speed processing of massive data sets. He holds a PhD in astrophysics from the University of Chicago and was a Millikan fellow at the California Institute of Technology.
Is Your ML Secure? Cybersecurity and Threats in the ML World(Keynote)

Luis Vargas, PhD
Luis Vargas is a Partner Technical Advisor to the CTO of Microsoft. Responsible for Microsoft’s AI at Scale initiative coordinating efforts across infrastructure, systems software, models, and products. He bootstrapped the productization of Automated ML and Reinforcement Learning in the Azure AI Platform, worked on the launch of Azure Database Services, and lead the high-availability area for SQL Server. Luis has a PhD in Computer Science from Cambridge University.
The Big Wave of AI at Scale(Keynote)

Padhraic Smyth, PhD
Padhraic Smyth is a Chancellor’s Professor at the University of California, Irvine (UCI) with appointments in the Department of Computer Science and in the Department of Statistics. His research interests include machine learning, pattern recognition, and applied statistics and he has published over 200 research papers on these topics. He is a Fellow of the Association for Computing Machinery (ACM) and the Association for the Advancement of Artificial Intelligence (AAAI) and has served in editorial and advisory positions for journals such as the Journal of Machine Learning Research and the Journal of the American Statistical Association. He has co-authored two texts, Principles of Data Mining (MIT Press, 2001), and Modeling the Internet and the Web (Wiley, 2003). While at UCI he has received research funding from federal agencies such as NSF, NIH, NASA, and NIST, we well as from companies such as Google, Qualcomm, SAP, Adobe, IBM, Experian, and Microsoft. In addition to his academic research he is also active in industry consulting, working on the development of new machine learning algorithms and methods across multiple application areas. He also served as an academic advisor to Netflix for the Netflix prize competition from 2006 to 2009. Padhraic grew up in the west of Ireland and received a bachelor’s degree in Electronic Engineering from the National University of Ireland (Galway) in 1984. He then received Masters and PhD degrees (in 1985 and 1988 respectively) in Electrical Engineering from the California Institute of Technology. From 1988 to 1996 he was a Technical Group Leader at the Jet Propulsion Laboratory, Pasadena, and has been on the faculty at UC Irvine since 1996.
Overconfidence in Machine Learning: Do Our Models Know What They Don’t Know?(Keynote)

Manuela Veloso, PhD
Manuela Veloso is Head of J.P. Morgan Chase AI Research and Herbert A. Simon University Professor Emerita at Carnegie Mellon University, where she was previously Faculty in the Computer Science Department and Head of the Machine Learning Department. She is past president of the Association for the Advancement of Artificial Intelligence (AAAI), and the co-founder and a Past President of the RoboCup Federation. In her career she has received numerous awards and honors, including: National Science Foundation CAREER Award, Allen Newell Medal for Excellence in Research, Radcliffe Fellow at the Radcliffe Institute for Advanced Study (Harvard University), Einstein Chair Professor of the Chinese Academy of Sciences, and the ACM/SIGART Autonomous Agents Research Award for “contributions to the field of artificial intelligence, in particular in planning, learning, multi-agent systems, and robotics.” Veloso is a Fellow of AAAI, AAAS, ACM, and IEEE. She was elected in 2022 to the National Academy of Engineering.

Ken Jee
As the Head of Data Science at Scouts Consulting Group, Ken spends his workdays improving the performance of athletes and teams by analyzing the data collected on them. He also dabbles in entrepreneurship and content creation, best known for his YouTube channel where he helps over 80,000 people navigate the data science landscape. More recently, Ken is focused on project-based learning through Kaggle. He hopes to share the processes that data scientists take when approaching Kaggle competitions and new datasets. He started the #66DaysOfData challenge to help people create the habit of learning and working on projects every day.
Bridging the Gap Between Data Scientists and Decision Makers(Keynote)

Usama Fayyad, PhD
Usama Fayyad is the Inaugural Executive Director of the Institute for Experiential AI at Northeastern University where he is also professor of computer science. He is Chairman at Open Insights focusing on AI/ML/BigData, Data strategy, and new business models for Data. He was Global Chief Data Officer at Barclays Bank in London (2013-2016) after which he served as Co-founder and CTO at OODA Health in San Francisco focused on AI in the healthcare space. He was Chairman/CEO/CTO at several Seattle/Silicon Valley tech startups and the first person to hold the title: Chief Data Officer when Yahoo! acquired his 2nd startup in 2004. He held leadership roles at Microsoft (1996-2000) and founded the Machine Learning Systems group at NASA’s Jet Propulsion Laboratory (1989-1996) where he was awarded Caltech’s top Excellence in Research award & a U.S. Government medal from NASA. Usama published over 100 technical articles, holds over 30 patents, is a Fellow of both Association for Advancement of Artificial Intelligence and the Association of Computing Machinery. He is a recipient of both the ACM SIGKDD Awards for Innovation and for Service. He earned his Ph.D. from the University of Michigan and holds two BSE’s in Electrical and Computer Engineering, MSE Computer Engineering and M.Sc. in Mathematics.
Data Science and AI in Digital Transformation: Digital Can Lead to Blindness(Keynote)
Featured AI Expo Speakers

Robert Magno
Rob Magno is a Sales Engineer/Solution Architect at Run:AI based in New Jersey. He has been working in the Docker and Kubernetes space for the past five years. He enjoys tackling the diverse customer challenges that come with orchestrating AI/ML workloads through Kubernetes.
Building the Best AI Infrastructure Stack to Accelerate Your Data Science(Demo Talk)

Adam Pocock, PhD
Adam Pocock is a Machine Learning researcher at Oracle Labs. He’s the lead developer of the Tribuo machine learning library, and maintains several other machine learning libraries on the JVM including TensorFlow-Java and ONNX Runtime’s Java API. Adam’s research has covered several areas of ML & applications, from work on scaling up and parallelizing Bayesian inference, to building multilingual NLP systems. He holds a PhD in Computer Science from the University of Manchester where his research focused on the theoretical underpinnings of feature selection algorithms.
Building Provenance and Reproducibility into ML Systems(Demo Talks)

Sanjay Patil
Sanjay Patil is the senior client solutions partner at Quantiphi and leads the Google Cloud business for US North-East. His passion and expertise in Data and AI/ML solutions drive Sanjay to build transformation roadmaps for customers and help them harness the power of data and the cloud.
Prior to Quantiphi, he worked with CVS Health as a Business Analyst, supporting the GM/Seasonal Business Unit in analyzing sales data and building seasonal budgets. Sanjay holds a Master’s in Marketing Analytics from Graduate school at Bentley University, an MBA in Marketing, and a Graduate Certificate in Business Analytics.
Sanjay is an avid reader and continuous learner who enjoys discussions on AI/ML, cloud technology, leadership, and career coaching.
Reimagine Clinical Research with the Power of Artificial Intelligence(Demo Talk)

Tim Kraska, PhD
Tim Kraska is an Associate Professor of Electrical Engineering and Computer Science in MIT’s Computer Science and Artificial Intelligence Laboratory, co-director of the Data System and AI Lab at MIT (DSAIL@CSAIL), and co-founder of Einblick Analytics. Currently, his research focuses on building systems for machine learning, and using machine learning for systems. Before joining MIT, Tim was an Assistant Professor at Brown, spent time at Google Brain, and was a PostDoc in the AMPLab at UC Berkeley after he got his PhD from ETH Zurich. Tim is a 2017 Alfred P. Sloan Research Fellow in computer science and received several awards including the VLDB Early Career Research Contribution Award, the VMware Systems Research Award, the university-wide Early Career Research Achievement Award at Brown University, an NSF CAREER Award, as well as several best paper and demo awards at VLDB, SIGMOD, and ICDE.
Data Boards: A Collaborative and Interactive Space for Data Science(Demo Talk)

Jimmy Whitaker
Jimmy Whitaker is the Chief Scientist of AI at Pachyderm. He focuses on creating a great data science experience and sharing best practices for how to use Pachyderm. When he isn’t at work, he’s either playing music or trying to learn something new, because “You suddenly understand something you’ve understood all your life, but in a new way.”

Kevin Hu
Kevin Hu is co-founder and CEO of Metaplane, a data observability company based in Boston focused on helping every team find and fix data quality problems with as little setup as possible. Metaplane is backed by leading investors including Y Combinator and the founders of Okta, HubSpot, and Lookout, and is used across high-growth teams and large enterprises.
Kevin has over a decade of experience working in data. Most recently, he researched the intersection of machine learning and data science at MIT, where he collaborated with Fortune 500 companies while earning his PhD, SM, and SB. His research has been published in top computer science venues like ACM CHI, KDD, and SIGMOD, and featured in the New York Times, Wired, and The Economist.
The Origins, Purpose, and Practice of Data Observability(Talk)
Data Observability in 10 Minutes(Demo Talk)

Alex Kim
Alex Kim is a Solutions Engineer at Iterative. His background is in physics, software engineering, and machine learning. In the last couple of years, he became increasingly interested in the engineering side of ML projects: processes and tools needed to go from an idea to a production solution.

Nathan Ballou
Nathan Ballou is a Senior Data Scientist at Saturn Cloud, a cloud workspace for the whole data science team. Prior to working at Saturn Cloud, Nathan worked as a data science consultant and as an operations research analyst. When Nathan’s not evangelizing machine learning at Saturn Cloud he can be found rowing on the Patapsco River in Baltimore.
What to Do When Your Data Gets Big(Demo Talk)

Ryan Urabe
CTO and co-founder of dataPlor. Co-founded BrandFolder ($155M exit) and has 10+ years of engineering experience in the startup space, in addition to 5 years as a data analyst in financial litigation consulting.

Alexander Dean
Alexander Dean is Co-founder and Chief Executive Officer at Snowplow Analytics. Alexander is a keen technologist with a passion for functional programming, cloud-based architectures and big data technologies. He also has a passion for innovation and organizational change. Before co-founding Snowplow, Alexander worked in technology roles at OpenX and in the Business Intelligence department at Deloitte Consulting, as well as strategy roles at Fathom Partners and Keplar LLP. Alexander holds a BA in History from the University of Cambridge.
Snowplow: Creating AI-ready Data for Better Models and Predictions(Demo Talk)

Jason Hepp
Jason is a Solutions Architecture Director with a background in building data engineering platforms to facilitate both streaming and batch analytics at scale. He is an experienced architect across verticals such as government, retail, manufacturing, and finance allowing for a unique perspective to customer problems.
Aiven is Your One Stop Shop for Open-source Database Solutions to Power ML/AI in the Cloud.(Demo Talk)

Dr. Alison Cozad
Dr. Alison Cozad holds a Ph.D. in Chemical Engineering from Carnegie Mellon University where she leveraged mixed-integer and semi-infinite optimization methods to improve machine learning algorithms. Prior to joining Gurobi, she held multiple roles at ExxonMobil, including as a Senior Data Science Lead and Real-time Optimization Engineer.
In her free time, Alison loves making things from CNC woodworking to electronics to cheese making to sock puppetry.
From Data to Decisions: Make your Machine Learning Models Mean more with Mathematical Optimization(Demo Talk)

Anais Dotis-Georgiou
Anais Dotis-Georgiou is a Developer Advocate for InfluxData with a passion for making data beautiful with the use of Data Analytics, AI, and Machine Learning. She takes the data that she collects, does a mix of research, exploration, and engineering to translate the data into something of function, value, and beauty. When she is not behind a screen, you can find her outside drawing, stretching, boarding, or chasing after a soccer ball.
InfluxDB: The Database for Your Time Series Data Science Problems(Demo Talk)

Phil Taylor
Phil Taylor is the Product Manager for Kensho NERD (Named Entity Recognition and Disambiguation). Prior to joining Kensho, Phil worked at IBM as a product manager for their data and AI SaaS platform and as a strategy and operations consultant. He earned his MBA from MIT Sloan in 2019 and previously worked as a consultant at firms such as Charles River Associates.
Unlocking the Power of AI and Machine Learning with Kensho NERD(Demo Talk)

Doris Zhong
Doris Zhong is a Product Manager in Azure AI Platform organization at Microsoft, and she is focusing on the area of machine learning in hybrid cloud. She loves to communicate with customer to get deep insights, and help solve the real problem. In her early career, she worked on building Microsoft internal GPU training platform, that managed tens of thousands of GPUs, and served thousands of users.
Run Azure Machine Learning Anywhere in Multi-cloud or on Premises(Demo Talk)

Shan He
Shan He is Senior Director of Engineering at Foursquare and Co-Founder of Unfolded – acquired by Foursquare. She is an engineer, a designer, and a data artist who has built her career in geospatial analytics and visualization. Before founding Unfolded, Shan was the first member of Uber’s data visualization team. At Uber, she created and open-sourced kepler.gl, an advanced geospatial visualization tool and the 2018 Kantar Information is Beautiful Award Gold winner.
Supercharging Geospatial Analysis In Your Data Science Workflow(Demo Talk)

Ben Amaba, Ph.D.
Dr. Ben Amaba is focused on AI, IoT, Data, and Edge Computing. Ben received his Ph.D. in Industrial Engineering from the University of Miami. Dr. Amaba is a registered and licensed Professional Engineer with International Registry; certified in Production, Operations, and Inventory Management by APICS ®; LEED® Accredited Professional (Leadership in Energy & Environmental Design); and certified in Corporate Strategy by Massachusetts Institute of Technology. Ben holds a copyright and several patents. Ben earned his BS in Electrical Engineering as well as his Master’s in Engineering/Industrial Management. Dr. Amaba holds positions as Board Member to the Oakland University Artificial Intelligence Research Center (OUAIRC), Founding member to the Institute of Advanced Systems Engineering, Founding member to the Center of Advanced Supply Chain Management, Industry Council Advisor for the Project Production Institute, Industrial Engineering Fellow, Board Member to the Council on Industrial and Systems Engineering (CISE), Executive Board Member of Applied Human Factors and Ergonomics (AHFE) and Editorial Board Member to IEEE (Institute of Electrical and Electronics Engineers) IT Professionals, and Editorial Board of The Open Cybernetics and Systemics Journal.
Demystifying AI — Everything You Need to Know for Successful Deployment(Demo Talk)

Jake Bengtson
Jake is currently working as a Senior Product Marketing Manager over ML Lifecycle products at Cloudera. Before joining Cloudera, Jake worked as a Data Scientist and Solution Architect at ExxonMobil. Additionally, he worked as a Senior Data Scientist at FarmersEdge. Before starting his professional career, Jake obtained his bachelor’s and master’s degree from Brigham Young University. When he isn’t working, Jake enjoys skiing, golfing, and spending time with his family in the mountains.
Forecasting Crypto Currency Prices with Cloudera Applied Machine Learning Prototypes(Demo Talk)

Aurick Qiao, PhD
Aurick Qiao is the Chief Executive Officer at Petuum. Aurick received his Ph.D. from Carnegie Mellon University, where he researched distributed machine learning systems. His work on elastic scheduling for deep learning training recently won the Jay Lepreau Best Paper Award at OSDI 2021. Together with his experience at top technology companies such as Microsoft, Facebook, and Dropbox, Aurick is building products to support the next generation of AI/ML operations.
Supercharging MLOps with Composability, Automation, and Scalability(Demo Talk)

Kirk DeBaets
Kirk DeBaets is a Senior Solution Engineer at Clarifai. He has an MBA and a passion for turning technologies into positive business outcomes. A former VP of Database Engineering in both the Investment Bank and Global Technology lines of business at JP Morgan Chase, he has spent the last several years working with customers to derive business value from their AI/ML investment.
Demystifying AI — Everything You Need to Know for Successful Deployment(Demo Talk)

Akram Dweikat
Akram Dweikat is a computer engineer and entrepreneur, specialized in machine learning & AI. He has been recognized by the UK Government as an Exceptional Talent in computer engineering, innovation, and entrepreneurship. Akram is currently the Engineering Manager for Deliveroo’s Network Economics (ML) team. Also, he is a global data science ambassador for Z by HP. He has been appointed as an AI Expert by the World Economic Forum, serving on their Global Future Council on Artificial Intelligence for Humanity. In his spare time, Akram helps build agricultural gardens for income and food security in his native Palestine. Earlier in his career, Akram helped establish the entrepreneurial community in Nablus and was one of eight youth selected to meet US President Barack Obama on his official visit to Palestine.
Introduction to WSL2 for Data Science with Z by HP(Demo Talk)
Data Science Innovation with Z by HP Workstations and Software Stack(Talk)

Glen Ford
Glen Ford is VP of Product at iMerit — a leading AI data solutions company — where he leads the product management and design teams. Glen holds more than two decades of product development experience across the technology sector. A Graduate of Texas A&M University—Commerce, Glen began his career as a consultant where he handled full-stack web programming and architecture for clients including Time Warner and AIM Funds. Over the years, he has held senior and director-level product management roles at several companies including Demand Media, WP Engine and Humanify. Most recently, Glen spent four years at Alegion — an ML-powered data annotation platform — where he helped the company grow from eight full-time employees to more than 100 in a challenging, emerging market.
The Hidden Layers of Tech Behind Successful Data Labeling(Demo Talk)

Edoardo Riva
Edoardo Riva has more than 20 years of experience enabling clients, partners, and colleagues in the areas of software architecture, integration and deployment. Over the last 10 years, he has been a featured presenter at industry and technology conferences. He has co-authored and delivered workshops globally on everything new and at the forefront of technology including distributed computing, high-performance analytics, in-database processing, workload management, and, most recently cloud computing. From Microsoft Azure to Red Hat OpenShift, he shares his experience running analytic workloads on different platforms, either on-prem or in the cloud. Edoardo holds a bachelor’s in computer engineering from Politecnico di Milano.
Automating Deployment using GitOps: SAS Viya on Red Hat OpenShift(Demo Talk)

Andrew Engel, PhD
Andrew Engel is the Chief Data Scientist at Rasgo. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, HP and led DataRobot’s efforts in Entertainment, Sports and Gaming before joining Rasgo in August of 2020.
Feature Engineering on the Modern Data Stack(Demo Talk)

Ryan Wright
Ryan Wright is the creator of Quine, and has been leading software teams focused on data infrastructure and data science for two decades. He has served as principal engineer, director of engineering, principal investigator on DARPA-funded research programs, and is currently the founder and CEO of thatDot—the company supporting Quine. Ryan particularly enjoys taking the philosophical ends of computer science—usually problems related to language, meaning, and data—and making them more practical.
Noiseless Anomaly Detection with Streaming Graph A.I.(Demo Talk)
Quine: A Streaming Graph for Event-Driven Data Pipelines(Talk)

Lipika Ramaswamy
Lipika Ramaswamy is a Senior Applied Scientist at Gretel.ai where she focuses on developing advanced synthetic data generation technologies that include privacy guarantees. Prior to Gretel.ai, she worked as a data scientist at LeapYear, a differential privacy software company. Lipika attended Bryn Mawr College for her undergrad, where she began her STEM career, and holds a Master’s in Data Science from Harvard University.
Democratizing Access to Data with Synthetic Data Generation(Demo Talk)

Marcelo Litovsky
Marcelo Litovsky is an experienced Information Technology professional with 30 years of diverse background in Enterprise Architecture, AI, Systems and Database Management, and Programming. He has worked in multiple industries: Financial Services, Entertainment, and Information Technology in his career. Today, he serves as Director of Sales Engineering at Iguazio, bringing his expertise to help Data Scientists, Data Engineers, and Systems Engineers work together to deploy AI/ML applications faster, more efficiently and in a reproducible way. When he is not installing software, talking to customers, or writing Python code, you can find him at the gym or preparing healthy vegan meals.
“It worked on my laptop, now what?” Using OS Tool MLRun to Automate the Path to Production(Demo Talk)

Matt Tolley
Matt Tolley is a Sales Director working with Aiven customers across North America. Matt formerly worked at a Boston based DBaaS company focused on commercial DB virtualization acquired by Google in 2020. He then stepped into the world of open source and has been with Aiven for 2 years, with a core focus of consulting with customers that are looking to solve business-critical challenges through the use of open source database and streaming technology in the cloud.
Aiven is Your One Stop Shop for Open-source Database Solutions to Power ML/AI in the Cloud.(Demo Talk)

Denis Coady
An experienced product manager with a demonstrated history of providing valuable products and services in the big data and AI/ML industry, Denis currently serves as a Technical Product Manager for Molecula. Denis is driven to empower organizations with easier-to-use data products and to make cutting-edge advancements accessible to more people. He has a strong engineering background that informs his work. He most recently worked as a Senior Solutions Architect at Cloudera, and has previous experience at IBM, Microsoft, and Boeing.
A New Data Format to Deliver Real-Time Data at Massive Scale(Demo Talk)

Zoe Steinkamp
Zoe Steinkamp is a developer Advocate for influxData. She was a front end software engineer for over 6 years before she moved into a developer advocate role. She has been with InfluxDB for over 3 years and she looks forward to sharing her knowledge of the platform and databases. She enjoys learning about awesome new technologies and doing at home tech projects to help make her life as well as other people’s lives easier. Her passions besides new technology include traveling and gardening.
Methods and Tools for Time Series Data Science Problems with InfluxDB(Demo Talk)
Schedule
Keynote | Virtual | Machine Learning | All Levels
In his Keynote, Ken dives deep into 5 of the main causes of misunderstandings between data scientists and decision makers. He highlights the actionable strategies to get everyone on the same page so that data scientists and decision makers are working with, not against each other…more details
As the Head of Data Science at Scouts Consulting Group, Ken spends his workdays improving the performance of athletes and teams by analyzing the data collected on them. In his free time, Ken dabbles in entrepreneurship and content creation. He is best known for his YouTube channel where he helps over 150,000 people navigate the data science landscape. He enjoys making commentary and tutorial videos that make the fields of data science and machine learning accessible to everyone. He started the #66DaysOfData challenge to help people create the habit of learning and working on projects every day. In his free time, Ken enjoys golf, fly fishing, yoga, jiu-jitsu, and cooking.
Keynote | Virtual | Machine Learning
In this session we’ll discuss the current trend of increasingly larger AI models, empowering a wider range of tasks in the language, vision, and multi-modality space, with growing levels of capability. We’ll give an overview of the research and engineering efforts supporting the trend, its product and engineering impact at Microsoft, and the implications for other companies…more details
Luis Vargas is a Partner Technical Advisor to the CTO of Microsoft. Responsible for Microsoft’s AI at Scale initiative coordinating efforts across infrastructure, systems software, models, and products. He bootstrapped the productization of Automated ML and Reinforcement Learning in the Azure AI Platform, worked on the launch of Azure Database Services, and lead the high-availability area for SQL Server. Luis has a PhD in Computer Science from Cambridge University.
Keynote | In-person | Machine Learning | All Levels
In his Keynote, Ken dives deep into 5 of the main causes of misunderstandings between data scientists and decision makers. He highlights the actionable strategies to get everyone on the same page so that data scientists and decision makers are working with, not against each other…more details
As the Head of Data Science at Scouts Consulting Group, Ken spends his workdays improving the performance of athletes and teams by analyzing the data collected on them. In his free time, Ken dabbles in entrepreneurship and content creation. He is best known for his YouTube channel where he helps over 150,000 people navigate the data science landscape. He enjoys making commentary and tutorial videos that make the fields of data science and machine learning accessible to everyone. He started the #66DaysOfData challenge to help people create the habit of learning and working on projects every day. In his free time, Ken enjoys golf, fly fishing, yoga, jiu-jitsu, and cooking.
Keynote | Virtual | Machine Learning | Machine Learning Safety & Security
Just like any other piece of software, machine learning models are vulnerable attacks from malicious agents. However, data scientists and ML engineers rarely think about the security of their models. Models are vulnerable too—they’re representations of underlying training datasets, and are susceptible to attacks that can compromise the privacy and confidentiality of data. Every single step in the machine learning lifecycle is susceptible to various security threats. But there are steps you can take…more details
Hari Bhaskar is an engineering leader with hands on experience in designing and developing the AI platform at OCI. He is a researcher with a PhD on big data architectures and machine learning. His expertise and interests include the areas of model life cycle management, MLOps, and ML security and bias assessment. He has published 25+ papers in leading academic journals such as IEEE and Springer, and presented in international conferences on topics related to AI and machine learning. He is passionate about model security as it is one of the nascent areas of research where threat vectors emerge in terms of sophisticated and crafted attacks to mine models and associated information on data sets.
Jean-René Gauthier is the Product Architect behind the Oracle Cloud Infrastructure AI platform. Previously at DataScience.com, Jean-René designed the datascience.com platform model management features and roadmap. In addition, he managed a team of data experts in developing algorithms and analytics models to solve customers’ unique business problems. He is also responsible for educating clients on these algorithms and models, ensuring that they are incorporated into the business to add maximum value. Prior to his three years at DataScience.com, Jean-René was a data scientist at AuriQ Systems where he focused on online marketing analytics and data engineering, often involving high-speed processing of massive data sets. He holds a PhD in astrophysics from the University of Chicago and was a Millikan fellow at the California Institute of Technology.
Demo Talk | In-person | Machine Learning
Feature engineering is more than simply missing value imputation, handling outlier and categorical variables and scaling numerical variables. It is an opportunity to allow a data scientist’s creativity to shine and as Andrew Ng’s stated, “Applied machine learning is basically feature engineering.” In this talk, we will show how to aggregate time series data and calculate moving averages in pandas, directly on the data warehouse using SQL and leveraging Rasgo to calculate and publish those features on Snowflake…more details
Andrew Engel is the Chief Data Scientist at Rasgo. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, HP and led DataRobot’s efforts in Entertainment, Sports and Gaming before joining Rasgo in August of 2020.
Demo Talk | In-person | Machine Learning
In this session, attendees will learn how iMerit is solving the problem of scaling data pipelines with accuracy using unique technology. Join iMerit’s VP of Product, Glen Ford, as he uncovers the invisible technology building successful data labeling workflows and discovering anomalous and novel classes for customers using iMerit’s Edge Case technology…more details
Glen Ford is VP of Product at iMerit — a leading AI data solutions company — where he leads the product management and design teams. Glen holds more than two decades of product development experience across the technology sector. A Graduate of Texas A&M University—Commerce, Glen began his career as a consultant where he handled full-stack web programming and architecture for clients including Time Warner and AIM Funds. Over the years, he has held senior and director-level product management roles at several companies including Demand Media, WP Engine and Humanify. Most recently, Glen spent four years at Alegion — an ML-powered data annotation platform — where he helped the company grow from eight full-time employees to more than 100 in a challenging, emerging market.
Demo Talk | In-person | MLOps & Data Engineering
Machine learning models are never done. The world is always changing and models rely on data to learn useful information about this world. In ML systems we need to be able to embrace change without sacrificing reliability. But how do we do it? MLOps. MLOps, the process of operationalizing your machine learning technology, is fundamental to any organization leveraging AI. However, the complexities of machine learning require managing two lifecycles: the code and the data. Pachyderm is a platform that provides the foundation for unifying these two lifecycles. In this session, you will learn how to manage constantly changing data through versioning, unify data and code lifecycles, and institute data- driven automation…more details
Jimmy Whitaker is the Chief Scientist of AI at Pachyderm. He focuses on creating a great data science experience and sharing best practices for how to use Pachyderm. When he isn’t at work, he’s either playing music or trying to learn something new, because “You suddenly understand something you’ve understood all your life, but in a new way.”
Demo Talk | In-person | MLOps & Data Engineering
The capabilities of MLRun are extensive, and we will cover the basics to get you started. You will leave this session with enough information to:
Get you started with MLRun, on your own, in 10 minutes, so you can automate and accelerate your path to production
Run local move to Kubernetes
Understand how your Python code can run as a Kubernetes job with no code changes
Track your experiments
Get an introduction to advanced MLOps topics using MLRun
Marcelo Litovsky is an experienced Information Technology professional with 30 years of diverse background in Enterprise Architecture, AI, Systems and Database Management, and Programming. He has worked in multiple industries: Financial Services, Entertainment, and Information Technology in his career. Today, he serves as Director of Sales Engineering at Iguazio, bringing his expertise to help Data Scientists, Data Engineers, and Systems Engineers work together to deploy AI/ML applications faster, more efficiently and in a reproducible way. When he is not installing software, talking to customers, or writing Python code, you can find him at the gym or preparing healthy vegan meals.
Demo Talk | In-person | Machine Learning | All Levels
This talk will review relevant use cases leveraging Kensho NERD to uncover the companies, subsidiaries, and other organizations appearing in textual data to power smart search, supercharge research workflows, and more. By linking to broad knowledge bases with tens of millions of entities, you’ll see first-hand how you can differentiate your organization with data and machine learning…more details
Phil Taylor is the Product Manager for Kensho NERD (Named Entity Recognition and Disambiguation). Prior to joining Kensho, Phil worked at IBM as a product manager for their data and AI SaaS platform and as a strategy and operations consultant. He earned his MBA from MIT Sloan in 2019 and previously worked as a consultant at firms such as Charles River Associates.
Demo Talk | In-person | Machine Learning
In this talk, I will present Northstar, a novel system we developed for Interactive Data Exploration at MIT and Brown University and which is now commercialized by einblick analytics, inc. I will explain why Northstar required us to completely rethink the entire analytics stack, from the interface to the “guts” and highlight a few selected techniques we developed to provide a truly novel user-interface (see http://www.einblick.ai/ for a video demonstration) and interactive speeds even over the largest datasets and complex ML operations…more details
Tim Kraska is an Associate Professor of Electrical Engineering and Computer Science in MIT’s Computer Science and Artificial Intelligence Laboratory, co-director of the Data System and AI Lab at MIT (DSAIL@CSAIL), and co-founder of Einblick Analytics. Currently, his research focuses on building systems for machine learning, and using machine learning for systems. Before joining MIT, Tim was an Assistant Professor at Brown, spent time at Google Brain, and was a PostDoc in the AMPLab at UC Berkeley after he got his PhD from ETH Zurich. Tim is a 2017 Alfred P. Sloan Research Fellow in computer science and received several awards including the VLDB Early Career Research Contribution Award, the VMware Systems Research Award, the university-wide Early Career Research Achievement Award at Brown University, an NSF CAREER Award, as well as several best paper and demo awards at VLDB, SIGMOD, and ICDE.
Demo Talk | In-person | MLOps &Data Engineering | All Levels
In this talk, I’ll describe an approach that streamlines all three phases. As our demo project, I’ve selected a very common deployment pattern in CV projects: a CV model wrapped in a web API service. Automatic defect detection is an example problem I am addressing with this pattern…more details
Alex Kim is a Solutions Engineer at Iterative. His background is in physics, software engineering, and machine learning. In the last couple of years, he became increasingly interested in the engineering side of ML projects: processes and tools needed to go from an idea to a production solution.
Demo Talk | In-person | MLOps and Data Engineering | All Levels
Feature stores are the newest idea that is supposed to help us, but it turns out that’s not enough. In this session, you’ll learn how to craft production-ready features and build training datasets at the right points-in-time from event-based data. Specifically, we’ll be covering strategies for powering feature stores with a feature engine to:
– Compute directly from event-based data to try new features
– Iterate on feature definitions and time selection across historical data instantly
– Join values between different entities at precise times — without leakage
– Eliminate data discrepancies in production
Come join us to learn how to finally iterate on amazing ML models with event-based data…more details
Dr. Charna Parkey is Vice President of Product at Kaskada, where she co-created the first commercially available feature engine with time travel. She has over 15 years’ experience in enterprise data science and adaptive algorithms in the defense and startup tech sectors and has worked with dozens of Fortune 500 companies in her work as a data scientist. She earned her Ph.D. in Electrical Engineering at the University of Central Florida.
Demo Talk | Virtual | Machine Learning
In this session, Senior Director of Engineering Shan He will tangibly demonstrate how geospatial analysis can help improve user experiences, product design and business decisions…more details
Shan He is Senior Director of Engineering at Foursquare and Co-Founder of Unfolded – acquired by Foursquare. She is an engineer, a designer, and a data artist who has built her career in geospatial analytics and visualization. Before founding Unfolded, Shan was the first member of Uber’s data visualization team. At Uber, she created and open-sourced kepler.gl, an advanced geospatial visualization tool and the 2018 Kantar Information is Beautiful Award Gold winner.
Demo Talk | Virtual | Machine Learning
In this talk, we’ll learn about the advantages of time series databases and InfluxDB when tackling time series data science problems. Next, we’ll dive into the solutions that InfluxDB offers which enable you to prepare your data and send it to the data warehouse of your choice. You can also use InfluxDB for MLOps monitoring. Finally, a demo will demonstrate just how easy it is to collect and write time series data into InfluxDB so you can focus on the analysis of your data…more details
Anais Dotis-Georgiou is a Developer Advocate for InfluxData with a passion for making data beautiful with the use of Data Analytics, AI, and Machine Learning. She takes the data that she collects, does a mix of research, exploration, and engineering to translate the data into something of function, value, and beauty. When she is not behind a screen, you can find her outside drawing, stretching, boarding, or chasing after a soccer ball.
Demo Talk | In-person | Machine Learning
This talk will present a new technique we call “novelty detection” which uses the freely available “Quine” streaming graph to score incoming event data immediately. This technique is able to use categorical data directly instead requiring the traditional one-hot encoding (or other encodings) and makes use of context to accurately score events never seen before. The end result of this process is a live stream of real-time explanations and “novelty scores” which provide a total-ordering of how unusual each observation is compared to all data seen so far…more details
Ryan Wright is the creator of Quine, and has been leading software teams focused on data infrastructure and data science for two decades. He has served as principal engineer, director of engineering, principal investigator on DARPA-funded research programs, and is currently the founder and CEO of thatDot—the company supporting Quine. Ryan particularly enjoys taking the philosophical ends of computer science—usually problems related to language, meaning, and data—and making them more practical.
Demo Talk | Virtual | Machine Learning
WSL2 – Windows Subsystem for Linux is a layer for running Linux binary executables natively on Windows. What is WSL2? How does it fit within your workflow? What is the value of it for data science? How to setup your machine? How to run your first code? This introductory session aims to provide answers to these questions, get you introduced to WSL2 and get you started by configuring your machine and running your first code…more details
Akram Dweikat is a computer engineer and entrepreneur, specialized in machine learning & AI. He has been recognized by the UK Government as an Exceptional Talent in computer engineering, innovation, and entrepreneurship. Akram is currently the Engineering Manager for Deliveroo’s Network Economics (ML) team. Also, he is a global data science ambassador for Z by HP. He has been appointed as an AI Expert by the World Economic Forum, serving on their Global Future Council on Artificial Intelligence for Humanity. In his spare time, Akram helps build agricultural gardens for income and food security in his native Palestine. Earlier in his career, Akram helped establish the entrepreneurial community in Nablus and was one of eight youth selected to meet US President Barack Obama on his official visit to Palestine.
Demo Talk | Virtual | Machine Learning
In this session, attendees will learn how iMerit is solving the problem of scaling data pipelines with accuracy using unique technology. Join iMerit’s VP of Product, Glen Ford, as he uncovers the invisible technology building successful data labeling workflows and discovering anomalous and novel classes for customers using iMerit’s Edge Case technology…more details
Glen Ford is VP of Product at iMerit — a leading AI data solutions company — where he leads the product management and design teams. Glen holds more than two decades of product development experience across the technology sector. A Graduate of Texas A&M University—Commerce, Glen began his career as a consultant where he handled full-stack web programming and architecture for clients including Time Warner and AIM Funds. Over the years, he has held senior and director-level product management roles at several companies including Demand Media, WP Engine and Humanify. Most recently, Glen spent four years at Alegion — an ML-powered data annotation platform — where he helped the company grow from eight full-time employees to more than 100 in a challenging, emerging market.
Demo Talk | In-person
In this talk, we will hear from Quantiphi’s Sanjay Patil, Senior Client Solution Partner, Applied AI, US North-East, on how they:
Accelerated Research and Clinical Trials
Simplified technology to enable clinicians to detect hemorrhage accurately and administer non-invasive treatment
Implemented Google’s ‘Healthcare API’ into Medical Imaging Diagnostics
Productionalized the solution while scaling to multiple trial sites and remote locations on edge devices
Unlocked the technology potential of using the abundance of Imaging data for research & diagnosis for CT scans
Sanjay Patil is the senior client solutions partner at Quantiphi and leads the Google Cloud business for US North-East. His passion and expertise in Data and AI/ML solutions drive Sanjay to build transformation roadmaps for customers and help them harness the power of data and the cloud.
Prior to Quantiphi, he worked with CVS Health as a Business Analyst, supporting the GM/Seasonal Business Unit in analyzing sales data and building seasonal budgets. Sanjay holds a Master’s in Marketing Analytics from Graduate school at Bentley University, an MBA in Marketing, and a Graduate Certificate in Business Analytics.
Sanjay is an avid reader and continuous learner who enjoys discussions on AI/ML, cloud technology, leadership, and career coaching.
Demo Talk | In-person
We will discuss different approaches you can take to adapt your data so that it fits in your existing analysis framework. Then we will review the steps you can take when the analysis is simply too big to fit in the RAM of a single machine. We will examine how you might speed up calculations by using parallel processes and/or GPUs and by using frameworks such as Python’s Dask and the R future package. This discussion will equip you with strategies to tackle larger datasets. More data does not have to mean more problems!..more details
Nathan Ballou is a Senior Data Scientist at Saturn Cloud, a cloud workspace for the whole data science team. Prior to working at Saturn Cloud, Nathan worked as a data science consultant and as an operations research analyst. When Nathan’s not evangelizing machine learning at Saturn Cloud he can be found rowing on the Patapsco River in Baltimore.
Demo Talk | In-person | Machine Learning Safety & Security | MLOps & Data Engineering | All Levels
This demo is a deep dive into Metaplane, the only engineer-first, self-serve data observability platform. Data teams at high-growth companies (like Vendr, Reforge, Weedmaps, Teachable, SpotOn, etc) use Metaplane to save engineering time and increase trust in data by understanding when things break, what went wrong, and how to fix it — before the CMO messages them about a broken dashboard. By the end of the demo, you’ll know how to setup out-of-the-box and custom tests, automatically extract lineage throughout your data stack, and triage data quality alerts with your team…more details
Kevin Hu is co-founder and CEO of Metaplane, a data observability company based in Boston focused on helping every team find and fix data quality problems with as little setup as possible. Metaplane is backed by leading investors including Y Combinator and the founders of Okta, HubSpot, and Lookout, and is used across high-growth teams and large enterprises.
Kevin has over a decade of experience working in data. Most recently, he researched the intersection of machine learning and data science at MIT, where he collaborated with Fortune 500 companies while earning his PhD, SM, and SB. His research has been published in top computer science venues like ACM CHI, KDD, and SIGMOD, and featured in the New York Times, Wired, and The Economist.
Demo Talk | In-person
In this session, Cloudera will demonstrate how an AMP can be used for structural time series analysis. An Auto ML approach will be employed to forecast future cryptocurrency prices. To facilitate easy application usage, a Web-based, RESTful endpoint will be exposed to retrieve model predictions…more details

Troy is currently a Partner Solutions Engineer at Cloudera focused primarily on integration efforts with other software vendors. Before joining Cloudera, Troy spent a number of years in the banking industry with Freddie Mac, Truist, and Capital One, with a long stint beforehand at Oracle. His undergraduate and graduate school work was in Mathematical Statistics at The American University. In his free time, Troy is an amateur jazz musician and enjoys traveling with his family.

Demo Talk | Virtual | Machine Learning Safety & Security | MLOps & Data Engineering | All Levels
This demo is a deep dive into Metaplane, the only engineer-first, self-serve data observability platform. Data teams at high-growth companies (like Vendr, Reforge, Weedmaps, Teachable, SpotOn, etc) use Metaplane to save engineering time and increase trust in data by understanding when things break, what went wrong, and how to fix it — before the CMO messages them about a broken dashboard. By the end of the demo, you’ll know how to setup out-of-the-box and custom tests, automatically extract lineage throughout your data stack, and triage data quality alerts with your team…more details
Kevin Hu is co-founder and CEO of Metaplane, a data observability company based in Boston focused on helping every team find and fix data quality problems with as little setup as possible. Metaplane is backed by leading investors including Y Combinator and the founders of Okta, HubSpot, and Lookout, and is used across high-growth teams and large enterprises.
Kevin has over a decade of experience working in data. Most recently, he researched the intersection of machine learning and data science at MIT, where he collaborated with Fortune 500 companies while earning his PhD, SM, and SB. His research has been published in top computer science venues like ACM CHI, KDD, and SIGMOD, and featured in the New York Times, Wired, and The Economist.
Demo Talk | Virtual | Responsible AI
In this talk we’ll discuss our approach to solving the problems of provenance tracking and reproducibility by engineering a machine learning library from the ground up to incorporate first-class notions of provenance and reproducibility, automatically capturing provenance for all ML computations…more details
Adam Pocock is a Machine Learning researcher at Oracle Labs. He’s the lead developer of the Tribuo machine learning library, and maintains several other machine learning libraries on the JVM including TensorFlow-Java and ONNX Runtime’s Java API. Adam’s research has covered several areas of ML & applications, from work on scaling up and parallelizing Bayesian inference, to building multilingual NLP systems. He holds a PhD in Computer Science from the University of Manchester where his research focused on the theoretical underpinnings of feature selection algorithms.
Demo Talk | In-person
The analytical journey from data to decisions may involve adding a new skill to your analytical toolbox. Join Dr. Alison Cozad for a discussion on,
When and why to ask your business user, “What are you going to do with these results?”
How mathematical optimization can complement your machine learning models
A combined data science and optimization python example and demo with Gurobi
Dr. Alison Cozad holds a Ph.D. in Chemical Engineering from Carnegie Mellon University where she leveraged mixed-integer and semi-infinite optimization methods to improve machine learning algorithms. Prior to joining Gurobi, she held multiple roles at ExxonMobil, including as a Senior Data Science Lead and Real-time Optimization Engineer.
In her free time, Alison loves making things from CNC woodworking to electronics to cheese making to sock puppetry.
Demo Talk | In-person | Big Data Analytics
This presentation explores:
– Why AI?
– Best practices from process to unstructured data
– Modeling technology across industry, academia, and government…more details
Kirk DeBaets is a Senior Solution Engineer at Clarifai. He has an MBA and a passion for turning technologies into positive business outcomes. A former VP of Database Engineering in both the Investment Bank and Global Technology lines of business at JP Morgan Chase, he has spent the last several years working with customers to derive business value from their AI/ML investment.
Dr. Ben Amaba is focused on AI, IoT, Data, and Edge Computing. Ben received his Ph.D. in Industrial Engineering from the University of Miami. Dr. Amaba is a registered and licensed Professional Engineer with International Registry; certified in Production, Operations, and Inventory Management by APICS ®; LEED® Accredited Professional (Leadership in Energy & Environmental Design); and certified in Corporate Strategy by Massachusetts Institute of Technology. Ben holds a copyright and several patents. Ben earned his BS in Electrical Engineering as well as his Master’s in Engineering/Industrial Management. Dr. Amaba holds positions as Board Member to the Oakland University Artificial Intelligence Research Center (OUAIRC), Founding member to the Institute of Advanced Systems Engineering, Founding member to the Center of Advanced Supply Chain Management, Industry Council Advisor for the Project Production Institute, Industrial Engineering Fellow, Board Member to the Council on Industrial and Systems Engineering (CISE), Executive Board Member of Applied Human Factors and Ergonomics (AHFE) and Editorial Board Member to IEEE (Institute of Electrical and Electronics Engineers) IT Professionals, and Editorial Board of The Open Cybernetics and Systemics Journal.
Demo Talk | Virtual | Big Data Analytics
In this talk, we’ll walk through the requirements of delivering real-time analytics at scale, the workarounds practitioners are forced to make in order to achieve them, and a new feature-oriented data format that enables you to deliver real-time data at scale without making costly concessions…more details
An experienced product manager with a demonstrated history of providing valuable products and services in the big data and AI/ML industry, Denis currently serves as a Technical Product Manager for Molecula. Denis is driven to empower organizations with easier-to-use data products and to make cutting-edge advancements accessible to more people. He has a strong engineering background that informs his work. He most recently worked as a Senior Solutions Architect at Cloudera, and has previous experience at IBM, Microsoft, and Boeing.
Demo Talk | In-person | Machine Learning
Aiven’s platform of database services including Apache Kafka®, Apache Flink®, Apache Cassandra®, Postgres®, Opensearch®, and Redis™ are all at your disposal to meet any ML or AI algorithm data needs. Rather than struggling with database setup and management, Aiven’s fully managed, open source cloud data platform works for you and your data. Our set-it-and-forget-it solutions take the pain out of cloud data infrastructure. Come see how we can help turbocharge your Ml/AI data needs with Aiven…more details
Jason is a Solutions Architecture Director with a background in building data engineering platforms to facilitate both streaming and batch analytics at scale. He is an experienced architect across verticals such as government, retail, manufacturing, and finance allowing for a unique perspective to customer problems.
Matt Tolley is a Sales Director working with Aiven customers across North America. Matt formerly worked at a Boston based DBaaS company focused on commercial DB virtualization acquired by Google in 2020. He then stepped into the world of open source and has been with Aiven for 2 years, with a core focus of consulting with customers that are looking to solve business-critical challenges through the use of open source database and streaming technology in the cloud.
Demo Talk | Virtual | MLOps & Data Engineering
Machine learning models are never done. The world is always changing and models rely on data to learn useful information about this world. In ML systems we need to be able to embrace change without sacrificing reliability. But how do we do it? MLOps. MLOps, the process of operationalizing your machine learning technology, is fundamental to any organization leveraging AI. However, the complexities of machine learning require managing two lifecycles: the code and the data. Pachyderm is a platform that provides the foundation for unifying these two lifecycles. In this session, you will learn how to manage constantly changing data through versioning, unify data and code lifecycles, and institute data- driven automation…more details
Jimmy Whitaker is the Chief Scientist of AI at Pachyderm. He focuses on creating a great data science experience and sharing best practices for how to use Pachyderm. When he isn’t at work, he’s either playing music or trying to learn something new, because “You suddenly understand something you’ve understood all your life, but in a new way.”
Demo Talk | In-person | Responsible AI | Machine Learning Safety & Security
Join us, as we go through the many ways you can interact with and utilize Gretel Synthetics, an open-source synthetic data generator that features differentially private learning. Whether you’re a developer, data scientist, or just a data enthusiast, this hands-on workshop will show you how using either Gretel’s APIs, CLIs, SaaS Console or SDK can offer any user an easy experience generating synthetic data…more details
Lipika Ramaswamy is a Senior Applied Scientist at Gretel.ai where she focuses on developing advanced synthetic data generation technologies that include privacy guarantees. Prior to Gretel.ai, she worked as a data scientist at LeapYear, a differential privacy software company. Lipika attended Bryn Mawr College for her undergrad, where she began her STEM career, and holds a Master’s in Data Science from Harvard University.
Demo Talk | In-person | MLOps & Data Engineering
In this talk, we’ll tackle the challenge of optimizing the AI infrastructure stack using Kubernetes, NVIDIA GPUs, and Run:AI. Walking through an example of a well-architected AI Infrastructure stack, we’ll discuss how Kubernetes can be augmented with advanced GPU scheduling to maximize efficiency and speed up data science initiatives…more details
Rob Magno is a Sales Engineer/Solution Architect at Run:AI based in New Jersey. He has been working in the Docker and Kubernetes space for the past five years. He enjoys tackling the diverse customer challenges that come with orchestrating AI/ML workloads through Kubernetes.
Demo Talk | Virtual
In this talk, we will hear from Quantiphi’s Sanjay Patil, Senior Client Solution Partner, Applied AI, US North-East, on how they:
Accelerated Research and Clinical Trials
Simplified technology to enable clinicians to detect hemorrhage accurately and administer non-invasive treatment
Implemented Google’s ‘Healthcare API’ into Medical Imaging Diagnostics
Productionalized the solution while scaling to multiple trial sites and remote locations on edge devices
Unlocked the technology potential of using the abundance of Imaging data for research & diagnosis for CT scans
Sanjay Patil is the senior client solutions partner at Quantiphi and leads the Google Cloud business for US North-East. His passion and expertise in Data and AI/ML solutions drive Sanjay to build transformation roadmaps for customers and help them harness the power of data and the cloud.
Prior to Quantiphi, he worked with CVS Health as a Business Analyst, supporting the GM/Seasonal Business Unit in analyzing sales data and building seasonal budgets. Sanjay holds a Master’s in Marketing Analytics from Graduate school at Bentley University, an MBA in Marketing, and a Graduate Certificate in Business Analytics.
Sanjay is an avid reader and continuous learner who enjoys discussions on AI/ML, cloud technology, leadership, and career coaching.
Demo Talk | Virtual
We will discuss different approaches you can take to adapt your data so that it fits in your existing analysis framework. Then we will review the steps you can take when the analysis is simply too big to fit in the RAM of a single machine. We will examine how you might speed up calculations by using parallel processes and/or GPUs and by using frameworks such as Python’s Dask and the R future package. This discussion will equip you with strategies to tackle larger datasets. More data does not have to mean more problems!..more details
Nathan Ballou is a Senior Data Scientist at Saturn Cloud, a cloud workspace for the whole data science team. Prior to working at Saturn Cloud, Nathan worked as a data science consultant and as an operations research analyst. When Nathan’s not evangelizing machine learning at Saturn Cloud he can be found rowing on the Patapsco River in Baltimore.
Demo Talk | In-person
In this talk we’ll explore indexing strategies to reduce exponential Big O problems to reasonable workloads through the polymorphism offered by the GiST index type in Postgresql and look at how to incorporate these strategies with other tools to build robust and performant analysis pipelines…more details
CTO and co-founder of dataPlor. Co-founded BrandFolder ($155M exit) and has 10+ years of engineering experience in the startup space, in addition to 5 years as a data analyst in financial litigation consulting.
Demo Talk | Virtual
This presentation and demo will show how your team can easily compose, manage, and monitor AI/ML infrastructure across multiple systems on a single pane of glass, seamlessly scale ML pipelines from local development to batch execution and online serving, and optimize end-to-end ML pipelines in an automatic and cost-efficient way. We will discuss new innovations in Composable, Automatic, and Scalable ML (CASL), developed in collaboration with CMU, UC Berkeley, and Stanford, and how they play a pivotal role in the Petuum Platform…more details
Aurick Qiao is the Chief Executive Officer at Petuum. Aurick received his Ph.D. from Carnegie Mellon University, where he researched distributed machine learning systems. His work on elastic scheduling for deep learning training recently won the Jay Lepreau Best Paper Award at OSDI 2021. Together with his experience at top technology companies such as Microsoft, Facebook, and Dropbox, Aurick is building products to support the next generation of AI/ML operations.
Tong Wen is an Architect and Director of Engineering at Petuum. Tong joined Petuum from Microsoft where he was a member of the founding team of Azure Machine Learning. Tong has 10+ years of experience in building innovative and high-impact AI/ML and HPC platforms with proven track record. Before his first startup experience in 2008, Tong was a researcher in computational science and engineering at IBM Research and Lawrence Berkley National Lab. He holds a Ph.D. degree in applied mathematics from MIT.
Keynote | Virtual | Machine Learning | All Levels
In his Keynote, Ken dives deep into 5 of the main causes of misunderstandings between data scientists and decision makers. He highlights the actionable strategies to get everyone on the same page so that data scientists and decision makers are working with, not against each other…more details
As the Head of Data Science at Scouts Consulting Group, Ken spends his workdays improving the performance of athletes and teams by analyzing the data collected on them. In his free time, Ken dabbles in entrepreneurship and content creation. He is best known for his YouTube channel where he helps over 150,000 people navigate the data science landscape. He enjoys making commentary and tutorial videos that make the fields of data science and machine learning accessible to everyone. He started the #66DaysOfData challenge to help people create the habit of learning and working on projects every day. In his free time, Ken enjoys golf, fly fishing, yoga, jiu-jitsu, and cooking.
Keynote | Virtual | Machine Learning
In this session we’ll discuss the current trend of increasingly larger AI models, empowering a wider range of tasks in the language, vision, and multi-modality space, with growing levels of capability. We’ll give an overview of the research and engineering efforts supporting the trend, its product and engineering impact at Microsoft, and the implications for other companies…more details
Luis Vargas is a Partner Technical Advisor to the CTO of Microsoft. Responsible for Microsoft’s AI at Scale initiative coordinating efforts across infrastructure, systems software, models, and products. He bootstrapped the productization of Automated ML and Reinforcement Learning in the Azure AI Platform, worked on the launch of Azure Database Services, and lead the high-availability area for SQL Server. Luis has a PhD in Computer Science from Cambridge University.
Keynote | In-person | Machine Learning | All Levels
In his Keynote, Ken dives deep into 5 of the main causes of misunderstandings between data scientists and decision makers. He highlights the actionable strategies to get everyone on the same page so that data scientists and decision makers are working with, not against each other…more details
As the Head of Data Science at Scouts Consulting Group, Ken spends his workdays improving the performance of athletes and teams by analyzing the data collected on them. In his free time, Ken dabbles in entrepreneurship and content creation. He is best known for his YouTube channel where he helps over 150,000 people navigate the data science landscape. He enjoys making commentary and tutorial videos that make the fields of data science and machine learning accessible to everyone. He started the #66DaysOfData challenge to help people create the habit of learning and working on projects every day. In his free time, Ken enjoys golf, fly fishing, yoga, jiu-jitsu, and cooking.
Keynote | Virtual | Machine Learning | Machine Learning Safety & Security
Just like any other piece of software, machine learning models are vulnerable attacks from malicious agents. However, data scientists and ML engineers rarely think about the security of their models. Models are vulnerable too—they’re representations of underlying training datasets, and are susceptible to attacks that can compromise the privacy and confidentiality of data. Every single step in the machine learning lifecycle is susceptible to various security threats. But there are steps you can take…more details
Hari Bhaskar is an engineering leader with hands on experience in designing and developing the AI platform at OCI. He is a researcher with a PhD on big data architectures and machine learning. His expertise and interests include the areas of model life cycle management, MLOps, and ML security and bias assessment. He has published 25+ papers in leading academic journals such as IEEE and Springer, and presented in international conferences on topics related to AI and machine learning. He is passionate about model security as it is one of the nascent areas of research where threat vectors emerge in terms of sophisticated and crafted attacks to mine models and associated information on data sets.
Jean-René Gauthier is the Product Architect behind the Oracle Cloud Infrastructure AI platform. Previously at DataScience.com, Jean-René designed the datascience.com platform model management features and roadmap. In addition, he managed a team of data experts in developing algorithms and analytics models to solve customers’ unique business problems. He is also responsible for educating clients on these algorithms and models, ensuring that they are incorporated into the business to add maximum value. Prior to his three years at DataScience.com, Jean-René was a data scientist at AuriQ Systems where he focused on online marketing analytics and data engineering, often involving high-speed processing of massive data sets. He holds a PhD in astrophysics from the University of Chicago and was a Millikan fellow at the California Institute of Technology.
Demo Talk | In-person | Machine Learning
Feature engineering is more than simply missing value imputation, handling outlier and categorical variables and scaling numerical variables. It is an opportunity to allow a data scientist’s creativity to shine and as Andrew Ng’s stated, “Applied machine learning is basically feature engineering.” In this talk, we will show how to aggregate time series data and calculate moving averages in pandas, directly on the data warehouse using SQL and leveraging Rasgo to calculate and publish those features on Snowflake…more details
Andrew Engel is the Chief Data Scientist at Rasgo. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, HP and led DataRobot’s efforts in Entertainment, Sports and Gaming before joining Rasgo in August of 2020.
Demo Talk | In-person | Machine Learning
In this session, attendees will learn how iMerit is solving the problem of scaling data pipelines with accuracy using unique technology. Join iMerit’s VP of Product, Glen Ford, as he uncovers the invisible technology building successful data labeling workflows and discovering anomalous and novel classes for customers using iMerit’s Edge Case technology…more details
Glen Ford is VP of Product at iMerit — a leading AI data solutions company — where he leads the product management and design teams. Glen holds more than two decades of product development experience across the technology sector. A Graduate of Texas A&M University—Commerce, Glen began his career as a consultant where he handled full-stack web programming and architecture for clients including Time Warner and AIM Funds. Over the years, he has held senior and director-level product management roles at several companies including Demand Media, WP Engine and Humanify. Most recently, Glen spent four years at Alegion — an ML-powered data annotation platform — where he helped the company grow from eight full-time employees to more than 100 in a challenging, emerging market.
Demo Talk | In-person | MLOps & Data Engineering
Machine learning models are never done. The world is always changing and models rely on data to learn useful information about this world. In ML systems we need to be able to embrace change without sacrificing reliability. But how do we do it? MLOps. MLOps, the process of operationalizing your machine learning technology, is fundamental to any organization leveraging AI. However, the complexities of machine learning require managing two lifecycles: the code and the data. Pachyderm is a platform that provides the foundation for unifying these two lifecycles. In this session, you will learn how to manage constantly changing data through versioning, unify data and code lifecycles, and institute data- driven automation…more details
Jimmy Whitaker is the Chief Scientist of AI at Pachyderm. He focuses on creating a great data science experience and sharing best practices for how to use Pachyderm. When he isn’t at work, he’s either playing music or trying to learn something new, because “You suddenly understand something you’ve understood all your life, but in a new way.”
Demo Talk | In-person | MLOps & Data Engineering
The capabilities of MLRun are extensive, and we will cover the basics to get you started. You will leave this session with enough information to:
Get you started with MLRun, on your own, in 10 minutes, so you can automate and accelerate your path to production
Run local move to Kubernetes
Understand how your Python code can run as a Kubernetes job with no code changes
Track your experiments
Get an introduction to advanced MLOps topics using MLRun
Marcelo Litovsky is an experienced Information Technology professional with 30 years of diverse background in Enterprise Architecture, AI, Systems and Database Management, and Programming. He has worked in multiple industries: Financial Services, Entertainment, and Information Technology in his career. Today, he serves as Director of Sales Engineering at Iguazio, bringing his expertise to help Data Scientists, Data Engineers, and Systems Engineers work together to deploy AI/ML applications faster, more efficiently and in a reproducible way. When he is not installing software, talking to customers, or writing Python code, you can find him at the gym or preparing healthy vegan meals.
Demo Talk | In-person | Machine Learning | All Levels
This talk will review relevant use cases leveraging Kensho NERD to uncover the companies, subsidiaries, and other organizations appearing in textual data to power smart search, supercharge research workflows, and more. By linking to broad knowledge bases with tens of millions of entities, you’ll see first-hand how you can differentiate your organization with data and machine learning…more details
Phil Taylor is the Product Manager for Kensho NERD (Named Entity Recognition and Disambiguation). Prior to joining Kensho, Phil worked at IBM as a product manager for their data and AI SaaS platform and as a strategy and operations consultant. He earned his MBA from MIT Sloan in 2019 and previously worked as a consultant at firms such as Charles River Associates.
Demo Talk | In-person | Machine Learning
In this talk, I will present Northstar, a novel system we developed for Interactive Data Exploration at MIT and Brown University and which is now commercialized by einblick analytics, inc. I will explain why Northstar required us to completely rethink the entire analytics stack, from the interface to the “guts” and highlight a few selected techniques we developed to provide a truly novel user-interface (see http://www.einblick.ai/ for a video demonstration) and interactive speeds even over the largest datasets and complex ML operations…more details
Tim Kraska is an Associate Professor of Electrical Engineering and Computer Science in MIT’s Computer Science and Artificial Intelligence Laboratory, co-director of the Data System and AI Lab at MIT (DSAIL@CSAIL), and co-founder of Einblick Analytics. Currently, his research focuses on building systems for machine learning, and using machine learning for systems. Before joining MIT, Tim was an Assistant Professor at Brown, spent time at Google Brain, and was a PostDoc in the AMPLab at UC Berkeley after he got his PhD from ETH Zurich. Tim is a 2017 Alfred P. Sloan Research Fellow in computer science and received several awards including the VLDB Early Career Research Contribution Award, the VMware Systems Research Award, the university-wide Early Career Research Achievement Award at Brown University, an NSF CAREER Award, as well as several best paper and demo awards at VLDB, SIGMOD, and ICDE.
Demo Talk | In-person | MLOps &Data Engineering | All Levels
In this talk, I’ll describe an approach that streamlines all three phases. As our demo project, I’ve selected a very common deployment pattern in CV projects: a CV model wrapped in a web API service. Automatic defect detection is an example problem I am addressing with this pattern…more details
Alex Kim is a Solutions Engineer at Iterative. His background is in physics, software engineering, and machine learning. In the last couple of years, he became increasingly interested in the engineering side of ML projects: processes and tools needed to go from an idea to a production solution.
Demo Talk | In-person | MLOps and Data Engineering | All Levels
Feature stores are the newest idea that is supposed to help us, but it turns out that’s not enough. In this session, you’ll learn how to craft production-ready features and build training datasets at the right points-in-time from event-based data. Specifically, we’ll be covering strategies for powering feature stores with a feature engine to:
– Compute directly from event-based data to try new features
– Iterate on feature definitions and time selection across historical data instantly
– Join values between different entities at precise times — without leakage
– Eliminate data discrepancies in production
Come join us to learn how to finally iterate on amazing ML models with event-based data…more details
Dr. Charna Parkey is Vice President of Product at Kaskada, where she co-created the first commercially available feature engine with time travel. She has over 15 years’ experience in enterprise data science and adaptive algorithms in the defense and startup tech sectors and has worked with dozens of Fortune 500 companies in her work as a data scientist. She earned her Ph.D. in Electrical Engineering at the University of Central Florida.
Demo Talk | Virtual | Machine Learning
In this session, Senior Director of Engineering Shan He will tangibly demonstrate how geospatial analysis can help improve user experiences, product design and business decisions…more details
Shan He is Senior Director of Engineering at Foursquare and Co-Founder of Unfolded – acquired by Foursquare. She is an engineer, a designer, and a data artist who has built her career in geospatial analytics and visualization. Before founding Unfolded, Shan was the first member of Uber’s data visualization team. At Uber, she created and open-sourced kepler.gl, an advanced geospatial visualization tool and the 2018 Kantar Information is Beautiful Award Gold winner.
Demo Talk | Virtual | Machine Learning
In this talk, we’ll learn about the advantages of time series databases and InfluxDB when tackling time series data science problems. Next, we’ll dive into the solutions that InfluxDB offers which enable you to prepare your data and send it to the data warehouse of your choice. You can also use InfluxDB for MLOps monitoring. Finally, a demo will demonstrate just how easy it is to collect and write time series data into InfluxDB so you can focus on the analysis of your data…more details
Anais Dotis-Georgiou is a Developer Advocate for InfluxData with a passion for making data beautiful with the use of Data Analytics, AI, and Machine Learning. She takes the data that she collects, does a mix of research, exploration, and engineering to translate the data into something of function, value, and beauty. When she is not behind a screen, you can find her outside drawing, stretching, boarding, or chasing after a soccer ball.
Demo Talk | In-person | Machine Learning
This talk will present a new technique we call “novelty detection” which uses the freely available “Quine” streaming graph to score incoming event data immediately. This technique is able to use categorical data directly instead requiring the traditional one-hot encoding (or other encodings) and makes use of context to accurately score events never seen before. The end result of this process is a live stream of real-time explanations and “novelty scores” which provide a total-ordering of how unusual each observation is compared to all data seen so far…more details