ODSC West will host more than 280 speakers and instructors. Speaker profiles are added weekly. Check back for updates. You’re welcome to check out some speaker blogs here.
ODSC hosts a fantastic lineup of some of the best and brightest expert speakers and core contributors in data science
ODSC West will host more than 280 speakers and instructors. Speaker profiles are added weekly. Check back for updates. You’re welcome to check out some speaker blogs here.
Michael I. Jordan is the Pehong Chen Distinguished Professor in the Department of Electrical Engineering and Computer Science and the Department of Statistics at the University of California, Berkeley.
Michael I. Jordan is the Pehong Chen Distinguished Professor in the Department of Electrical Engineering and Computer Science and the Department of Statistics at the University of California, Berkeley. He received his Masters in Mathematics from Arizona State University, and earned his PhD in Cognitive Science in 1985 from the University of California, San Diego. He was a professor at MIT from 1988 to 1998. His research interests bridge the computational, statistical, cognitive, biological and social sciences. Prof. Jordan is a member of the National Academy of Sciences, a member of the National Academy of Engineering, a member of the American Academy of Arts and Sciences, and a Foreign Member of the Royal Society. He is a Fellow of the American Association for the Advancement of Science. He was a Plenary Lecturer at the International Congress of Mathematicians in 2018. He received the Ulf Grenander Prize from the American Mathematical Society in 2021, the IEEE John von Neumann Medal in 2020, the IJCAI Research Excellence Award in 2016, the David E. Rumelhart Prize in 2015, and the ACM/AAAI Allen Newell Award in 2009. He gave the Inaugural IMS Grace Wahba Lecture in 2022, the IMS Neyman Lecture in 2011, and an IMS Medallion Lecture in 2004. He is a Fellow of the AAAI, ACM, ASA, CSS, IEEE, IMS, ISBA and SIAM.
In 2016, Prof. Jordan was named the “most influential computer scientist” worldwide in an article in Science, based on rankings from the Semantic Scholar search engine.
On Learning-Aware Mechanism Design(Keynote)
Kay Firth-Butterfield is Head of Artificial Intelligence and a member of the Executive Committee at the World Economic Forum and is one of the foremost experts in the world on the governance of AI. She is a Barrister, former Judge and Professor, technologist and entrepreneur who has an abiding interest in how humanity can equitably benefit from new technologies, especially AI. Kay is an Associate Barrister (Doughty Street Chambers), Master of the Inner Temple, London and serves on the Lord Chief Justice’s Advisory Panel on AI and Law. She co-founded AI Global and was the world’s first Chief AI Ethics officer in 2014 and created the AIEthics twitter hashtag. Kay is Vice-Chair of The IEEE Global Initiative for Ethical Considerations in Artificial Intelligence and Autonomous Systems and was part of the group which met at Asilomar to create the Asilomar AI Ethical Principles. She is on the Polaris Council for the Government Accountability Office (USA), the Advisory Board for UNESCO International Research Centre on AI and AI4All. Kay has advanced degrees in Law and International Relations and regularly speaks to international audiences addressing many aspects of the beneficial and challenging technical, economic and social changes arising from the use of AI. She has been consistently recognized as a leading woman in AI since 2018 and was featured in the New York Times as one of 10 Women Changing the Landscape of Leadership.
Responsible AI a Global Imperative for Governments and Business – Now and the Future(Keynote)
Professor Pieter Abbeel is Director of the Berkeley Robot Learning Lab and Co-Director of the Berkeley Artificial Intelligence (BAIR) Lab. Abbeel’s research strives to build ever more intelligent systems, which has his lab push the frontiers of deep reinforcement learning, deep unsupervised learning, especially as it pertains to robotics. Abbeel’s Intro to AI class has been taken by over 100K students through edX, and his Deep Unsupervised Learning materials are standard references for AI researchers. Abbeel has founded several companies, including Gradescope (AI to help instructors with grading homework, projects and exams) and Covariant (AI for robotic automation of warehouses and factories). He advises many AI and robotics start-ups, and is a frequently sought after speaker worldwide for C-suite sessions on AI future and strategy. Abbeel has received many awards and honors, including ACM Prize, IEEE Fellow, PECASE, NSF-CAREER, ONR-YIP, AFOSR-YIP, Darpa-YFA, TR35, and 10+ best paper awards/finalists. His work is frequently featured in the press, including the New York Times, Wall Street Journal, BBC, Rolling Stone, Wired, and Tech Review.
Monica Lam is a Professor in the Computer Science Department at Stanford University since 1988. She is the faculty director of the Open Virtual Assistant Lab (OVAL). She received a B.Sc. from University of British Columbia in 1980 and a Ph.D. in Computer Science from Carnegie Mellon University in 1987. Monica is a Member of the National Academy of Engineering and an ACM Fellow. She is a co-author of the popular text Compilers, Principles, Techniques, and Tools (2nd Edition), also known as the Dragon book. Professor Lam’s current research is on conversational virtual assistants with an emphasis on privacy protection. Her research uses deep learning to map task-oriented natural language dialogues into formal semantics, represented by a new executable programming language called ThingTalk. Her Almond virtual assistant, trained on open knowledge graphs and IoT API standards, can be easily customized to perform new tasks. She is leading an Open Virtual Assistant Initiative to create the largest, open, crowdsourced language semantics model to promote open access in all languages. Her decentralized Almond virtual assistant that supports fine-grain sharing with privacy has received Popular Science’s Best of What’s New Award in Security in 2019.
Prof. Lam is also an expert in compilers for high-performance machines. Her pioneering work of affine partitioning provides a unifying theory to the field of loop transformations for parallelism and locality. Her software pipelining algorithm is used in commercial systems for instruction level parallelism. Her research team created the first, widely adopted research compiler, SUIF. Her contributions in computer architecture include the CMU Warp Systolic Array and the Stanford DASH Distributed Memory Multiprocessor. She was on the founding team of Tensilica, now a part of Cadence.
She received an NSF Young Investigator award in 1992, the ACM Most Influential Programming Language Design and Implementation Paper Award in 2001, an ACM SIGSOFT Distinguished Paper Award in 2002, the ACM Programming Language Design and Implementation Best Paper Award in 2004, the ACM SIGARCH/SIGPLAN/SIGOPS ASPLOS Influential Paper Awards in two consecutive years, 2021 and 2022. She was the author of two of the papers in “20 Years of PLDI–a Selection (1979-1999)”, and one paper in the “25 Years of the International Symposia on Computer Architecture”. She received the University of British Columbia Computer Science 50th Anniversary Research Award in 2018.
Taming Large Language Models into Trustworthy Conversational Virtual Assistants(Keynote)
Ion Stoica is a Professor in the EECS Department at University of California at Berkeley. He does research on cloud computing and networked computer systems. Past work includes Apache Spark, Apache Mesos, Tachyon, Chord DHT, and Dynamic Packet State (DPS). He is an ACM Fellow and has received numerous awards, including the SIGOPS Hall of Fame Award (2015), the SIGCOMM Test of Time Award (2011), and the ACM doctoral dissertation award (2001). In 2013, he co-founded Databricks a startup to commercialize technologies for Big Data processing, and in 2006 he co-founded and Conviva, a startup to commercialize technologies for large scale video distribution.
Making ML Scaling Easy(Keynote)
Graham Neubig is an associate professor at the Language Technologies Institute of Carnegie Mellon University. His research focuses on multilingual natural language processing, natural language interfaces to computers, and machine learning methods for NLP, with the final goal of every person in the world being able to communicate with each-other, and with computers in their own language. He also contributes to making NLP research more accessible through open publishing of research papers, advanced NLP course materials and video lectures, and open-source software, all of which are available on his web site.
Is My NLP Model Working? The Answer is Harder Than You Think(Keynote)
Yaron Haviv is a serial entrepreneur who has been applying his deep technological experience in data, cloud, AI and networking to leading startups and enterprise companies since the late 1990s. As the co-founder and CTO of Iguazio, Yaron drives the strategy for the company’s MLOps platform and led the shift towards the production-first approach to data science and catering to real-time AI use cases. He also initiated and built Nuclio, a leading open source serverless platform with over 4,000 Github stars and MLRun, Iguazio’s open source MLOps orchestration framework. Prior to co-founding Iguazio in 2014, Yaron was the Vice President of Datacenter Solutions at Mellanox (now NVIDIA), where he led technology innovation, software development and solution integrations. He was also the CTO and Vice President of R&D at Voltaire, a high-performance computing, IO and networking company which floated on the NYSE in 2007. Yaron is an active contributor to the CNCF Working Group and was one of the foundation’s first members. He presents at major industry events and writes tech content for leading publications including TheNewStack, Hackernoon, DZone, Towards Data Science and more.
From AutoML to AutoMLOps(Track Keynote)
Dawn Song is a Professor in the Department of Electrical Engineering and Computer Science at UC Berkeley. Her research interest lies in deep learning, security, and blockchain. She has studied diverse security and privacy issues in computer systems and networks, including areas ranging from software security, networking security, distributed systems security, applied cryptography, blockchain and smart contracts, to the intersection of machine learning and security. She is the recipient of various awards including the MacArthur Fellowship, the Guggenheim Fellowship, the NSF CAREER Award, the Alfred P. Sloan Research Fellowship, the MIT Technology Review TR-35 Award, the Faculty Research Award from IBM, Google and other major tech companies, and Best Paper Awards from top conferences in Computer Security and Deep Learning. She is an IEEE Fellow. She is ranked the most cited scholar in computer security (AMiner Award). She obtained her Ph.D. degree from UC Berkeley. Prior to joining UC Berkeley as a faculty, she was a faculty at Carnegie Mellon University from 2002 to 2007. She is also a serial entrepreneur.
(Talk)
Jon Krohn is Chief Data Scientist at the machine learning company untapt. He authored the book Deep Learning Illustrated, which was released by Addison-Wesley in 2019 and became an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers in-person at Columbia University, New York University, and the NYC Data Science Academy, as well as online via O’Reilly, YouTube, and his A4N podcast on A.I. news. Jon holds a doctorate in neuroscience from Oxford and has been publishing on machine learning in leading academic journals since 2010.
Andreas Mueller is a Principal Research SDE at Microsoft (previously Columbia, NYU, Amazon), and author of the O’Reilly book “Introduction to machine learning with Python”, describing a practical approach to machine learning with python and scikit-learn. He is one of the core developers of the scikit-learn machine learning library, and has been co-maintaining it for several years. Andreas is also a Software Carpentry instructor.
Automatic DataFrame Profiling and Visualization for Machine Learning(Talk)
Dr. Kira Radinsky is the CEO and CTO of Diagnostic Robotics, where the most advanced technologies in the field of artificial intelligence are harnessed to make healthcare better, cheaper, and more widely available. In the past, she co-founded SalesPredict, acquired by eBay in 2016, and served as eBay director of data science and IL chief scientist. One of the up-and-coming voices in the data science community, she is pioneering the field of medical data mining. Dr. Radinsky gained international recognition for her work at Microsoft Research, where she developed predictive algorithms that recognized the early warning signs of globally impactful events, including political riots and disease epidemics. In 2013, she was named to the MIT Technology Review’s 35 Young Innovators Under 35, in 2015 as Forbes 30 under 30 rising stars in enterprise technology, and in 2016 selected as “woman of the year” by Globes. She is a frequent presenter at global tech events, including TEDx, Wired, Strata Data Science, Techcrunch and academic conferences, and she publishes in the Harvard Business Review. Radinsky serves as a board member in: Israel Securities Authority, Maccabi Research Institute, and technology board of HSBC bank. Dr. Radinsky also serves as visiting professor at the Technion, Israel’s leading science and technology institute, where she focuses on the application of predictive data mining in medicine.
Jess Garcia is the Founder of the global Cybersecurity/DFIR firm One eSecurity and a Senior Instructor with the SANS Institute.
During his 25 years in the field, Jess has led a myriad of complex multinational investigations for Fortune 500 companies and global organizations. As a SANS Instructor, Jess stands as one of the most prolific and veteran ones, having taught 10+ different highly technical Cybersecurity/DFIR courses in hundreds of conferences world-wide over the last 19 years.
Jess is also an active Cybersecurity/DFIR Researcher. With the mission of bringing Data Science/AI to the DFIR field, Jess launched in 2020 the DS4N6 initiative (www.ds4n6.io), under which he is leading the development of multiple open source tools, standards and analysis platforms for DS/AI+DFIR interoperability.
DS/AI for Incident Response & Threat Hunting with CHRYSALIS & DAISY(Talk)
Dr. Jennifer Prendki is the founder and CEO of Alectio, the first startup focused on DataPrepOps, a portmanteau term that she coined to refer to the nascent field focused on automating the optimization of a training dataset. She and her team are on a fundamental mission to help ML teams build models with less data (leading to both the reduction of ML operations costs and CO2 emissions) and have developed technology that dynamically selects and tunes a dataset that facilitates the training process of a specific ML model. Prior to Alectio, Jennifer was the VP of Machine Learning at Figure Eight; she also built an entire ML function from scratch at Atlassian, and led multiple Data Science projects on the Search team at Walmart Labs. She is recognized as one of the top industry experts on Data Preparation, Active Learning and ML lifecycle management, and is an accomplished speaker who enjoys addressing both technical and non-technical audiences.
Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Dr. Stonebraker has been a pioneer of data base research and technology for more than forty years. He was the main architect of the INGRES relational DBMS, and the object-relational DBMS, POSTGRES. These prototypes were developed at the University of California at Berkeley where Stonebraker was a Professor of Computer Science for twenty five years. More recently at M.I.T. he was a co-architect of the Aurora/Borealis stream processing engine, the C-Store column-oriented DBMS, the H-Store transaction processing engine, the SciDB array DBMS, and the Data Tamer data curation system. Presently he serves as Chief Technology Officer of Hopara and Tamr, Inc.
Professor Stonebraker was awarded the ACM System Software Award in 1992 for his work on INGRES. Additionally, he was awarded the first annual SIGMOD Innovation award in 1994, and was elected to the National Academy of Engineering in 1997. He was awarded the IEEE John Von Neumann award in 2005 and the 2014 Turing Award, and is presently an Adjunct Professor of Computer Science at M.I.T.
Ville has been developing infrastructure for machine learning for over two decades. He has worked as an ML researcher in academia and as a leader at a number of companies, including Netflix where he led the ML infrastructure team that created Metaflow, a popular open-source framework for data science infrastructure. He is a co-founder and CEO of Outerbounds, a company developing modern human-centric ML. He is also the author of an upcoming book, Effective Data Science Infrastructure, published by Manning.
Human-Friendly, Production-Ready Data Science with Metaflow(Talk)
Mosharaf Chowdhury is a Morris Wellman associate professor of CSE at the University of Michigan, Ann Arbor, where he leads the SymbioticLab. His work improves application performance and system efficiency of machine learning and big data workloads. He is also building software solutions to monitor and optimize the impact of machine learning systems on energy consumption and data privacy. His group developed Infiniswap, the first scalable software solution for memory disaggregation; Salus, the first software-only GPU sharing system for deep learning; FedScale, the largest federated learning benchmark and a scalable and extensible federated learning engine; and Zeus, the first GPU energy-vs-training performance tradeoff optimizer for DNN training. In the past, Mosharaf did seminal works on coflows and virtual network embedding, and he was a co-creator of Apache Spark. He has received many individual awards and fellowships, thanks to his stellar students and collaborators. His works have received seven paper awards from top venues, including NSDI, OSDI, and ATC, and over 22,000 citations. Mosharaf received his Ph.D. from UC Berkeley in 2015.
Jennifer Davis, Ph.D. is a Staff Field Data Scientist at Domino Data Labs, where she empowers clients on complex data science projects. She has completed two postdocs in computational and systems biology, trained at a supercomputing center at the University of Texas, Austin, and worked on hundreds of consulting projects with companies ranging from start-ups to the Fortune 100. Jennifer has previously presented topics at conferences for Association for Computing Machinery on LSTMs and Natural Language Generation and at conferences across the US and in Italy. Jennifer was part of a panel discussion for an IEEE conference on artificial intelligence in biology and medicine. She has practical experience teaching both corporate classes and at the college level. Jennifer enjoys working with clients and helping them achieve their goals.
Large Scale Deep Learning using the High-Performance Computing Library OpenMPI and DeepSpeed(Workshop)
Leonidas (Leo) is a Senior Data Scientist at Astrazeneca. His work is focused around machine learning in oncology, including clinical and non clinical applications. He is also enthusiastic about NLP applications in oncology and how this can be used to leverage patient treatment. He is also a workshop facilitator in the European Leadership University (ELU), NL and has also been a data science educator at DataCamp. He holds a PhD from the University of Warwick, UK. in bioinformatics and ML, an MSc in statistics from Imperial College London, UK and a BSc in Statistics and Insurance Science from the University of Piraeus, GR.
Introduction to Python for Data Analysis(Bootcamp)
Nick is a passionate machine learning, data science, and MLOps enthusiast with experience across multiple domains including fraud detection, natural language processing, computer vision, and data mining. Nick holds a BSc. in Cognitive Science with a specialization in ML and Neural Computation from University of California, San Diego. He is an AWS Certified Solutions Architect, and has earned certifications in Python, Pytorch, Apache Airflow, PySpark and other frameworks. Currently, Nick acts as pre-sales MLOps Engineer at Iguazio, where he specializes in helping enterprises create real-world impact with their data science initiatives, with expertise in deployments on AWS, GCP, and Azure as well as on-premise Kubernetes architecture. Nick speaks at global industry events and blogs about MLOps, data science and ML Engineering.
Building an AI App in Under 20 Minutes Using OS MLOps tool MLRun(Demo Talk)
Hugo Bowne-Anderson is a data scientist, writer, educator & podcaster. His interests include promoting data & AI literacy/fluency, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. He does many of these at DataCamp, a data science training company educating over 3 million learners worldwide through interactive courses on the use of Python, R, SQL, Git, Bash and Spreadsheets in a data science context. He has spearheaded the development of over 25 courses in DataCamp’s Python curriculum, impacting over 170,000 learners worldwide through my own courses. He hosts and produce the data science podcast DataFramed, in which he uses long-format interviews with working data scientists to delve into what actually happens in the space and what impact it can and does have. He earned PhD in Mathematics from the University of New South Wales, Australia and has conducted biomedical research at the Max Planck Institute in Germany and Yale University, New Haven.
Full-stack Machine Learning for Data Scientists(Tutorial)
Clinton Brownley, Ph.D., is a data scientist at Meta (formerly Facebook), where he’s responsible for a variety of analytics projects designed to empower employees to do their best work. Prior to this role, he was a data scientist at WhatsApp, working to improve messaging and VoIP calling performance and reliability. Before WhatsApp, he worked on large-scale infrastructure analytics projects to inform hardware acquisition, maintenance, and data center operations decisions at Facebook.
As an avid student and teacher of modern data analysis and visualization techniques, Clinton teaches a graduate course in interactive data visualization for UC Berkeley’s MIDS program, taught a short-term graduate course in regression analysis and machine learning workshop for NYU’s A3SR program, leads an annual machine learning in Python workshop, and is the author of two books, “Foundations for Analytics with Python” and “Multi-objective Decision Analysis”.
Clinton is a past-president of the San Francisco Bay Area Chapter of the American Statistical Association and is a council member for the Section on Practice of the Institute for Operations Research and the Management Sciences. Clinton received degrees from Carnegie Mellon University and American University.
Machine Learning with Python: A Hands-On Introduction(Training)
David has over 20 years of experience in the fields of data, AI and enterprise cloud. He has led teams for EMC Dell, Hitachi and Cisco, working with some of the most innovative companies in the world in both classified and commercial environments. Today, David acts as the Western Regional Director at Iguazio, working with Enterprise customers to help them bring their data science initiatives to life. David is passionate about applying MLOps principles to real-world AI projects, on-premise, in multi-cloud environments, on a SCIF or all of the above. When he’s not working with customers on AI projects, he volunteers at the Salvation Army and Rotary International. He and his wife have twins – a boy and a girl, as well as a 94lb/43kg Labrador that eats everything.
Balaji is currently a Staff Research Scientist at Google Brain working on Machine Learning and its applications. Previously, he was a research scientist at DeepMind for 4.5+ years. Before that, he received a PhD in machine learning from Gatsby Unit, UCL supervised by Yee Whye Teh. His research interests are in scalable, probabilistic machine learning. More recently, he has focused on: – Uncertainty and out-of-distribution robustness in deep learning – Deep generative models including generative adversarial networks (GANs), normalizing flows and variational auto-encoders (VAEs) – Applying probabilistic deep learning ideas to solve challenging real-world problems.
Practical Tutorial on Uncertainty and Out-of-distribution Robustness in Deep Learning(Tutorial)
Guy Van den Broeck is an Associate Professor and Samueli Fellow at UCLA, in the Computer Science Department, where he directs the Statistical and Relational Artificial Intelligence (StarAI) lab. His research interests are in Machine Learning, Knowledge Representation and Reasoning, and Artificial Intelligence in general. His work has been recognized with best paper awards from key artificial intelligence venues such as UAI, ILP, KR, and AAAI (honorable mention). He also serves as Associate Editor for the Journal of Artificial Intelligence Research (JAIR). Guy is the recipient of an NSF CAREER award, a Sloan Fellowship, and the IJCAI-19 Computers and Thought Award.
Artificial Intelligence Can Learn from Data. But Can It Learn to Reason?(Talk)
Steven Bird has spent much of his career pursuing scalable computational methods for capturing, enriching, and analysing data from endangered languages, drawing on fieldwork in West Africa, South America, and Melanesia. Over the past 5 years he has shifted to working from the ground up with remote Aboriginal communities in Australia, supporting language learning and development in an Aboriginal ranger program, school, and arts centre. He is a co-developer of the Natural Language Toolkit (NLTK), co-founder of the Open Language Archives Community (OLAC), founder of the ACL Anthology, and director of the Aikuma Project. He has held academic appointments at the universities of Edinburgh, Pennsylvania, UC Berkeley, and Melbourne, and is now professor at Charles Darwin University, in Darwin, Australia.
Oliver is a software developer from Hamburg Germany and has been a practitioner for more than 3 decades. He specializes in frontend development and machine learning. He is the author of many video courses and textbooks.
Image Recognition with OpenCV and TensorFlow(Training)
A Teaching Associate Professor in the Institute for Advanced Analytics, Dr. Aric LaBarr is passionate about helping people solve challenges using their data. There he helps design the innovative program to prepare a modern workforce to wisely communicate and handle a data-driven future at the nation’s first Master of Science in Analytics degree program. He teaches courses in predictive modeling, forecasting, simulation, financial analytics, and risk management. Previously, he was Director and Senior Scientist at Elder Research, where he mentored and led a team of data scientists and software engineers. As director of the Raleigh, NC office he worked closely with clients and partners to solve problems in the fields of banking, consumer product goods, healthcare, and government. Dr. LaBarr holds a B.S. in economics, as well as a B.S., M.S., and Ph.D. in statistics — all from NC State University.
Advanced Fraud Modeling & Anomaly Detection with Python & R(Training)
Dr. Prabhanjan (Anju) Kambadur heads the AI Engineering group at Bloomberg. Anju leads a group of 100+ researchers and engineers who build solutions for Bloomberg clients in the areas of machine learning, natural language processing (NLP) and natural language understanding, information extraction, knowledge graphs, question answering, and table understanding. Previously, Anju was a research staff member in the Business Analytics and Mathematical Sciences Department at IBM Research’s Thomas J. Watson Research Center, where he worked on problems in machine learning, such as matrix sketching, genome-wide association studies, temporal causal modeling, and high-performance computing. He received his PhD from Indiana University. Anju has published peer-reviewed articles in the fields of high-performance computing, machine learning, and natural language processing.
Cal Al-Dhubaib is a data scientist, entrepreneur, and professional speaker on Artificial Intelligence. He founded Pandata to help organizations plan, design, and scale human-centered AI solutions. Pandata has overseen 80+ transformative projects with leading global brands including Parker Hannifin, the Cleveland Museum of Art, FirstEnergy, and Penn State University.
Cal is especially passionate about orchestrating inclusive teams that are empowered to build Trusted AI solutions. He has been recognized as a Notable Immigrant Entrepreneur, Crain’s Cleveland 20 in their 20s, and two-time Cleveland Smart 50 recipient. In addition to becoming the first data science graduate from Case Western Reserve University, Cal is also known for his role in advocating for careers and educational pathways in Data Science through workforce development initiatives.
Addressing the High Failure Rate of AI & ML Projects – Practical Examples from the Field (Talk)
Meg is currently a UX Researcher for Google Cloud AI and Industry Solutions, where she focuses her research on Explainable AI and Model Understanding. She has had a varied career working for start-ups and large corporations alike, in fields as varied as EdTech, weather forecasting, and commercial robotics. She has published and spoken on topics such as user research, information visualization, educational-technology design, human-robot interaction (HRI), and voice user interface (VUI) design. Meg is also a proud alumnus of Virginia Tech, where she received her Ph.D. in Human-Computer Interaction.
Utkarsh Contractor is the VP of AI and Machine Learning at Aisera, where he leads the data science team working on machine learning and artificial intelligence applications in the fields of Natural Language Processing and Computer Vision. As a graduate student at Stanford University, his research focussed on experiments in computer vision, using Deep Neural Networks to analyze surveillance scene imagery and footages. Utkarsh has a decade of industry experience in Computer Vision, NLP and other Machine Learning domains working at companies such as Aisera, LinkedIn and AT&T Labs.
Eitan is the Chief Data Scientist at Bill.com and has many years of experience as a researcher. His recent focus is on machine learning, deep learning, applied statistics and software engineering. Before, he was a Postdoctoral Scholar at Lawrence Berkeley National Lab, received his PhD in Physics from Boston University and B.S. in Astrophysics from University of California Santa Cruz. Eitan holds 4 patents and 11 publications to date and has spoken about data at various conferences around the world.
Real-time Field Extraction on Mobile-devices Using Machine Learning(Talk)
Malte Pietsch is CTO & Co-Founder at deepset. His current focus is on building deepset Cloud – a SaaS platform for developers to build, deploy and operate modern NLP pipelines. He holds a M.Sc. with honors from TU Munich and conducted research at Carnegie Mellon University. Before founding deepset he worked as a data scientist for multiple startups. He is an active open-source contributor and author of the NLP framework Haystack.
Building Modern Search Pipelines with Haystack, Large Language Models and Hybrid Retrieval(Talk)
Julia Lintern currently works as an instructor for the Metis Data Science Flex Program. Previously, she worked as a Data Scientist for the New York Times. Julia began her career as a structures engineer designing repairs for damaged aircraft. Julia holds an MA in applied math from Hunter College, where she focused on visualizations of various numerical methods and discovered a deep appreciation for the combination of mathematics and visualizations. During certain seasons of her career, she has also worked on creative side projects such as Lia Lintern, her own fashion label.
Introduction to Machine Learning(Bootcamp)
Chandra Khatri is the Chief Scientist and Head of AI at Got It AI, wherein, his team is transforming AI space by leveraging state-of-the-art technologies to deliver the world’s first fully autonomous Conversational AI system. Under his leadership, Got It AI is democratizing Conversational AI and related ecosystems through automation. Prior to Got-It, Chandra was leading various AI applied and research groups at Uber, Amazon Alexa and eBay.
At Uber, he was leading Conversational AI, Multi-modal AI, and Recommendation Systems. At Amazon he was the founding member of the Alexa Prize Competition and Alexa AI, wherein he was leading the R&D and got the opportunity to significantly advance the field of Conversational AI, particularly Open-domain Dialog Systems, which is considered as the holy-grail of Conversational AI and is one of the open-ended problems in AI. And at eBay he was driving NLP, Deep Learning, and Recommendation Systems related applied research projects.
He graduated from Georgia Tech with a specialization in Deep Learning in 2015 and holds an undergraduate degree from BITS Pilani, India. His current areas of research include Artificial and General Intelligence, Democratization of AI, Reinforcement Learning, Language and Multi-modal Understanding, and Introducing Common Sense within Artificial Agents.
Serg Masís has been at the confluence of the internet, application development, and analytics for the last two decades. Currently, he’s a Climate and Agronomic Data Scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Before that role, he co-founded a search engine startup, incubated by Harvard Innovation Labs, that combined the power of cloud computing and machine learning with principles in decision-making science to expose users to new places and events efficiently. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making. He wrote the bestselling book “Interpretable Machine Learning with Python” and is currently working on a new book titled “DIY AI” for Addison-Wesley for a broader audience of curious developers, makers, and hackers.
Enhance Trust with Machine Learning Model Error Analysis(Workshop)
Martin is a Senior Clinical Programmer at BioMarin, where he builds dashboards and tools for making data-informed decisions. Previously, Martin built statistical tools and dashboards for the Diabetes Technology Society, a contributing author for Data Journalism in R on the Northeastern University School of Journalism blog/website, and other volunteer and non-profit organizations. He’s a data journalism instructor for California State University, Chico. Martin holds a graduate degree in Clinical Research and is passionate about data literacy and open source technologies.
Data Visualization with ggplot2(Workshop)
Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage.
He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences.
Machine Learning with XGBoost(Workshop)
Idiomatic Pandas(Workshop)
Scott Zoldi is chief analytics officer at FICO responsible for advancing the company's leadership in artificial intelligence (AI) and analytics in its product and technology solutions. At FICO Scott has authored more than 120 analytic patents, with 71 granted and 49 pending. Scott is actively involved in the development of analytics applications, Responsible AI technologies and AI governance frameworks, the latter including FICO's blockchain-based [SZ1] model development governance methodology. Scott is a member of the Board of Advisors of FinRegLab, a Cybersecurity Advisory Board Member of the California Technology Council, and a Board Member of Tech San Diego and the San Diego Cyber Center of Excellence. He is also a member of the CNBC Technology Executive Council. Scott received his Ph.D. in theoretical and computational physics from Duke University.
Adam Breindel consults and teaches widely on Apache Spark and other technologies. Adam’s experience includes work with banks on neural-net fraud detection, streaming analytics, cluster management code, and web apps, as well as development at a variety of startup and established companies in the travel, productivity, and entertainment industries. He is excited by the way that Spark and other modern big-data tech remove so many old obstacles to system design and make it possible to explore new categories of interesting, fun, hard problems.
Joseph M. Hellerstein is the Jim Gray Professor of Computer Science at the University of California, Berkeley, whose work focuses on data-centric systems and the way they drive computing. He is an ACM Fellow, an Alfred P. Sloan Research Fellow and the recipient of three ACM-SIGMOD “Test of Time” awards for his research. Fortune Magazine has included him in their list of 50 smartest people in technology , and MIT’s Technology Review magazine included his work on their TR10 list of the 10 technologies “most likely to change our world”.
Hellerstein is a co-founder of Aqueduct, which is bringing new open source technology for Prediction Infrastructure to market. Previously he co-founded Trifacta, the pioneering company in Data Preparation, where he served as founding CEO and Chief Strategy Officer. Hellerstein has served on the technical advisory boards of a number of computing and Internet companies including Dell EMC, SurveyMonkey, Datometry and Acryl Data.
Diego Klabjan is a professor at Northwestern University, Department of Industrial Engineering and Management Sciences. He is also Founding Director, Master of Science in Analytics, and the Deep Learning Lab. His expertise is focused on data science and deep learning with a concentration in finance, insurance, and healthcare. Professor Klabjan has led projects with large companies such as The Chicago Mercantile Exchange Group, Intel, General Motors and many others, and he is also assisting numerous start-ups with their analytics needs. He is also a founder of Opex Analytics.
MLOps for Deep Learning(Talk)
Kirstin Aschbacher is a Data Scientist, with a background in PsychoNeuroImmunology Research from her days as an Associate Professor at the University of California, San Francisco (UCSF), Department of Psychology, Weill Institute for Neurosciences, and the Division of Cardiology. She has a PhD in Clinical Psychology and is also a licensed Psychologist with a certificate in HRV Biofeedback. She uses her cross-functional skill-sets to drive innovative, AI-based products that enhance user well-being and stress-resilience. In her current role as Senior Director of Health Data Science at Meru Health, she has focused on HRV Biofeedback and Precision Care algorithms.
Nidhin is an Machine Learning Engineer at Walmart where he works on Walmart’s E-commerce Search Engine. Before Walmart, he worked for two startups.
Building a Semantic Search Engine (Training)
Manu Ram Pandit is a Staff software engineer on the data analytics and infrastructure team at LinkedIn, where he’s influenced the design and implementation of hosted notebooks, providing a seamless experience to end users. Manu has worked on setting up multiple features in the platform like sharing/choosing custom docker environments & recently is involved with visualization efforts to effectively view big data visualizations.He works closely with customers, engineers, and product to understand and define the requirements and design of the system. He has extensive experience in building complex and scalable applications. Previously, he was with Paytm, Amadeus, and Samsung, where he built scalable applications for various domains.
Unified Data Science Platform for Accelerating Data Insights(Talk)
Experienced Data Scientist and Tech Lead at Imperva’s threat research group where I work on creating machine learning algorithms to help protect our customers against web app and DDoS attacks. Before joining Imperva, I obtained a B.Sc and M.Sc in Bioinformatics from Bar Ilan University.
Brian Lucena is Principal at Numeristical, where he advises companies of all sizes on how to apply modern machine learning techniques to solve real-world problems with data. He is the creator of three Python packages: StructureBoost, ML-Insights, and SplineCalib. In previous roles he has served as Principal Data Scientist at Clover Health, Senior VP of Analytics at PCCI, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.
Advanced Gradient Boosting (I): Fundamentals, Interpretability, and Categorical Structure(Training)
Advanced Gradient Boosting (II): Calibration, Probabilistic Regression and Conformal Prediction(Training)
Dr. Sagar Samtani is an Assistant Professor and Grant Thornton Scholar in the Department of Operations and Decision Technologies at Indiana University. Dr. Samtani graduated with his Ph.D. from the AI Lab from University of Arizona. Dr. Samtani’s research interests are in AI for Cybersecurity, developing deep learning approaches for cyber threat intelligence, vulnerability assessment, open-source software, AI risk management, and Dark Web analytics. He has received funding from NSF’s SaTC, CICI, and SFS programs and has published over 40 peer-reviewed articles in leading information systems, machine learning, and cybersecurity venues. He is deeply involved with industry, serving on the Board of Directors for the DEFCON AI Village and Executive Advisory Council for the CompTIA ISAO.
Alex Ratner is the co-founder and CEO at Snorkel AI, and an Assistant Professor of Computer Science at the University of Washington. Prior to Snorkel AI and UW, he completed his Ph.D. in CS advised by Christopher Ré at Stanford, where he started and led the Snorkel open source project, and where his research focused on applying data management and statistical learning techniques to emerging machine learning workflows such as creating and managing training data and applying this to real-world problems in medicine, knowledge base construction, and more. Previously, he earned his A.B. in Physics from Harvard University.
Operationalizing Organizational Knowledge with Data-Centric AI(Talk)
Carl Gold is currently the Data Science Director at OfferFit.ai, an AI-as-a-Service reinforcement learning engine that maximizes customer upsell and retention. Before coming to OfferFit, Carl was Chief Data Scientist of Zuora, the Subscription Economy leading billing platform. Based on his experiences fighting churn for SaaS companies during his time at Zuora, Carl wrote the first book dedicated to customer churn analytics and data science: “Fighting Churn With Data”. Carl has a PhD from the California Institute of Technology and first author publications in leading Machine Learning and Neuroscience journals.
Fighting Churn With Data(Workshop)
Veena Mendiratta is an applied researcher in network reliability and analytics at Nokia Bell Labs based in Naperville, Illinois, USA. Her research interests include network dependability, software reliability engineering, programmable networks resiliency, and telecom data analytics. Current work is focused on network reliability and analytics – architecting and modeling the reliability of next-generation programmable networks; and the development of analytics-based algorithms for anomaly detection, network slicing and network control for improving network performance and reliability. She has led projects on customer experience analytics using data mining and social network analysis techniques, and the development of algorithms and visual analytics for anomaly detection in telecommunications networks. She is a member of the SIAM Visiting Lecturer Program, Life Member of SIAM, Senior Member of IEEE, Member of INFORMS; member of ASA; and was a Fulbright Specialist Scholar for 5 years during which time she visited universities in India, Norway and New Zealand. She holds a B.Tech in engineering from IIT-Delhi, India, and a Ph.D. in operations research from Northwestern University, USA.
Dr. Victor Zitian Chen, CFA, is a believer and action-taker on the idea of a world brain. Dr. Chen is currently the Director of Data Analytics and Insights, Experimental Design and Causal Inference at Fidelity Investments. He leads the causal analytics efforts across the personal investing business at the Fidelity, including experimentation, prescriptive analytics, and causal knowledge graph-based applications. Before joining Fidelity, Dr. Chen was a tenured professor in management and data science at the University of North Carolina, Charlotte, and a visiting professor in international business at Copenhagen Business School, Denmark. He led two major National Science Foundation (NSF) grants focusing on causal knowledge graph-based explainable AI and analytics applications. He founded and led the Global OpenLabs for Performance Enhancement-Analytics and Knowledge System (GoPeaks) – a startup to advance and commercialize knowledge synthesis and causal/prescriptive analytics solutions for business decisions.
Causal/Prescriptive Analytics in Business Decisions(Business Talk)
Sheamus McGovern is the founder of ODSC (The Open Data Science Conference). He is also a software architect, data engineer, and AI expert. He started his career in finance by building stock and bond trading systems and risk assessment platforms and has worked for numerous financial institutions and quant hedge funds. Over the last decade, Sheamus has consulted with dozens of companies and startups to build leading-edge data-driven applications in finance, healthcare, eCommerce, and venture capital. He holds degrees from Northeastern University, Boston University, Harvard University, and a CQF in Quantitative Finance.
Ben is a Senior Data Scientist at the Institute for Experiential AI at Northeastern University. He obtained his Masters in Public Health (MPH) from Johns Hopkins and his PhD in Policy Analysis from the Pardee RAND Graduate School. Since 2014, he has been working in data science for government, academia and the private sector. His major focus has been on Natural Language Processing (NLP) technology and applications. Throughout his career, he has pursued opportunities to contribute to the larger data science community. He has presented his work at conferences, published articles, taught courses in data science and NLP, and is co-organizer of the Boston chapter of PyData. He also contributes to volunteer projects applying data science tools for public good.
David Koll is a Senior Data Scientist at Continental Tires, Germany. He holds a PhD in Computer Science from the University of Göttingen with research visits to the University of Oregon (USA), Uppsala University (Sweden), and Fudan University (China). Most of his academic work was involving analyses of social media. Since joining Continental in 2018 he has developed different analytical solutions that are now running in production, with a focus on both forecasting and Industry 4.0.
Any Way You Want It: Integrating Complex Business Requirements into ML Forecasting Systems(Workshop)
Michelle Hoogenhout is the lead data scientist at Hydrostasis, Inc. Hydrostasis is pioneering hydration monitoring by collecting optical changes in blood flow and water content from wrist-worn sensors. Michelle holds a PhD in Psychology (Neuropsychology) from the University of Cape Town and a neuropsychiatric genetics training fellowship from the Harvard T.H. Chan School of Public Health. She has over 10 years of experience in machine learning and insight generation from physiological and psychological data. Her research interests include the intersection between physical states and emotional and cognitive performance, as well as developmental disorders and empathy. Michelle also loves teaching and instructional design: she’s taught data science, psychology, and statistics. In her free time Michelle loves hiking, board games and swimming.
Jacob Schreiber is a post-doctoral researcher at the Stanford School of Medicine. As a researcher, he has developed machine learning approaches to integrate thousands of genomics data sets, to design biological sequences with desired characteristics, and has described how statistical pitfalls can be encountered and accounted for in genomics data sets. As an engineer, he has contributed to the community as a core contributor to scikit-learn and as the developer of several machine learning toolkits, including pomegranate for probabilistic modeling and apricot for submodular optimization.
Navigating the Pitfalls of Applying Machine Learning in Practice(Talk)
Daniel Lenton is the creator of Ivy, which is an open-source framework with an ambitious mission to unify all other ML frameworks. Prior to starting Ivy, Daniel was a PhD student at Imperial College London, where he published research in the areas of machine learning, robotics and computer vision.
Unifying ML With One Line of Code(Tutorial)
Stefano Ermon is an Associate Professor of Computer Science in the CS Department at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory, and a fellow of the Woods Institute for the Environment. His research is centered on techniques for probabilistic modeling of data and is motivated by applications in the emerging field of computational sustainability. He has won several awards, including Best Paper Awards (ICLR, AAAI, UAI and CP), a NSF Career Award, ONR and AFOSR Young Investigator Awards, Microsoft Research Fellowship, Sloan Fellowship, and the IJCAI Computers and Thought Award. Stefano earned his Ph.D. in Computer Science at Cornell University in 2015.
Yegna Jambunath is a Researcher at Centre for Deep Learning, Northwestern University. Yegna has six years of total work experience with four years of industry focused research experience in ML and Data Science. His areas of interest are MLOps, ML in Healthcare and RL.
MLOps for Deep Learning(Talk)
Swasti Kakker is a senior software development engineer on the data analytics and infrastructure team at LinkedIn, where she worked on the design and implementation of Darwin – a hosted Jupyter notebook solution. She has worked on features like scheduling notebooks based on a cron expression, creating publishable reports from executions of a notebook, introducing Language servers in notebooks and integrating notebooks with various apps at LinkedIn. She works closely with stakeholders to understand the expectations and requirements of the platform that would improve developer productivity. Her passion lies in increasing and improving developer productivity by designing and implementing scalable platforms. She has also spoken previously at international conferences like Grace Hopper, Orlando and O’reilly Strata, New York in 2019.
Unified Data Science Platform for Accelerating Data Insights(Talk)
Arun heads the Bloomberg Quantitative Research Solutions Team. Arun’s work initially focused on Stochastic Volatility Models for Derivatives & Exotics pricing/hedging and more generally around asset pricing using traditional quantitative finance methods. More recently, he has enjoyed working at the intersection of diverse areas such as data science, innovative quantitative finance models and using AI/Machine Learning methods to help reveal embedded signals in traditional & alternative data such as Company Financials, ESG, News/Social, Supply Chain, Geolocational & Extreme Weather and their potential impact on capital markets. Most recently in an attempt to complete a full circle, he has been exploring use of ML methods in asset pricing , e.g. Derivatives pricing and illiquid instrument pricing.
Prior to joining Bloomberg, he earned his Ph.D from Cornell University in the areas of computer science and applied mathematics and a B. Tech in Computer Science from IIT Delhi, India. Arun is also an editorial board member of The Journal of Financial Data Science.
Machine Learning Models for Quantitative Finance and Trading(Talk)
Oswald is a former PhD Candidate (ABD) in Mathematics, an education fanatic (5 degrees), and an author of 40 technical books. He has worked for Oracle, AAA, and Just Systems of Japan as well as various startups. He has lived/worked in 5 countries on three continents, and in a previous career he worked in South America, Italy, and the French Riviera, and has traveled to 70 countries on five continents. He has worked from C/C++/Java developer to CTO, comfortable in 4 languages, and currently he is an AI (ML,DL,NLP,DRL) adjunct instructor at UCSC and works on NLP-related tasks in a start-up in the Bay Area.
James Demmel is the Dr. Richard Carl Dehmel Distinguished Professor of Computer Science and Mathematics at the University of California at Berkeley, and former Chair of the EECS Dept. He also serves as Chief Strategy Officer for the start-up HPC-AI Tech, whose goal is to make large-scale machine learning much more efficient, with little programming effort required by users. Demmel’s research is in high performance computing, numerical linear algebra, and communication avoiding algorithms. He is known for his work on the widely used LAPACK and ScaLAPACK linear algebra libraries. He is a member of the National Academy of Sciences, National Academy of Engineering, and American Academy of Arts and Sciences; a Fellow of the AAAS, ACM, AMS, IEEE and SIAM; and winner of the IPDPS Charles Babbage Award, IEEE Computer Society Sidney Fernbach Award, the ACM Paris Kanellakis Award, the J. H. Wilkinson Prize in Numerical Analysis and Scientific Computing, and numerous best paper prizes.
Robert is a Principal Data Scientist at SAS where he builds end-to-end artificial intelligence applications. He also researches, consults, and teaches machine learning with an emphasis on deep learning and computer vision for SAS. Robert has authored an introductory book on computer vision and has written several professional courses on topics including neural networks, deep learning, and optimization modeling. Before joining SAS, Robert worked under the Senior Vice Provost at North Carolina State University, where he built models pertaining to student success, faculty development, and resource management. Prior to working in academia, Robert was a member of the research and development group on the Workforce Optimization team at Travelers Insurance. His models at Travelers focused on forecasting and optimizing resources. Robert graduated with a master’s degree in Business Analytics and Project Management from the University of Connecticut and a master’s degree in Applied and Resource Economics from East Carolina University.
Shoili Pal is a Data Scientist at The Home Depot where she currently works on Recommendations and Personalization. She has also worked in product data science teams, a finance team and two early stage startups. She holds a Masters in Analytics from Georgia Tech and a Masters in Operations Research from the London School of Economics. In her spare time she reads fantasy and science fiction, builds Lego sets and goes on bike rides.
Justin is a Developer Advocate at Airbyte. He has been an active content creator since 2019, documenting his journey as a self-taught developer through YouTube videos and live-streaming on Twitch. He’s excited about the power of the open-source modern data stack and how these new tools help evolve our workflows as data engineers, analysts and scientists!
Open Source ELT For Everyone – Level Up With Custom Connectors(Talk)
John is a Data Architect at Airbyte where he enjoys helping companies move data from where it’s created to where they want it to live. Before AIrbyte he worked as a Global Solutions Architect at LiveRamp where he helped companies activate data to transform customer experiences. Besides being in the weeds about data, John is an avid bike rider and golfer.
Open Source Powers the Modern Data Stack (Demo Talk)
Max is a Staff Data Scientist at Wish where he focuses on online experimentation (A/B testing) and machine learning. He has been revamping the A/B testing platform at Wish on various fronts, including infrastructure, statistical testing, usability, etc. His passion is to empower data-driven decision-making through the rigorous use of data. Max earned his Ph.D. in Statistical Informatics from the University of Arizona.
Ali Vanderveld is a Senior Staff Data Scientist at Wayfair, where she serves as a technical leader for machine learning, currently leading the development of novel search and recommendation technologies. Prior to Wayfair, she led a team focused on language AI at Amazon Web Services and was the Director of Data Science at ShopRunner. She has also worked at Civis Analytics, at Groupon, and as a technical mentor for the Data Science for Social Good Fellowship. Ali has a PhD in theoretical astrophysics from Cornell University and got her start working as an academic researcher at Caltech, the NASA Jet Propulsion Laboratory, and the University of Chicago, working on the development teams for several space telescope missions, including ESA’s Euclid.
Optimizing Recommendations for Competing Business Objectives(Talk)
Danny Chiao is an engineering lead at Tecton/Feast Inc working on building a next-generation feature store. Previously, Danny was a technical lead at Google working on end to end machine learning problems within Google Workspace, helping build privacy-aware ML platforms / data pipelines and working with research and product teams to deliver large-scale ML powered enterprise functionality. Danny holds a Bachelor’s degree in Computer Science from MIT.
Building Production-Ready Recommender Systems with Feast(Talk)
Yang You is a Presidential Young Professor at National University of Singapore. He is on an early career track at NUS for exceptional young academic talents with great potential to excel. He received his PhD in Computer Science from UC Berkeley. His advisor is Prof. James Demmel, who was the former chair of the Computer Science Division and EECS Department. Yang You’s research interests include Parallel/Distributed Algorithms, High Performance Computing, and Machine Learning. The focus of his current research is scaling up deep neural networks training on distributed systems or supercomputers. In 2017, his team broke the world record of ImageNet training speed, which was covered by the technology media like NSF, ScienceDaily, Science NewsLine, and i-programmer. In 2019, his team broke the world record of BERT training speed. The BERT training techniques have been used by many tech giants like Google, Microsoft, and NVIDIA. Yang You’s LARS and LAMB optimizers are available in industry benchmark MLPerf. He is a winner of IPDPS 2015 Best Paper Award (0.8%), ICPP 2018 Best Paper Award (0.3%) and ACM/IEEE George Michael HPC Fellowship. Yang You is a Siebel Scholar and a winner of Lotfi A. Zadeh Prize. Yang You was nominated by UC Berkeley for ACM Doctoral Dissertation Award (2 out of 81 Berkeley EECS PhD students graduated in 2020). He also made Forbes 30 Under 30 Asia list (2021) and won IEEE CS TCHPC Early Career Researchers Award for Excellence in High Performance Computing. For more information, please check his lab’s homepage at https://ai.comp.nus.edu.sg/
Ysis Tarter is a senior data engineer at Absci, where deep learning AI and synthetic biology are harnessed to translate ideas into drugs. She leads the development of data platforms and pipelines for high-throughput biological data, as well as scientific tools for data analysis. Ysis is also the co-tech lead of the Bay Area chapter of Black Girls Code and teaches data visualization and analytics. She has lectured at several institutions, including Columbia, USC, and UC Berkeley. Tarter holds an MS in Applied Biomedical Engineering from Johns Hopkins University and a BS in Computer Science from Stanford University where she specialized in biocomputation. She has published peer-reviewed articles in the fields of scalable neuroscience and synthetic biological design.
Julien is currently Chief Evangelist at Hugging Face. He’s recently spent 6 years at Amazon Web Services where he was the Global Technical Evangelist for AI & Machine Learning. Prior to joining AWS, Julien served for 10 years as CTO/VP Engineering in large-scale startups.
Sadie St Lawrence is the Founder and CEO of Women in Data, a community of 30,000+ data leaders, practitioners, and citizens whose mission is to increase diversity in data careers. Women in Data has been named a Top 50 Leading Company of The Year, and has been rated as the #1 community for Women in AI and Tech. Sadie has trained over 400,000 people in data science and has developed multiple programs in machine learning and career development. Sadie has been awarded, Top 30 Most Inspiring Women in AI, Top 10 Most Admired Businesswomen to Watch in 2021, Top 21 Influencer in Data, and is the recipient of the Outstanding Service Award from UC Davis. In addition, she serves on boards, and is the host of the Data Bytes podcast.
Creating An Ethical AI Environment (Business Talk)
Aaron is our Director of Solutions Engineering at Appen. He works closely with the Sales and Solutions teams to manage Fortune 500 deals through the pipeline. Aaron has lived in 7 cities around the world and is a geek at heart. He loves solving problems, breaking new technologies and identifying opportunities where technology can have a real impact on how we get things done.
Peter is VP of Engineering at Mindtech. Peter has many years of experience in semiconductors, with expertise in AI, GPU and VR/AR. Working at companies including Highwai, Imagination Technologies and ST. Peter has also been highly active in Khronos, including chairing the NNEF working group.
Chip Kent is the chief data scientist at Deephaven Data Labs. He holds a Ph.D. from CalTech, with decades of quantitative, mathematical, and computer science experience. Chip comes from a background in quantitative private investment, using data to make investments at Walleye Capital.
Vini Jaiswal is a Developer Advocate at Databricks. She co-leads the advocacy for the open-source project Delta Lake. She helped advance data science and AI uses for over a decade with companies of different sizes. She loves to help with social causes through data and AI skills, and actively contributes to modern Data Science and Eng.
Building Reliable Lakehouses for your ML pipelines with Delta Lake(Talk)
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks. He now works in Thomson Reuters as VP of Labs, and also provides consultancy and training for small and large companies. His previous experience includes being Head of Data Science and Analytics in Bumble, the largest dating site with over 500 million users, heading the team through acquisition and an IPO.
Corey Wade, MS Mathematics, MFA Writing & Consciousness, is the director and founder of Berkeley Coding Academy, an online program with live classes where teenagers learn Python Programming, Data Analytics, and Machine Learning. Author of Hands-on Gradient Boosting with XGBoost and scikit-learn, and lead author of The Python Workshop, Corey also teaches Math, Programming, and Data Science at Berkeley Independent Study. Corey has published iPhone apps with students, designed classes to build websites, and run after-school coding programs to support girls and underserved students. A Springboard Data Science graduate and multiple grant award-winner, Corey has also worked in industry developing Data Science curricula for Pathstream and Hello World while contributing articles for Towards Data Science. When not coding or teaching, Corey reads poetry and studies the stars.
Introduction to scikit-learn: Machine Learning in Python(Training)
Pete spent more than two decades on Wall Street, growing, and running automated trading groups. In 2005, he was the founding CEO of Walleye Capital, a multi-billion-dollar quant fund that derives value at the intersection of real-time data and automated applications. In 2017, Pete and some engineers spun a proprietary data engine out of Walleye, forming an independent company called Deephaven Data Labs. Deephaven is an open-first software shop, delivering a real-time query engine, APIs, UIs, and integrations to the community via open projects designed for diverse teams. Deephaven complements streaming technologies and makes dynamic data easy and accessible.
Real-time Analytics, AI&Apps with Deephaven Data Labs(Demo Talk)
Raluca Ada Popa is the Robert E. and Beverly A. Brooks associate professor of computer science at UC Berkeley working in computer security, systems, and applied cryptography. She is a co-founder and co-director of the RISELab and SkyLab at UC Berkeley, as well as a co-founder of Opaque Systems and PreVeil, two cybersecurity companies. Raluca has received her PhD in computer science as well as her Masters and two BS degrees, in computer science and in mathematics, from MIT. She is the recipient of the 2021 ACM Grace Murray Hopper Award, a Sloan Foundation Fellowship award, Jay Lepreau Best Paper Award at OSDI 2021, Distinguished Paper Award at IEEE Euro S&P 2022, Jim and Donna Gray Excellence in Undergraduate Teaching Award, NSF Career Award, Technology Review 35 Innovators under 35, Microsoft Faculty Fellowship, and a George M. Sprowls Award for best MIT CS doctoral thesis.
Abubakar Abid completed his PhD at Stanford in applied machine learning. During his PhD, he founded Gradio (www.gradio.dev), an open-source Python library that has been used to build over 500,000 machine learning demos. Gradio was acquired by Hugging Face, which is where Abubakar now serves as a machine learning team lead.
A Practical Tutorial on Building Machine Learning Demos with Gradio(Workshop)
Amita Kapoor, is the author of best-selling books in the field of Artificial Intelligence and Deep Learning. She mentors students at different online platforms such as Udacity and Coursera and is a research and tech advisor to organizations like DeepSight AI Labs and MarkTechPost. She started her academic career in the Department of Electronics, SRCASW, the University of Delhi, where she was an Associate Professor. She has over 20 years of experience in actively researching and teaching neural networks and artificial intelligence at the university level. A DAAD fellow, she has won many accolades with the most recent being Intel AI Spotlight award 2019, Europe. An active researcher, she has more than 50 publications in international journals and conferences. Extremely passionate about using AI for the betterment of society and humanity in general, she is ready to embark on her second innings as a digital nomad.
Deep Learning with Python and Keras (Tensorflow 2)(Training)
Andrew is a Ph.D. Astrophysicist who made the switch from academia to data science (via the Insight Data Science program) in 2014. He was the first data scientist hired at Greenhouse Software where he has worked on many internal data science projects and a few customer-facing data-powered product features. Andrew lives in New Jersey with his wife and son.
Statistics for Data Science(Bootcamp)
Aaron Roth is the Henry Salvatori Professor of Computer and Cognitive Science, in the Computer and Information Sciences department at the University of Pennsylvania, with a secondary appointment in the Wharton statistics department. He is affiliated with the Warren Center for Network and Data Science, and co-director of the Networked and Social Systems Engineering (NETS) program. He is also an Amazon Scholar at Amazon AWS. He is the recipient of a Presidential Early Career Award for Scientists and Engineers (PECASE) awarded by President Obama in 2016, an Alfred P. Sloan Research Fellowship, an NSF CAREER award, and research awards from Yahoo, Amazon, and Google. His research focuses on the algorithmic foundations of data privacy, algorithmic fairness, game theory, learning theory, and machine learning. Together with Cynthia Dwork, he is the author of the book “The Algorithmic Foundations of Differential Privacy.” Together with Michael Kearns, he is the author of “The Ethical Algorithm”.
Ajay Thampi is a machine learning engineer at Meta where he works on large recommender systems, responsible AI and fairness. He holds a PhD and his research was focused on signal processing and machine learning. He has published papers at leading conferences and journals on reinforcement learning, convex optimization, and classical machine learning techniques applied to 5G cellular networks.
Interpretable AI or How I Learned to Stop Worrying and Trust AI(Talk)
Ilana is a Director in PwC Labs (Emerging Tech & AI), where she serves as one of the leads for Artificial Intelligence. Ilana specializes in applying machine learning and simulation modeling to address client needs across sectors regarding strategic deployment of new services, operational efficiencies, geospatial analytics, explainability and bias. Ilana is a Certified Ethical Emerging Technologist, is listed as one of 100 “Brilliant Women in AI Ethics” in 2020, and was recently recognized in Forbes as one of 15 leaders advancing Ethical AI. Since 2018, she has led PwC’s efforts globally in the development of cutting-edge approaches to build and deploy Responsible AI.
Emerging Approaches to AI Governance: Tech-Led vs Policy-Led(Talk)
Frank Zickert is Quantum machine learning engineer and the author of Hands-On Quantum Machine Learning With Python. He teaches quantum machine learning in an accessible way to help those without a degree in math or physics to get started in the field.
In his research, Frank strives to use quantum machine learning to advance the field of knowledge graph-based natural language processing. He is also the Chief Technology Officer of Ihr MPE B+C where he supports medical physicists to provide radiation protection services for clinical customers. Previously he worked at Aperto-An IBM Company and Deutsche Bank.
Frank earned his Ph.D. in Information Systems Development from Goethe University Frankfurt am Main, Germany.
Getting Started With Quantum Bayesian Networks in Python and Qiskit(Tutorial)
Albert has skills in machine learning and big data to solve (financial) optimization problems. He developed projects of different skill levels for Taipy’s tutorial videos. He got his degree from McGill University – Bachelor of Science. Major in Computer Science & Statistics. Minor in Finance.
How to build stunning Data Science Web applications in Python – Taipy Tutorial(Workshop)
Martin has over 30 years of experience in Data Science, AI, Decision Optimization. He worked as Consulting Project Manager, Technical Sales, Data Scientist with organizations including ILOG, IBM, Manhattan Associates, Emptoris. He has strong modeling skills in constraint programming, mathematical programming, machine learning. He is skilled in C++, Java, Python. Martin’s main objective is to help organizations identify and deploy analytics that maximize ROI. He was selected as INFORMS Franz Edelman Award finalist. He has studied M.S. in Operations Research from Massachusetts Institute of Technology.
Turning your Data/AI algorithms into full web apps in no time with Taipy (Demo Talk)
How to Build Stunning Data Science Web applications in Python – Taipy Tutorial(Workshop)
Bob has worked with the HPCC Systems technology platform and the ECL programming language for over a decade and has been a technical trainer for over 30 years. He is the developer and designer of the HPCC Systems Online Training Courses and is the Senior Instructor for all classroom and remote based training.
Relational Dataset Analytics for Clear Customer Insights(Workshop)
Celia Cintas is a Research Scientist at IBM Research Africa – Nairobi. She is a member of the AI Science team at the Kenya Lab. Her current research explores subset scanning for anomalous pattern detection under generative models and the improvement of ML techniques to address challenges in Global Health. Previously, a grantee from the National Scientific and Technical Research Council at LCI-UNS and IPCSH-CONICET. She holds a Ph.D. in Computer Science from Universidad del Sur (Argentina). More info https://celiacintas.github.io/about/
A Tale of Adversarial Attacks & Out-of-Distribution Detection Stories in the Activation Space(Talk)
Roger is a Senior Architect leading the Machine Learning and Analytics Library team at LexisNexis Risk Solutions. Roger has been involved in the implementation and utilization of machine learning and AI techniques for many years, and he has more than 20 patents in diverse areas of software technology.
Open-source Data Curation and Governance for Large and Growing Data Lakes(Talk)
Tamoghna is a AI Solution Architect in Client Computing Group at Intel, working on building next generation AI solutions for edge computing. Prior to this role he has worked as a data scientist at Intel working on various domains like supply chain – inventory optimization, anomaly detection and failure prediction of various IT infrastructure across Intel, building advanced search tools for bug sightings, to name a few. After his Masters in Computer Science from Indian Statistical Institute and a Masters in Mathematics form Calcutta University, he has worked as a research assistant in Microsoft Research India for 3+ years and then moved to other product companies to start his journey in the ML and AI space. He has been teaching AI courses at Intel and trained 250+ employees. Also, he was a core member of the internal AI training academy and AI content development which is a 3-level course in AI for Intel employees. He mentors many folks for their AI projects. He has 4 US patents filed on various innovative AI applications and products and also published few papers related to the work at Intel. He published a book on hands-on transfer learning with Python in 2018 from Packt (packtpub.com) and is working on another book to be published this year from bpb publications.
Dr. Mohit Bansal is the John R. & Louise S. Parker Professor and the Director of the MURGe-Lab in the Computer Science department at University of North Carolina (UNC) Chapel Hill. He received his PhD from UC Berkeley and his BTech from IIT Kanpur. His research expertise is in natural language processing and multimodal machine learning, with a particular focus on grounded and embodied semantics, human-like language generation, and interpretable and generalizable deep learning. He is a recipient of DARPA Director’s Fellowship, NSF CAREER Award, Army Young Investigator Award, Google Focused Research Award, Microsoft Investigator Fellowship, and outstanding paper awards at ACL, CVPR, EACL, COLING, and CoNLL. His service includes ACL Executive Committee, ACM Doctoral Dissertation Award Committee, Program Co-Chair for CoNLL 2019, ACL Americas Sponsorship Co-Chair, and Associate/Action Editor for TACL, CL, IEEE/ACM TASLP, and CSL journals. Webpage: https://www.cs.unc.edu/~mbansal/
Unified and Efficient Multimodal Pretraining Across Vision and Language(Talk)
Allison Portis is a software engineer at Databricks working on Delta Lake. She recently graduated from Cornell University where she studied computer science. Allison previously worked on open source feature engineering projects as an intern at Feature Labs and is excited to now be a part of the Delta Lake community.
Building Reliable Lakehouses for your ML pipelines with Delta Lake(Talk)
Jimmy Whitaker is the Chief Scientist of AI at Pachyderm. He focuses on creating a great data science experience and sharing best practices for how to use Pachyderm. When he isn’t at work, he’s either playing music or trying to learn something new, because “You suddenly understand something you’ve understood all your life, but in a new way.”
Ben is a machine learning solutions consultant with W&B. He trains our customers to use W&B and works with them to improve their machine learning workflow. Prior to joining W&B he was training models and developing ml infrastructure for Samsung Research.
ML Tools for Humans(Demo Talk)
Kaushik Bokka is a Senior Research Engineer at Lightning AI and one of the core maintainers of the PyTorch Lightning library. He has prior experience in building production scale Machine Learning and Computer Vision systems for several products ranging from Video Analytics to Fashion AI workflows. He has also been a contributor to a few other open-source projects and aims to empower the way people and organizations build AI applications.
Florian Jacta is a specialist of Taipy, a low-code open-source Python package enabling any Python developers to easily develop a production-ready AI application. Package pre-sales and after-sales functions. He is data Scientist for Groupe Les Mousquetaires (Intermarche) and ATOS. He developed several Predictive Models as part of strategic AI projects. Also, Florian got his master’s degree in Applied Mathematics from INSA, Major in Data Science and Mathematical Optimization.
How to build stunning Data Science Web applications in Python – Taipy Tutorial(Workshop)
Nadia Fawaz is a Senior Staff Applied Research Scientist and the Technical Lead for Inclusive AI at Pinterest. Her research and engineering interests include machine learning for personalization, AI fairness and data privacy, and her work aims at bridging theory and practice. She was named one of the 100 Brilliant Women in AI Ethics 2021, her work on Hair Pattern Search was recognized in the AI and Data category on Fast Company’s World Changing Ideas 2022 list with an honorable mention, and her work on inclusive AI was featured in many news outlets, including The Wall Street Journal, Fast Company, Vogue Business and CBS. She was a winner of the ACM RecSyS challenge on Context-Aware Movie Recommendations CAMRa2011 and her 2012 UAI paper was featured in an MIT TechReview article as “The Ultimate Challenge For Recommendation Engines”. Earlier, she was a Staff Software Engineer in Machine Learning and the Tech Lead for the job recommendation AI team at LinkedIn, a Principal Research Scientist at Technicolor Research lab, and a postdoctoral researcher at the Massachusetts Institute of Technology, Research Laboratory of Electronics. She received her Ph.D. in 2008 and her Diplome d’ingenieur (M.Sc.) in 2005 both in EECS from Telecom ParisTech and EURECOM, France. She is a member of the IEEE and of the ACM.
Sandeep Agrawal leads the HeatWave Machine Learning (HeatWave ML) project within MySQL HeatWave. HeatWave ML is the product of years of research and advanced development, and aims to help both data scientists and non-data scientists quickly apply ML to a given problem. Prior to HeatWave, Sandeep led the Oracle AutoML project within Oracle labs, creating a state-of-the-art distributed AutoML engine. He is passionate about Machine Learning and Systems Architecture, and a project like HeatWave ML that combines the two is heaven for him. Prior to Oracle, he completed his PhD in Computer Science from Duke University in 2015.
Chase is a solutions architect at Arrikto with a passion for connecting people to technical solutions that can prevent them from wasting precious time and mental energy- solving the same problems over and over. Chase is a certified Kubernetes Administrator, Developer, and Security Specialist who works to help clients reduce MLOps friction and toil while ensuring the “non-negotiables” are enforced to provide the best return on their production models.
How Far Left Can You Shift? The Tension Between Data Science and ML Engineering(Talk)
Personal to Product to Platform: Reporting Your Results with Kubeflow(Demo Talk)
Souheil is the Head of Field Data Science at Arrikto where he helps build machine learning solutions for clients. Previously, Souheil worked at Freddie Mac and Capital One where he built models and machine learning platforms. Prior to becoming a data scientist, he spent 15 years in academia working on MRI and Brain Imaging. Souheil holds a BS and PhD in Physics from Yale and MIT respectively.
How Far Left Can You Shift? The Tension Between Data Science and ML Engineering(Talk)
Personal to Product to Platform: Reporting Your Results with Kubeflow(Demo Talk)
Audrey Reznik has been in the IT industry (private and public sectors) for 27 years in multiple verticals. In the last 4 years, she worked as a Data Scientist at ExxonMobil where she created a Data Science Enablement team to help data scientists easily deploy ML models in a Hybrid Cloud environment. Audrey was instrumental in educating scientists about what the OpenShift platform was and how to use OpenShift containers (images) to organize, run, and visualize data analysis results. Audrey now works as a Data Scientist with the Red Hat OpenShift Data Science Team where she is focused on next-generation applications. She is passionate about Data Science and in particular the current opportunities with ML and Federated Data.
MLOPs GItOps/Pipelines(Demo Talk)
Dancing with Data Science and Security on the Edge(Workshop)
Vishal Rathi is a Software Engineer at Walmart where he works on Walmart’s E-commerce Search Engine. He received his Masters in Computer Science with a concentration in Machine Learning from Georgia Institute of Technology.
Building a Semantic Search Engine (Training)
Erik passionately advocates for tomorrow’s solutions, with a keen focus on pragmatically getting there today. With over 20 years’ experience in operations, sales, and engineering in the language services and data annotation industries, Appen’s VP of Enterprise Solutions brings a wholistic approach to building creative fit-for-use solutions from discovery through delivery. Erik’s broad background in business strategy and people-centric leadership is focused on building more compelling and ethical value propositions for clients, people, and shareholders. Erik has an MBA, an MS in Management and Leadership, and an BA in Psychology.
Lucas is the Product Manager for Ground Control, iMerit’s single source of truth platform for managing data annotation workflows through reporting, analytics, and insights. Prior to iMerit, he designed and launched mapping technology for self-driving cars and developed electronics systems for high-performance vehicles. When not working in the trenches of machine learning, either as an engineer or Product Manager, you can find Lucas experimenting with ML in a variety of side projects, like using computer vision to optimize human biomechanics.
Jeffrey is currently the Global Head of Data Science and Analytics at Amazon Music. Prior to Amazon, Jeffrey worked at WalmartLabs as the VP of Data Science, Data Engineering, and Platform Engineering. Before joining WalmartLabs, he pretty much spent my entire career in quantitative finance. His last role in the investment management industry was the Chief Data Scientist and Global Head of Data Science at AllianceBernstein (AB), a global investment management firm that managed almost $800B. Before AB, he was the VP and Head of Data Science at Silicon Valley Data Science, a startup acquired by Apple in 2017. Earlier in his career, he held various quantitative leadership positions, including the Corporate VP and Head of Risk Analytics and Quantitative Research at Charles Schwab Corporation, Director of Financial Risk Consulting at KPMG, and Assistant Director at Moody’s Analytics. Jeffrey enjoys academic research and teaching. He has taught finance, economics, machine learning, and statistics at University of Pennsylvania, Virginia Tech, Cornell, NYU, and UC Berkeley. He is a frequent speaker at national and international A.I., data science, and technology conferences, such as Spark&AI Summit, Strata, ODSC, PyCon, and many others. He holds a Ph.D. and an M.A. in Economics from the University of Pennsylvania and a B.S. in Mathematics and Economics from UCLA.
Nick Singh is an Ex-Facebook & Google Engineer turned best-selling author of Ace the Data Science Interview, and founder of SQL Interview Platform DataLemur.com. His career advice on LinkedIn has earned him 100,000 followers, and he’s successfully career coached 578 people to land their dream job in data!
Ace the Data Job Hunt(Career Talk)
Ace the Data Science Interview with Nick Singh(Career Workshop)
Guglielmo is a Biomedical Engineer with an extensive background in Software Engineering and Data Science applied to different contexts, such as Biotech Manufacturing, Healthcare and DevOps, just to mention the latest, and a lifelong learner. Currently busy unlocking business value through Deep Learning projects, mostly in Computer Vision (not restricted to this field by the way).
He has been recognized as DataOps Champion at the Streamsets DataOps Summit 2019 and awarded as one of the Top 50 Tech Visionaries at the 2019 Dubai Intercon Conference.
He is also an international speaker and author of the following book: Hands-on Deep Learning with Apache Spark @Packt https://www.packtpub.com/big-data-and-business-intelligence/hands-deep-learning-apache-spark
Not Just Deep Fakes: Applications of Visual Generative Models in Pharma Manufacturing(Tutorial)
Mr. Yurchisin has over ten years’ experience applying operations research, machine learning, statistics, and data visualization to improve decision making. Before joining Gurobi, Jerry (who also goes by Jerome) was a Senior Consultant at OnLocation, Inc. where he customized several linear programming models within the National Energy Modeling System (NEMS) to analyze implementing specific energy policies and utilizing new technologies.
Prior to OnLocation, Jerry was an Operations Research Analyst & Data Scientist at Booz Allen Hamilton for over seven years. There he formulated scheduling and staffing integer programming models for the US Coast Guard, as well as led a project to quantify the maritime risks of offshore energy installations with the Research & Development Center. Further, Jerry was the technical lead on several Coast Guard studies including Living Marine Resources and Maritime Domain Awareness, providing statistical analysis and building supervised and unsupervised machine learning models. He also performed statistical analyses, machine learning modeling, and data visualization for cyberspace directorates at DoD and DHS.
Jerry has several years of experience teaching a wide variety of college-level mathematics and statistics courses and has a passion for education. He also enjoys golfing, biking, and writing about sports from an analytics point of view. He lives in Alexandria, Virginia with his wife, son, and two dogs.
Jerry holds B.S., Ed. and M.S., Mathematics degrees from Ohio University and an M.S. in Operations Research and Statistics from The University of North Carolina at Chapel Hill.
From Data to Decisions: Make your Machine Learning Models mean more with Mathematical Optimization (Demo Talk)
Akash Tandon is co-founder and CTO of Looppanel where he builds software to help product teams record, store and analyze user research data. He is a co-author of Advanced Analytics with PySpark, published by O’Reilly. Previously, Akash worked as a senior data engineer at Atlan, SocialCops and RedCarpet where he built data infrastructure for enterprise, government and finance use-cases. He has also been a participant and mentor in the Google Summer of Code program with the R Project for Statistical Computing.
Introduction to Large-scale Analytics with PySpark(Workshop)
Andrew is a Research Engineer at Cloudera Fast Forward Labs where he spends his time researching the latest advances in the field of machine learning and building prototypes applied to real-world use cases. Prior to joining Cloudera, Andrew worked as a Data Scientist in Deloitte’s Analytics & Cognitive practice developing data products and delivering insights for Government and Public Sector organizations. Andrew holds a Bachelor’s Degree in Mechanical Engineering from Virginia Tech.
Neutralizing Subjectivity Bias with HuggingFace Transformers(Talk)
Rajsekhar Aikat is iMerit’s newly appointed Chief Technology & Product Officer. Rajsekhar joins iMerit from Qualcomm, where he was Senior Director of Product Management. He has over 18 years of technical & product experience across multiple verticals, including automotive, IOT, robotics and telecom. Before Qualcomm, he was the Director of Product at Brain Corporation, where he was responsible for scaling BrainOSTM, an autonomous mobile robot platform & ecosystem, as well as overseeing the development and commercialization of commercial cleaning and delivery robots globally.
“DataOps 2.0” – How the Changing MLOps Landscape is Reinventing DataOps(Talk)
Yotam is a machine learning and deep learning expert with extensive hands-on experience in neural network development. Prior to co-founding Tensorleap, Yotam developed and led AI and Big Data projects from research to production for companies in the automotive and other sectors, as well as developing machine learning algorithms for large government projects, including the Soreq Nuclear Research Center (Israel).
Unleash your Neural Networks with Applied Explainability(Demo Talk)
Bio Coming Soon!
Sandy works at Elementl as the lead engineer for the Dagster project. Prior, he led machine learning and data science teams at KeepTruckin and Clover Health. He’s a committer on Spark and Hadoop, and co-authored O’Reilly’s Advanced Analytics with Spark.
Orchestrating Data Assets instead of Tasks, with Dagster(Talk)
Priya Donti is a Co-founder and Chair of Climate Change AI, a non-profit initiative to catalyze impactful work at the intersection of climate change and machine learning, which she is currently running through the Cornell Tech Runway Startup Postdoc Program. She will also join MIT EECS as an Assistant Professor in Fall 2023. Her research focuses on machine learning for forecasting, optimization, and control in high-renewables power grids. Specifically, her work explores methods to incorporate the physics and hard constraints associated with electric power systems into deep learning models. Priya received her Ph.D. in Computer Science and Public Policy from Carnegie Mellon University, and is a recipient of the MIT Technology Review’s 2021 “35 Innovators Under 35” award, the Siebel Scholarship, the U.S. Department of Energy Computational Science Graduate Fellowship, and best paper awards at ICML (honorable mention), ACM e-Energy (runner-up), PECI, the Duke Energy Data Analytics Symposium, and the NeurIPS workshop on AI for Social Good.
Ke works as lead data scientist in Data Science & Analytics Lab (DSAL) at American Family Insurance. She leads data science and engineering team to build AI-powered solutions turning business initiatives into large-scale ML models, reusable ML systems and ML reliable products. She is a believer, advocator and practitioner in AI adoption to advance business and society growth. Her expertise is in solving data-driven problems, designing data science strategy, and building scalable end-to-end ML applications in production. Ke earned her Master’s degree in Statistics at University of Illinois Urbana-Champaign in 2018, and B.S. in Statistics at Renmin University of China.
Continual Learning: Build Sustainable AI Models in Production(Talk)
Nikolay is an experienced Data Science professional who currently leads the EMEA Data Science team at Domino Data Lab. He holds an MSc in Software Technologies, an MSc in Data Science, and is currently undertaking postgraduate research at King’s College London. His area of expertise is Statistics, Mathematics, and Data Science in general, and his research interests are in Neural Networks with emphasis on biological plausibility. He writes articles and blogs regularly and speaks at various European conferences (ODSC, Big Data Spain, Strata, Big Data London etc.) to build awareness about data science and artificial intelligence. He is also the organizer of the London Data Science and Machine Learning meetup and recipient of several technical mastery awards like the Oracle ACE Award and the IBM Outstanding Technical Achievement Award.
Elijah Meeks is a co-founder and Chief Innovation Officer of Noteable, a startup focused on evolving how we analyze and communicate data. He is known for his pioneering work in the digital humanities while at Stanford, where he was the technical lead for acclaimed works like ORBIS and Kindred Britain. He was Netflix’s first Senior Data Visualization Engineer, and while at Netflix and Apple worked to develop the charting library Semiotic as well as bring cutting-edge data visualization techniques to analytical applications for stakeholders across the organization including A/B testing, conversation flows, algorithms, membership, people analytics, content, image testing and social media. He is a prolific writer, speaker and leader in the field of data visualization and the co-founder and first executive director of the Data Visualization Society.
Building a Great Data Visualization Portfolio(Career Workshop)
Oryan is a ֿLead Software Engineer with a passion for Machine Learning and DevOps, with 7 years of experience developing services for production and development environments and leading teams.
Data-driven ML Retraining with Production Insights(Demo Talk)
As a member of the Neo4j Field Engineering team, Stuart brings 15 years of experience helping many Global 2000 organizations solve their business challenges leveraging semantic technologies, natural language processing, search and graphs. In addition, he has experience across a wide range of industries, including healthcare, finance, manufacturing and retail. Based in the Bay Area, Stuart works with large enterprise companies including Wells Fargo, eBay, Visa, Adobe, Genentech, Kaiser and Cisco.
Neo4j Demo: A Graph Data Science Framework for the Enterprise(Demo Talk)
Katie is a Data Science Solution Architect at Neo4j. She completed her degree in Cognitive Neuroscience at Harvard University. Passionate about people and problem solving, she transitioned to focusing on helping people and businesses leverage data for impactful outcomes. As a customer-facing data scientist she has had the opportunity to work with large and small organizations across a variety of industries. At Neo4j she helps teams up-level their data science practice with graph data science.
Graph Data Science: The Secret Ingredient for Relationship-Driven AI(Talk)
Justin Emerson is a Principal Technology Evangelist at Pure Storage focused on the FlashBlade product portfolio. He joined Pure in 2020 as a FlashBlade Data Architect for the San Francisco Bay Area. Prior to that, he worked at storage-focused reseller partners for more than a decade.
Turbo Boost Workflows for AI, ML, DevOps and EDA with Modern File Utilities(Demo Talk)
AI TCO (Total Cost of Ownership) Considerations from Pilot to Production Scale(Talk)
Robert Osazuwa Ness is a researcher at Microsoft Research and author of the book Causal Machine Learning. He leads the development of MSR’s causal machine learning platform and conducts research into probabilistic models for advanced causal reasoning. He has worked as a machine learning engineer in various machine learning startups. He attended graduate school at both Johns Hopkins SAIS (Hopkins-Nanjing Center) and Purdue University. He received his Ph.D. in Statistics from Purdue, where his dissertation research focused on Bayesian active learning models for causal discovery.
Causal AI(Talk)
Kyle Kirwan wants to help the world make magic with data. He is the co-founder and CEO of Bigeye, a data reliability engineering platform that helps data teams build trust in the data their organizations depend on. As one of the first data scientists, data analysts, and product managers at Uber, he helped launch teams like Experimentation Platform, and products like Databook.
Data Observability for Data Science Teams(Demo Talk)
Laura Skylaki is a Manager of Applied Research in Thomson Reuters Labs, where she leads advanced machine learning projects in the domain of Legal and Tax AI.With a career spanning more than a decade at the intersection of research and practical application, she has contributed technical expertise in diverse fields such as bioinformatics and stem cell biology, image processing and natural language processing. She holds a doctorate in stem cell bioinformatics from the University of Edinburgh, UK, and has been publishing on machine learning applications in leading academic journals since 2012.
NLP Fundamentals(Training)
Jeremy Wenxiao Gu is the Director of Data Science and manages the statistical experimentation team at Shipt Inc, San Francisco. The company is an American delivery service owned by Target Corporation. Before Shipt, Jeremy was Data Science Manager at Stitch Fix (2020-2021), Data Scientist and Manager at Uber (2017-2019), and Data Scientist at Amazon (2014-2017). Jeremy received an MS in Statistics from the University of Washington (2014) and a BS in Mathematics and Statistics from the University of Minnesota (2012). Jeremy is also a member of the American Statistical Association (ASA). He served as Chapter Vice President and Chapter Representative of ASA for three years (2015-2018).
Using Causal Inference Model to Set Up Financial Goals of Company(Business Talk)
Anna Litvak-Hinenzon is SVP, Global Head of Data Science at The RepTrak Company. She leads RepTrak’s global international data organization, providing clients with actionable data insights on Reputation, Brand, and ESG. As a data and technology leader for over 15 years, Anna helps organizations to achieve goals with data products powered by cutting-edge machine learning and AI models leveraging multiple data sources. Anna is a passionate digital transformation leader, an author of numerous patents and papers, with a Ph.D. in Applied Mathematics in her background.
How to Model Public Opinions in the Media Age(Business Talk)
Gaurav is currently the Executive Vice President and General Manager of Machine Learning and AI at AtScale. He is responsible for defining and leading the business that extends the company’s semantic layer platform to address the rapidly expanding set of Enterprise AI and machine learning applications.
Most recently, Gaurav served as VP of Product at Neural Magic – innovators in software acceleration for deep learning utilizing sparse model architectures. Previously, he served in a number of executive roles at IBM spanning product, engineering, and sales that were focused on taking cutting edge data science, machine learning, and AI products and solutions to market; specializing in model training, serving, mlops, and trusted AI in the context of driving business outcomes for enterprise applications. He is also an advisor to data and AI companies.
A New Era of Applied AI: How to Accelerate Enterprise Adoption of AI for Business Impact (Business Talk)
Younes Ben Brahim is a senior Product Marketing Manager at Red Hat and focuses on AI/ML, data analytics and HPC solutions on OpenShift. Previously, Younes worked as a product manager for NetApp and held various roles in Sales, and Consulting at Cisco, Nokia and Perficient. Younes holds a Bachelor of Science from Colorado State University and an MBA from the University of Denver.
Accelerate AI/ML Deployments with Enterprise-Grade MLOps(AIx Keynote)
Scott McClellan is a senior director of product management at NVIDIA, focused on data science workflows. Before joining NVIDIA, Scott was the chief technology officer of PRGX Inc. He has been chief technologist and led engineering and product development at companies including RedHat and HP, where he guided strategies across HPC, cloud, big data and AI solutions. Scott holds a Bachelor of Science from the University of Iowa.
Accelerate AI/ML Deployments with Enterprise-Grade MLOps(AIx Keynote)
Savita is Data & AI evangelist based out of San Francisco Bay Area, USA. Savita brings 15 years of experience as a Technology professional during which she worked on Microsoft platform architecting and developing applications, automating solutions and integrations across Azure, M365, Power Platform and Teams.
She believes AI is a game changing development in human history which can solve some of the most daunting challenges humanity faces today. This idea drives her to relentlessly engage with customers, educate them on the potential of AI, showcase practical use cases and ultimately get them excited and thinking about how they can use AI to solve their organization’s pressing challenges.
Azure AI Powered Global Translator(Demo Talk)
Dr. Blaine Nelson earned his B.S. (University of South Carolina), M.S. and Ph.D (UC Berkeley) degrees in Computer Science. He was a Humboldt Postdoctoral Research Fellow at the University of Tübingen (2011-13) and a Postdoctoral Researcher at the University of Potsdam (2013-14) in Germany. As a graduate student and post-doc, Dr. Nelson co-established the foundations of adversarial machine learning. He has twice co-chaired the ACM CCS workshop on Artificial Intelligence & Security, and co-coordinated the Dagstuhl Perspectives Workshop on Machine Learning Methods for Computer Security (2012).
Following his post-doctoral work, Dr. Nelson worked as a software engineer in Google’s fraud detection group (2014-2016) where he built models and designed infrastructure for large scale machine learning. He then became a senior software engineer at Google’s counter-abuse technology team (2016-2021) where he designed and built a large scale machine learning workflow system. Currently, Dr. Nelson is a principal machine learning engineer at Robust Intelligence where he works in a multi-faceted role to build infrastructure for testing the reliability and security of machine learned models by finding potential flaws or vulnerabilities in their behavior.
Practical Adversarial Learning: How to Evaluate, Test, and Build Better Models(Training)
Shivnath Babu is CTO & Co-Founder at Unravel Data helping our team innovate and solve the difficult challenges enterprises face with their data Previously, he was a tenured professor of Computer Science at Duke University doing research on the ease-of-use and manageability of data-intensive systems, automated problem diagnosis, and cluster-sizing for applications running on cloud platforms.
Dr. Inchiosa’s passion for AI drives his work as Principal Data Scientist Manager in Azure Data’s Advanced Workload Engineering team, where he leads a team of data scientists focused on AI-led co-innovation engagements with strategic customers and partners. Previously, Mario served as Revolution Analytics’ Chief Scientist and as Analytics Architect in IBM’s Big Data organization, where he worked on advanced analytics in Hadoop, Teradata, and R. Prior to that, Mario was US Chief Scientist in Netezza Labs, bringing advanced analytics and R integration to Netezza’s SQL-based data warehouse appliances. He also served as US Chief Science Officer at NuTech Solutions, a computer science consultancy specializing in simulation, optimization, and data mining, and Senior Scientist at BiosGroup, a complexity science spin-off of the Santa Fe Institute. Mario holds Bachelor’s, Master’s, and PhD degrees in Physics from Harvard University. He has been awarded four patents and has published over 30 research papers, earning Publication of the Year and Open Literature Publication Excellence awards.
Feathr: Scalable Feature Store that opens the Window to Infinite Possibilities(Keynote)
Hunter Kempf is a Data Scientist working in the cybersecurity industry and a Z by HP Global Data Science Ambassador. In his free time he works on various side projects relating to Data Science and some of those projects end up as articles for his Medium blog. Previously Hunter worked as a Data Scientist at AT&T working on preventing Fraud and Security incidents and graduated from the Georgia Institute of Technology (Georgia Tech) with a masters in Cybersecurity and the University of Notre Dame with a masters in Applied and Computational Mathematics and Statistics.
Introduction to Generative Art with Stable Diffusion, presented by HP Inc(Talk)
Alex first joined HP as a program manager on the Advanced Compute Solutions OEM team, providing extended life hardware solutions to critical partners across industry. Now, Alex is the product manager for Data Science Workstations. Alex manages Data Science Hardware, operating systems, and the Z by HP Data Science Stack Manager. Prior to HP, Alex received his MBA from the University of Texas-Austin and served in the United States Army.
Data Science at 200mph, How HP Data Science Powers Winning Racing, Presented by HP Inc(Demo Talk)
Steve is the Software Engineering Lead for HP’s Data Science Solutions Team. For two years he’s been curating HP’s Data Science Stack and building processes to ensure compatibility across HP workstations. He studied Computer Science at Colorado State University, and no matter the season – he tries his hardest to get lost in the Rocky Mountains.
Data Science at 200mph, How HP Data Science Powers Winning Racing, Presented by HP Inc(Demo Talk)
Jun Zeng is HP’s Distinguished Technologist and founding manager of the 3D Digital Twin group. Jun has 20 years of industrial experiences in creating and commercializing software for improving cyber-physical systems. His publications include a co-edited book on computer-aided Design and a co-authored book on digital factory, and 50+ peer-reviewed papers. He has 58 U.S. patents granted and more pending. His academic training includes Ph.D. in mechanical engineering and M.S. in computer science, both from Johns Hopkins University. He is ACM member, and IEEE senior member.
Jack McCauley an Innovator in Residence at Jacobs Institute for Design Innovation at UC Berkeley, Professor at UC Berkeley, Co-Founder of Oculus, an American engineer, hardware designer, inventor, video game developer and philanthropist. Jack is best known for designing the guitars and drums for the Guitar Hero video game series, and as a co-founder and former chief engineer at Oculus VR. At Oculus, Jack designed and built the Oculus
DK1 and DK2 virtual reality headsets. Oculus was acquired by Facebook for $2 Billion. McCauley holds numerous U.S. patents for inventions in software, audio effects, virtual reality, motion control, computer peripherals, and video game hardware and controllers. Jack was awarded a full scholarship to attend University of California, Berkeley where he earned as BSc., EECS in Electrical Engineering and Computer Science in 1986. Jack has authored numerous research papers in the field of artificial intelligence (AI) and mathematical modeling of AI-based systems and is currently pursuing new projects at his private R&D facility and hardware incubator in Livermore, California.
Salil Pradhan is a Product Manager in MySQL HeatWave team. His interests include distributed data processing, machine learning, cloud computing, middleware technologies as well as application areas such as Marketing Automation and Supply Chain Management.
Bio Coming Soon!
Hanna Hajishirzi is an Associate Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington and a Senior Research Manager at the Allen Institute for AI. Her research spans different areas in NLP and AI, focusing on developing general-purpose machine learning algorithms that can solve diverse NLP tasks. Applications for these algorithms include question answering, representation learning, green AI, knowledge extraction, and conversational dialogue. Honors include the NSF CAREER Award, Sloan Fellowship, Allen Distinguished Investigator Award, Intel rising star award, best paper and honorable mention awards, and several industry research faculty awards. Hanna received her PhD from University of Illinois and spent a year as a postdoc at Disney Research and CMU.
Toward Robust, Knowledge-Rich Natural Language Processing(Talk)
Jake is currently working as a Senior Product Marketing Manager over ML Lifecycle products at Cloudera. Before joining Cloudera, Jake worked as a Data Scientist and Solution Architect at ExxonMobil. Additionally, he worked as a Senior Data Scientist at FarmersEdge. Before starting his professional career, Jake obtained his bachelor’s and master’s degree from Brigham Young University. When he isn’t working, Jake enjoys skiing, golfing, and spending time with his family in the mountains.
Forecasting Crypto Currency Prices with Cloudera Applied Machine Learning Prototypes(Demo Talk)
As the Head of Machine Learning at Abnormal Security, Dan builds cybercrime detection algorithms to keep people and businesses safe. Before joining Abnormal Dan worked at Twitter: first as an ML researcher working on recommendation systems, and then as the head of web ads machine learning. Before Twitter Dan built smartphone sensor algorithms at TrueMotion and Computer Vision systems at the Serre Lab.
Sanjay Hariharan is a Principal Data Scientist with QuantumBlack Labs, AI by McKinsey, where he serves as a technical Data Science leader. His expertise lies in the development, assetization, and deployment of analytics solutions for Life Sciences and Healthcare problems, including measuring drug efficacy, improving patient access, segmenting physician populations, and expanding indications. These problems combine domain expertise broad applications of Machine Learning and Optimization. As part of his previous role, Sanjay managed the data science workstream for many client facing engagements, working in industries including manufacturing, retail, public sector, and healthcare. Sanjay holds a B.A. in Mathematics from the University of Pennsylvania and an M.S. in Statistical Science from Duke University. In his free time, he enjoys playing tennis, traveling, and exploring New York City!
Kedro: The Open Source Python Library for Production-Ready Data Science Code(Talk)
Rongyao is a data scientist bootstrapped from stats and social science training and an industry-honed engineer. She has worked across domains from anti-corruption research to digital advertising and finance, She specializes in end-to-end problem solving from ideation to deployment. She has spent the past four years absorbing the explosion of deep learning and NLP and leveraging them to scale information extraction at CB Insights R&D.
Jacob Nelson is a data scientist working in the market research industry. Jacob designs research, collects survey data, and uses statistics and machine learning to find insights about consumer behavior that help steer clients’ brand strategy. In his field, he is well appreciated for boldy finding new approaches to traditional tasks to improve research outcomes and predictive insights. He earned his Masters Degree in Political Science at Utah State University in 2016, acquiring skills in data science and quantitative research methodology along the way. He has been working as a data scientist in the market research industry ever since. Jacob currently works for Harris Poll, an American market research and analytics company.
Archetypal Analysis: Maintaining Contrastive Categories in Cluster Analysis(Talk)
Jun Gong is a senior member of the ML team at Anyscale working on RLlib. He is a co-author of a Nature publication on the topic of stratospheric balloon navigation using reinforcement learning. Before Anyscale, he had extensive experience working on distributed and ML systems at Google and Facebook.
Write your first RL Recommender System with Ray and RLlib(Workshop)
Consulting leader that has advised and worked closely with Fortune 500 enterprises to implement and adopt usage of cloud, analytics/AI and IoT solutions. He is a consulting leader at Tiger Analytics, solving complex problems for supply chain, manufacturing, and sustainability while creating business value. In addition, he has served as innovation thought leader with US federal agencies – DoE, NSF, and Materials Genome Initiative to drive usage of Cloud/AI in R&D and manufacturing to launch consumer products.
Tackling Supply Chain Issues with Data and Analytics – Tales from the War Room(Talk)
Kyle is the Chief Architect at Noteable and a core developer of the IPython/Jupyter project. He wants to help build great environments for collaborative analysis, development, and production workloads for everyone; from small teams to massive scale. His passion for open source has enabled him to build better systems with staying power, enable peers, support companies he’s worked for, and drive growth. As an active member of local politics, Kyle has focused on schools, active transportation, transit, and housing all to have a good impact on climate change and equity.
Tushar Mehrotra is SVP of data & analytics for OptumInsight. In his role he oversees product, services, and data asset and platform strategy across markets, and is responsible for driving the vision and direction for the organization encompassing data scientists, actuaries, engineers, product leaders, and medical informatics teams. He is responsible for building GTM and end-to-end analytics solutions, across AI/ML, Social Determinants of Health and Health Equity, Population Health and VBC, Inpatient and Quality domains. Tushar is also accountable for growth, solutioning/design, and delivery for analytics as part of Optum’s Market Performance Partnerships. Prior to joining Optum, Mehrotra spent seven years at McKinsey & Company in Washington, D.C. He was one of the core leaders in the North America Digital and Analytics practice for health care. He has published articles on health care digital and analytics topics and has served on panels and roundtables with leaders in the industry. Mehrotra earned his Bachelor of Science from the University of Illinois at Urbana-Champaign in electrical engineering, and an MBA from the Wharton School at the University of Pennsylvania in strategic management and finance.
As the Vice President, Chief of Staff, Systems Improvement of Bassett Healthcare Network, Michael Thompson partners with administrative and medical staff leadership to develop and implement systems to manage strategic performance improvement plans for all entities across Bassett Healthcare Network. He works collaboratively with network staff and leaders to leverage cultural systems to accelerate improvement initiatives, and partners to drive quality, safety, experience, and access priorities. Michael serves as the performance delivery manager and analytics lead for the Optum Insights/Bassett market performance partnership, leads and oversees the development of key process standard work and integrates improvement and change management principles across the Bassett network. Prior to arriving at Bassett, Michael spent five years at INTEGRIS in Oklahoma City in a variety of roles – most recently vice president for provider services. Prior to working for INTEGRIS, Michael spent 10 years with the U.S. Postal Service in a variety of industrial engineering and leadership positions. Michael holds an MS in industrial and systems engineering and a master’s in business administration. He is a Lean Six Sigma Black Belt and has extensive experience in lean methodology and system design.
Sophia Liu is a Senior Data scientist at Netflix. She leads the data science initiatives for Netflix games offerings. She specializes in online controlled experimentation (A/B tests), causal inferences and analytics. Before Netflix, she was a senior data scientist in Analysis and Experimentation (A&E) team at Microsoft. Dr. Liu received her M.S. and PhD degrees in Electrical Engineering from Columbia University and Northwestern University in 2012 and 2016, respectively. During her graduate study, she has won two best paper awards out of 14 international publications and conducted internships in Bell Labs, Cisco and Alliance Data Systems.
5 Things We Have Learned From Continuous Explore Exploit Applications at Netflix(Talk)
Anna Filippova tends to the dbt Community garden of over 25,000 at dbt Labs as the Director of Community. Prior to dbt Labs, Anna built the first Analytics Engineering team at GitHub. Today, she writes about the intersection of modern data tools and open source in the Analytics Engineering Roundup.
In her past life, Anna published research on building, maintaining and sustaining open source communities. She has also studied how distributed and open source communities worked, fought and learned in a Postdoc at Carnegie Mellon, and acquired a PhD in Communication and Media from the National University of Singapore. From time to time you can find Anna traveling the coast of California and working from her campervan and she is always open to an AMA session.
Bhakti is a technical lead at Responsible AI, Google where she leads applied research teams to build tools to evaluate and mitigate fairness and robustness issues at scale in Google’s tech. She and her team have landed several quality improvements in Google’s ML models by architecting and deploying AI models, systems, and platforms that are trustworthy, fair and explainable by design.
Rajiv Shah is a leading expert on practical AI. At Hugging Face, his primary focus is on enabling enterprises to succeed with AI. He previously led data science enablement efforts across hundreds of data scientists at DataRobot and has been part of data science teams at Snorkel AI, Caterpillar, and State Farm. He is a widely recognized speaker on AI, has received many patents, and published research papers in several domains, including sports analytics, deep learning, and interpretability. He received a Ph.D. and a J.D. from the University of Illinois at Urbana Champaign.
Transforming Enterprise Data Science with Transformers(Workshop)
Sourav Mazumder is an IBM Data Scientist Thought Leader and The Open Group Distinguished Data Scientist. Sourav has consistently driven business innovation and values through methodologies and Technologies related to Artificial Intelligence, Data Science and Big Data transpired through his knowledge, insights, experience and influencing skills across multiple industries including Manufacturing, Insurance, Telecom, Banking, Media, Health Care and Retail industries in USA, Europe, Australia, Japan and India. Over the last 10 years, he has influenced key decision makers of several fortune 500 companies to adopt Artificial Intelligence, Data Science, and Big Data related technologies to address complex business needs. Sourav has also consistently provided directions to and successfully led numerous challenging Artificial Intelligence, Data Science and Big Data projects, applying various related methodologies ranging from Descriptive statistics, Probabilistic Modelling, Algorithmic Modelling, Natural Language Processing, etc., to solve critical business problems. Sourav has also successfully partnered with academia within North America, India, South Africa to mentor students and enable them in this field. Sourav has experience and exposure in working with a variety of Artificial Intelligence, Data Science and Big Data related technologies such as Watson Open Scale, Watson Natural Language Processing, Watson Machine Learning, IBM Cloud Pak for Data, Spark, Hadoop, BigSQL, HBase, MongoDb, Solr, System ML, Cognos, R, Python, Scala/Java and using them in projects involving phases from creation of Minimum Viable Product to Productionization at an enterprise level. Sourav is an Open Source enthusiast and contributes to Open Source regularly. Sourav holds patents in the Data and AI space (patent profile https://patents.justia.com/search?q=Sourav+Mazumder). Sourav consistently publishes papers/blogs/articles in various industry forums. Sourav is co-author, guest editor and chief editor of multiple books in AI, Data Science and Big Data space (https://www.researchgate.net/profile/Sourav-Mazumder). Sourav is regularly invited to speak in various Industry conferences, like Open Data Science Conference, Spark Summit, IBM Think, Global AI Conference, etc in this subject area. He can be found on Linkedin (https://www.linkedin.com/in/souravmazumder/)
Data Drift Identification for NLP Models in the Context of AI Governance for Enterprises(Workshop)
Prem Prakash heads product marketing for Machine Learning, Developer Tools and Training at Databricks. Previously he has led product marketing at Microsoft for Azure AI portfolio of services. With an academic background in engineering and MBA, and over 15 years of experience in technology sales and marketing, he is passionate building and scaling marketing teams to drive awareness and adoption of technology products.
5 Questions Business Leaders Should Ask When Investing in Machine Learning Projects(Business Talk)
Ben Taylor has over 17 years of machine-learning experience. After studying chemical engineering, Taylor joined Intel and Micron and worked in their photolithography, process control, and yield prediction groups. Pursuing his love for high-performance computing (HPC) and predictive modeling, Taylor joined an artificial intelligence hedge fund (AIQ) as their AI expert. Taylor then joined a young HR startup called HireVue and built out their data science group and helped o launch HireVue’s AI insights product using video/audio from candidate interviews. In 2017 Taylor co-founded Zeff.ai to pursue deep learning for image, audio, video, and text for the enterprise.
Building & Selling AI Startups(Business Talk)
Manesh specialize in thought leadership and building end to end technology solutions in cloud computing, data platforms and DevOps with key focus on hybrid workloads. He work’s with CXOs on understanding Business Problems and designs systems to deliver customer success through Spektra’s innovative cloud solutions and services.
Making Data-driven Decisions with Azure Machine Learning & Responsible AI Dashboard(Workshop)
Mathias Ciliberto is a data scientist at Hydrostasis, Inc. He started working on wearable sensors at ST Microelectronics, before completing his PhD in Engineering at the University of Sussex with a thesis on movement recognition with template matching methods for power-aware applications. At Hydrostasis Inc., he works on methods for hydration assessment using near-infrared signals. His expertise includes machine learning, deep learning and template matching methods for physiological sensing and assessment, with a focus on health-care and low power applications. His research interests also include signal processing, embedded systems and novel wearable applications.
Hugo Shi a data science leader with 15 years of experience with data science and software projects at companies ranging from JP Morgan to the Chicago Trading Company. He is the CTO and co-founder of Saturn Cloud where he helps to make sure that Saturn Cloud is secure, scalable and easy to use for all data science teams. Hugo has a PhD in Signal Processing and his academic research focused on iterative reconstruction algorithms in medical imaging.
Data Science Platforms are Bad(Demo Talk)
Dr. Bryan Bischof is the Head of Data Science at Weights and Biases, and adjunct professor of Data Science at Rutgers University. He’s previously worked in Time Series Signal Processing at Scale, Demand Forecasting, Global Optimization and Logistics, and Personalized Recommendations. He’s obsessed with math, and has a dog named Ravioli.
Mike Wong is a Solutions Engineer at Unravel Data helping customers navigate the challenges of the modern data economy and optimize complex data stack. Previously, he spent nearly 20 years as a solution architect in a range of technology roles from PLM to Hadoop. His robust experience in the DataOps domain allows Mike to help customers achieve their vision with data applications and infrastructure.
Empowering DevOps for Data Teams(Demo Talk)
Vincent has 30+ years as AI specialist with ILOG and IBM. He has mentored several Data Science teams. Vincent has designed/modeled several major AI projects for customers such as Samsung. Electronics, McDonald’s, Dassault Aviation, Carhartt, Toyota, TSMC, Disney, etc. He is skilled in Mathematical Modeling, Machine Learning, Time Series prediction. He has strong experience in Manufacturing, Retail & Logistics industries. His main objective is to “Help companies go beyond AI pilots and be successful in bringing AI to their end-users”. He received his Msc in Comp. Science & AI from Paris-Saclay University.
Turning your Data/AI Algorithms into Full Web Apps in no Time with Taipy(Demo Talk)
Subho Majumdar is a technical leader in applied trustworthy machine learning who believes in a community-centric approach to data-driven decision making. He has pioneered the use of trustworthy ML methods in industry settings, co-wrote a book, and founded multiple nonprofit efforts in this area. In past work, he has helped drive policy changes in government and nonprofit organizations through successful collaborations in the data for good space. Subho has 10 years of R&D experience in ML, data science, and statistics, with 30+ publications and 15+ filed patents (2 granted). Currently, Subho is a senior scientist in the Security ML group of Splunk. He holds a PhD and masters in statistics from University of Minnesota.
Practicing Trustworthy Machine Learning: A Tutorial (Tutorial)
Matthew McAteer is the creator of 5cube Labs, an ML consultancy that has worked with over 100 companies in industries ranging from architecture to medicine to agriculture to drug discovery. Matthew worked with the Tensorflow team at Google on probabilistic programming, and previously worked in biomedical research in labs at MIT and Harvard Medical School.
Practicing Trustworthy Machine Learning: A Tutorial (Tutorial)
Bio Coming Soon!
Practicing Trustworthy Machine Learning: A Tutorial (Tutorial)
David Patterson received BA, MS, and PhD degrees from UCLA. He is a UC Berkeley Pardee professor emeritus, a Google distinguished engineer since 2016, the RIOS Laboratory Director, and the RISC-V International Vice-Chair.
His most influential Berkeley projects likely were RISC and RAID. He received service awards for his roles as ACM President, Berkeley CS Division Chair, and CRA Chair and awards for his teaching. The most prominent of his seven co-authored books is Computer Architecture: A Quantitative Approach.
He and his co-author John Hennessy shared the 2017 ACM A.M Turing Award, the 2021 BBVA Foundation Frontiers of Knowledge Award, and the 2022 NAE Charles Stark Draper Prize for Engineering. The Turing Award is often referred to as the “Nobel Prize of Computing” and the Draper Prize is considered a “Nobel Prize of Engineering.”
Outside of work he plays soccer, lifts weights, cycles, and bodysurfs. He has been married to his high-school sweetheart since 1967, and they have raised two sons, who in turn are raising three grandchildren.
A Decade of Machine Learning Accelerators: Lessons Learned and Carbon Footprint(Talk)
Balamurugan Gangadharan is Senior Staff Software Engineer on the Data and Artificial Intelligence Platform group at LinkedIn, playing the role of Tech Lead for DARWIN, a Data Science and Artificial Intelligence platform. He has been playing a key role in shaping the roadmap of the product by working with various stakeholders of the product as well as in designing and implementing multiple features in DARWIN. He has extensive experience building highly scalable and complex distributed systems. Previously, he has served senior roles at companies like Qubole, Nutanix, Dell etc where he has built enterprise and cloud native solutions for complex problems.
Unified Data Science Platform for Accelerating Data Insights(Talk)
Harini Kannan is a data scientist at Sophos AI. She has been in security data science for the last 5 years. She was previously the Principal Data Scientist at Capsule8, which was acquired by Sophos. She has given talks at Defcon AIVillage, CAMLIS, BlackHat (USA), ODSC -East, Data Science Salon, PyData (Boston), and Data Connectors. Her areas of research include detecting hardware-based attacks using performance counters, user behavior analysis, applied NLP, interpretable ML, and unsupervised anomaly detection.
Konstantin Berlin is currently the Head of AI at Sophos, where he manages a team of machine learning researchers and big data engineers. His group is responsible for developing and maintaining headline ML models that are actively deployed and used by millions of Sophos customers every day. His areas of interests and work cover all aspects of the ML development cycle. This includes leading a team of research in developing novel ML cybersecurity models, working across organizations to integrate the models into products, and expanding and architecting Sophos AI infrastructure and MLOps capabilities.
George Williams is the Head of AI at Smile Identity, an identity management and computer vision-based biometrics provider. He has held senior leadership roles in software engineering, system design, data science, and AI research, including tenures at Apple’s New Product Architecture group and at New York University’s Courant Institute. He can talk on a broad range of topics at the intersection of e-commerce, machine learning, cybersecurity, computer hardware, and computer science. He is an author of several research papers in computer vision and deep learning, published at NeurIPS, CVPR, ICASSP, ICCV, and SIGGRAPH. George is regularly invited to present at meetups and technology conferences, including recent talks at Blackhat, Open Data Science Conference, Apache Spark Summit, JupyterCon, AnacondaCon, and Space Computing. He is a track chair at the Valleyml.ai conference and as a workshop chair for the Neural Information Processing Conference.
Nick is currently a software engineer at Google working on macOS endpoint security systems. He was previously a senior threat researcher at Capsule8 (acquired by Sophos), focusing on Linux server defense. His background is primarily in low-level systems and kernel exploitation research. Nick is also a Hacker in Residence and former student of NYU Tandon School of Engineering’s OSIRIS Lab.
Cameron Wolf is a Ph.D. student in Computer Science at Rice University in Houston, TX advised by Dr. Anastasios Kyrillidis. His interests are loosely related to math and machine/deep learning, including non-convex optimization, theoretically-grounded algorithms for deep neural networks, continual learning, and deep learning on video data. Prior to Rice, he was an undergraduate student in Computer Science at UT Austin, where he worked with the Neural Networks Research Group on research related to genetic algorithms and evolutionary computation.
Outside of academia, He is a Research Scientist at Alegion, a software startup based in Austin, TX. At Alegion, he focuses on the development of long-term, practical research projects in several areas, including streaming training of deep neural networks and evaluating the quality of video-based annotations for computer vision applications. Additionally, he produces a lot of technical content—both internally and externally—with the goal of familiarizing those outsides of academia with important topics and considerations relevant to artificial intelligence.
Perspectives on Hyperparameter Scheduling in Deep Learning(Workshop)
Greg Michaelson is Cofounder and Chief Product Officer at Zerve, a young, stealthy startup that’s rethinking the data science development experience. Previously, Greg was an early joiner at DataRobot where he played many roles, including Chief Customer Officer. Prior to that, he worked as a data scientist in the financial sector after earning a PhD in Applied Statistics from the University of Alabama. In his spare time, Greg manufactures a line of flavored breakfast cereal toppings called Cerup. He lives in Spring Creek, Nevada with his wife, four children, and two Clumber Spaniels.
Four Reasons the Data Science Development Experience Sucks(Talk)
Zoe Steinkamp is a developer Advocate for influxData. She was a front end software engineer for over 6 years before she moved into a developer advocate role. She has been with InfluxDB for over 3 years and she looks forward to sharing her knowledge of the platform and databases. She enjoys learning about awesome new technologies and doing at home tech projects to help make her life as well as other people’s lives easier. Her passions besides new technology include traveling and gardening.
Methods and Tools for Time Series Data Science Problems with InfluxDB(Demo Talk)
Mohamed is the Co-founder & CEO of Kolena and the author of Manning’s book: “Deep Learning for Vision Systems”. Previously, he built and managed AI/ML organizations at Amazon, Twilio, Rakuten, and Synapse (acq. by Palantir). Mohamed regularly speaks at AI conferences like Amazon’s DevCon, O’Reilly’s AI conference, and Google’s I/O.
Build an ML Testing Infrastructure for Rigorous and Systematic Model Testing(Talk)
Bing Liu is a distinguished professor at the University of Illinois at Chicago. He received his Ph.D. in AI from the University of Edinburgh. His current research interests include continual/lifelong learning, lifelong learning dialogue systems, open-world learning, natural language processing, and machine learning. His previous research interests include sentiment analysis, fake review detection, and Web data mining. He has published extensively in prestigious conferences and journals and authored four books: one about lifelong/continual learning, two about sentiment analysis, and one about Web mining. Three of his papers have received Test-of-Time awards and another one received Test-of-Time honorable mention. Some of his works have also been widely reported in the popular and technology press internationally. He served as the Chair of ACM SIGKDD from 2013-2017, as program chair of many leading conferences. He is the winner of 2018 ACM SIGKDD Innovation Award, and is a Fellow of AAAI, ACM, and IEEE.
Continual Learning of Natural Language Processing Tasks(Talk)
Kourosh Hakhamaneshi is a Reinforcement Learning Engineer on the Ray RLlib team at Anyscale. Prior to joining Anyscale, Kourosh was a PhD, EECS student at Berkeley AI Research working on machine learning, especially in reinforcement learning, unsupervised learning, and their applications in robotics and automated design. He was co-advised by Pieter Abbeel and Vladimir Stojanovic.
Write your first RL Recommender System with Ray and RLlib(Workshop)
Melanie is the Research Engineering Manager of Cloudera Fast Forward Labs, an applied machine learning research team within Cloudera. As a researcher and data scientist, she is passionate about democratizing machine learning by turning academic breakthroughs into useful and accessible applications, especially in the NLP space. With experience as a data scientist in multiple industries from hardware manufacturing to cybersecurity, she is a jack of all trades who loves to share what she’s learned. She is also an avid knitter and a reformed astrophysicist, holding a Ph.D. in astrophysics from the University of Minnesota.
Neutralizing Subjectivity Bias with HuggingFace Transformers(Talk)
Danny D. Leybzon has worn many hats, all of them related to data. He studied computational statistics at UCLA and has worked in the data and ML space ever since. In his role as MLOps architect, he has worked to evangelize machine learning best practices, talking on subjects such as distributed deep learning, productionizing machine learning models, automated machine learning, and lately has been talking about AI observability and data logging. When Danny’s not researching, practicing, or talking about data science, he’s usually doing one of his numerous outside hobbies: rock climbing, backcountry backpacking, skiing, etc.
Achieving Better Models Through Monitoring(Demo Talk)
Vishakha Gupta-Cledat is Co-founder and CEO of ApertureData. Prior to that, she worked at Intel Labs for over 7 years where she led the design and development of VDMS (the Visual Data Management System) which forms the core of ApertureData’s product, ApertureDB. Vishakha holds a Ph.D in Computer Science from the Georgia Institute of Technology and a M.S. in Information Networking from Carnegie Mellon University. She has worked on scheduling in heterogeneous multi-core environments, graph based storage and applications on non volatile memory systems, and visual data management challenges for analytics use cases.
Can We Simplify Image and Video Management for Analytics?(Women’s Ignite)