ODSC Europe 2023
Conference Schedule
more sessions added weekly
Europe 2022 Schedule
We are delighted to announce our Europe 2022 Preliminary Schedule!
Please Note: In-Persons attendees will have access to virtual sessions. If you have a virtual Pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded.
The prerequisites and slides for the sessions could be found here
Workshop | In-person | MLOps and Data Engineering | Machine Learning | All Levels
MLOps means different things to different people, however, the fundamental essence of MLOps is to deliver models into productions faster with a consistent, repeatable and reliable approach. Based on our experience of working with various large and small customers across the world, Microsoft has developed an accelerator to do exactly what the word suggests – accelerate our customer’s journey to production…more details
Cindy Weng is a Senior Cloud Solution Architect at Microsoft in Data & AI. She specializes in architecting MLOps solutions for customers across a variety of industries including retail, financial services, consumer goods, and tech. She is one of the authors of the MLOps V2 unified accelerator by Microsoft.
Manu Kanwarpal is a Senior Specialist in the EMEA AI Global Black Belt team at Microsoft. He specialises in Azure Machine Learning and works with some of Microsoft’s largest customers on establishing their end to end processes for Data Science & MLOps. He is one of the authors of Microsoft’s unified MLOps accelerator called “MLOps v2”.
Workshop | In-person | Deep Learning | Machine Learning | Intermediate-Advanced
Recently, OpenAI showcased their latest text-to-image model known as DALL-E2. It generates photorealistic images from text including some unusual one e.g. “astraunaut riding a horse”. Soon after, this was superceded by Google’s Imagen as state-of-the-art model. Both models share a common thing, they use diffusion models as the core algorithm, which is the topic of our workshop…more details
Bio Coming Soon!
Workshop | In-person | MLOps and Data Engineering | Intermediate
Come to this talk to learn how you can add real-time analytics capability to your data pipeline…more details
Karin is currently the leading developer community programming in the Developer Relations team at StarTree. Karin initially began her career in entertainment marketing working with the likes of names like Eminem and Live Nation. She also launched a successful professional women’s network in two major cities in the U.S., organized events for her local Data Science meetup, and helped lead a on-going hackathon to put machine learning in the hands of cancer biologists. Her journey working in data eventually led her to a position as Program Manager for Community Development for the leading graph database in the world, Neo4j. Most recently, she was brought on to StarTree to improve the adoption and success of the overall developer community.
Bootcamp | Virtual | Kickstarter | Beginner
In this workshop, you will get acquainted with the pandas library, which is the most widely used package for reading, analyzing and exporting datasets in Python. You will also learn how to visualize many kinds of tabular data using the plotnine package, along with some tips and tricks on how to make your visualizations stand out. Lastly, you will have the opportunity make predictions and take decisions using data, based on basic statistical methods…more details
Leonidas (Leo) is a Senior Data Scientist at Astrazeneca. His work is focused around machine learning in oncology, including clinical and non clinical applications. He is also enthusiastic about NLP applications in oncology and how this can be used to leverage patient treatment. He is also a workshop facilitator in the European Leadership University (ELU), NL and has also been a data science educator at DataCamp. He holds a PhD from the University of Warwick, UK. in bioinformatics and ML, an MSc in statistics from Imperial College London, UK and a BSc in Statistics and Insurance Science from the University of Piraeus, GR.
Tutorial | In-Person | Deep Learning | NLP
We’ll spend most of our time with Natural Language Processing and how to train models with NLP techniques. We’ll see how computers break down words into ‘embeddings’ – or higher dimensional vectors, whose ‘direction’ can be used to establish sentiment. With this, you’ll then see how a computer can begin to understand the ‘meaning’ of text – and we’ll see how to train AI models to detect things like whether a movie review was positive or not. The techniques can then be extended to text prediction – which leads to text generation – and you’ll learn how to create a simple AI model that creates it’s own text…more details
Laurence Moroney leads AI Advocacy at Google, working with the Google AI Research and product development teams. He’s the best-selling author of ‘AI and Machine Learning for Coders,’ as well as the instructor on the Fundamentals of TinyML course at HarvardX, and the popular TensorFlow specializations with deeplearning.ai and Coursera. He’s passionate about empowering software developers to succeed in Machine Learning, democratizing AI as a result. Laurence is based on Washington State in the USA.
Tutorial | In-person | Machine Learning for Finance | Intermediate
Financial documents such as news, central bank releases, company earnings calls and press releases can significantly alter market movement. We can therefore mine these to extract sentiment to understand and anticipate relevant market movement, strengthen investment theses, and inform trading strategies…more details
Chandini Jain is the CEO/founder of Auquan – a london based fintech using NLP and AI to distill relevant and impactful information from unstructured text. Prior to Auquan, she worked as a derivatives trader at Optiver in Chicago/Amsterdam and Deutsche Bank. At Auquan, she oversee the development of our machine learning strategies.
With a bachelor’s degree in Engineering Physics from IIT Delhi, Vishal has a strong background in mathematics and statistics. He thoroughly enjoys working with Data Science, Machine Learning, and Big Data technologies, with a firm belief in learning by doing.
Workshop | In-person | Machine Learning, Big Data Analytics, MLOps and Data Engineering | Beginner
We will work through various materials and examples to get you started with GPU development in Python using open source libraries…more details
Jacob Tomlinson is a senior Python software engineer at NVIDIA with a focus on deployment tooling for distributed systems. His work involves maintaining open source projects including RAPIDS and Dask. RAPIDS is a suite of GPU accelerated open source Python tools which mimic APIs from the PyData stack including those of Numpy, Pandas and SciKit-Learn. Dask provides advanced parallelism for analytics with out-of-core computation, lazy evaluation and distributed execution of the PyData stack. He also tinkers with the open source chatbot automation framework Opsdroid in his spare time. Jacob volunteers with the local tech community group Tech Exeter and lives in Exeter, UK.
Workshop | In-Person | Deep Learning | NLP | All Levels
Over the past few years speech synthesis or text-to-speech (TTS) has seen rapid advances thanks to deep learning. As anyone who owns a voice assistant will know, artificial voices are becoming more and more natural and convincing. The good news is you can recreate this impressive technology yourself, using high quality open-source tools. In this workshop, we’ll learn all about TTS and create a custom speech synthesis system from scratch. We’ll take a look at the development of TTS systems up to the present day, investigate the challenges that researchers are still grappling with, and walk through and end-to-end example of creating a deep learning-based TTS system – including data preparation, training, inference and evaluation. This workshop doesn’t require any prior knowledge of TTS or deep learning…more details
Alex Peattie is the co-founder and CTO of Peg, a technology platform helping multinational brands and agencies to find and work with top YouTubers. Peg is used by over 1500 organisations worldwide including Coca-Cola, L’Oreal and Google.
An experienced digital entrepreneur, Alex spent six years as a developer and consultant for the likes of Grubwithus, Huckberry, UNICEF and Nike, before joining coding bootcamp Makers Academy as senior coach, where he trained hundreds of junior developers. Alex was also a technical judge at this year’s TechCrunch Disrupt conference.
Workshop | In-person | Machine Learning | Deep Learning
This session discusses the fundamentals of distributed training and how scaling works with ML model training. Then we’ll look into the PyTorch Lightning framework’s core components. Using these components, we demonstrate how to implement a simple model and scale it with different distributed strategies and accelerators with ease without worrying about the hassles of engineering…more details
Workshop | In-person | Deep Learning | Machine Learning | Intermediate
This tutorial will introduce the glossary of uncertainty quantification relevant for deep learning and contextualise which aspects are most important in order to ease model deployment. While the identification of difficult samples for collaborative approaches between human and AI can be very successful in-domain, the reliable detection of out-of-distribution samples via uncertainty remains an active field of research. We’ll provide an overview of promising recent developments and end with building a neural network that knows when it does not know – in a simple setting and for illustrative purposes…more details
Christian Leibig is Director of Machine Learning at Vara, leading the development of methods from research to production. He obtained a Ph.D. in Neural Information Processing from the International Max Planck Research School in Tübingen and a diploma in physics from the University of Konstanz. Before joining Vara, he worked as a Postdoctoral Researcher at the University Clinics in Tübingen on the applicability of Bayesian Deep Learning and machine learning applications for the healthcare space for ZEISS and held research and internship positions with Max Planck, LMU Munich and the Natural and Medical Sciences Institute in Reutlingen. The method and software of his PhD work, an unsupervised solution for neural spike sorting from HDCMOS-MEA data is distributed by Multichannel Systems (Harvard Bioscience). His work on applying and assessing uncertainty methods to large scale medical imaging was among the first in the field and awarded with key note speaker invitations. He enjoys all of theory, software engineering, and people management, in particular for applications that have a meaningful impact, such as diagnosing cancer early.
Workshop | Virtual | Deep Learning | Beginner-Intermediate
The main goal of this sessions is to show you how GANs work: we will start with a simple example using synthetic data (not generated by GANs) to learn about latent spaces and how to use them to generate more synthetic data (using GANs to generate them). We will improve on the model’s architecture, incorporating convolutional layers (DCGAN), different loss functions (WGAN, WGAN-GP) and use them to generate synthetic images of flowers (the roses!)…more details
Daniel has been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp, for more than three years, helping more than 150 students advance their careers.
He writes regularly for Towards Data Science. His blog post “Understanding PyTorch with an example: a step-by-step tutorial” reached more than 220,000 views since it was published.
The positive feedback from the readers motivated him to write the book Deep Learning with PyTorch Step-by-Step, which covers a broader range of topics.
Daniel is also the main contributor of two python packages: HandySpark and DeepReplay.
His professional background includes 20 years of experience working for companies in several industries: banking, government, fintech, retail and mobility.
Workshop | Virtual | Deep Learning | Machine Learning | Beginner
Learn the basics of building a PyTorch model using a structured, incremental and from first principles approach. Find out why PyTorch is the fastest growing Deep Learning framework and how to make use of its capabilities: autograd, dynamic computation graph, model classes, data loaders and more. The main goal of this session is to show you how PyTorch works: we will start with a simple and familiar example in Numpy and “torch” it! At the end of it, you should be able to understand PyTorch’s key components and how to assemble them together into a working model…more details
Daniel has been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp, for more than three years, helping more than 150 students advance their careers.
He writes regularly for Towards Data Science. His blog post “Understanding PyTorch with an example: a step-by-step tutorial” reached more than 220,000 views since it was published.
The positive feedback from the readers motivated him to write the book Deep Learning with PyTorch Step-by-Step, which covers a broader range of topics.
Daniel is also the main contributor of two python packages: HandySpark and DeepReplay.
His professional background includes 20 years of experience working for companies in several industries: banking, government, fintech, retail and mobility.
Tutorial | Virtual | Machine Learning for Finance | Deep Learning
Our discussion will start with a brief overview of the value of machine learning in economic applications. We will then introduce TensorFlow 2 and discuss its advantages as a tool for solving prediction and modeling problems in economics and finance. The remainder of the presentation will be dedicated to two applications. The first will center on the use of natural language processing methods as a means of extracting text features from central bank communications, such as speeches and policy statements. The second will examine the use of generative adversarial networks (GANs) as a tool for simulating financial data for Monte Carlo experiments. Code will be provided for all worked examples included in the presentation…more details
Isaiah Hull is a senior economist in the research division of Sweden’s Central Bank (Sveriges Riksbank). He holds a PhD in economics from Boston College and conducts research on computational economics, machine learning, and quantum computing. He is also the instructor for DataCamp’s “Introduction to TensorFlow in Python” course and the author of “Machine Learning for Economics in Finance in TensorFlow 2.”
Tutorial | Virtual | MLOps & Data Engineering | Machine Learning | Beginner-Intermediate
One of the key questions in modern data science and machine learning, for businesses and practitioners alike, is how do you move machine learning projects from prototype and experiment to production as a repeatable process. In this workshop, we present an introduction to the landscape of production-grade tools, techniques, and workflows that bridge the gap between laptop data science and production ML workflows…more details
Hugo Bowne-Anderson is a data scientist, writer, educator & podcaster. His interests include promoting data & AI literacy/fluency, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. He does many of these at DataCamp, a data science training company educating over 3 million learners worldwide through interactive courses on the use of Python, R, SQL, Git, Bash and Spreadsheets in a data science context. He has spearheaded the development of over 25 courses in DataCamp’s Python curriculum, impacting over 170,000 learners worldwide through my own courses. He hosts and produce the data science podcast DataFramed, in which he uses long-format interviews with working data scientists to delve into what actually happens in the space and what impact it can and does have. He earned PhD in Mathematics from the University of New South Wales, Australia and has conducted biomedical research at the Max Planck Institute in Germany and Yale University, New Haven.
Workshop | In-person | Machine Learning | Beginner
Learn how to detect common objects in real-time using a TensorFlow.js pre-trained model in your browser to give your next web application superpowers (or to get more eyes on your research). Walk through an end-to-end creation of a smart camera application in this workshop and discover how you can benefit from the reach and scale of the web by performing inference on device in the browser…more details
Jason is the public face of TensorFlow.js, helping web engineers globally take their first steps with machine learning in JavaScript. He also combines his knowledge of the technical and creative worlds to develop innovative prototypes for Google’s largest customers and internal teams with over 15 years experience working within web engineering and investigating emerging technologies.
He holds an MEng in Computer Science, is a member of the British Computing Society, and is a certified information privacy technologist. Jason loves sharing knowledge online which has attracted a global following. In his spare time he can be found walking the wings of flying aircraft being one of the few people in the world who has been trained in the art of wing walking.
Tutorial | Virtual | Machine Learning
According to industry surveys, the number one hassle of data scientists is cleaning the data to analyze it. We will focus on the specific problem of missing values, in a prediction settings. We will show how machine-learning practice can be adapted to work data tables with missing values. We will start from classic missing-value result from statistics and show how they transfer to supervised learning. We will then discuss specific machine-learning methods suited for prediction with missing values. From a statistical point of view the supervised learning settings leads to different tradeoff that the classic statistical results…more details
Gaël Varoquaux is a research director working on data science and health at Inria (French Computer Science National research). His research focuses on using data and machine learning for scientific inference, with applications to health and social science, as well as developing tools that make it easier for non-specialists to use machine learning. He has long applied it to brain-imaging data to understand cognition. Years before the NSA, he was hoping to make bleeding-edge data processing available across new fields, and he has been working on a mastermind plan building easy-to-use open-source software in Python. He is a core developer of scikit-learn, joblib, Mayavi and nilearn, a nominated member of the PSF, and often teaches scientific computing with Python using the scipy lecture notes.
Workshop | In-person | MLOps and Data Engineering | Intermediate
In this workshop, I will introduce the Delta file format and how it works, before taking a tour of the many features available in the open source project. I will show you how to get started with Delta in a spark environment, covering a range of features from simple merge statements and temporal querying, right down to some deeper performance tuning. You will leave this workshop ready to work in a Delta Lake architecture, confident that you will avoid the dreaded swamp!..more details
Simon is the Director of Engineering for Advancing Analytics, a Microsoft Data Platform MVP and one of the few Databricks Beacons Globally. Simon has pioneered Lakehouse Architectures for a some of the world’s largest companies, challenging traditional analytical solutions and pushing for the very best for the data industry. Simon runs the Advancing Spark YouTube channel, where he can often be found digging into Spark features, investigating new Microsoft technologies and cheering on the Delta Lake project.
Workshop | In-Person | Deep Learning | Machine Learning | Intermediate
Neural networks are powerful approximators for any function. However they do not have the slightest idea of common knowledge of the world which often makes them fail miserably, especially when extrapolating to areas not covered by training data.
We, as human beings have that knowledge about the world and our domains of expertise, allowing deep learning models to become much more robust and even to extrapolate. But how do we encode this?…more details
Oliver is a software developer from Hamburg Germany and has been a practitioner for more than 3 decades. He specializes in frontend development and machine learning. He is the author of many video courses and textbooks.
Bootcamp | Virtual | Machine Learning
The first half of the session will include a discussion of both the theory and the implementation of Supervised modeling such as: Linear Regression, Logistic Regression, Decision Trees, Random Forest and XGBoost. We will then shift our focus to Unsupervised models including Clustering and Topic Modeling algorithms. This session will be interactive such that students will be able to develop their own models within the scope of the workshop…more details
Julia Lintern currently works as an instructor for the Metis Data Science Flex Program. Previously, she worked as a Data Scientist for the New York Times. Julia began her career as a structures engineer designing repairs for damaged aircraft. Julia holds an MA in applied math from Hunter College, where she focused on visualizations of various numerical methods and discovered a deep appreciation for the combination of mathematics and visualizations. During certain seasons of her career, she has also worked on creative side projects such as Lia Lintern, her own fashion label.
Bootcamp | Virtual | All Focus Areas
The goal of this session is to show you that you can start learning the math needed for machine learning and data science using code. You’ll learn about scalars, vectors, matrices and tensors, and see how to use linear algebra on your data. Don’t worry if you don’t have a math background, we’ll explain the mathematical notations and conventions. At the end of the session, you’ll know how to operate on vectors, matrices and tensors, use the norm of vectors, and apply the dot product to vectors…more details
Hadrien Jean is a machine learning scientist working at My Medical Assistent where he is developing deep learning models in the medical domain. He wrote the book Essential Math for Data Science (https://www.essentialmathfordatascience.com/) aimed at helping people to get the math needed in data science from a coding perspective. He previously worked at Ava on speech diarization. He also worked on a bird detection project using deep learning. He completed his Ph.D. in cognitive science at the École Normale Supérieure (Paris, France) on the topic of auditory perceptual learning with a behavioral and electrophysiological approach. He has published a series of blog articles aiming at building intuition on mathematics through code and visualization (https://hadrienj.github.io/posts/).
Tutorial | Virtual | ML for Finance | Intermediate
This session demonstrates the application of reinforcement learning to create a financial model-free solution to the asset allocation problem, learning to solve the problem using time series and deep neural networks. We demonstrate this on daily data for the top 24 stocks in the US equities universe with daily rebalancing. We use a deep reinforcement model on US stocks using different deep learning architectures. We use Long Short Term Memory networks, Convolutional Neural Networks, and Recurrent Neural Networks and compare them with more traditional portfolio management approaches…more details
Sonam Srivastava is the founder of Wright Research, an India-based Robo-advisor, where she creates data-driven portfolios out of her deep passion for quant finance.
Wright Research is a wealth creator in the digital space that uses scientific data-driven methods to tactically extract opportunities across assets in the public markets to grow clients’ wealth. Wright functions as SEBI registered Robo advisor and is among the most popular advisors among millennial investors with more than 30000 clients and 125 crore+ in assets. Wright Research has delivered a 90% + outperformance over the index in the last 2.5 years.
She has 10+ years of experience in investment research and portfolio management, working on systematic strategies, long-short strategies, and algorithmic trading. She started her career in the field with Mumbai-based Forefront Capital, which got acquired by Edelweiss. At Edelweiss, she worked as an algorithm designer at Edelweiss’s institutional equity broking desk. After that, she worked at HSBC Europe as a quant building factor-driven portfolio solutions. Before starting Wright Research, she also worked at Qplum, doing portfolio management at the artificial intelligence-driven Robo-advisor.
She graduated from IIT Kanpur and has a master’s in financial engineering from Worldquant University. She is a globally recognized researcher and works as a visiting faculty as AI in Finance Institute New York and BSE Institute Limited.
Workshop | Virtual | NLP | Intermediate
This talk will highlight the general concepts and ways of implementing the language model DistilBERT and using transfer learning to use the base model to build an efficient question-answering model. This also ensures that using the available open-source platforms we are able to have better business outputs as well as a better environment because training a single AI model contributes to 5 cars’ lifetime worth of carbon emissions? A basic understanding of python is desirable. Code can be made available via GitHub for everyone to examine after the talk…more details
Jayeeta is a Senior Data Scientist with 6+ years of industry experience. She received her MS in Quantitative Methods and Modeling from NY, and a BS in Economics and Statistics. Currently, Jayeeta works at Fitch Ratings, a global leader in financial information services. Jayeeta is an avid NLP researcher and gets to explore a lot of state-of-the-art models to build cool products and firmly believes that data, of all forms, is the best storyteller. She also led multiple NLP workshops in association with Women Who Code, GitNation among others. Jayeeta has also been invited to speak at International Conference on Machine Learning (ICML 2020, 2022), ODSC East, MLConf EU, WomenTech Global Conference, and Data Summit Connect among others. Jayeeta is passionate about promoting initiatives to inspire more women to take up STEM. Jayeeta lives in New York, she loves to cook, and spends her summers hiking and traveling with her husband.
Tutorial | Virtual | Deep Learning | Machine Leaning | Deep Learning | NLP | Intermediate
In the tutorial, we plan to provide a comprehensive description of the main categories of incremental learning methods, e.g., based on distillation loss, growing the capacity of the network, introducing regularization constraints, or using autoencoders to capture knowledge from the initial training set, as well as recent advances in the context of self-supervised learning…more details
Karteek Alahari is a senior researcher (known as chargé de recherche in France, which is equivalent to a tenured associate professor) at Inria. He is based in the Thoth research team at the Inria Grenoble – Rhône-Alpes center. He was previously a postdoctoral fellow in the Inria WILLOW team at the Department of Computer Science in ENS (École Normale Supérieure), after completing his PhD in 2010 in the UK. His current research focuses on addressing the visual understanding problem in the context of large-scale datasets. In particular, he works on learning robust and effective visual representations, when only partially-supervised data is available. This includes frameworks such as incremental learning, weakly-supervised learning, adversarial training, etc. Dr. Alahari’s research has been funded by a Google research award, the French national research agency, and other industrial grants, including Facebook, NaverLabs Europe, Valeo.
Training | In-person | MLOps and Data Engineering | Machine Learning | Deep Learning | NLP | Intermediate
Machine learning is usually taught from tutorials using small, clean datasets put into data-frames and orchestrated with Jupyter notebooks; all done in one, in-memory, local environment. This is a fine style for presenting a new topic and teaching the main ideas, but unfortunately, these patterns are not conducive to the delivery of real production applications at scale…more details
Ryan Dawson is a technologist passionate about data. Ryan works with clients on large-scale data and AI initiatives, helping organizations get more value from data. His work includes strategies to productionize machine learning, organizing the way data is captured and shared, selecting the right data technologies and optimal team structures, as well as writing the code to make it happen. He has over 15 years of experience and, as well as many widely read articles about MLOps, software design, and delivery. is author of the Thoughtworks Guide to Evaluating MLOps Platforms.
Meissane Chami serves ThoughtWorks, Inc. as a Senior ML Engineer, advising and developing innovative data science and machine learning solutions from proof of concept to production. She has gained expertise setting up innovation frameworks and conducting fast cycle proof of concepts. Her primary areas of expertise are in Natural Language processing, MLOps, DevOps, cloud computing, containerisation and Python. She holds a MSc degree in Machine Learning and Data Science form University College London School of Engineering.
Andy Symonds is a technologist passionate about using data science in new and interesting ways. With a background in academia before moving into consulting, he loves problem solving and experimentation. Andy works with clients to help them gain insights and drive business value, by developing proof of concepts and moving these solutions into production.
Workshop | In-person | Responsible AI | Beginner-Intermediate
In this workshop, I will walk you through some worked examples using open-source H2O (available in both Python and R). You will learn how to build models with good accuracy while keeping fairness and interpretability in mind. The worked examples can also be used as templates for you to try H2O with your own datasets…more details
Jo-fai (or Joe) has multiple roles (data scientist / evangelist / community manager) at H2O.ai. Since joining the company in 2016, Joe has delivered H2O talks/workshops in 40+ cities around Europe, US, and Asia. Nowadays, he is best known as the H2O #360Selfie guy. He is also the co-organiser of H2O’s EMEA meetup groups including London Artificial Intelligence & Deep Learning – one of the biggest data science communities in the world with more than 11,000 members (https://www.meetup.com/London-Artificial-Intelligence-Deep-Learning/).
Tutorial | In-Person | Deep Learning | Advanced
Sadly, Deep Learning-based generative models are often used to create visual deep fakes, but many use cases that unfortunately don’t make the headlines exist in diverse industries. In this talk the speaker is going to highlight the potential business value of generative models with reference to a real use case scenario in the pharma manufacturing…more details
Guglielmo is a Biomedical Engineer with an extensive background in Software Engineering and Data Science applied to different contexts, such as Biotech Manufacturing, Healthcare and DevOps, just to mention the latest, and a lifelong learner. Currently busy unlocking business value through Deep Learning projects, mostly in Computer Vision (not restricted to this field by the way).
He has been recognized as DataOps Champion at the Streamsets DataOps Summit 2019 and awarded as one of the Top 50 Tech Visionaries at the 2019 Dubai Intercon Conference.
He is also an international speaker and author of the following book:
Hands-on Deep Learning with Apache Spark @Packt https://www.packtpub.com/big-data-and-business-intelligence/hands-deep-learning-apache-spark
Training | Virtual | Machine Learning | Beginner
In this workshop we cover the basics of modern R. You’ll learn how to read data from a CSV using the readr package, manipulate data with dplyr and make compelling visualizations with ggplot2…more details
Tutorial | In-person | Machine Learning | Intermediate
Data preprocessing is an important part of data science, even more so when dealing with time series. Signal processing and the Fourier Transform can move your time series in frequency domain, where events are easier to detect. Learn how to apply the Fourier Transform to build a classification solution on top of IoT Time Series data. In this tutorial, Corey will introduce the theory behind the Fourier Transform and demonstrate how to build a classification model in the Frequency Domain by using KNIME Analytics Platform…more details
Corey studied Mathematics at Michigan State University and works as a Data Scientist with KNIME where he focuses on Time Series Analysis, Forecasting, and Signal Analytics. He is the creator and instructor of the KNIME Time Series Analysis course, author of the e-book: Alteryx to KNIME, creator of the KNIME Time Series Analysis components, and Co-Author of the upcoming Codeless Time Series Analysis Book with Packt.
Workshop | In-person | Machine Learning | NLP | Intermediate
In the workshop, I will walk you through building a recommender system from learned embeddings in a fashion e-commerce platform. You will learn how to collect and represent the users’ interaction data, building a model to generate embeddings and how to use approximate nearest neighbour algorithms to build a similarity model. You will also learn how to tackle the cold start problem and finally the techniques whereby you can evaluate the recommendation systems…more details
Seyed Saeid Masoumzadeh is a senior data scientist at Lyst, a world largest fashion search platform. He has extensive experience in researching and developing Machine Learning, Deep Learning and NLP, and delivering them into production. Saeid is also the Co-founder of Cyra, a smart AI-based recruiting assistant, backed by “Entrepreneur First”, an international Talent Investor. Saeid has received his master degree in artificial intelligence and his PhD in computer science from the University of Vienna. He has published several peer-reviewed papers in reputed international journals and conferences.
Workshop | Virtual | NLP | Machine Learning | All Levels
Weakly supervised approaches have gained popularity in the last two years, but there is still a significant amount of overhead in applying these methods to more complex NLP tasks. The performance of weakly supervised systems is contingent on both the quality and quantity of independent sources of weak signal- if a practitioner cannot come up with sufficient sources themselves then weak supervision is largely impractical…more details
Shayan Mohanty is the CEO and Co-Founder of Watchful, a company that largely automates the process of creating labeled training data. He’s spent over a decade of leading data engineering teams at various companies including Facebook, where he served as lead for the stream processing team responsible for processing 100% of the ads metrics data for all FB products. He is also a Guest Scientist at Los Alamos National Laboratory and has given talks on topics ranging from Automata Theory to Machine Teaching.
Workshop | Virtual | Machine Learning | MLOps and Data Engineering
In this talk, Data Scientist Felipe Adachi will talk about different types of data distribution shifts in ML applications, such as covariate shift, label shift, and concept drift, and how these issues can affect your ML application. Furthermore, the speaker will discuss the challenges of enabling distribution shift detection in data in a lightweight and scalable manner by calculating approximate statistics for drift measurements. Finally, the speaker will walk through steps that data scientists and ML engineers can take in order to surface data distribution shift issues in a practical manner, rather than reacting to the impacts of performance degradation reported by their customers…more details
Felipe is a Data Scientist in WhyLabs. He is a core contributor to whylogs, an open-source data logging library, and focuses on writing technical content and expanding the whylogs library in order to make AI more accessible, robust, and responsible. Previously, Felipe was an AI Researcher at WEG, where he researched and deployed Natural Language Processing approaches to extract knowledge from textual information about electric machinery. He is also a Master in Electronic Systems Engineering from UFSC (Universidade Federal de Santa Catarina), with research focused on developing and deploying fault detection strategies based on machine learning for unmanned underwater vehicles. Felipe has published a series of blog articles about MLOps, Monitoring, and Natural Language Processing in publications such as Towards Data Science, Analytics Vidhya, and Google Cloud Community.
Bernease Herman is a senior data scientist at WhyLabs, the AI Observability company, and a research scientist at the University of Washington eScience Institute. At WhyLabs, she is building model and data monitoring solutions using approximate statistics techniques. Earlier in her career, Bernease built ML-driven solutions for inventory planning at Amazon and conducted quantitative research at Morgan Stanley. Her academic research focuses on evaluation metrics and interpretable ML with specialty on synthetic data and societal implications. She has published work in top machine learning conferences and workshops such as NeurIPS, ICLR, and FAccT.
Tutorial | Virtual | NLP | Intermediate
This workshop includes all the tips and tricks to deploy a successful sentiment analysis model. We’ll start from word level semantics and dissect vocab words with spaCy. More specifically, we’ll classify vocabulary words with their syntactic categories. Next we’ll continue with statistical word semantics and play with word vectors. Finally we’ll move onto sentence level semantics, play with context dependent word vectors, then experiment with the mighty BERT…more details
Duygu Altinok is a senior NLP engineer with 12 years of experience in almost all areas of NLP including search engine technology, speech recognition, text analytics and conversational AI. She authored several publications in NLP area at conferences such as LREC and CLNLP. She also enjoys working for open-source projects and a contributor of spaCy library.
Duygu earned her undergraduate degree in Computer Engineering from METU, Ankara in 2010 and later earned her Master’s degree in Mathematics from Bilkent University, Ankara in 2012. She spent 2 years at University of Bonn for her PhD studies. She is currently a senior engineer at Deepgram with a focus on conversational AI and speech technology.
Originally from Istanbul, Duygu currently resides in Berlin, DE with her cute dog Adele.
Tutorial | In-person | MLOps & Data Engineering | Machine Learning | Intermediate
In this talk, Gal (Senior Data Scientist, Fiverr) and Itai (CPO, Mona) discuss how Fiverr utilizes advanced tools, both home-grown and bought, to bridge the gap between data science and business, empower data scientists to understand the behavior of their models in production and make sure their AI solutions bring the value they’re expected to deliver…more details
With over 10 years of experience (Google, AI-focused startups) with big data and as the CPO and head of customer success at Mona, the leading AI monitoring intelligence company, Itai has a unique view of the AI industry. Working closely with data science and ML teams applying dozens of solutions in over 10 industries, Itai encounters a wide variety of business use-cases, organizational structures and cultures, and technologies used in today’s AI world.
Gal Naamani has been working as a data scientist for 4 years, with the past 3 years being at Fiverr. As the Senior Data Scientist, Gal works closely with developers, analysts, product managers, and business owners on growth opportunities and new ideas, from research to production. Gal currently has leading roles in projects that are focused around search engine ranking, promoted ads, online bidding optimization, exploration-exploitation problems, monitoring, and more.
Tutorial | In-person | Deep Learning | Machine Learning | All Levels
In this session, we will present a brief introduction to causality in AI. We will start with fundamental statistics, introduce causal graphs, do-calculus, and general methods for causal inference. We will give an overview of methods that allow the discovery of causal structures from observational data, as well as several applications in business settings. This session will cover topics from the very basic (what is a correlation?) to more complex ideas such as causal representation learning. This is meant to serve as a general and practical overview of the fastest-growing research area in AI…more details
Andre joined causaLens from Goldman Sachs, where he was an executive director in the Model Risk Management group in Hong Kong and Frankfurt. Today he is working with industry leading, global organisations to apply cutting edge Causal AI research in production level solutions that empower individuals and teams to make better decisions. Andre received his PhD in theoretical physics from the University of Munich, where he studied the interplay between quantum mechanics and general relativity in black-holes.
Tutorial | In-person | Machine Learning | Deep Learning | Advanced
In this short tutorial, we go through an example of writing our own estimator, test it against the scikit-learn’s common tests, and see how it behaves inside a pipeline and a grid search. There have also been recent developments related to the general API of the estimators which require slight modifications by the third party developers. In this talk we cover these changes and point you to the activities to watch as well as some of the private utilities which you can use to improve your experience of developing an estimator…more details
He is a computer scientist / bioinformatician who has turned to be a core developer of `scikit-learn` and `fairlearn`, and work as a Machine Learning Engineer at Hugging Face. He is also an organizer of PyData Berlin.
These days he mostly focus on aspects of machine learning and tools which help with creating more ethical and fair decision making systems. This trend has influenced him to work on `fairlearn`, and to work on aspects of `scikit-learn` which would help tools such as `fairlearn` to work more fluently with the package; and at Hugging Face, his focus is to enable the community of these libraries to be able to share their models more easily and be more open about their work.
Workshop | In-person | Machine Learning | Beginner-Intermediate
Tuning a model is a core element of a data scientist’s work. It is often very difficult, requiring both experience and expertise to do effectively. An important and integral part of the model tuning process is the feature selection process. This is because in many cases, the model itself is a ‘black box’, which makes it hard to understand features’ performance…more details
Ori Nakar is a principal cyber-security researcher, a data engineer and a data scientist at Imperva Threat Research group. Ori has a many years experience as a software engineer and engineering manager, focused on cloud technologies and big data infrastructure. At the Threat Research group Ori is responsible for the data infrastructure and involved in analytics projects, machine learning and innovation projects.
Tutorial | Virtual | Machine Learning | Intermediate
Advances in information extraction have enabled the automatic construction of large knowledge graphs (KGs) like DBpedia, YAGO, Wikidata or Google Knowledge Graph. Learning rules from KGs is a crucial task for KG completion, cleaning and curation. This tutorial presents state-of-the-art rule induction methods, recent advances, research opportunities as well as open challenges along this avenue…more details
Daria Stepanova is a lead research scientist at Bosch Center for Artificial Intelligence. Her research interests include knowledge representation and reasoning, machine learning and neuro-symbolic AI. Previously Daria was a senior researcher at Max Plank Institute for Informatics (Germany), where she was heading a group on semantic data. Daria got her PhD in Computational Logic from Vienna University of Technology (Austria) in 2015. Before starting her PhD she worked as a visiting researcher at the School of Computing Science at Newcastle University (UK) in an industrially-oriented project.
Training | In-person | MLOps and Data Engineering | Machine Learning | Deep Learning | NLP | Intermediate
Data science teams differ from traditional software groups as the former often simultaneously tackle multiple short-lived projects. It’s not uncommon for a small group to have a few days to construct a dashboard. We have to balance the business need with long term reproducibility requirements. This workshop takes you through strategies that you can adopt…more details
Dr Colin Gillespie is the Co-Founder and CTO of Jumping Rivers. A data science consultancy that specialises in all things R and Python. He is also a Senior Statistics lecturer at Newcastle University, has published over eighty peer-reviewed papers, and co-authored the O’Reilly book, Efficient R programming.
Training | In-person | Machine Learning for Finance | Intermediate
This half-day trading session covers the most important Python topics and skills to apply AI and Machine Learning (ML) to Algorithmic Trading. The session shows how to make use of the Oanda trading API (via a demo account) to retrieve data, to stream data, to place orders, etc. Building on this, a ML-based trading strategy is formulated and backtested. Finally, the trading strategy is transformed into an online trading algorithm and is deployed for real-time trading on the Oanda trading platform…more details
Dr. Yves J. Hilpisch is founder and CEO of The Python Quants (http://tpq.io), a group focusing on the use of open source technologies for financial data science, artificial intelligence, algorithmic trading, and computational finance. He is also founder and CEO of The AI Machine (http://aimachine.io), a company focused on AI-powered algorithmic trading based on a proprietary strategy execution platform.
Yves has a Diploma in Business Administration, a Ph.D. in Mathematical Finance and is Adjunct Professor for Computational Finance at Miami Herbert Business School.
Workshop | In-person | Responsible Ai | Machine Learning | Beginner-Intermediate
Human In The Loop (HITL) is a process in which, as part of the ML workflow, experts are asked their opinion about predictions made by an ML model in order to tune and improve the model. In this talk we’ll explain how we collaborated with and integrated engineers as a core part of our machine learning process, in order to create a mechanism to automatically predict the best security policies for our customers. We’ll go through the different stages of the project, discuss the challenges we faced along the way and how we overcame them, and show how you can use a similar process for any heuristic/ML project you have…more details
Adam is an experienced Data Scientist at Imperva’s threat research group where he works on creating machine learning algorithms to help protect Imperva’s customers against database attacks. Before joining Imperva, he obtained a PHD in Neuroscience from Ben-Gurion University of the Negev.
Workshop | In-person | Deep Learning | MLOps and Data Engineering | Intermediate
This live coding session introduces the ideas behind autodiff and teaches its fundamentals by walking you through a simple example of implementing autodiff using the core Python programming language features, without PyTorch. In the process, you will gain a deeper understanding of the PyTorch autodiff functionality and develop the knowledge that will help you troubleshoot PyTorch model training (for example, using Horovod) in your projects. You will see that while autodiff can be straightforward, it scales to complex applications of the calculus chain rule…more details
Carl implemented his first neural net in 2000. He is a senior director of the AI / ML practice at Cognizant, focusing on communications, technology, and media customers. Previously he worked on deep learning and machine learning at Google and IBM. Carl is an author of over 20 articles in professional, trade, and academic journals, an inventor with 6 patents at USPTO, and holds 3 corporate awards from IBM for his innovative work. His machine learning book, “MLOps Engineering at Scale” continues to receive reader acclaim. You can find out more about Carl from his blog www.cloudswithcarl.com
Workshop | Virtual | Machine Learning | Deep Learning
Time-Series processing is highly important in many domains and sets of problems, however, preprocessing and machine learning models for time-series come with their unique challenges – we’ll go through a time-series dataset and look at opportunities for optimisation. After the tutorial, you should be more comfortable with time-series datasets and know about a few modelling approaches…more details
Ben Auffarth is the head of data science at loveholidays. With a background and Ph.D. in computational and cognitive neuroscience, he has designed and conducted wet lab experiments on cell cultures, analysed experiments with terabytes of data, run brain models on IBM supercomputers with up to 64k cores. More recently, he’s built production systems processing hundreds of thousands of transactions per day, and trained neural networks on millions of text documents. He’s authored two books on machine learning. When he’s not at work, you might find him on a playground with his young son in West London. He co-founded and is the former president of Data Science Speakers, London.
Workshop | In-person | MLOps and Data Engineering | Machine Learning | Deep Learning | NLP | Beginner-Intermediate
By completing this workshop, you will develop an understanding of the core principles of causal inference and how it differs from AI. You will see how in this context often data is not enough, and incorporating business knowledge and understanding is key. You will also become familiar with tools to implement causal inference in your own decision making solutions…more details
Alice Grout-Smith is a Data Scientist at Jaguar Land Rover’s Corporate Analytics Centre of Excellence. She graduated from the University of Oxford with a degree in Chemistry and a Masters in the field of Quantum Mechanics, which was later published. She became one of the first members of JLR’s Analytics team and has recently completed the Analytics Graduate Scheme. She has gained expertise in time series modelling and is interested in the challenges raised by scarce data. Her work focuses on developing forecasting models, from flat linear to hierarchical Bayesian models. She is excited to be speaking at the ODSC, having been inspired by it at the start of her career.
Jamie Hilton is a Senior Data Scientist at Jaguar Land Rover (JLR) with over 5 years of experience realising business value through data insights. At JLR his work focuses on driving digital transformation with data, helping the business to make the right decisions at the right time. Previously, he led advanced analytics initiatives as Head of Customer Science at Manchester-based e-commerce business THG. He holds a MA in Mathematics from the University of Cambridge.
Jamie is particularly passionate about the application of data science to the automotive and motorsport industries. In 2021, he worked with leading Formula 2 team Virtuosi Racing to deliver a competitive advantage by leveraging their data, having studied Advanced Motorsport Engineering at Cranfield University.
Workshop | In-person | NLP | Intermediate
Natural language generation is one of the key areas of Natural Language Processing with a range of applications such as dialogue generation, question-answering, machine translation, summarisation, etc. Most recently, controlled text generation techniques have been actively applied for data augmentation purposes in the general NLP domain notorious for its data sparsity issue. This makes this task the principal tool in the toolkit of any Data Science or AI practitioner. Current state-of-the-art in language generation predominantly uses pre-trained Transformer-based language models. Despite the progress of these powerful models, the task of controlling text generation remains a challenge and mainly relies on best practices…more details
Julia Ive is a Lecturer in Natural Language Processing at Queen Mary University of London, UK. She is the author of many mono- and multimodal text generation approaches in Machine Translation and Summarisation. Currently, she is working on the theoretical aspects of style preservation and privacy-safety in artificial text generation.
Talk | In-person | Responsible Ai | Intermediate
In this talk, data scientist Prathiba Krishna will delve into the implications of deploying AI solutions and how to instil conscious considerations for those solutions to be ethical. The speaker will also talk about the importance of interpretability and explainability of models within the Analytics lifecycle and to shed light on the possible actions that organisations may want to consider while implementing AI responsibly. The talk will structure on the benefits and pitfalls within this journey…more details
Prathiba is an experienced Data Scientist with a rich background in the Insurance industry. With a Master’s degree in Operational Research with Applied Statistics and Risk, her passion takes form through seeing the varying applications of Machine Learning and AI techniques, and how they propel data scientists to build better models and solutions. Skilled in data analysis and modelling, she utilizes SAS software and Open Source to assess and address problems within enterprise organizations.
Keynote | Virtual | Machine Learning | Deep Learning | All Levels
The message-passing paradigm has been the “battle horse” of deep learning on graphs for several years, making graph neural networks a big success in a wide range of applications, from particle physics to protein design. From a theoretical viewpoint, it established the link to the Weisfeiler-Lehman hierarchy, allowing to analyse the expressive power of GNNs. I argue that the very “node-and-edge”-centric mindset of current graph deep learning schemes may hinder future progress in the field. As an alternative, I propose physics-inspired “continuous” learning models that open up a new trove of tools from the fields of differential geometry, algebraic topology, and differential equations so far largely unexplored in graph ML…more details
Michael Bronstein is the DeepMind Professor of AI at the University of Oxford and Head of Graph Learning Research at Twitter. He was previously a professor at Imperial College London and held visiting appointments at Stanford, MIT, and Harvard, and has also been affiliated with three Institutes for Advanced Study (at TUM as a Rudolf Diesel Fellow (2017-2019), at Harvard as a Radcliffe fellow (2017-2018), and at Princeton as a short-time scholar (2020)). Michael received his PhD from the Technion in 2007. He is the recipient of the Royal Society Wolfson Research Merit Award, Royal Academy of Engineering Silver Medal, five ERC grants, two Google Faculty Research Awards, and two Amazon AWS ML Research Awards. He is a Member of the Academia Europaea, Fellow of IEEE, IAPR, BCS, and ELLIS, ACM Distinguished Speaker, and World Economic Forum Young Scientist. In addition to his academic career, Michael is a serial entrepreneur and founder of multiple startup companies, including Novafora, Invision (acquired by Intel in 2012), Videocites, and Fabula AI (acquired by Twitter in 2019).
Talk | In-Person | Machine Learning | Deep Learning | All Levels
In this talk I will explain how Bayesian modeling addresses these issues by (i) incorporating expert knowledge of the structure as well as about plausible parameter rangers; (ii) connecting multiple different data sets to increase circumstantial evidence of latent user features; and (iii) principled quantification of uncertainty to increase robustness of model fits and interpretation of the results. Inspired by real-world problems we encountered at PyMC Labs, we will look at Media Mix Models for marketing attribution and Customer Lifetime Value models and various hybrids between them…more details
Dr. Thomas Wiecki is an author of PyMC, the leading platform for statistical data science. To help businesses solve some of their trickiest data science problems, he assembled some of the best Bayesian modelers out there and founded PyMC Labs — the Bayesian consultancy. He did his PhD at Brown University. Website link: https://www.pymc-labs.io
Talk | In-person | Machine Learning | Deep Learning | All Levels
In this session, communication consultant and data visualization specialist Alan Rutter will break down common challenges that data teams face in presenting their work to others, and suggest practical solutions – which are often rooted in people and culture, rather than data and code. It is aimed at any data practitioner who is frequently presenting insights to other stakeholders…more details
Alan Rutter is the founder of consultancy Fire Plus Algebra, and is a specialist in communicating complex subjects through data visualisation, writing and design. He has worked as a journalist, product owner and trainer for brands and organisations including Guardian Masterclasses, WIRED, Time Out,the Home Office, the Biotechnology and Biological Sciences Research Council and Liverpool School of Tropical Medicine.
Track Keynote | In-person | MLOps and Data Engineering | All Levels
In this session, we will describe the challenges in operationalizing machine & deep learning. We’ll explain the production-first approach to MLOps pipelines – using a modular strategy, where the different components provide a continuous, automated, and far simpler way to move from research and development to scalable production pipelines. Without the need to refactor code, add glue logic, and spend significant efforts on data and ML engineering…more details
Yaron Haviv is a serial entrepreneur who has been applying his deep technological experience in data, cloud, AI and networking to leading startups and enterprise companies since the late 1990s. As the co-founder and CTO of Iguazio, Yaron drives the strategy for the company’s MLOps platform and led the shift towards the production-first approach to data science and catering to real-time AI use cases. He also initiated and built Nuclio, a leading open source serverless platform with over 4,000 Github stars and MLRun, Iguazio’s open source MLOps orchestration framework. Prior to co-founding Iguazio in 2014, Yaron was the Vice President of Datacenter Solutions at Mellanox (now NVIDIA), where he led technology innovation, software development and solution integrations. He was also the CTO and Vice President of R&D at Voltaire, a high-performance computing, IO and networking company which floated on the NYSE in 2007. Yaron is an active contributor to the CNCF Working Group and was one of the foundation’s first members. He presents at major industry events and writes tech content for leading publications including TheNewStack, Hackernoon, DZone, Towards Data Science and more.
Demo Talk | In-person | MLOps & Data Engineering | All Levels
What does it take to get the best model into production? We’ve seen industry-leading ML teams follow some of the same common workflows for dataset management, experimentation, and model management. I’ll share case studies from customers across industries, outline best practices, and dive into tools and solutions for common pain points…more details
Allan’s background covers a broad technology stack in infrastructure and cloud, working across a variety of roles in large enterprises before moving into Data Science and ML in recent years. His last role was working on time series forecasting at a fintech scale-up before joining Weights and Biases as the first member of the Customer Success team in EMEA.
Demo Talk | Virtual | MLOps and Data Engineering | All Levels
In this session, you will learn how to run machine learning workloads with seamless Azure Machine Learning experience anywhere, including on-premises, in multi-cloud environments, and at the edge. Use any Kubernetes cluster and extend machine learning to run MLOps, model training, real-time inference or batch-inference. You can manage all the resources through a single pane with the management, consistency, and reliability...more details
Doris Zhong is a Product Manager in Azure AI Platform organization at Microsoft, and she is focusing on the area of machine learning in hybrid cloud. She loves to communicate with customer to get deep insights, and help solve the real problem. In her early career, she worked on building Microsoft internal GPU training platform, that managed tens of thousands of GPUs, and served thousands of users.
Talk | In-person | MLOps and Data Engineering | Intermediate
In this talk we will deep dive into 3 Enterprise case studies where leading organizations have built automated machine / deep learning pipelines, generating real business value from AI: 1. Serving real time recommendations for retail 2. Scaling NLP pipelines to make thousands of PDFs searchable and indexable for the organization 3. Deploying 40+ data products at a large airline group to tackle fraud, optimize flight routes to reduce CO2 emissions and improve pilot training We’ll cover the organizational and technological aspects to consider when building up your MLOps capabilities and practical tips for success…more details
Adi Hirschtein brings 20 years of experience as an executive, product manager and entrepreneur building and driving innovation in technology companies. As the VP of Product at Iguazio, the MLOps platform built for production and real-time use cases, he leads the product roadmap and strategy. His previous roles spanned technology companies such as Dell EMC, Zettapoint and InfraGate, in diverse positions including product management, business development, marketing, sales and execution, with a strong focus on machine learning, database and storage technology. When working with startups and corporates, Adi’s passion lies in taking a team’s ideas from their very first day, through a successful market penetration, all the way to an established business. Adi holds a B.A. in Business Administration and Information Technology from the College of Management Academic Studies.
Talk | Virtual | Responsible Ai and Social Good | Intermediate
In this talk, Nuria will describe the work that she did between March 2020 and April 2022, leading a multi-disciplinary team of 20+ volunteer scientists working very closely with the Presidency of the Valencian Government in Spain on 4 large areas: human mobility modeling; computational epidemiological models (both metapopulation, individual and LSTM-based models); predictive models; and a large-scale, online citizen surveys called the COVID19impactsurvey (https://covid19impactsurvey.org) with over 720,000 answers worldwide. This survey has enabled us to shed light on the impact that the pandemic is having on people’s lives…more details
Nuria Oliver is the Commissioner to the President of the Valencian Government on AI Strategy and Data Science against COVID-19; Cofounder and Vicepresident of ELLIS; Cofounder of the ELLIS Alicante Unit Foundation; Chief Data Scientist at Data-Pop Alliance.
Nuria earned her PhD from MIT; is a Fellow of the IEEE, an ACM Fellow and Fellow; Member of the Spanish Royal Academy of Engineering, SIGCHI Academy and Academia Europaea. She has 25+ years of research experience in human-centric AI and is the author of 160+ widely cited scientific articles as well as an inventor of 40+ patents and a public speaker. Her work is regularly featured in the media and has received numerous recognitions, including the Spanish National Computer Science Award (Angela Ruiz Robles category), the MIT TR100 (today TR35) Young Innovator Award (first Spanish scientist to receive this award); 2019 Data Scientist of the Year in Europe; 2020 Data Scientist of the Year by ESRI. She has recently co-led ValenciaIA4COVID, the winning team of the 500k XPRIZE Pandemic Response Challenge.
Talk | Virtual | Responsible Ai | Machine Learning | Beginner
This talk will introduce the audience to challenges in AI for health equity with a particular focus on race and ethnicity data. We will explore real-world ethnicity data collected routinely in healthcare settings in the form of electronic health records. We will examine issues with completeness, correctness, and granularity of these data, implications for healthcare AI, and finally highlight opportunities towards “better data, better models, better healthcare”…more details
Sara is a Senior Research Associate in Biomedical Data Science and University Research Lecturer at the University of Oxford, where she is the Machine Learning Lead in the Centre for Statistics in Medicine. She has 12 years of experience in machine learning, signal processing, and intelligent remote monitoring research, with applications in biomedical and planetary health informatics. Sara has served on the NASA Frontier Development Lab Artificial Intelligence Panel and the NASA Climate Challenge Big Think. She is a National Geographic Society Explorer in Tracking Plastic Pollution with Remote Monitoring and Machine Learning. Sara is also a University of Oxford Ambassador for Women in Data Science.
Talk | Virtual | Machine Learning | All Levels
Z by HP is bringing exceptional technology -both hardware and software- to data scientists, analysts, and creatives that is built for the demands of heavy data processing workloads, right out of the box. With Z laptops, desktops, or rack-mounted data science computers, you get high-performance computers that complement your cloud infrastructure, reduce latency of heavy workloads, make collaboration more efficient, and secure your data end to end. Critical data science software is pre-loaded on the machine – Tensorflow, PyTorch, python, and docker to name a few-with your choice of a Linux or windows subsystem for Linux (2) operating system. This ensures each program works seamlessly with the right version of your OS and other supporting software so you are productive on day one…more details
Brad leads business development in North America in Data Science for all commercial and enterprise accounts. He brings a unique background in SAAS sales, technology, and leadership to HP’s Advanced Compute Solutions organization.
Hunter Kempf is a Data Scientist working in the cybersecurity industry and a Z by HP Global Data Science Ambassador. In his free time he works on various side projects relating to Data Science and some of those projects end up as articles for his Medium blog. Previously Hunter worked as a Data Scientist at AT&T working on preventing Fraud and Security incidents and graduated from the Georgia Institute of Technology (Georgia Tech) with a masters in Cybersecurity and the University of Notre Dame with a masters in Applied and Computational Mathematics and Statistics.
Talk | In-person | NLP | Machine Learning | Beginner-Intermediate
This talk provides a gentle and highly visual overview of some of the main intuitions and real-world applications of large language models. It assumes no prior knowledge of language processing and aims to bring attendees up to date with the fundamental intuitions and applications of large language models…more details
Jay Alammar, Through his popular machine learning blog, Jay has helped millions of engineers visually understand machine learning tools and concepts from the basic (ending up in NumPy, pandas docs) to the cutting-edge (The Illustrated Transformer, BERT, GPT-3).
Demo Talk | In-person | MLOps and Data Engineering | Intermediate
In this session, Cloudera will demonstrate how an AMP can be used for structural time series analysis. An Auto ML approach will be employed to forecast future cryptocurrency prices. To facilitate easy application usage, a Web-based, RESTful endpoint will be exposed to retrieve model predictions…more details
Ade Adewunmi is responsible for Machine Learning Services at Cloudera Fast Forward. She spends her days advising clients on the data-enabled transformation of their organisations with a particular focus on the systematic integration of machine learning into their business operations.
Prior to joining Cloudera Fast Forward Labs, Ade worked as a consultant, advising organizations on the development and delivery of their data strategies. Before that, she led the UK-based Government Digital Service’s Data Infrastructure programme.
Outside of work, Ade’s interests in the application and impact of data are broader – beyond the boundaries of corporate organisations; she volunteers with civil society organisations such as Datakind UK, mySociety and Foxglove Legal.
She blogs about the ways in which data can be made useful for organisations and wider society as well as the leadership and organisational cultures that make this possible. When she’s not advising, blogging or speaking about these things, she’s almost certainly watching too much TV and justifying it on the grounds of maintaining cultural relevance (as if any justification were needed!).
Talk | In-person | Machine Learning | Deep Learning | Beginner-Intermediate
In this talk, I will describe the steps involved in the Explainability by design methodology, and I will explain how the methodology can be applied to an illustrative autonomous system, which makes financial decisions about customers. I will also discuss the methodology qualitatively and quantitatively…more details
Luc Moreau is a Professor of Computer Science and Head of the department of Informatics, at King’s College London. Before joining King’s, Luc was Head of the Web and Internet Science, in the department of Electronics and Computer Science, at the University of Southampton.
Luc was co-chair of the W3C Provenance Working Group, which resulted in four W3C Recommendations and nine W3C Notes, specifying PROV, a conceptual data model for provenance the Web, and its serializations in various Web languages. Previously, he initiated the successful Provenance Challenge series, which saw the involvement of over 20 institutions investigating provenance inter-operability in 3 successive challenges, and which resulted in the specification of the community Open Provenance Model (OPM). Before that, he led the development of provenance technology in the FP6 Provenance project and the Provenance Aware Service Oriented Architecture (PASOA) project. [For further details, please see https://nms.kcl.ac.uk/luc.moreau/about.html]
He is on the editorial board of “PeerJ Computer Science” and previously he was editor-in-chief of the journal “Concurrency and Computation: Practice and Experience” and on the editorial board of “ACM Transactions on Internet Technology”.
Demo Talk | In-person | Machine Learning | All Levels
WSL 2 – Windows Subsystem for Linux is a layer for running Linux binary executables natively on Windows. What is WSL 2? How does it fit within your workflow? What is the value of it for data science? How to setup your machine? How to run your first code? This introductory session aims to provide answers to these questions, get you introduced to WSL2 and get you started by configuring your machine and running your first code…more details
Akram Dweikat is a computer engineer and entrepreneur, specialized in machine learning & AI. He has been recognized by the UK Government as an Exceptional Talent in computer engineering, innovation, and entrepreneurship. Akram is currently the Engineering Manager for Deliveroo’s Network Economics (ML) team. Also, he is a global data science ambassador for Z by HP. He has been appointed as an AI Expert by the World Economic Forum, serving on their Global Future Council on Artificial Intelligence for Humanity. In his spare time, Akram helps build agricultural gardens for income and food security in his native Palestine. Earlier in his career, Akram helped establish the entrepreneurial community in Nablus and was one of eight youth selected to meet US President Barack Obama on his official visit to Palestine.
Demo Talk | Virtual | Machine Learning | All Levels
Graphs can represent almost any kind of data, from complex supply chains, medical research, customer 360, and fraud detection.
Implemented in production-grade within the Neo4j Graph Data Science library, Graph Embeddings are an advanced AI technology used to translate your connected data – knowledge graphs, customer journeys, and transaction networks – into a predictive signal.
Applications of Graph Embeddings are numerous: finding fraud, entity resolution and disambiguation, improving product recommendations, discovering new drugs and predicting churn…more details
Talk | Virtual | Machine Learning | Deep Learning | NLP | Intermediate
Recent advances in machine learning systems have made it incredibly easier to train ML models given a training set. However, this does not mean that the job of an MLDev or MLOps engineer is any easier. As we sail past the era in which the major goal of ML platforms is to support the building of models, we might have to think about our next generation ML platforms as something that support the iteration of data. This is a challenging task, which requires us to take a holistic view of data quality, data management, and machine learning altogether. In this talk, I will discuss some of our thoughts in this space, illustrated by several recent results that we get in data debugging and data cleaning for ML models to systematically enforce their quality and trustworthiness…more details
Ce is an Assistant Professor in Computer Science at ETH Zurich. The mission of his research is to make machine learning techniques widely accessible---while being cost-efficient and trustworthy---to everyone who wants to use them to make our world a better place. He believes in a system approach to enabling this goal, and his current research focuses on building next-generation machine learning platforms and systems that are data-centric, human-centric, and declaratively scalable. Before joining ETH, Ce finished his PhD at the University of Wisconsin-Madison and spent another year as a postdoctoral researcher at Stanford, both advised by Christopher Ré. His work has received recognitions such as the SIGMOD Best Paper Award, SIGMOD Research Highlight Award, Google Focused Research Award, an ERC Starting Grant, and has been featured and reported by Science, Nature, the Communications of the ACM, and a various media outlets such as Atlantic, WIRED, Quanta Magazine, etc.
Talk | Virtual | MLOps and Data Engineering
This session will talk about the awesome new features the community has built that were recently released in. Apache Airflow 2.3…more details
Kaxil is currently working as the Director of Airflow Engineering Team @ Astronomer. Currently, he is one of the top three committers of the Airflow Project based on the number of commits. He is one of the release managers of Airflow. Most prominent works include co-authoring DAG Serialization, Scheduler HA, Secrets Backend.
He did his Masters in Data Science & Analytics from Royal Holloway, University of London. Started as a Data Scientist and then gained experience in Data Engineering, BigData and DevOps space. He began working on Airflow in 2017 while working at Data Reply as a BigData consultant and became a PMC member in 2018 and now works full-time at astronomer.io making Airflow better for everyone. He is a huge cricket fan and his favourite cricketers are Rahul Dravid and Virat Kohli.
Talk | Virtual | Deep Learning, Machine Learning, Big Data Analytics, MLOps and Data Engineering | Beginner – Intermediate
Cloud-native applications. Multiple Cloud providers. Hybrid Cloud. 1000s of VMs and containers. Complex network policies. Millions of connections and requests in any given time window. This is the typical situation faced by a Security Operations Control (SOC) Analyst every single day. In this talk, the speaker talks about the high-availability and highly scalable data pipelines that he built for the following use cases :
* Denial of Service: A device in the network stops working.
* Data Loss : An example is a rogue agent in the network transmitting IP data outside the network
* Data Corruption : A device starts sending erroneous data…more details
Tuhin Sharma is Senior Principal Data Scientist at Redhat in the Corporate Development and Strategy group. Prior that he worked at Hpersonix as AI Architect. He also co-founded and has been CEO of Binaize, a website conversion intelligence product for e-commerce SMBs. He received master’s degree from Indian Institute of Technology Roorkee in Computer Science with specialization in Data Mining. He received bachelor’s degree from Indian Institute of Engineering Science and Technology Shibpur in Computer Science. He loves to code and collaborate on open source and research projects. He has 4 research papers and 5 patents in the field of AI and NLP. He is reviewer of IEEE MASS conference in the AI track. He writes deep learning articles for O’reilly with the collaboration with AWS MXNET team. He loves to play TT and Guitar in his leisure time. His favorite quote is “Life is Beautiful”.
Talk | In-person |