ODSC Europe 2022
Conference Schedule
more sessions added weekly
Workshop | In-person
Abstract Coming Soon!
Cindy Weng is a Senior Cloud Solution Architect at Microsoft in Data & AI. She specializes in architecting MLOps solutions for customers across a variety of industries including retail, financial services, consumer goods, and tech. She is one of the authors of the MLOps V2 unified accelerator by Microsoft.
Manu Kanwarpal is a Senior Specialist in the EMEA AI Global Black Belt team at Microsoft. He specialises in Azure Machine Learning and works with some of Microsoft’s largest customers on establishing their end to end processes for Data Science & MLOps. He is one of the authors of Microsoft’s unified MLOps accelerator called “MLOps v2”.
Workshop | In-person | Intermediate-Advanced
Recently, diffusion models have surpassed Generative Adversarial Networks (GANs) both in quality and diversity of image generated. In this workshop, we will learn how to build a model for image generation. Then we will look at state-of-the-art models and applications such as the recently released DALL-E 2…more details
Bio Coming Soon!
Bootcamp | Virtual | Kickstarter | Beginner
In this workshop, you will get acquainted with the pandas library, which is the most widely used package for reading, analyzing and exporting datasets in Python. You will also learn how to visualize many kinds of tabular data using the plotnine package, along with some tips and tricks on how to make your visualizations stand out. Lastly, you will have the opportunity make predictions and take decisions using data, based on basic statistical methods…more details
Leonidas (Leo) is a Senior Data Scientist at Astrazeneca. His work is focused around machine learning in oncology, including clinical and non clinical applications. He is also enthusiastic about NLP applications in oncology and how this can be used to leverage patient treatment. He is also a workshop facilitator in the European Leadership University (ELU), NL and has also been a data science educator at DataCamp. He holds a PhD from the University of Warwick, UK. in bioinformatics and ML, an MSc in statistics from Imperial College London, UK and a BSc in Statistics and Insurance Science from the University of Piraeus, GR.
Half-Day Training | In-Person | NLP | Machine Learning | All Levels
Extracting knowledge from text data has always been one of the most researched topics in machine learning, but only recently we witnessed breakthroughs that put NLP in the spotlight. Many pieces of information are stored in unstructured data, like text, which is extremely important in many different fields, from finance to social media and e-commerce.
In this course we will go through Natural Language Processing fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases…more details
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks and Manchester United, and with large social networks, like Justgiving. He now works as Head of Data Scientist and Analytics in Badoo, the largest dating site with over 420 million users. He is also the lead instructor at ideai.io, a company specialized in Reinforcement Learning, Deep Learning, and Machine Learning training.
Workshop | In-person | Machine Learning, Big Data Analytics, MLOps and Data Engineering | Beginner
We will work through various materials and examples to get you started with GPU development in Python using open source libraries…more details
Jacob Tomlinson is a senior Python software engineer at NVIDIA with a focus on deployment tooling for distributed systems. His work involves maintaining open source projects including RAPIDS and Dask. RAPIDS is a suite of GPU accelerated open source Python tools which mimic APIs from the PyData stack including those of Numpy, Pandas and SciKit-Learn. Dask provides advanced parallelism for analytics with out-of-core computation, lazy evaluation and distributed execution of the PyData stack. He also tinkers with the open source chatbot automation framework Opsdroid in his spare time. Jacob volunteers with the local tech community group Tech Exeter and lives in Exeter, UK.
Tutorial | In-Person | Deep Learning | NLP
We’ll spend most of our time with Natural Language Processing and how to train models with NLP techniques. We’ll see how computers break down words into ‘embeddings’ – or higher dimensional vectors, whose ‘direction’ can be used to establish sentiment. With this, you’ll then see how a computer can begin to understand the ‘meaning’ of text – and we’ll see how to train AI models to detect things like whether a movie review was positive or not. The techniques can then be extended to text prediction – which leads to text generation – and you’ll learn how to create a simple AI model that creates it’s own text…more details
Laurence Moroney leads AI Advocacy at Google, working with the Google AI Research and product development teams. He’s the best-selling author of ‘AI and Machine Learning for Coders,’ as well as the instructor on the Fundamentals of TinyML course at HarvardX, and the popular TensorFlow specializations with deeplearning.ai and Coursera. He’s passionate about empowering software developers to succeed in Machine Learning, democratizing AI as a result. Laurence is based on Washington State in the USA.
Workshop | In-Person | Deep Learning | NLP | All Levels
Over the past few years speech synthesis or text-to-speech (TTS) has seen rapid advances thanks to deep learning. As anyone who owns a voice assistant will know, artificial voices are becoming more and more natural and convincing. The good news is you can recreate this impressive technology yourself, using high quality open-source tools. In this workshop, we’ll learn all about TTS and create a custom speech synthesis system from scratch. We’ll take a look at the development of TTS systems up to the present day, investigate the challenges that researchers are still grappling with, and walk through and end-to-end example of creating a deep learning-based TTS system – including data preparation, training, inference and evaluation. This workshop doesn’t require any prior knowledge of TTS or deep learning…more details
Alex Peattie is the co-founder and CTO of Peg, a technology platform helping multinational brands and agencies to find and work with top YouTubers. Peg is used by over 1500 organisations worldwide including Coca-Cola, L’Oreal and Google.
An experienced digital entrepreneur, Alex spent six years as a developer and consultant for the likes of Grubwithus, Huckberry, UNICEF and Nike, before joining coding bootcamp Makers Academy as senior coach, where he trained hundreds of junior developers. Alex was also a technical judge at this year’s TechCrunch Disrupt conference.
Workshop | In-person | Deep Learning | Machine Learning | Intermediate
This tutorial will introduce the glossary of uncertainty quantification relevant for deep learning and contextualise which aspects are most important in order to ease model deployment. While the identification of difficult samples for collaborative approaches between human and AI can be very successful in-domain, the reliable detection of out-of-distribution samples via uncertainty remains an active field of research. We’ll provide an overview of promising recent developments and end with building a neural network that knows when it does not know – in a simple setting and for illustrative purposes…more details
Christian Leibig is Director of Machine Learning at Vara, leading the development of methods from research to production. He obtained a Ph.D. in Neural Information Processing from the International Max Planck Research School in Tübingen and a diploma in physics from the University of Konstanz. Before joining Vara, he worked as a Postdoctoral Researcher at the University Clinics in Tübingen on the applicability of Bayesian Deep Learning and machine learning applications for the healthcare space for ZEISS and held research and internship positions with Max Planck, LMU Munich and the Natural and Medical Sciences Institute in Reutlingen. The method and software of his PhD work, an unsupervised solution for neural spike sorting from HDCMOS-MEA data is distributed by Multichannel Systems (Harvard Bioscience). His work on applying and assessing uncertainty methods to large scale medical imaging was among the first in the field and awarded with key note speaker invitations. He enjoys all of theory, software engineering, and people management, in particular for applications that have a meaningful impact, such as diagnosing cancer early.
Workshop | In-person | Machine Learning | Deep Learning
Abstract Coming Soon!
Workshop | Virtual
Time-Series processing is highly important in many domains and sets of problems, however, preprocessing and machine learning models for time-series come with their unique challenges – we’ll go through a time-series dataset and look at opportunities for optimisation. After the tutorial, you should be more comfortable with time-series datasets and know about a few modelling approaches…more details
Ben Auffarth is the head of data science at loveholidays. With a background and Ph.D. in computational and cognitive neuroscience, he has designed and conducted wet lab experiments on cell cultures, analysed experiments with terabytes of data, run brain models on IBM supercomputers with up to 64k cores. More recently, he’s built production systems processing hundreds of thousands of transactions per day, and trained neural networks on millions of text documents. He’s authored two books on machine learning. When he’s not at work, you might find him on a playground with his young son in West London. He co-founded and is the former president of Data Science Speakers, London.
Half-Day Training | Virtual | NLP | Intermediate-Advanced
Abstract Coming Soon!
Dipanjan (DJ) Sarkar is a data science consultant and published author, and was recognized as a Google Developer Expert in Machine Learning by Google in 2019. He currently works as a lead data science consultant at Schaffhausen Institute of Technology Academy, Zurich. Dipanjan has led advanced analytics initiatives working with Fortune 500 companies like Intel, Applied Materials, Red Hat / IBM. He works on leveraging data science, machine learning and deep learning to build large- scale intelligent systems. Dipanjan also works as an independent consultant, mentor and AI advisor in his spare time collaborating with multiple universities, organizations and startups across the globe. His passion includes solving challenging data problems as well as educating and helping people upskill in all things data. Dipanjan has also been recognized as one of the top ten Data Scientists in India in 2020, 40 under 40 Data Scientists, 2021 and Top 50 AI Thought Leaders by Global AI Hub, Switzerland. In his spare time he loves reading, gaming, watching interesting documentaries, football. He is also a strong supporter of open-source and publishes his code and analyses from his books, articles and experience on GitHub at https://github.com/dipanjanS and LinkedIn at https://www.linkedin.com/in/dipanzan
Workshop | Virtual | Deep Learning | Beginner-Intermediate
The main goal of this sessions is to show you how GANs work: we will start with a simple example using synthetic data (not generated by GANs) to learn about latent spaces and how to use them to generate more synthetic data (using GANs to generate them). We will improve on the model’s architecture, incorporating convolutional layers (DCGAN), different loss functions (WGAN, WGAN-GP) and use them to generate synthetic images of flowers (the roses!)…more details
Daniel has been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp, for more than three years, helping more than 150 students advance their careers.
He writes regularly for Towards Data Science. His blog post “Understanding PyTorch with an example: a step-by-step tutorial” reached more than 220,000 views since it was published.
The positive feedback from the readers motivated him to write the book Deep Learning with PyTorch Step-by-Step, which covers a broader range of topics.
Daniel is also the main contributor of two python packages: HandySpark and DeepReplay.
His professional background includes 20 years of experience working for companies in several industries: banking, government, fintech, retail and mobility.
Tutorial | Virtual | Machine Learning
Abstract Coming Soon!
Gaël Varoquaux is a research director working on data science and health at Inria (French Computer Science National research). His research focuses on using data and machine learning for scientific inference, with applications to health and social science, as well as developing tools that make it easier for non-specialists to use machine learning. He has long applied it to brain-imaging data to understand cognition. Years before the NSA, he was hoping to make bleeding-edge data processing available across new fields, and he has been working on a mastermind plan building easy-to-use open-source software in Python. He is a core developer of scikit-learn, joblib, Mayavi and nilearn, a nominated member of the PSF, and often teaches scientific computing with Python using the scipy lecture notes.
Tutorial | Virtual | MLOps & Data Engineering | Machine Learning | Beginner-Intermediate
One of the key questions in modern data science and machine learning, for businesses and practitioners alike, is how do you move machine learning projects from prototype and experiment to production as a repeatable process. In this workshop, we present an introduction to the landscape of production-grade tools, techniques, and workflows that bridge the gap between laptop data science and production ML workflows…more details
Hugo Bowne-Anderson is a data scientist, writer, educator & podcaster. His interests include promoting data & AI literacy/fluency, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. He does many of these at DataCamp, a data science training company educating over 3 million learners worldwide through interactive courses on the use of Python, R, SQL, Git, Bash and Spreadsheets in a data science context. He has spearheaded the development of over 25 courses in DataCamp’s Python curriculum, impacting over 170,000 learners worldwide through my own courses. He hosts and produce the data science podcast DataFramed, in which he uses long-format interviews with working data scientists to delve into what actually happens in the space and what impact it can and does have. He earned PhD in Mathematics from the University of New South Wales, Australia and has conducted biomedical research at the Max Planck Institute in Germany and Yale University, New Haven.
Workshop | Virtual | Deep Learning | Machine Learning | Beginner
Learn the basics of building a PyTorch model using a structured, incremental and from first principles approach. Find out why PyTorch is the fastest growing Deep Learning framework and how to make use of its capabilities: autograd, dynamic computation graph, model classes, data loaders and more. The main goal of this session is to show you how PyTorch works: we will start with a simple and familiar example in Numpy and “torch” it! At the end of it, you should be able to understand PyTorch’s key components and how to assemble them together into a working model…more details
Daniel has been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp, for more than three years, helping more than 150 students advance their careers.
He writes regularly for Towards Data Science. His blog post “Understanding PyTorch with an example: a step-by-step tutorial” reached more than 220,000 views since it was published.
The positive feedback from the readers motivated him to write the book Deep Learning with PyTorch Step-by-Step, which covers a broader range of topics.
Daniel is also the main contributor of two python packages: HandySpark and DeepReplay.
His professional background includes 20 years of experience working for companies in several industries: banking, government, fintech, retail and mobility.
Workshop | In-person
Learn how to detect common objects in real-time using a TensorFlow.js pre-trained model in your browser to give your next web application superpowers (or to get more eyes on your research). Walk through an end-to-end creation of a smart camera application in this workshop and discover how you can benefit from the reach and scale of the web by performing inference on device in the browser…more details
Jason is the public face of TensorFlow.js, helping web engineers globally take their first steps with machine learning in JavaScript. He also combines his knowledge of the technical and creative worlds to develop innovative prototypes for Google’s largest customers and internal teams with over 15 years experience working within web engineering and investigating emerging technologies.
He holds an MEng in Computer Science, is a member of the British Computing Society, and is a certified information privacy technologist. Jason loves sharing knowledge online which has attracted a global following. In his spare time he can be found walking the wings of flying aircraft being one of the few people in the world who has been trained in the art of wing walking.
Workshop | In-Person | Deep Learning | Machine Learning | Intermediate
Neural networks are powerful approximators for any function. However they do not have the slightest idea of common knowledge of the world which often makes them fail miserably, especially when extrapolating to areas not covered by training data.
We, as human beings have that knowledge about the world and our domains of expertise, allowing deep learning models to become much more robust and even to extrapolate. But how do we encode this?…more details
Oliver is a software developer and architect from Hamburg, Germany. He has been developing software with different approaches and programming languages for more than 3 decades. Lately, he has been focusing on Machine Learning and its interactions with humans.
Workshop | In-person | MLOps and Data Engineering
Come to this talk to learn how you can add real-time analytics capability to your data pipeline…more details
Karin is currently the leading developer community programming in the Developer Relations team at StarTree. Karin initially began her career in entertainment marketing working with the likes of names like Eminem and Live Nation. She also launched a successful professional women’s network in two major cities in the U.S., organized events for her local Data Science meetup, and helped lead a on-going hackathon to put machine learning in the hands of cancer biologists. Her journey working in data eventually led her to a position as Program Manager for Community Development for the leading graph database in the world, Neo4j. Most recently, she was brought on to StarTree to improve the adoption and success of the overall developer community.
Workshop | In-person
Abstract Coming Soon!
Simon is the Director of Engineering for Advancing Analytics, a Microsoft Data Platform MVP and one of the few Databricks Beacons Globally. Simon has pioneered Lakehouse Architectures for a some of the world’s largest companies, challenging traditional analytical solutions and pushing for the very best for the data industry. Simon runs the Advancing Spark YouTube channel, where he can often be found digging into Spark features, investigating new Microsoft technologies and cheering on the Delta Lake project.
Bootcamp | Virtual | Machine Learning
The first half of the session will include a discussion of both the theory and the implementation of Supervised modeling such as: Linear Regression, Logistic Regression, Decision Trees, Random Forest and XGBoost. We will then shift our focus to Unsupervised models including Clustering and Topic Modeling algorithms. This session will be interactive such that students will be able to develop their own models within the scope of the workshop…more details
Julia Lintern currently works as an instructor for the Metis Data Science Flex Program. Previously, she worked as a Data Scientist for the New York Times. Julia began her career as a structures engineer designing repairs for damaged aircraft. Julia holds an MA in applied math from Hunter College, where she focused on visualizations of various numerical methods and discovered a deep appreciation for the combination of mathematics and visualizations. During certain seasons of her career, she has also worked on creative side projects such as Lia Lintern, her own fashion label.
Bootcamp | Virtual
Abstract Coming Soon!
Hadrien Jean is a machine learning scientist working at My Medical Assistent where he is developing deep learning models in the medical domain. He wrote the book Essential Math for Data Science (https://www.essentialmathfordatascience.com/) aimed at helping people to get the math needed in data science from a coding perspective. He previously worked at Ava on speech diarization. He also worked on a bird detection project using deep learning. He completed his Ph.D. in cognitive science at the École Normale Supérieure (Paris, France) on the topic of auditory perceptual learning with a behavioral and electrophysiological approach. He has published a series of blog articles aiming at building intuition on mathematics through code and visualization (https://hadrienj.github.io/posts/).
Tutorial | Virtual | ML for Finance | Intermediate
This session demonstrates the application of reinforcement learning to create a financial model-free solution to the asset allocation problem, learning to solve the problem using time series and deep neural networks. We demonstrate this on daily data for the top 24 stocks in the US equities universe with daily rebalancing. We use a deep reinforcement model on US stocks using different deep learning architectures. We use Long Short Term Memory networks, Convolutional Neural Networks, and Recurrent Neural Networks and compare them with more traditional portfolio management approaches…more details
Sonam Srivastava is the founder of Wright Research, an India-based Robo-advisor, where she creates data-driven portfolios out of her deep passion for quant finance.
Wright Research is a wealth creator in the digital space that uses scientific data-driven methods to tactically extract opportunities across assets in the public markets to grow clients’ wealth. Wright functions as SEBI registered Robo advisor and is among the most popular advisors among millennial investors with more than 30000 clients and 125 crore+ in assets. Wright Research has delivered a 90% + outperformance over the index in the last 2.5 years.
She has 10+ years of experience in investment research and portfolio management, working on systematic strategies, long-short strategies, and algorithmic trading. She started her career in the field with Mumbai-based Forefront Capital, which got acquired by Edelweiss. At Edelweiss, she worked as an algorithm designer at Edelweiss’s institutional equity broking desk. After that, she worked at HSBC Europe as a quant building factor-driven portfolio solutions. Before starting Wright Research, she also worked at Qplum, doing portfolio management at the artificial intelligence-driven Robo-advisor.
She graduated from IIT Kanpur and has a master’s in financial engineering from Worldquant University. She is a globally recognized researcher and works as a visiting faculty as AI in Finance Institute New York and BSE Institute Limited.
Tutorial | In-person
In this short tutorial, we go through an example of writing our own estimator, test it against the scikit-learn’s common tests, and see how it behaves inside a pipeline and a grid search. There have also been recent developments related to the general API of the estimators which require slight modifications by the third party developers. In this talk we cover these changes and point you to the activities to watch as well as some of the private utilities which you can use to improve your experience of developing an estimator…more details
He is a computer scientist / bioinformatician who has turned to be a core developer of `scikit-learn` and `fairlearn`, and work as a Machine Learning Engineer at Hugging Face. He is also an organizer of PyData Berlin.
These days he mostly focus on aspects of machine learning and tools which help with creating more ethical and fair decision making systems. This trend has influenced him to work on `fairlearn`, and to work on aspects of `scikit-learn` which would help tools such as `fairlearn` to work more fluently with the package; and at Hugging Face, his focus is to enable the community of these libraries to be able to share their models more easily and be more open about their work.
Tutorial | Virtual | Machine Learning | Intermediate-Advanced
Advances in information extraction have enabled the automatic construction of large knowledge graphs (KGs) like DBpedia, YAGO, Wikidata of Google Knowledge Graph. Learning rules from KGs is a crucial task for KG completion, cleaning and curation. This tutorial presents state-of-the-art rule induction methods, recent advances, research opportunities as well as open challenges along this avenue…more details
Daria Stepanova is a research scientist at Bosch Center for Artificial Intelligence. Her research interests include Knowledge Representation and Reasoning with the special focus on automatic acquisition of rules from structured knowledge. Previously Daria was a senior researcher at Max Plank Institute for Informatics (Germany), where she was heading a group on Semantic Data. Daria got her diploma degree in Applied Computer Science from the department of Mathematics and Mechanics of St. Petersburg State University (Russia) in 2010 and a PhD in Computational Logic from Vienna University of Technology (Austria) in 2015. Before starting her PhD she worked as a visiting researcher at the school of Computing Science at Newcastle University (UK) in an industrially-oriented project.
Workshop | Virtual
I will focus on the common use case of anomaly detection in a closed-loop Convolutional Neural Network (CNN) to demonstrate the benefits of adopting the Data Mesh paradigm across a multiplane data platform in Machine Learning operations. With this example we will learn how to make the leap from model experimentation to productisation while adhering to the common affordances of a data product such as observability, life-cycle management and discoverability…more details
Shawn is passionate about harnessing the power of data strategy, engineering and analytics in order to help businesses uncover new opportunities. As an innovative technologist with over 13 years experience, Shawn removes technology as a barrier, and broadens the art of the possible for business and product leaders. His holistic view of technology and emphasis on developing and motivating strong engineering talent, with a focus on delivering outcomes whilst minimising outputs, is one of the characteristics which sets him apart from the crowd.
Shawn’s deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning and engineering analytics platforms. He has years of experience working as a consultant practitioner for a variety of prestigious clients ranging from secret clearance level government organizations to Fortune 500 companies.
Training | Virtual | Data Visualization | Data Engineering | Intermediate-Advanced
The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python…more details
Stefanie Molin is a data scientist and software engineer at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around anomaly detection, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science. She is currently pursuing a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Tutorial | In-Person | Deep Learning | Advanced
Sadly, Deep Learning-based generative models are often used to create visual deep fakes, but many use cases that unfortunately don’t make the headlines exist in diverse industries. In this talk the speaker is going to highlight the potential business value of generative models with reference to a real use case scenario in the pharma manufacturing…more details
Guglielmo is a Biomedical Engineer with an extensive background in Software Engineering and Data Science applied to different contexts, such as Biotech Manufacturing, Healthcare and DevOps, just to mention the latest, and a lifelong learner.
Currently busy unlocking business value through Deep Learning projects, mostly in Computer Vision (not restricted to this field by the way).
He has been recognized as DataOps Champion at the Streamsets DataOps Summit 2019 and awarded as one of the Top 50 Tech Visionaries at the 2019 Dubai Intercon Conference.
He is also an international speaker and author of the following book:
Hands-on Deep Learning with Apache Spark @Packt https://www.packtpub.com/big-data-and-business-intelligence/hands-deep-learning-apache-spark
Tutorial | In-person
In this talk, Gal (Senior Data Scientist, Fiverr) and Itai (CPO, Mona) discuss how Fiverr utilizes advanced tools, both home-grown and bought, to bridge the gap between data science and business, empower data scientists to understand the behavior of their models in production and make sure their AI solutions bring the value they’re expected to deliver…more details
With over 10 years of experience (Google, AI-focused startups) with big data and as the CPO and head of customer success at Mona, the leading AI monitoring intelligence company, Itai has a unique view of the AI industry. Working closely with data science and ML teams applying dozens of solutions in over 10 industries, Itai encounters a wide variety of business use-cases, organizational structures and cultures, and technologies used in today’s AI world.
Gal Naamani has been working as a data scientist for 4 years, with the past 3 years being at Fiverr. As the Senior Data Scientist, Gal works closely with developers, analysts, product managers, and business owners on growth opportunities and new ideas, from research to production. Gal currently has leading roles in projects that are focused around search engine ranking, promoted ads, online bidding optimization, exploration-exploitation problems, monitoring, and more.
Workshop | In-person | Responsible AI
Abstract Coming Soon
Jo-fai (or Joe) has multiple roles (data scientist / evangelist / community manager) at H2O.ai. Since joining the company in 2016, Joe has delivered H2O talks/workshops in 40+ cities around Europe, US, and Asia. Nowadays, he is best known as the H2O #360Selfie guy. He is also the co-organiser of H2O’s EMEA meetup groups including London Artificial Intelligence & Deep Learning – one of the biggest data science communities in the world with more than 11,000 members (https://www.meetup.com/London-Artificial-Intelligence-Deep-Learning/).
Tutorial | In-person
Our discussion will start with a brief overview of the value of machine learning in economic applications. We will then introduce TensorFlow 2 and discuss its advantages as a tool for solving prediction and modeling problems in economics and finance. The remainder of the presentation will be dedicated to two applications. The first will center on the use of natural language processing methods as a means of extracting text features from central bank communications, such as speeches and policy statements. The second will examine the use of generative adversarial networks (GANs) as a tool for simulating financial data for Monte Carlo experiments. Code will be provided for all worked examples included in the presentation…more details
Isaiah Hull is a senior economist in the research division of Sweden’s Central Bank (Sveriges Riksbank). He holds a PhD in economics from Boston College and conducts research on computational economics, machine learning, and quantum computing. He is also the instructor for DataCamp’s “Introduction to TensorFlow in Python” course and the author of “Machine Learning for Economics in Finance in TensorFlow 2.”
Workshop | Virtual
Abstract Coming Soon!
Brian Lucena is Principal at Numeristical, where he advises companies of all sizes on how to apply modern machine learning techniques to solve real-world problems with data. He is the creator of StructureBoost, ML-Insights, and the SplineCalib calibration tool. In previous roles he has served as Senior VP of Analytics at PCCI, Principal Data Scientist at Clover Health, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.
Tutorial | In-person
Data preprocessing is an important part of data science, even more so when dealing with time series. Signal processing and the Fourier Transform can move your time series in frequency domain, where events are easier to detect. Learn how to apply the Fourier Transform to build a classification solution on top of IoT Time Series data. In this tutorial, Corey will introduce the theory behind the Fourier Transform and demonstrate how to build a classification model in the Frequency Domain by using KNIME Analytics Platform…more details
Corey studied Mathematics at Michigan State University and works as a Data Scientist with KNIME where he focuses on Time Series Analysis, Forecasting, and Signal Analytics. He is the creator and instructor of the KNIME Time Series Analysis course, author of the e-book: Alteryx to KNIME, creator of the KNIME Time Series Analysis components, and Co-Author of the upcoming Codeless Time Series Analysis Book with Packt.
Workshop | In-person | Machine Learning | NLP
I talk about our journey to build a recommender system at Lyst which is the world’s largest global fashion search platform, bringing together eight million products from 17,000 of the world’s leading brands and retailers, and is used by over 150 million fashion lovers a year. I will walk you through the ideas, successes, and failures to construct different types of models which finally led us to develop a novel neural embedding approach for our recommender system…more details
Workshop | Virtual | NLP | Machine Learning | All Levels
Weakly supervised approaches have gained popularity in the last two years, but there is still a significant amount of overhead in applying these methods to more complex NLP tasks. The performance of weakly supervised systems is contingent on both the quality and quantity of independent sources of weak signal- if a practitioner cannot come up with sufficient sources themselves then weak supervision is largely impractical…more details
Shayan Mohanty is the CEO and Co-Founder of Watchful, a company that largely automates the process of creating labeled training data. He’s spent over a decade of leading data engineering teams at various companies including Facebook, where he served as lead for the stream processing team responsible for processing 100% of the ads metrics data for all FB products. He is also a Guest Scientist at Los Alamos National Laboratory and has given talks on topics ranging from Automata Theory to Machine Teaching.
Tutorial | Virtual
Abstract Coming Soon!
Bio Coming Soon!
Workshop | Virtual | Machine Learning, MLOps and Data Engineering
In this talk, Data Scientist Felipe Adachi will talk about different types of data distribution shifts in ML applications, such as covariate shift, label shift, and concept drift, and how these issues can affect your ML application. Furthermore, the speaker will discuss the challenges of enabling distribution shift detection in data in a lightweight and scalable manner by calculating approximate statistics for drift measurements. Finally, the speaker will walk through steps that data scientists and ML engineers can take in order to surface data distribution shift issues in a practical manner, rather than reacting to the impacts of performance degradation reported by their customers…more details
Felipe is a Data Scientist in WhyLabs. He is a core contributor to whylogs, an open-source data logging library, and focuses on writing technical content and expanding the whylogs library in order to make AI more accessible, robust, and responsible. Previously, Felipe was an AI Researcher at WEG, where he researched and deployed Natural Language Processing approaches to extract knowledge from textual information about electric machinery. He is also a Master in Electronic Systems Engineering from UFSC (Universidade Federal de Santa Catarina), with research focused on developing and deploying fault detection strategies based on machine learning for unmanned underwater vehicles. Felipe has published a series of blog articles about MLOps, Monitoring, and Natural Language Processing in publications such as Towards Data Science, Analytics Vidhya, and Google Cloud Community.
Workshop | In-person | Deep Learning
Tuning a model is a core element of a data scientist’s work. It is often very difficult, requiring both experience and expertise to do effectively. An important and integral part of the model tuning process is the feature selection process. This is because in many cases, the model itself is a ‘black box’, which makes it hard to understand features’ performance…more details
Training | In-person | Machine Learning for Finance | Intermediate
This half-day trading session covers the most important Python topics and skills to apply AI and Machine Learning (ML) to Algorithmic Trading. The session shows how to make use of the Oanda trading API (via a demo account) to retrieve data, to stream data, to place orders, etc. Building on this, a ML-based trading strategy is formulated and backtested. Finally, the trading strategy is transformed into an online trading algorithm and is deployed for real-time trading on the Oanda trading platform…more details
Dr. Yves J. Hilpisch is founder and CEO of The Python Quants (http://tpq.io), a group focusing on the use of open source technologies for financial data science, artificial intelligence, algorithmic trading, and computational finance. He is also founder and CEO of The AI Machine (http://aimachine.io), a company focused on AI-powered algorithmic trading based on a proprietary strategy execution platform.
Yves has a Diploma in Business Administration, a Ph.D. in Mathematical Finance and is Adjunct Professor for Computational Finance at Miami Herbert Business School.
Workshop | In-person | Responsible Ai | Machine Learning
Human In The Loop (HITL) is a process in which, as part of the ML workflow, experts are asked their opinion about predictions made by an ML model in order to tune and improve the model. In this talk we’ll explain how we collaborated with and integrated engineers as a core part of our machine learning process, in order to create a mechanism to automatically predict the best security policies for our customers. We’ll go through the different stages of the project, discuss the challenges we faced along the way and how we overcame them, and show how you can use a similar process for any heuristic/ML project you have…more details
Adam is an experienced Data Scientist at Imperva’s threat research group where he works on creating machine learning algorithms to help protect Imperva’s customers against database attacks. Before joining Imperva, he obtained a PHD in Neuroscience from Ben-Gurion University of the Negev.
Workshop | In-person
This live coding session introduces the ideas behind autodiff and teaches its fundamentals by walking you through a simple example of implementing autodiff using the core Python programming language features, without PyTorch. In the process, you will gain a deeper understanding of the PyTorch autodiff functionality and develop the knowledge that will help you troubleshoot PyTorch model training (for example, using Horovod) in your projects. You will see that while autodiff can be straightforward, it scales to complex applications of the calculus chain rule…more details
Carl implemented his first neural net in 2000. He is a senior director of the AI / ML practice at Cognizant, focusing on communications, technology, and media customers. Previously he worked on deep learning and machine learning at Google and IBM. Carl is an author of over 20 articles in professional, trade, and academic journals, an inventor with 6 patents at USPTO, and holds 3 corporate awards from IBM for his innovative work. His machine learning book, “MLOps Engineering at Scale” continues to receive reader acclaim. You can find out more about Carl from his blog www.cloudswithcarl.com
Talk | In-person
Abstract Coming Soon!
Prathiba is an experienced Data Scientist with a rich background in the Insurance industry. With a Master’s degree in Operational Research with Applied Statistics and Risk, her passion takes form through seeing the varying applications of Machine Learning and AI techniques, and how they propel data scientists to build better models and solutions. Skilled in data analysis and modelling, she utilizes SAS software and Open Source to assess and address problems within enterprise organizations.
Keynote | Virtual | Machine Learning | Big Data Analytics | All Levels
In this session we’ll discuss the current trend of increasingly larger AI models, empowering a wider range of tasks in the language, vision, and multi-modality space, with growing levels of capability. We’ll give an overview of the research and engineering efforts supporting the trend, its product and engineering impact at Microsoft, and the implications for other companies…more details
Luis Vargas is a Partner Technical Advisor to the CTO of Microsoft. Responsible for Microsoft’s AI at Scale initiative coordinating efforts across infrastructure, systems software, models, and products. He bootstrapped the productization of Automated ML and Reinforcement Learning in the Azure AI Platform, worked on the launch of Azure Database Services, and lead the high-availability area for SQL Server. Luis has a PhD in Computer Science from Cambridge University.
Talk | In-Person | Machine Learning | All Levels
In this talk I will explain how Bayesian modeling addresses these issues by (i) incorporating expert knowledge of the structure as well as about plausible parameter rangers; (ii) connecting multiple different data sets to increase circumstantial evidence of latent user features; and (iii) principled quantification of uncertainty to increase robustness of model fits and interpretation of the results. Inspired by real-world problems we encountered at PyMC Labs, we will look at Media Mix Models for marketing attribution and Customer Lifetime Value models and various hybrids between them…more details
Thomas Wiecki is the Chief Executive Officer at PyMC Labs (www.pymc-labs.io), the world’s first Bayesian consultancy. Prior to that Thomas was the VP of Data Science at Quantopian, where he used probabilistic programming and machine learning to help build the world’s first crowdsourced hedge fund. He is an author of the popular PyMC3 package — a probabilistic programming framework written in Python. He holds a PhD from Brown University.
Talk | In-person | All Levels
In this session, communication consultant and data visualization specialist Alan Rutter will break down common challenges that data teams face in presenting their work to others, and suggest practical solutions – which are often rooted in people and culture, rather than data and code. It is aimed at any data practitioner who is frequently presenting insights to other stakeholders…more details
Alan Rutter is the founder of consultancy Fire Plus Algebra, and is a specialist in communicating complex subjects through data visualisation, writing and design. He has worked as a journalist, product owner and trainer for brands and organisations including Guardian Masterclasses, WIRED, Time Out,the Home Office, the Biotechnology and Biological Sciences Research Council and Liverpool School of Tropical Medicine.
Demo Talk | In-person
What does it take to get the best model into production? We’ve seen industry-leading ML teams follow some of the same common workflows for dataset management, experimentation, and model management. I’ll share case studies from customers across industries, outline best practices, and dive into tools and solutions for common pain points…more details
Allan’s background covers a broad technology stack in infrastructure and cloud, working across a variety of roles in large enterprises before moving into Data Science and ML in recent years. His last role was working on time series forecasting at a fintech scale-up before joining Weights and Biases as the first member of the Customer Success team in EMEA.
Track Keynote | In-person
Abstract Coming Soon!
Yaron Haviv is a serial entrepreneur who has been applying his deep technological experience in data, cloud, AI and networking to leading startups and enterprise companies since the late 1990s. As the co-founder and CTO of Iguazio, Yaron drives the strategy for the company’s MLOps platform and led the shift towards the production-first approach to data science and catering to real-time AI use cases. He also initiated and built Nuclio, a leading open source serverless platform with over 4,000 Github stars and MLRun, Iguazio’s open source MLOps orchestration framework. Prior to co-founding Iguazio in 2014, Yaron was the Vice President of Datacenter Solutions at Mellanox (now NVIDIA), where he led technology innovation, software development and solution integrations. He was also the CTO and Vice President of R&D at Voltaire, a high-performance computing, IO and networking company which floated on the NYSE in 2007. Yaron is an active contributor to the CNCF Working Group and was one of the foundation’s first members. He presents at major industry events and writes tech content for leading publications including TheNewStack, Hackernoon, DZone, Towards Data Science and more.
Demo Talk | Virtual
In this session, you will learn how to run machine learning workloads with seamless Azure Machine Learning experience anywhere, including on-premises, in multi-cloud environments, and at the edge. Use any Kubernetes cluster and extend machine learning to run MLOps, model training, real-time inference or batch-inference. You can manage all the resources through a single pane with the management, consistency, and reliability...more details
Doris Zhong is a Product Manager in Azure AI Platform organization at Microsoft, and she is focusing on the area of machine learning in hybrid cloud. She loves to communicate with customer to get deep insights, and help solve the real problem. In her early career, she worked on building Microsoft internal GPU training platform, that managed tens of thousands of GPUs, and served thousands of users.
Demo Talk | In-person
In this session, Jake Bengtson from Cloudera will demonstrate how the AMP Churn Modeling with scikit learn can be repurposed to create a web application that will predict this year’s NBA champion. From ingesting historical NBA data to altering the existing Flask application to use a newly trained model, we will walk through the entire process of going from AMP to MVP…more details
Jake is currently working as a Senior Product Marketing Manager over ML Lifecycle products at Cloudera. Before joining Cloudera, Jake worked as a Data Scientist and then as a Data Science and Analytics Solution Architect at ExxonMobil. Additionally, he worked as a Senior Data Scientist at FarmersEdge. Before starting his professional career, Jake obtained his bachelor’s and master’s degree from Brigham Young University. When he isn’t working, Jake enjoys skiing, golfing, and spending time with his family in the mountains.
Talk | Virtual | Responsible Ai | Machine Learning | Beginner
Abstract Coming Soon!
Sara is a Senior Research Associate in Biomedical Data Science and University Research Lecturer at the University of Oxford, where she is the Machine Learning Lead in the Centre for Statistics in Medicine. She has 12 years of experience in machine learning, signal processing, and intelligent remote monitoring research, with applications in biomedical and planetary health informatics. Sara has served on the NASA Frontier Development Lab Artificial Intelligence Panel and the NASA Climate Challenge Big Think. She is a National Geographic Society Explorer in Tracking Plastic Pollution with Remote Monitoring and Machine Learning. Sara is also a University of Oxford Ambassador for Women in Data Science.
Talk | In-person | Machine Learning | Deep Learning | Beginner-Intermediate
In this talk, I will describe the steps involved in the Explainability by design methodology, and I will explain how the methodology can be applied to an illustrative autonomous system, which makes financial decisions about customers. I will also discuss the methodology qualitatively and quantitatively…more details
Luc Moreau is a Professor of Computer Science and Head of the department of Informatics, at King’s College London. Before joining King’s, Luc was Head of the Web and Internet Science, in the department of Electronics and Computer Science, at the University of Southampton.
Luc was co-chair of the W3C Provenance Working Group, which resulted in four W3C Recommendations and nine W3C Notes, specifying PROV, a conceptual data model for provenance the Web, and its serializations in various Web languages. Previously, he initiated the successful Provenance Challenge series, which saw the involvement of over 20 institutions investigating provenance inter-operability in 3 successive challenges, and which resulted in the specification of the community Open Provenance Model (OPM). Before that, he led the development of provenance technology in the FP6 Provenance project and the Provenance Aware Service Oriented Architecture (PASOA) project. [For further details, please see https://nms.kcl.ac.uk/luc.moreau/about.html]
He is on the editorial board of “PeerJ Computer Science” and previously he was editor-in-chief of the journal “Concurrency and Computation: Practice and Experience” and on the editorial board of “ACM Transactions on Internet Technology”.
Demo Talk | Virtual
You’ll leave this demo with essential resources for better predictions and outcomes using the data you already have through graphs and Neo4j…more details
Originally from the USA but now living in the UK, Joe Depeau has over 25 years of varied experience in the IT industry across a number of domains and specialties. For the last 10 years Joe has focused on technical pre-sales and solution architecture in the data and analytics space, and he is a passionate evangelist for the use of graph databases and especially graph data science. When not geeking out over data and technology he enjoys camping, hiking with his dog, reading, tending to his Animal Crossing island, and playing boardgames and RPGs. He also bakes a mean cheesecake.
Talk | Virtual | Deep Learning, Machine Learning, Big Data Analytics, MLOps and Data Engineering | Beginner – Intermediate
Cloud-native applications. Multiple Cloud providers. Hybrid Cloud. 1000s of VMs and containers. Complex network policies. Millions of connections and requests in any given time window. This is the typical situation faced by a Security Operations Control (SOC) Analyst every single day. In this talk, the speaker talks about the high-availability and highly scalable data pipelines that he built for the following use cases :
* Denial of Service: A device in the network stops working.
* Data Loss : An example is a rogue agent in the network transmitting IP data outside the network
* Data Corruption : A device starts sending erroneous data…more details
Tuhin Sharma is Senior Principal Data Scientist at Redhat in the Corporate Development and Strategy group. Prior that he worked at Hpersonix as AI Architect. He also co-founded and has been CEO of Binaize, a website conversion intelligence product for e-commerce SMBs. He received master’s degree from Indian Institute of Technology Roorkee in Computer Science with specialization in Data Mining. He received bachelor’s degree from Indian Institute of Engineering Science and Technology Shibpur in Computer Science. He loves to code and collaborate on open source and research projects. He has 4 research papers and 5 patents in the field of AI and NLP. He is reviewer of IEEE MASS conference in the AI track. He writes deep learning articles for O’reilly with the collaboration with AWS MXNET team. He loves to play TT and Guitar in his leisure time. His favorite quote is “Life is Beautiful”.
Talk | Virtual | MLOps and Data Engineering
This session will talk about the awesome new features the community has built that were recently released in. Apache Airflow 2.3…more details
Kaxil is currently working as the Director of Airflow Engineering Team @ Astronomer. Currently, he is one of the top three committers of the Airflow Project based on the number of commits. He is one of the release managers of Airflow. Most prominent works include co-authoring DAG Serialization, Scheduler HA, Secrets Backend.
He did his Masters in Data Science & Analytics from Royal Holloway, University of London. Started as a Data Scientist and then gained experience in Data Engineering, BigData and DevOps space. He began working on Airflow in 2017 while working at Data Reply as a BigData consultant and became a PMC member in 2018 and now works full-time at astronomer.io making Airflow better for everyone. He is a huge cricket fan and his favourite cricketers are Rahul Dravid and Virat Kohli.
Talk | In-person
Abstract Coming Soon!
Adi Hirschtein brings 20 years of experience as an executive, product manager and entrepreneur building and driving innovation in technology companies. As the VP of Product at Iguazio, the MLOps platform built for production and real-time use cases, he leads the product roadmap and strategy. His previous roles spanned technology companies such as Dell EMC, Zettapoint and InfraGate, in diverse positions including product management, business development, marketing, sales and execution, with a strong focus on machine learning, database and storage technology. When working with startups and corporates, Adi’s passion lies in taking a team’s ideas from their very first day, through a successful market penetration, all the way to an established business. Adi holds a B.A. in Business Administration and Information Technology from the College of Management Academic Studies.
Demo Talk | In-person
MLRun is an open-source MLOps orchestration framework. It exists to accelerate the integration of AI/ML applications into existing business workflows. MLRun introduces Data Scientists to a simple Python SDK that transforms their code into a production-quality application. It does so by abstracting the many layers involved in the MLOps pipeline. Developers can build, test, and tune their work anywhere and leverage MLRun to integrate with other components of their business workflow…more details
Gilad has over 15 years of experience in product management and a solid R&D background. He combines analytical skills and technical innovation with Data Science market experience. Gilad’s passion is to define a product vision and turn it into reality. As Director of Product Management at Iguazio, Gilad manages both the Enterprise MLOps Platform product as well as MLRun, Iguazio’s open source MLOps orchestration framework. Prior to joining Iguazio, Gilad managed several different products at NICE-Actimize, a leading vendor of financial crime prevention solutions, including coverage of machine-learning based solutions, formation of a marketplace and addressing customer needs across different domains. Gilad holds a B.A in Computer Science, M.Sc. in Biomedical Engineering and MBA from Tel-Aviv University.
Demo Talk | In-person
You’ll leave this demo with essential resources for better predictions and outcomes using the data you already have through graphs and Neo4j…more details
Originally from the USA but now living in the UK, Joe Depeau has over 25 years of varied experience in the IT industry across a number of domains and specialties. For the last 10 years Joe has focused on technical pre-sales and solution architecture in the data and analytics space, and he is a passionate evangelist for the use of graph databases and especially graph data science. When not geeking out over data and technology he enjoys camping, hiking with his dog, reading, tending to his Animal Crossing island, and playing boardgames and RPGs. He also bakes a mean cheesecake.
Talk | In-person
Abstract Coming Soon!
Rachel is a Product Manager in Appen’s Autonomous Vehicles working group. In that role, she is working to provide high quality data on all levels of autonomy for motor vehicle clients. Prior to joining Appen, Rachel worked on data science tools to enable model interpretability, fairness testing and automated machine learning. Other passions of hers include using AI and technology to act as a catalyst towards solving humanitarian-centered problems for non-profits around the world.
Demo Talk | In-person
In this talk, we’ll tackle the challenge of optimizing the AI infrastructure stack using Kubernetes, NVIDIA GPUs, and Run:ai.
Walking through an example of a well-architected AI Infrastructure stack, we’ll discuss how Kubernetes can be augmented with advanced GPU scheduling to maximize efficiency and speed up data science initiatives…more details
Rob Magno is a Sales Engineer/Solution Architect at Run:AI based in New Jersey. He has been working in the Docker and Kubernetes space for the past five years. He enjoys tackling the diverse customer challenges that come with orchestrating AI/ML workloads through Kubernetes.
Talk | Virtual | NLP | All Levels
In this talk, I will first describe methods developed in the NLP community to detect the types and levels of social biases learnt by large-scale language models. Next, I will present techniques that can be used to mitigate such biases…more details
Danushka Bollegala is a Professor in the Department of Computer Science, University of Liverpool, UK. He obtained his PhD from the University of Tokyo in 2009 and worked as an Assistant Professor before moving to the UK. He has worked on various problems related to Natural Language Processing and Machine Learning. He has received numerous awards for his research excellence such as the IEEE Young Author Award, best paper awards at GECCO and PRICAI. His research has been supported by various research council and industrial grants such as EU, DSTL, Innovate UK, JSPS, Google and MSRA. He is an Amazon Scholar.
Talk | Virtual
This talk will outline the evolution of data annotation starting with the computer vision and deep learning research fueled by the ImageNet competition, particularly the revolutionary year 2012 when the deep learning revolution began…more details
Talk | In-person | VectorCon | Machine Learning | Beginner
This session is all about vector databases. If you are a data scientist or data/software engineer this session would be interesting for you. You will learn how you can easily run your favorite ML models with the vector database Weaviate. You’ll get an overview of what a vector database like Weaviate can offer: such as semantic search, question answering, data classification, named entity recognition, multimodal search, and much more. Vector search will be illustrated with live demos of a real use case! After this session, you know when and how to use vector search with various ML models…more details
Laura is a Data Scientist at SeMI, where we build the open-source vector search engine Weaviate. She researches new machine learning features for Weaviate and works on everything UX/DX related to Weaviate. For example, she is responsible for the GraphQL API design. She is in close contact with our open source community. Additionally, she likes to solve custom use cases with Weaviate, and introduces Weaviate to other people by means of Meetups, talks and presentations.
Demo Talk | Virtual
Just a few years ago every cutting-edge tech company, like Google, Lyft, Microsoft, and Amazon, rolled their own AI/ML tech stack from scratch. Fast forward to today and we have a Cambrian explosion of new companies building a massive array of software to democratize AI for the rest of us. But how do we make sense of it all? In order for AI apps to become as ubiquitous as the apps on your phone, you need a canonical stack for machine learning that makes it easier for non-tech companies to level up fast…more details
Lee is the General Secretary for the AI Infrastructure Alliance. Based out of the UK, he is responsible for crafting and nurturing relationships with companies to build a canonical stack for AI and ML. When not shuttling his 3 children around, he can most often be found cycling, running and swimming around England’s South Coast.
Demo Talk | In-person
Failure or delays in creating training data and deploying data ops can suffocate good deep learning models, a chance data scientists can’t bet on. In this session, attendees will learn how iMerit is solving the problem of scaling data pipelines with accuracy using unique technology. Join iMerit’s VP of Product, Glen Ford, as he uncovers the invisible technology building successful data labeling workflows and discovering anomalous and novel classes for customers using iMerit’s Edge Case technology…more details
Glen Ford is VP of Product at iMerit — a leading AI data solutions company — where he leads the product management and design teams. Glen holds more than two decades of product development experience across the technology sector. A Graduate of Texas A&M University—Commerce, Glen began his career as a consultant where he handled full-stack web programming and architecture for clients including Time Warner and AIM Funds. Over the years, he has held senior and director-level product management roles at several companies including Demand Media, WP Engine and Humanify. Most recently, Glen spent four years at Alegion — an ML-powered data annotation platform — where he helped the company grow from eight full-time employees to more than 100 in a challenging, emerging market.
Talk | In-person | Machine Learning | Beginner – Intermediate
Explainable AI, or XAI, is a rapidly expanding field of research that aims to supply methods for understanding model predictions. We will start by providing a general introduction to the field of explainability, introduce the Alibi library and focus on how it helps you to understand trained models. We will then explore the collection of algorithms provided by Alibi and the types of insight they each provide, looking at a broad range of datasets and models, discussing the pros and cons of each. In particular, we’ll look at methods that apply to any model. The focus will be on application to real-world datasets to show the practitioner that XAI can justify, explore and enhance their use of ML…more details
Alex Athorne is a Research Engineer at Seldon, where he works on open-source libraries for explainability and drift detection. He studied mathematics at Warwick and went on to do a PhD at Imperial College London in dynamical systems. He’s passionate about open-source development and writing about his experiences in ML.
Talk | In-person | NLP | Machine Learning | Deep Learning | Responsible AI
This talk will give you the tools you need to make informed decisions around building production-ready ethical models. We’ll cover four practical steps to adopting ethical AI, including awareness, diversity, human in the loop, and frameworks…more details
Matt Beale is one of CloudFactorys Senior Solutions Consultants. He helps clients around the world overcome their data challenges with AI and ML projects across Autonomous Vehicles, Green Energy and FinTech. Matt joined CloudFactory due his interest in the ethical issue that impact AI and CloudFactorys mission to create meaningful work in the developing world. Away from work Matt has a passion for photography, travelling and unusual cars. In fact his passion for unusual cars bought him to import a Nissan Stagea from Japan to the UK.
Talk | Virtual | Machine Learning | Beginner-Intermediate
The talk will help business leaders to understand the benefits of digital twins while providing technology leaders the key skills, capabilities and tools required to design, build, and deploy agent-based simulations…more details
Dr. Anand S. Rao is a Partner in PwC’s Advisory practice. He is the Global Artificial Intelligence Lead, Cross-vertical Analytics Champion, and the Co-Sponsor for the AI Center of Enablement within PwC. With over 33 years of industry and consulting experience, Anand leads a team of practitioners who work with C-level executives at some of the world’s largest organizations, advising them on a range of topics including global growth strategies, marketing, sales, distribution and digital strategies, behavioral economics and customer experience, and statistical and computational analytics. As the global lead for AI he is responsible for research and commercial relationships with academic institutions and start-ups, research, development and commercialization of innovative AI, big data and analytic techniques. With his PhD and research career in Artificial Intelligence and his subsequent experience in management consulting he brings business domain knowledge, statistical, and computational analytics to generate unique insights into the practice of ‘data science’.
Demo Talk | In-person
In this talk, data scientist Kyriakos Fistos, will take you through every step of the analytics journey throughout the analytics life cycle by showing how to prepare and visualize data, how to develop Machine Learning models, how to integrate Open Source models in your projects, and how to deploy and manage all of your analytical assets…more details
Kyriakos is a Data Scientist and works across a broad set of industries contributing to large scale digital transformation projects for the world’s largest organizations. He specializes in data visualization, predictive modelling and model management.
With his academic background in Mathematics and Operational Research, Applied Statistics and Financial Risk, his day-to-day includes helping clients make better business decisions through Data Management, Business Intelligence, Machine Learning and Artificial Intelligence solutions.
Talk | Virtual | MLOps and Data Engineering | Intermediate – Advanced
This presentation argues for the large-scale adoption of data mesh principles to advance data science. Specifically, there is a need for domain-specific data standards including well-defined data structures for key entities in the domain and metadata to support particular use cases. Examples will demonstrate how bioinformaticians create data pipelines that draw from data sources about gene (GenBank) and protein (UniProt) sequences, protein structures (Protein Data Bank), gene expression (Expression Atlas), bioactive molecules (ChemBL), and metabolic and signaling pathways (KEGG Pathway Database). We will also review an example metadata standard for human pathogen genomic sequences and describe why domain-specific metadata is needed in addition to common metadata standards. The talk will conclude with tips on how to create data products within a data mesh architecture…more details
Dan Sullivan is a Principal Data Architect at 4 Mile Analytics. He has a PhD in Genetics, Bioinformatics and Computational Biology and is a former research scientist in infectious disease genomics. He is the author of Google Cloud Certification Study Guides for the Professional Data Engineer, Professional Architect, and Associate Cloud Engineer certifications and an instructor on LinkedIn Learning and Udemy where he provides courses on data science, data modeling, and cloud computing.
Demo Talk | In-person
Deploying models in to production is still the hardest problem in Data Science, but not if you use the right tools. In this session we look at how MLFlow pipelines simplify the process of deploying models to production. We look at the challenges of data management and how Delta can be used to ensure model are reproducible, without taking many replicas of your dataset…more details