From Research to Production: Performant Cross-Platform ML/DNN Model Inferencing on Cloud and Edge with ONNX Runtime
From Research to Production: Performant Cross-Platform ML/DNN Model Inferencing on Cloud and Edge with ONNX Runtime

Abstract: 

Powerful Machine Learning models trained using various frameworks such as scikit-learn, PyTorch, TensorFlow, Keras, and others can often be challenging to deploy, maintain, and performantly operationalize for latency-sensitive customer scenarios. Using the standard Open Neural Network Exchange (ONNX) model format and the open source cross-platform ONNX Runtime inference engine, these models can be scalably deployed to cloud solutions on Azure as well as local devices ranging from Windows, Mac, and Linux to various IoT hardware. Once converted to the interoperable ONNX format, the same model can be served using the cross-platform ONNX Runtime inference engine across a wide variety of technology stacks to provide maximum flexibility and reduce deployment friction.

In this workshop, we will explore the versatility of ONNX and ONNX Runtime by showcasing its usefulness in converting traditional ML scikit-learn pipelines to ONNX, applying transfer learning techniques, and exporting PyTorch-trained Deep Neural Network models to ONNX. We'll work through how these models can then be deployed to Azure as a cloud service using Azure Machine Learning services, and to Windows or Mac devices for on-device inferencing.

The production-ready ONNX Runtime is already used in many key Microsoft products and services such as Bing, Office, Windows, Cognitive Services, and more, on average realizing 2x+ performance improvements in high traffic scenarios.

ONNX Runtime supports inferencing of ONNX format models on Linux, Windows, and Mac, with Python, C, and C# APIs. The extensible architecture supports graph optimizations (node elimination, node fusions, etc.) and partitions models to run efficiently on a wide variety of hardware, leveraging custom accelerators, computation libraries, and runtimes where available. These pluggable ""execution providers"" work with CPU, GPU, FPGA, and more.

ONNX is a standard format for DNN and traditional ML models, developed by Microsoft, Facebook, and a number of other leading companies in the AI industry. The interoperable format provides data scientists with the flexibility to use their framework and tools of choice to accelerate the path from research to production. It also allows hardware partners to design optimizations for deep learning focused hardware based on a standard specification that is compatible with many frameworks.

Bio: 

Prabhat Roy is a Data and Applied Scientist at Microsoft, where he is a leading contributor to the scikit-learn to ONNX converter project (https://github.com/onnx/sklearn-onnx). In the past, he worked on ML.NET, an open source ML library for .NET developers, focusing on customer engagements for text and image classification problems.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google