ML Inference on Edge with ONNX Runtime
ML Inference on Edge with ONNX Runtime


ONNX Runtime is the inference engine for inferencing ML models in different HW environments. Applications of AI are everywhere. This requires ML models that are trained in the cloud to execute on small devices with low power, low compute, and low memory. Such devices are typically used in IoT scenarios. The data captured by these devices are processed before sending the telemetry to the cloud for further actions on the business application. ONNX Runtime has made enhancements to enable execution of ML models in these edge devices to power AI on the edge applications. This session will walk through the workflow to train an image classification model, package in container and deploy to IoT device.

Session Outline
Lesson 1:
Train ML models for IOT applications, e.g. image classification. Start with a pre-trained model to fine tune for specific IOT scenario. Store model in registry. And convert to ONNX.
Lesson 2:
Create the IoT application in Python using the ML model with ONNX Runtime. Package this in a docker image for the target device and register image in container registry.
Lesson 3:
Deploy the container image to the edge device. Run inference sessions and send processed telemetry to cloud for the business application.

Background Knowledge
- Python;
- Machine Learning life-cycle.


Prabhat Roy is a Data and Applied Scientist at Microsoft, where he is a leading contributor to the scikit-learn to ONNX converter project ( In the past, he worked on ML.NET, an open-source ML library for .NET developers, focusing on customer engagements for text and image classification problems.