On-Demand Accelerating Deep Neural Network Inference via Edge Computing


Deep Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on mobile phones and embedded systems with limited hardware resources and taking more time for Inference and Training. For many mobile-first companies such as Baidu and Facebook, various apps are updated via different app stores, and they are very sensitive to the size of the binary files. For example, App Store has the restriction “apps above 100 MB will not download until you connect to Wi-Fi”. As a result, a feature that increases the binary size by 100MB will receive much more scrutiny than one that increases it by 10MB. It is challenging to run computation-intensive DNN-based tasks on mobile devices due to the limited computation resources.

This talk introduces the Algorithms and Hardware that can be used to accelerate the Inferencing or reduce the latency of deep learning workloads. We will discuss how to compress the Deep Neural Networks and techniques like Graph Fusion, Kernel Auto-Tuning for accelerating inference, as well as Data and model parallelization, automatic mixed precision, and other techniques for accelerating training. We will also discuss specialized hardware for deep learning such as GPUs, FPGAs, and ASICs, including the Tensor Cores in NVIDIA’s Volta GPUs as well as Google’s Tensor Processing Units (TPUs). We will also discuss the Deployment of the Large Size Deep Learning Models on the Edge devices like NVIDIA Jetson Nano, Google's Edge TPU(Coral).


Deepesh Agrawal experienced Machine Learning Engineer with a demonstrated history of working in the information technology and services industry, before this I was a Solution Architect with Nvidia's partner. I have completed projects based on ML & DL such as video classification, object detection, and text analysis. Skilled in Python (Programming Language), C++, Data Science, and Deep Learning.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google