
Abstract: Tracking objects is a foundational task for video analysis. It is the engine for smart cities, autonomous driving, and building management. However, although many methods have been proposed in top-tier research conferences for the task each year, most of our production systems use half a decade old techniques. In this talk, I will explain the reason behind the gap in research and production, with intuition and experimental results. Further, I will introduce our recent works to address the issues. Our new methods can easily leverage large-scale datasets and learn to track objects in diverse scenarios. In the end, I will provide tools for industry practitioners to build trackers on their data without diving into complicated parameter tuning and expensive optimization. You will learn to make robust, simple, and performant tracking modules to supercharge your video analysis engines.
Background Knowledge
Deep Learning, Python
Bio: Fisher Yu is an Assistant Professor at ETH Zürich in Switzerland. He obtained his Ph.D. degree from Princeton University and became a postdoctoral researcher at UC Berkeley. He is now leading the Visual Intelligence and Systems (VIS) group at ETH Zürich. His goal is to build perceptual systems capable of performing complex tasks in complex environments. His research is at the junction of machine learning, computer vision and robotics. He currently works on closing the loop between vision and action. His works on image representation learning and large-scale datasets, especially dilated convolutions and the BDD100K dataset, have become essential parts of computer vision research. More info is available at https://www.yf.io