Vision Transformer and its Applications


Vision transformer is a recent breakthrough in the area of computer vision. While transformer-based models have dominated the field of natural language processing since 2017, CNN-based models are still demonstrating state-of-the-art performances in vision problems. Last year, a group of researchers from Google figured out how to make a transformer work on recognition. They called it "vision transformer". The follow up works by the community demonstrated superior performance of vision transformers not only in recognition but also in other downstream tasks such as detection, segmentation, multi-modal learning and scene text recognition to mention a few.
In this talk, we will go into a deeper understanding of the model architecture of vision transformers. Most importantly, we will focus on the concept of self-attention and its role in vision. Then, we will present different model implementations utilizing the vision transformer as the main backbone.
Since self-attention can be applied beyond transformers, we will also discuss a promising direction in building general-purpose model architectures. In particular, networks that can process a variety of data formats such as text, audio, image and video.

Background Knowledge
Working knowledge of deep learning for computer vision


Rowel Atienza is a Professor and Scientist at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution on the field of active gaze tracking for human-robot interaction. Dr. Atienza is the author of Advanced Deep Learning with TensorFlow 2 and Keras. His current research work focuses on robotics, computer vision and AI.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google