Introduction to Transformers for NLP: Where We Are and How We Got Here


Have you wondered what is the technology behind the GPT models? In this talk, we are going to discuss the Transformer neural networks, introduced in 2017. Some experience with NLP and deep learning is advised. We will start by overviewing some of the key concepts and milestones in the applications of deep learning to Natural Language Processing: word embeddings, RNNs, sequence-to-sequence models, and the task of language modeling. In the second half of the talk, we will focus on the Attention mechanism, building up to the Transformer architecture and Transformer-based contextualized word embeddings such as BERT.


A former theoretical physicist turned machine learning engineer, Olga is now building a smart data annotation platform at Scaleway as a technical product manager. On the community side, she enjoys blogging about the latest advancements in AI both in and out of working hours. Some of her writing can be seen on

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google