
Abstract: Have you wondered what is the technology behind the GPT models? In this talk, we are going to discuss the Transformer neural networks, introduced in 2017. Some experience with NLP and deep learning is advised. We will start by overviewing some of the key concepts and milestones in the applications of deep learning to Natural Language Processing: word embeddings, RNNs, sequence-to-sequence models, and the task of language modeling. In the second half of the talk, we will focus on the Attention mechanism, building up to the Transformer architecture and Transformer-based contextualized word embeddings such as BERT.
Bio: A former theoretical physicist turned machine learning engineer, Olga is now building a smart data annotation platform at Scaleway as a technical product manager. On the community side, she enjoys blogging about the latest advancements in AI both in and out of working hours. Some of her writing can be seen on medium.com/@olgapetrova_92798.