Multimodal Retrieval Augmented Generation


Retrieval augmented generation (RAG) soon became established as the reference architecture whenever we want to inject custom knowledge into our LLM-powered applications.

Insofar, RAG has been applied to text data. Nevertheless, with the launch of GPT-4-turbo vision, we can extend the same concept also data different from texts, such as images.

In this workshop, we are going to cover the architecture behind a typical RAG application and how to incorporate images within this architecture, leveraging GPT-4-turbo with vision. To do so, we will see a practical implementation with Python and LangChain, consuming the model API from Azure OpenAI service.

Session Outline:

1. Lesson 1: RAG
Familiarize yourself with the logic behind retrieval augmented generation. By the end of this session, you will have a better understanding of the concepts of embeddings, vector databases, cosine similarity and context augmentation.
1. Lesson 2: Multimodal RAG with GPT-4-turbo vision
Extend the concept of RAG to data different from raw text, such as images. Familiarize yourself with the concept of multimodal embedding and how GPT-4-turbo vision is able to get the context from a vectorized image.
1. Lesson 3: Hands-on
Build your multimodal RAG application in Python using Azure OpenAI and LangChain. By the end of this session, you will be able to develop your own multimodal application from scratch.

Background Knowledge:

By the end of this session, attendees will be familiar with:
- Retrieval Augmented Generation
- Embedding
- Multimodality
- GPT-4-turbo with vision
- Vector databases
- Python and LangChain to build RAG Application


Valentina is a Data Science MSc graduate and Cloud Specialist at Microsoft, focusing on Analytics and AI workloads within the manufacturing and pharmaceutical industry since 2022. She has been working on customers' digital transformations, designing cloud architecture and modern data platforms, including IoT, real-time analytics, Machine Learning, and Generative AI. She is also a tech author, contributing articles on machine learning, AI, and statistics, and recently published a book on Generative AI and Large Language Models.

In her free time, she loves hiking and climbing around the beautiful Italian mountains, running, and enjoying a good book with a cup of coffee.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google