Should I use RAG or Fine-Tuning? What about Agents?

Abstract: 

One question we get a lot as we teach students around the world to build, ship, and share production-grade LLM applications is “Should I use RAG or fine-tuning?“

The answer is yes. You should use RAG AND fine-tuning, especially if you’re aiming at human-level performance in production.

In 2024 you should be thinking about using agents too!

To best understand exactly how and when to use RAG and Supervised Fine-Tuning (a.k.a SFT or just fine-tuning), there are many nuances that we must consider!

In this event, we’ll zoom in on prototyping LLM applications and describe how practitioners should think about leveraging the patterns of RAG, fine-tuning, and agentic reasoning. We’ll dive into RAG and how fine-tuned models and agents are typically leveraged within RAG applications.

Specifically, we will break down Retrieval Augmented Generation into dense vector retrieval plus in-context learning. With this in mind, we’ll articulate the primary forms of fine-tuning you need to know, including task training, constraining the I-O schema, and language training in detail. Finally, we’ll demystify the language behind the oft-confused terms agent, agent-like, and agentic by describing the simple meta-pattern of reasoning-action and its fundamental roots in if-then thinking.

Finally, we’ll provide an end-to-end domain-adapted RAG application to solve a use case. All code will be demoed live, including what is necessary to build our RAG application with LangChain v0.1 and to fine-tune an open-source embedding model from Hugging Face!

Session Outline:

Module 1: The Patterns of GenAI
We will break down Retrieval Augmented Generation into dense vector retrieval plus in-context learning. With this in mind, we’ll articulate the primary forms of fine-tuning you need to know, including task training, constraining the I-O schema, and language training in detail. Finally, we’ll demystify the language behind the oft-confused terms agent, agent-like, and agentic by describing the simple meta-pattern of reasoning-action and its fundamental roots in if-then thinking.

Module 2: Building a simple RAG application with LangChain v0.1
Leveraging LangChain Expression Language and LangChain v0.1, we’ll build a simple RAG prototype using OpenAI’s GPT 3.5 Turbo, OpenAI’s text-3-embedding-small, and a FAISS vector store!

Module 3: Fine-Tuning an Open-Source Embedding Model
Leveraging Quantization via the bitsandbytes library, Low Rank Adaptation (LoRA) via the Hugging Face PEFT library, and the Massive Text Embedding Benchmark leaderboard, we’ll adapt the embedding space of our off-the-shelf model to a particular domain!

Module 4: Constructing a Domain-Adapted RAG System
In the final module, we’ll assemble our domain-adapted RAG system, and discuss where we might leverage agentic reasoning if we kept building the system in the future!

Background Knowledge:

There is something for everyone in this talk! We ask that attendees have strong python or other coding language foundations, as well as some grasp on the concepts of RAG and Fine-Tuning, rooted in classic ML.

This is suitable for both Aspiring AI Engineers, Current Senior AI Engineers, and AI Engineering Leaders

Bio: 

Chris Alexiuk, is the Head of LLMs at AI Makerspace, where he serves as a programming instructor, curriculum developer, and thought leader for their flagship LLM Ops: LLMs in Production course. During the day, he’s a Founding Machine Learning Engineer at Ox. He is also a solo YouTube creator, Dungeons & Dragons enthusiast, and is based in Toronto, Canada.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google