
Abstract: Most people who call themselves prompt engineers are really just blind prompting: chatting with ChatGPT and manually reviewing the results through crude trial and error. If you're planning to use a prompt at scale – as part of a template, workflow, or product – it's essential you run that prompt 20-30 times via the GPT-4 API to see how often it fails. You also need to be rigorously A/B testing your prompt against alternative approaches to find what really makes a difference to results. Langchain is a development framework for Large Language Models (LLMs) like GPT-4, and it can help you build a consistent system for running, monitoring, and measuring the performance of your prompts, so you can optimize them against your success metric.
Session Outline:
1. Metrics – Establish how you'll measure the quality of the responses from GPT-4. At the end of this lesson you'll be aware of the benefits and limitations of the various evaluation methods available in Langchain.
2. Hypothesis – Design two or more prompts that may work, based on the latest research. You'll learn how to take AI theory and apply it practically to come up with test ideas.
3. Samples – Generate responses for your prompts using against multiple test cases. Establish the best practices for working with GPT-4 as well as open-source alternatives using Langchain.
4. Results – Evaluate the performance of your prompts and use them to inform the next test. See just how important it is to be stress-testing prompts at scale, to guarantee performance.
Learning objectives: We'll be using the open-source Langchain framework, in order to work with OpenAI's GPT-4 model (paid). We'll also learn how Lanchain makes it easy to substitute GPT-4 for an open-source model and rate the relative performance.
Background Knowledge:
Python experience needed, and access to an OpenAI account. Familiarity with Langchain a plus, but not a requirement.
Bio: Mike is a data-driven, technical marketer who built a 50 person marketing agency (Ladder), and 300k people have taken his online courses (LinkedIn, Udemy, Vexpower). He now works freelance on generative AI projects, and is writing a book on Prompt Engineering for O’Reilly Media.