Louis Castricato

Louis Castricato

Research Scientist at EleutherAI

    Louis Castricato is currently a research scientist at EleutherAI, working on large scale RLAIF infrastructure. Louis, previously, was team lead at CarperAI and Head of LLMs/Research Director at Stability AI, where he worked on libraries like trlX and various RLHF projects. Louis is also a PhD student at Brown University, advised by Ellie Pavlick.

    All Sessions by Louis Castricato

    Day 1 04/23/2024
    4:35 pm - 5:35 pm

    Pink Elephants and Direct Principle Feedback

    <span class="etn-schedule-location"> <span class="firstfocus">LLMs</span> </span>

    Tutorial: This tutorial presents Direct Principle Feedback (DPF), a novel approach for fine-tuning language models (LLMs) to dynamically obey new behavioral constraints at inference time. DPF addresses the Pink Elephant Problem, enabling models to avoid discussing specified unwanted topics (""""Pink Elephants"""") while focusing on desired ones (""""Grey Elephants""""). By applying DPF with high-quality synthetic data, we teach models to effectively navigate complex content guidelines across multiple contexts, offering a significant advancement over traditional reinforcement learning methods for LLM control. Targeting professionals in fields requiring dynamic content control, such as edtech and social media, this session elucidates the process of generating synthetic preference data, the mechanics of DPF, and its application for enhancing LLM controllability. Participants will acquire the expertise to deploy LLMs capable of adapting to specific content guidelines, ensuring relevance and compliance in diverse deployment scenarios. Through this tutorial, attendees will gain insights into leveraging DPF for addressing not only the Pink Elephant Problem but also broader challenges in LLM behavior control, marking a step forward in the development of adaptable, context-aware AI systems.

    Open Data Science




    Open Data Science
    One Broadway
    Cambridge, MA 02142

    Privacy Settings
    We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
    Consent to display content from - Youtube
    Consent to display content from - Vimeo
    Google Maps
    Consent to display content from - Google