Overcoming the Limitations of LLM Safety Parameters with Human Testing and Monitoring


Ensuring safety, fairness, and responsibility has become a critical challenge in the rapidly evolving landscape of Large Language Models (LLMs). This talk delves into a new approach to address these concerns by leveraging the power of human testing and monitoring from a diverse global population. We present a comprehensive strategy employing a combination of crowd-sourced and professional testers from various locations, countries, cultures, and life experiences. Our approach thoroughly scrutinizes LLM and LLM application input and output spaces. It ensures responsible and safe product delivery.

The presentation centers on functional performance, usability, accessibility, and bug testing. We share our research into these approaches and include recommendations for building test plans, adversarial testing approaches, and real-world usage scenarios. This diverse, global, human-based testing approach is a direct solution to the issues raised in recent papers highlighting the limited effectiveness of RLHF-created safety parameters against fine-tuning and prompt injection. Experts are calling for LLMs that inject safety parameters at the base parameter level, but, to date, this has resulted in a significant drop in LLM efficacy. Additionally, building safety directly into the pre-trained model is prohibitively expensive. Our approach overcomes these technical and financial limitations and is applicable now. Results point to a paradigm shift in LLM safety practices, yielding models and applications that remain helpful and harmless throughout their lifecycle.


Josh Poduska is an AI Leader, Strategist, and Advisor with over 20 years of experience. Presently, he is a Client Partner and AI Strategist at Applause App Quality. Previously, he held the position of Chief Field Data Scientist at Domino Data Lab. Josh has managed top analytical teams and led data science strategy at multiple companies. His primary research interest is in AI and ML validation, observability, and risk management. He graduated from UC Irvine with a Bachelor’s degree in Mathematics and earned his Master’s degree in Applied Statistics from Cornell University.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google