Abstract: "In the past year Generative AI and Large Language Models(LLMs) have disrupted the Machine Learning Landscape. Using prompts you can build a prototype ML solution in just a few days. The real question is are these prototypes ready for production? In an ever changing world of models powering third party API’s , dealing with the unchartered zone of evaluations, is it even prudent to expose customers to the LLM powered applications?
Clearly the popularity of ChatGPT shows us that people are relying on LLMs for not just for information but also for composing content such as emails or slides, generating code, reading papers to mention a few. People have even started looking towards AI for companionship.
This is the Age of Generative AI, innovation is moving faster than development speed. There are no rules, yet we have to build an application, that is high quality, robust to changes and provides the best possible customer experience.
In this presentation I will talk about building ML applications with Generative AI. In order to build ML applications there are a few approaches: use external APIs, fine-tune your own model using domain specific data and doing Retrieval Augmented Generation to provide context to the LLMs. In most applications some or all of these methods are combined. Having a framework for evaluations both human and automated helps with optimizing high quality results with performance and cost. Building scalable applications with streaming functionality has its own challenges.
Fast iterations are extremely important to keep up with the pace of innovation, at the same time creating best practices to ensure accountability and reproducibility is important for experimentation and to create an optimal customer experience. Some of these include prompt versioning, model versioning and monitoring models as they go into production.
Another important factor to consider while building an LLM or Generative AI assisted application, does it make sense to build smaller ML models that do classifications, summarizations, NER to reduce the ask from Generative models such that it can scale to larger traffic at lower latency and cost. Is the tradeoff higher development cycles or is it possible to build these models faster using LLM assisted training data? I will address how to answer these questions that come up in the lifecycle of a Generative AI driven application."
Bio: Sanghamitra Deb is Engineering manager: Generative AI and ML at Chegg INC. She works on improving student learning with LLMs. Prior to being a manager she was a Lead Staff Data Scientist for several years. Her work involves building conversational interfaces using LLMs, recommendation systems, computer vision, graph modeling, NLP , architectural decision-making, data pipelines and machine learning. Previously, Sanghamitra was a data scientist at a Accenture where she worked on a wide variety of problems related data modeling, architecture and visual story telling. She is an avid fan of python and has been programming for more than a decade.
Trained as an astrophysicist (she holds a PhD in physics) she uses her analytical mind to not only work in a range of domains such as: education, healthcare and recruitment but also in her leadership style. She mentors junior data scientists at her current organization and coaches students from various field to transition into Data Science. Sanghamitra enjoys addressing technical and non-technical audiences at conferences and encourages women into joining tech careers. She is passionate about diversity and has organized Women In Data Science meetups.