Fine Tuning Strategies for Language Models and Large Language Models

Abstract: 

Language Models (LM) and Large Language Models (LLM) have attracted significant attention in both public and corporate sectors due to their proficiency in natural language understanding (NLU) and generation (NLG). These models have shown capabilities in recent applications such as direct chat services for general public and natural answers generation services for specific business verticals. Despite their capabilities, the specialized nature of business requires a more precise control and higher accuracy to meet both business cases as well as specific organizational requirements.

This presentation explains the fine-tuning mechanisms of LM and LLM, by explaining the fundamental mechanisms behind them as well as the various trade off in real world application.

At start, we introduce in detail general machine learning concepts, such as Representation Learning and Transfer Learning, allowing us to define the concept of fine tuning of Language Models and their computational benefits.
We continue with an overview of actual fine-tuning methods for standard LM and their different use cases.
Then, we detail the architectures and various implementations of Task Heads for the fine tuning process.
The impact of Task Heads is also explained along with some concrete examples in real world.

Bio: 

Kevin Noel is currently Lead AI/ML at Uzabase Japan/US, developing LM / LLM based solutions for Speeda Edge business intelligence. He has more than 12 years experience in Japan in various industries. Previously, he has worked on Ads/Recommendation field with real time personalized Ads solution on Yahoo Japan application.

He also held a principal ML role at the largest Big Data, E-commerce in Japan and has worked with large scale multi-modal data (Tabular, Time series, Japanese NLP, image) through numerous machine learning projects. He has also provided various training on Deep Learning and external talk on applied ML (New York, 2019, ... )... Prior to this, Kevin, with a background in applied stochastic modeling and data mining from Ecole Centrale, held various quantitative roles a BNP Paribas, Bank of America, and ING in Asia/Japan.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from - Youtube
Vimeo
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google