Abstract: Advances in Natural Language Processing (NLP) over the last year have been fundamentally changing the way we work with text based data. We have seen the release of Google’s BERT and OpenAI’s GPT-2. Hugging Face’s Transformers library has democratized Natural Language Understanding (NLU) and Natural Language Generation (NLG) through. While these libraries provide access to powerful capabilities, it can be challenging for many users to figure out how to get started and apply them to their own datasets.
To address this challenge, Novetta has developed an intuitive guide structured as an NLP task framework that drastically lowers the barrier to entry for developers to use these advanced capabilities. This high-level guide enables users to take advantage of open pre-trained language models to fine-tune models for text classification, question answering, entity extraction, and part-of-speech tagging. The sequence of tutorials will provide quick and easy access to a wide variety of embedding schemes for downstream use, such as in recommendation systems. The ability to stand each of these tasks up as a service for easy integration into existing workflows and applications would be fast and straightforward.
In this hands-on workshop we will use Flair and Transformers to:
- Provide an overview of recent advances in NLP
- Install Flair and Transformers
- Demonstrate our approach to initially setup and build these advanced NLP capabilities
- Train a simple text classifier
- Train a simple Question Answering (QA) model
- Combine QA and NER to develop a custom application to automatically build visual timelines on documents by asking predefined questions on dates, places and people found in the document
- Leverage Kubernetes to deploy the application as a scalable service in the cloud.
Bio: Brian Sacash is a Machine Learning Engineer in Novetta's Machine Learning Center of Excellence. He helps various organizations discover the best ways to extract value from data. His interests are in the areas of Natural Language Processing, Machine Learning, Big Data, and Statistical Methods. Brian holds a Master of Science in Quantitative Analysis from the University of Cincinnati and a Bachelor of Science in Physics from Ohio Northern University.