Workflow Design for Natural Language Annotation
Workflow Design for Natural Language Annotation


Advances in conversational AI depend on increasingly sophisticated models to handle more complex and context-dependent linguistic structures. To support these advances, high-quality and high-volume labeled ground truth data remains as essential as ever in training and validating dialogue models. Developing a successful pipeline for data annotation depends on an effective partnership with an annotation team. This talk draws on iMerit's experiences as a natural language annotation partner, laying out our approach to five key processes in the annotation pipeline: (i) exchange of expertise, (ii) annotator training, (iii) workflow design, (iv) feedback cycle, and (v) quality evaluation. We discuss how to set priorities, question assumptions, align requirements with priorities, and evaluate outcomes within each of these processes, in order to build a robust annotation pipeline at scale.


Dr. Teresa O’Neill is a Solutions Architect at iMerit specializing in language annotation services. Before joining iMerit, she worked for a decade in academia as an educator and researcher. At iMerit, she leverages her experience as a linguist with both theoretical and applied specializations to build custom human-in-the-loop annotation pipelines for customers with NLP/NLU use cases.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google