Honor Code Violation Detection
Honor Code Violation Detection


Most businesses rely on traditional analytics on structured data for gaining day to day business insights despite owning very rich text data. To leverage text data companies typically need an in house Machine Learning team with Natural Language Processing expertise. There is a huge investment in problem specific feature engineering and manual curation of training data by subject matter experts (SME). This process becomes too expensive in an agile business environment where problem definitions change frequently.

Chegg has multiple student centric products: online tutoring, help with answering study questions, studying for ACT/SAT, writing help and others. Frequently there are business questions that are hidden in chats or questions asked by students.

Many students come to the Chegg Tutors platform and ask the tutors to do their graded assignments or quizzes for them. This violates Chegg’s honor code policies. We use text data (questions submitted by students, chats) and apply dark data extraction tool: snorkel, developed at Stanford to create an honor code violation detector (HCVD). This process uses inputs from SME’s and business partners and converts them into heuristic noisy rules which are modeled using generative models to produce high quality training data. Once there is training data HCVD detects key phrases (example: do my online quiz) that indicate honor code violation and indicates the necessary actions such as warnings, advising tutors or blocking, that need to be taken by the system.


Sanghamitra Deb is a Senior Data Scientist at Chegg Inc. At Chegg she works on a wide range of projects related to developing a recommendation system for Chegg online tutoring, detecting student and tutor intents using natural language processing and is heavily involved with A/B testing machine learning models. In the past she has worked at Accenture Tech Labs developing algorithmic solutions to business problems. Prior to being a data scientist she did her PhD in astrophysics and studied the formation and evolution of the universe by analyzing gravitational lensing by galaxy clusters.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google