Measuring Integrity: Emergent Data Science Techniques within the Legal, Regulatory & Compliance Space


How can you quantify someone’s integrity? How can businesses bridge the gap between their stakeholders’ intentions, actions and data?

During this session we will explore how to use data science to positively impact an organization’s legal and regulatory compliance and risk management processes. Leading organizations today are leveraging supervised and unsupervised learning techniques to assess data integrity and security, meet evolving compliance obligations, and realize risk mitigation. We will provide real-world examples of strategies for integrating multiple, disparate data sources, digestible front-end visualizations and case management tools with machine learning. Data sources include transactional data, ERP systems, HR, ethics hotline data, exit interviews for terminated employee, as well as investigative outcomes, email and voice. This is creating a fascinating new frontier and untapped possibilities for data scientists.

We will cover several key questions, including: (a) how do the results of machine learning models contribute to an overall investigative workflow?, and (b) how can we fuse and present –in near-real time- the results of a model with high-impact data visualization, text analytics, cross dataset search and case management? We will share tangible and practical strategies to drive improved organizational culture and a better-functioning business environment. This will include a walk-through of a use-case for predictive analytics. This approach uses rules-based testing, natural language processing, unsupervised anomaly detection and supervised learning. We can then flag high-risk entries leveraging predictive risk scoring augmented with unsupervised outlier detection and NLP algorithms to identify rare events. These samples are shared with teams to review and tag. Finally, after the sampled data is flagged to a specific risk category or labeled as non-responsive, this information can be leveraged as the seed set on future and past transactions to uncover additional patterns.


Jeremy is a senior professional with 13+ years of experience leveraging advanced analytics and data science within the legal, risk and compliance space. He is one of the leaders of EY’s Forensic Data Analytics team. Jeremy assists clients in enhancing organizations’ integrity programs through data-driven approaches. Jeremy has built teams and served clients within the US and globally, with an emphasis on leveraging machine learning to combat fraud, waste and abuse. He focuses on visioning, designing, and implementing automation techniques to manage data integrity, respond to litigation and regulatory inquiries, and manage legal and compliance risks. Jeremy’s clients include multinational conglomerates, financial services, energy, and the public sector. He has assisted clients with complex regulatory matters involving various state and federal agencies. He is also a cofounder of the EY flagship forensic analytics platform, “EY Virtual.” Jeremy is a Certified Fraud Examiner and Certified Anti-Money Laundering Specialist. He is a frequent speaker and guest lecturer on emergent technologies, including advanced text analytics and intelligent automation.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google