Analyzing Legislative Burden Upon Businesses Using NLP and ML
Analyzing Legislative Burden Upon Businesses Using NLP and ML


As legislation develops over time, the burden upon businesses can change drastically. Data scientists from Bardess have collaborated with a research team within the Government of Ontario to investigate the use of advanced natural language processing (NLP) and machine learning (ML) techniques to analyze legal documents including statutes and regulations. Using the Accessibility for Ontarians with Disabilities Act (AODA) as a starting point, we developed a multi-stage analysis. On the higher level, the goal was to simply identify and automatically detect parts of the legislature that indicate legislative burden and categorize them as being primarily burdens upon business or government departments. The second level of analysis aims at understanding patterns of similarities and differences between different classes of burden using data mining and clustering techniques. Finally, the objective of the analysis is expanded to include other legislative texts, using ML algorithms to detect burdens which have been duplicated across multiple statutes and acts. This latter work supports the Government of Ontario to develop leaner legislature more efficiently. Overall this work indicates how NLP and ML techniques can be brought to bear on complex legislative problems, further emphasizing the increasing utility of these techniques in government and industry.

In this hands-on workshop, we'll first describe the legislative/business context for the initiative, then walk attendees through the technical implementation. The work will be conducted by combining various techniques from the NLP toolbox, such as entity recognition, part-of-speech tagging, automatic summarization, and topic modeling. Work will be conducted in Python, making use of libraries for NLP such as spacy and nltk, and the ML library scikit-learn. We will also showcase interactive dashboards which have been created using the BI tool Qlik to allow exploration of the results of the analysis.


Serena Peruzzo is a senior data scientist at the analytics consultancy, Bardess. Her formal background is in Statistics with experience working both in the industry and academia. She has worked as a consultant on the Australian, British and Canadian markets delivering data science solutions across a broad range of industries and led several startups through the process of bootstrapping their data science capabilities.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google