Editor’s note: Danushka Bollegala is a speaker for ODSC Europe 2022. Be sure to check out his talk, Social Biases in Text Representations and their Mitigation, there!

How would you feel if the final decision on your job application was made by a natural language processing (NLP)-based system, without the intervention of any humans? 

How would you feel if your job application got rejected and the feedback you got from the system was based on your gender, race, or religious beliefs?

Perhaps you would not trust an NLP system for the rest of your lifetime?


How come sophisticated, automatically learned, state-of-the-art NLP models are so biased?

We train NLP models using ever-increasing datasets, which are often collected from the Internet, containing many toxic and biased viewpoints. For example, GPT-3 model released by Open AI is trained on 570 GB of text (ca. 400 billion tokens) crawled from the Internet. It might not be so surprising then to imagine some of those unfair discriminatory biases in the training data creep into the ML models, and even at times get amplified by the training algorithms we use to learn the ML models. Indeed, we have developed increasingly efficient and accurate learning algorithms that can pick even the slightest of the signals in the data, let it be desirable or harmful.

The biases learned by the NLP systems are more apparent when they interact with billions of users across the world such as chatbots, image retrieval systems, surveillance systems, and machine translation systems.

Eating and Having the Cake

However, we cannot simply decide not to use NLP systems because they are toxic, as we also want to enjoy the benefits of the systems developed using such models. We have got used to depending on NLP systems in our day-to-day lives to an extent such that we cannot think of a life without these systems. Therefore, we must find ways to somehow remove the social biases already learned by the NLP systems, or better, prevent NLP systems from learning such biases in the first place.

The NLP research community is aware of this pressing issue and is actively engaged in developing solutions. At the University of Liverpool’s NLP Group, we have made several significant efforts to address this problem. 

The first step toward addressing this problem is to develop metrics to objectively evaluate the types of biases learned by the textual representations. For this purpose, we proposed a method to evaluate the social biases in masked language models in a paper presented at the AAAI conference earlier this year. We extended social bias evaluation to languages other than English in a paper to be presented at NAACL in June 2022. We also showed that social biases are present not only in word representations, but also in a sense representations in an ACL-2022 paper.

Once we can reliably detect social biases in NLP systems, we can apply various techniques to remove those identified biases. For this purpose, we have developed methods for mitigating the social biases in text representations. For example, by using dictionaries, fine-tuning on training data or example words. However, completely removing all types of social biases from NLP systems, while retaining their accuracy is a challenging task. Until we reach that goal, we must be aware that AI systems will learn and reflect unfair biases and must be prepared to face the consequences.

About the Author/ODSC Europe 2022 Speaker on NLP and Social Bias:

Danushka Bollegala is a Professor in the Department of Computer Science, University of Liverpool, UK. He obtained his PhD from the University of Tokyo in 2009 and worked as an Assistant Professor before moving to the UK. He has worked on various problems related to Natural Language Processing and Machine Learning. He has received numerous awards for his research excellence such as the IEEE Young Author Award, best paper awards at GECCO and PRICAI. His research has been supported by various research council and industrial grants such as EU, DSTL, Innovate UK, JSPS, Google and MSRA. He is an Amazon Scholar.