Language Complexity and Volatility in Financial Markets: Using NLP to Further our Understanding of Information Processing


Quarterly corporate earnings reporting by publicly traded firms provide an excellent opportunity to investigate the impact of language in financial markets. These regularly scheduled events provide an opportunity to the top management to share with market participants factual quantitative information about the firm’s performance during the past quarter. Numerous financial analysts already produce their forecasts and estimates of the quantitative disclosures that will be reported by the publicly traded firms during corporate earnings season.

The structure of this reporting process involves two events that produce textual and audio output that contain information beyond quantitative data. First, an earnings announcement document that contains the realized factual quantitative information (e.g. EPS, revenue growth) together with a short textual narrative is released by the firm, usually prior to opening of the stock market. Second, an earnings conference call which follows the earnings release, either prior to market open or during trading hours, is an opportunity for the management to present a detailed discussion of the realized quarterly results, and provide further guidance for the following quarter. Select number of financial analysts and investors are able to participate and ask questions during these conference calls while other market participants are able to listen to the conference call in real time. Transcripts of these earnings calls are available, in some cases in real-time and generally within a short time following the completion of the call, to market participants. Natural Language Processing (NLP) of the rich textual information generated by the corporate earnings calls provides an excellent opportunity to isolate the impact of language on financial markets beyond that of the difference between estimated and reported quantitative information.

Our research focuses on the language complexity of earnings calls and its impact on volatility. The hypothesis is that the more complex the language used during an earnings call is, the more difficult it will be for market participants to process the additional information conveyed. Therefore, management’s choice of words and use of language affect the participants’ ability to determine the value of the firm which will lead to higher stock price volatility following the earnings calls. With our research design, use of a novel measure of idiosyncratic volatility, and access to a unique dataset produced by using NLP, machine learning (ML) and linguistic text processing (LTP) techniques, we are finding convincing support for our hypothesis.

In this talk, Dr. Karagozoglu will present details of our methodology and dataset, and share our findings with the audience. Discussion of further applications of NLP, ML, and data science in financial markets, especially in volatility estimation and trading, as well as risk management will wrap up the talk.


Dr. Ahmet Karagozoglu is the C.V. Starr Distinguished Professor in Finance & Investment Banking at Hofstra University and a visiting scholar at the Volatility and Risk Institute at New York University. He is also the founding academic director of the Martin B. Greenberg Trading Room at Hofstra’s Zarb School of Business since 2005. He was a visiting research professor at NYU’s Stern School of Business in 2019.
Dr. Karagozoglu’s primary research interests are in the areas of financial derivatives, risk management and market microstructure. His recently published articles focus on idiosyncratic volatility and news; short-sale constraints and information asymmetry; volatility and social media sentiment; credit risk and CDS markets; stress testing and model validation. Currently, he is investigating, with Dr. Nazli S. Alan and Dr. Robert F. Engle, the language complexity of earnings calls transcripts and volatility, using NLP, and the impact of COVID-19 pandemic on volatility in global equity markets.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google