Spark NLP for Healthcare: Modular Approach to Solve Problems at Scale in Healthcare NLP


Despite ongoing efforts towards using natural language processing (NLP) in information extraction from electronic health records (EHR’s), current solutions require healthcare AI practitioners to make unacceptable trade-offs between delivering state-of-the-art accuracy, generalizing over unseen data points, and preventing the sharing of personal data or intellectual property.

Spark NLP for Healthcare aims to bridge this gap by providing an accurate, scalable, private, and tunable software library that helps healthcare & pharma organizations build longitudinal patient records and knowledge graphs on real-world EHR data.

Spark NLP is a Natural Language Processing (NLP) library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment. Spark NLP comes with 3000+ pretrained pipelines and models in more than 200+ languages. It supports nearly all the NLP tasks and modules that can be used seamlessly in a cluster. Downloaded more than 1 million every month and experiencing 20x growth for the last one year, Spark NLP is used by 54% of healthcare organisations as the world’s most widely used NLP library in the enterprise.

In this talk, Veysel will conduct a hands-on session to go over the library’s healthcare components and teach how to solve any NLP problem in healthcare with the state-of-the-art methods and practices across the industry. He will also explain the best practices for building production-grade solutions around the latest research.

Background Knowledge
Basic understanding of NLP and healthcare data.


Veysel is a well known thought leader in healthcare NLP and works as a Lead Data Scientist and ML Engineer at John Snow Labs, improving the Spark NLP for the Healthcare library and delivering hands-on projects in Healthcare and Life Science.
He is a seasoned data scientist with a strong background in every aspect of data science including NLP, machine learning, deep learning, and big data with over ten years of experience. He’s also pursuing his Ph.D. in ML at Leiden University, Netherlands, and delivers graduate-level lectures in Auto ML and Distributed Data Processing.
Veysel has broad consulting experience in Statistics, Data Science, Software Architecture, MLOps, Machine Learning, and AI to several start-ups, boot camps, and companies around the globe. He also speaks at Data Science & AI events, conferences and workshops, and has delivered more than a hundred talks at international as well as national conferences and meetups.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google