Patient Level Prediction with Supervised Learning Models in Federated Data Networks


Global federated data networks with patient level data have emerged in recent years thanks to the adoption of data standards such as the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The Observational Health Data Sciences and Informatics (OHDSI) community has leveraged these data networks and developed a standardized, open-source patient level prediction framework to provide reliable and reproducible methods to generate medical evidence.

In this tutorial we will review the fundamentals of standardized data sources in federated health data networks and describe the data standards that enable open-source software development. We will demonstrate how to use a suite of patient level prediction tools to develop data-driven prediction models using standardized observational health data. The session will include the review a broad set of supervised learning models including regularized logistic regression, random forest, gradient boosting machines and neural networks. The flexibility of the approach will be highlighted by showing how it enables supervised learning models from Python, R, and JAVA to be integrated into the framework.

Session Outline:

Section 1: Observational data standards for federated data networks
Gain an understanding of how standardized data models enable open-source development of tools for data characterization and evidence generation including patient level prediction.

Section 2: Review the open source Patient Level Prediction package to learn how machine learning can be applied to a global network of federated health data sources to build models using a broad set of supervised learning models.

Background Knowledge:

Understanding of Machine Learning required


Jenna Reps is a Director at Janssen Research and Development where she is focusing on developing novel solutions to personalize risk prediction. Jenna’s areas of expertise include applying machine learning and data mining techniques to develop solutions for various healthcare problems. She is currently working within the patient level prediction OHDSI workgroup with the aim of developing open source and user friendly software for developing risk models using data sets in the OMOP Common Data Model format. Prior to joining Janssen Research and Development, Jenna was a Senior Research Fellow at the University of Nottingham where she developed supervised learning techniques to signal adverse drug reactions using UK primary care data and acted as a data consultant to other researchers within the University. Jenna received her BSc in Mathematics and MSc in Mathematical Biology at the University of Bath and her PhD in Computer Science at the University of Nottingham.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google