Data Harmonization for Generalizable Deep Learning Models: from Theory to Hands-on Tutorial
Data Harmonization for Generalizable Deep Learning Models: from Theory to Hands-on Tutorial


Integration of data from multiple sources, with and without labels, is a fundamental problem in transfer learning when models must be trained on a source data distribution that differs from one or more target data distributions. For example, in healthcare, models must flexibly inter-operate on large scale medical data gathered across multiple hospitals, each with confounding biases. Domain adaptation is a method for enabling this form of transfer learning by
simultaneously identifying deep feature representations that are invariant across domains (data sources), thereby enabling transfer learning to unseen data distributions.

In this workshop, we will teach attendees how to use domain adaptation for machine learning applications in computer vision for healthcare. More specifically, we will introduce scAlign, our recently developed domain adaptation approach that can integrate data from multiple sources in either a fully unsupervised, semi supervised and fully supervised fashion. Healthcare datasets will be used by participants to explore how to tune model accuracy and generalizability. Finally, we will explore how domain adaptation can be extended to perform a novel form of differential feature extraction to identify differences between features of data (e.g. images) from different

Learning objectives:
- Introduction to Domain Adaptation.
- 15-20 mins
- Theory/motivation (Healthcare)
- Go over applications in computer vision
- Introduction to scAlign

- Overview of Tensorflow/Colaboratory. (Interactive)
- 25 mins
- Structure of workshop
- Structure of Tensorflow model: basics
- Running an ML model in colaboratory (example)

- Domain adaptation on melanoma data. (Hands-on)
- 10 mins
- Explain problem: Align data from different hospital/patient/disease
- Go over what the model should produce, show visualization.
- 20 mins
- Work through alignment of data
- Explore changes to loss function weights.
- Hyper parameters for DA/Classifier

- Expand model structure

- Differential feature analysis with Domain adaptation. (Hands-on)
- 15 mins
- Explain model structure and how to perform differential feature analysis.
- Explore how features change with respect to disease.
- Decode into both health and disease state
- Compute difference


Gerald Quon is an Assistant Professor in the Department of Molecular and Cellular Biology at the University of California at Davis. He obtained his Ph.D. in Computer Science from the University of Toronto, M.Sc. in Biochemistry from the University of Toronto, and B. Math in Computer Science from the University of Waterloo. He also completed postdoctoral research training at MIT. His lab focuses on applications of machine learning to human genetics, genomics and health, and is funded by the National Science Foundation, National Institutes of Health, the Chan Zuckerberg Initiative, and the American Cancer Society.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google