Data Efficiency Through Transfer Learning

Abstract: In recent years, supervised machine learning models have demonstrated tremendous success in a variety of application domains. Despite the promising results, these successful models are data hungry and their performance relies heavily on the size of training data. In many real-world applications, it is difficult to collect sufficiently large training datasets resulting in running under-performing models or delaying the model deployment until a critical mass of data is collected. Transfer learning solutions help overcome these issues by transferring the knowledge from readily available datasets to a new target task, providing huge value to companies. In this talk, you’ll learn how to apply state-of-the-art academic research in transfer learning to real-world situations to solve various business problems, including the cold-start problem. The first method being a hybrid instance-based transfer learning approach that outperforms a set of baselines including state-of-the-art instance-based transfer learning approaches. Our method uses a probabilistic weighting strategy to fuse information from the source domain to the model learned in the target domain. This method is generic, applicable to multiple source domains, and robust with respect to the negative transfer. The other method being a framework for building differentially private aggregation approaches to enable transferring knowledge from existing models trained on other companies’ datasets to a new company with limited or no labeled data. Applying these methods in your organization will lead to increased customer trust and an advance in revenue for both you and your customers.

Bio: Eddie Du is an Applied Research Scientist on the Georgian Partners Impact Team and works on engagements with their portfolio companies while also identifying new research trends that Georgian can leverage in the future. He has a broad interest in bridging the gap between academic research and applications in the industry. Prior to joining Georgian, Eddie was a Software Engineer with Bluecore, where he developed infrastructure for data ingestion and data quality monitoring. Previous to this he worked as a Software Engineer at Facebook and Google. At Facebook, he worked on the News Feed Data Infrastructure team and contributed to tooling and infrastructure. At Google, he worked on YouTube Closed Captions, the YouTube Flash player and Google Buzz.