Abstract: ShopRunner is an e-commerce company that receives feeds of product data from many different retailer partners, including large department stores and retailers that specialize in apparel, electronics, nutritional products, and more. This data includes both images and text fields like product name and description. In order to combine these millions of products into one unified catalog for our marketplace, we need to fit them all into one unified taxonomy and determine various attributes. For apparel, which I will be focusing on in this talk, these attributes include color, pattern, sleeve length, season, fabric type, and country of origin. In typical retail settings such classifications are all performed manually, but at ShopRunner we have been developing deep learning models to automate the vast majority of this work.
In this talk I will motivate the benefits of combining image and text data for classification problems, and introduce our newly open sourced library for building multi-task computer vision and natural language processing deep learning pipelines using PyTorch. Using a variety of real world examples, I will walk through multiple strategies that we have employed at ShopRunner for combining images and text to classify apparel products. This will include an overview of the basics of transfer learning, a discussion of evaluation metrics and their tradeoffs in a business setting, and the overall design of our production product catalog enrichment system.
Bio: Ali Vanderveld is the Director of Data Science at ShopRunner, where her team leverages data from a network of over 100 retailers to build products for their 6 million members. Prior to ShopRunner, she was a staff data scientist at Civis Analytics, a consulting and software startup that helps companies, nonprofits, and political organizations better utilize their data. She has also worked at Groupon and as a technical mentor for the Data Science for Social Good Fellowship. Ali has a PhD in theoretical astrophysics from Cornell University and got her to start working as an academic researcher at Caltech, the NASA Jet Propulsion Laboratory, and the University of Chicago, working on the development teams for several space telescope missions, including ESA's Euclid.