Abstract: Given a million+ legitimate Wal-Mart Stores returns daily, identifying fraudulent returns in real-time with minimal customer friction is a challenging problem. One reason is because there is a lack of customer identity associated with in-store transactions. Besides, there are no confirmed fraud labels in situations where the fraudulent return is suspected. Finally, the customer is present when a decision to accept or deny the return is conveyed. Thus, incorrectly accusing the customer of return fraud typically insults the customer and damages customer relations. Accordingly, it would be desirable to provide an improved store return fraud detection system. We propose a system that supports intelligent detection of anomalous sequences of activities, together with comprehensive evaluation of distinct characteristics of fraudulent activities, enables the generation of high-confidence fraud labels to some activity patterns. Specifically, the system includes a self-evolving network that associates stores, transactions, returns, payment instruments, and customer identification over related activity sequences of store events. Activity sequences are further represented by respective signature vectors that comprise various behavior variables. Next, the system adopts an innovative metric to calculate pairwise similarity between the signature vectors. Later, an iterative process kicks in to identify clusters of the signature vectors having common behavior patterns based on the calculated pairwise similarity. Finally, human judgment, operational insights, and an analytical technique such as scoring or decision trees are used to label the identified clusters of signature vectors as fraudulent or non-fraudulent behavior. The labelled data further enables supervised learning models to predict fraudulent behavior at the early stage of fraudulent activity sequences before they are fully developed or terminated. The models derived from the supervised training may then be used to advise store return desk personal whether to deny, warn, or accept attempted store returns in real-time. The proposed approach is validated with real store returns data.
Bio: Henry Chen is a Director of Data Science at Walmart Labs. He leads a data science team to combat Walmart store returns fraud, Marketplace seller fraud, and E-Commerce payment fraud using machine learning and deep learning techniques. Prior to that he was a Senior Manager of Data Science at PayPal, the co-principal investigator of several NSF and DARPA sponsored research projects, and a technologist for top Fortune 100 companies in the San Francisco Bay Area. Henry Chen holds a M.S. and a Ph.D. from University of California at Berkeley.