Abstract: Identifying anomalous observations has important business impacts across all industries. None more than in the world of fraud detection where some observations are intentionally trying to hide, which is different than most rare event problems that exist in modeling. This talk will highlight some modern approaches to anomaly detection – local outlier factors, isolation forests, and classifier adjusted density estimation (CADE). All of these techniques have foundations in places that were not originally anomaly detection. Local outlier factors are derived from k-nearest neighbors. Isolation forests have their foundation in tree based algorithms. CADE was originally designed as an improvement / variation on kernel density estimation. However, all of these have been shown to have great abilities to find anomalous observations in a data set. The local outlier factor uses k-nearest neighbors to identify not only observations that are far away from the main group, but also far enough from isolated groups as well. Isolation forests use tree based approaches to randomly split data to find out how quickly an observation can be isolated from the group. CADE helps identify observations that appear as anomalies compared to the distribution of the rest of the observations in a data set in a single or multiple dimensional context. The best part about CADE is that it can also highlight the variables that most distinctly separate that observation from the rest which allows investigators to better explore the anomaly in question. This talk will highlight these approaches as well as demonstrate the approaches using open source software.
Bio: A Teaching Associate Professor in the Institute for Advanced Analytics, Dr. Aric LaBarr is passionate about helping people solve challenges using their data. There he helps design the innovative program to prepare a modern work force to wisely communicate and handle a data-driven future at the nation's first Master of Science in analytics degree program. He teaches courses in predictive modeling, forecasting, simulation, financial analytics, and risk management.
Previously, he was Director and Senior Scientist at Elder Research, where he mentored and lead a team of data scientists and software engineers. As director of the Raleigh, NC office he worked closely with clients and partners to solve problems in the fields of banking, consumer product goods, healthcare, and government.
Dr. LaBarr holds a B.S. in economics, as well as a B.S., M.S., and Ph.D. in statistics — all from NC State University.