Abstract: Many machine learning models are opaque in the way they make a prediction. Even with quite common ensemble models such as random forest and gradient boosting, it is difficult to explain why the model made a particular decision. For certain business contexts, this poses a challenge. We faced such a hurdle when working on various machine learning models in the real estate domain. Among the things we were interested in predicting were the rent and price of a property, and duration of time during which it will be without a tenant if it were to become vacant. The business owners who were the users of the models were not satisfied with the common feature importance plot obtained with tree-based models.
One of the approaches we employed to enhance the explainability of the models was the so-called Shapley value (also called Interaction-based) method for explanation. This is an idea that comes from cooperative game theory. The Shapley value helps explain the average contribution of a feature value across possible ‘coalitions’, i.e. feature combinations. This approach was successful in making the business users gain confidence in the models and trust their predictions.
In the talk I plan to briefly go over the business problem itself and the approach we took to solve it, as well as explain what Shapley value is, and how it can be used in many applications. With the increasing popularity of machine learning models, and the importance of transparent and explainable models in certain domains, explainability will become more and more important.
Bio: Violeta has been working as a data scientist in the Data Innovation and Analytics department in ABN AMRO bank located in Amsterdam, the Netherlands. In her daily job, she works on projects with different business lines applying the latest machine learning and advanced analytics technologies and algorithms. Before that, she worked for about 1.5 years as a data science consultant in Accenture, the Netherlands. Violeta enjoyed helping clients solve their problems with the use of data and data science but wanted to be able to develop more sophisticated tools, therefore the switch. Before her position at Accenture, she worked on her PhD, which she obtained from Erasmus University, Rotterdam in the area of Applied Microeconometrics. In her research, she used data to investigate the causal effect of negative experiences on human capital, education, problematic behavior and crime commitment.