Explaining XGBoost Models – Tools and Methods
Explaining XGBoost Models – Tools and Methods


There is a widespread belief that the twin modeling goals of prediction and explanation are in conflict. That is, if one desires superior predictive power, then by definition one must pay a price of having little insight into how the model made its predictions. Conversely, if one desires explanations then one must only use """"highly interpretable"""" methods like linear and logistic regression. However, in reality, this tradeoff is by no means a given. In fact, methods with high predictive power, when examined properly with sophisticated tooling, can yield practical insights that could never be realized by high bias methods like linear and logistic regression. Furthermore, the insights gained by carefully examining a model can be used to suggest better features, thereby improving model performance. Thus, the twin goals of prediction and understanding can instead form a virtuous cycle rather than remaining in conflict.

In this workshop, we will work hands-on using XGBoost with real-world data sets to demonstrate how to approach data sets with the twin goals of prediction and understanding in a manner such that improvements in one area yield improvements in the other. Using modern tooling such as Individual Conditional Expectation (ICE) plots and SHAP, as well as a sense of curiosity, we will extract powerful insights that could not be gained from simpler methods. In particular, attention will be placed on how to approach a data set with the goal of understanding as well as prediction.


Brian Lucena is Principal at Lucena Consulting and a consulting Data Scientist at Agentero. An applied mathematician in every sense, he is passionate about applying modern machine learning techniques to understand the world and act upon it. In previous roles he has served as SVP of Analytics at PCCI, Principal Data Scientist at Clover Health, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google