StructureBoost: Gradient Boosting with Categorical Structure


Often, categorical variables possess a natural structure that is not linear or ordinal in nature. The months of the year have a circular structure while the US states have a structure that can be represented by a graph. StructureBoost uses novel techniques that allow this known structure to be exploited to yield better predictions. Recently, StructureBoost has been enhance to utilize the structure in the target variable (i.e. in multi-classification) as well as in the predictor variables. This hands-on workshop will demonstrate how to use StructureBoost in different problems involving categorical variables with known structure.


Brian Lucena is Principal at Numeristical, where he advises companies of all sizes on how to apply modern machine learning techniques to solve real-world problems with data. He is the creator of three Python packages: StructureBoost, ML-Insights, and SplineCalib. In previous roles he has served as Principal Data Scientist at Clover Health, Senior VP of Analytics at PCCI, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from Youtube
Consent to display content from Vimeo
Google Maps
Consent to display content from Google