Creating and Preprocessing a Design Matrix with Recipes
Creating and Preprocessing a Design Matrix with Recipes


R has an excellent framework for specifying models using formulas. While [elegant and useful](, it was designed in a time when models had small numbers of terms and complex preprocessing of data was not commonplace. As such, it has some [limitations]( In this talk, a new package called `recipes` is shown where the specification of model terms and preprocessing steps can be enumerated sequentially. The recipe can be estimated and applied to any dataset. Current options include simple transformations (log, Box-Cox, interactions, dummy variables, ...), signal extraction (PCA, ICA, MDS), basis functions (splines, polynomials), imputation methods, and others.


Max Kuhn works at RStudio developing software for data analysis and modeling. He previously worked in pharmaceutical and molecular diagnostic research for more than 18 years. Max’s interests are in predictive modeling and machine learning and is the author of six R packages, including the [caret package]( He and Kjell Johnson published the bestselling book [Applied Predictive Modeling]( in 2013. Max holds a B.S. in Mathematics and a Ph.D. in Biostatistics.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google