Premium Session: Feature Engineering for Time Series Data

Abstract: Most machine learning algorithms today are not time-aware and are not easily applied to time series and forecasting problems. Leveraging advanced algorithms like XGBoost, or even linear models, typically requires substantial data preparation and feature engineering – for example, creating lagged features, detrending the target, and detecting periodicity. The preprocessing required becomes more difficult in the common case where the problem requires predicting a window of multiple future time points. As a result, most practitioners fall back on classical methods, such as ARIMA or trend analysis, which are time-aware but less expressive. This session covers the best practices for solving this challenge, by introducing a general framework for developing time series models, generating features and preprocessing the data, and exploring the potential to automate this process in order to apply advanced machine learning algorithms to almost any time series problem.

Bio: Michael Schmidt is the Chief Scientist at DataRobot and has been featured in the Forbes list of the world’s top 7 data scientists. He has won awards for research in AI, with publications ranking in the 99th percentile of all tracked research. In 2012, Michael founded Nutonian and led the development Eureqa, a machine learning application and service used by over 80,000 users globally (later acquired by DataRobot). In 2015, he was selected by MIT for the most innovative 35-under-35 award. Michael has also appeared in several media outlets such as the New York Times, NPR’s RadioLab, the Science Channel, and Communications of the ACM. Most recently, his work focuses on automated machine learning, feature engineering, and advanced time series prediction.