General Training Session: Feature Engineering for Time Series Data

Abstract: Most machine learning algorithms today are not time-aware and are not easily applied to time series and forecasting problems. Leveraging algorithms like XGBoost, or even linear models, typically requires substantial data preparation and feature engineering – for example, creating lagged features, detrending the target, and detecting periodicity. The preprocessing required becomes more difficult in the common case where the problem requires predicting a window of multiple future time points. As a result, most practitioners fall back on classical methods, such as ARIMA or trend analysis, which are time-aware but often less expressive. This talk covers practices for solving this challenge and exploring the potential to automate this process in order to apply advanced machine learning algorithms time series problems.

Bio: As Data Science Engineering Architect at DataRobot, Mark designs and builds key components of automated machine learning infrastructure. He contributes both by leading large cross-functional project teams and tackling challenging data science problems. Before working at DataRobot and data science he was a physicist where he did data analysis and detector work for the Olympus experiment at MIT and DESY.

Open Data Science Conference