Preparing time series data for XGBoost
Applying XGBoost to time series forecasting requires transforming the inherently sequential and temporal data into a supervised learning format that the model can process. This involves several key steps, including creating lag features, incorporating date-based and rolling statistical features, and ensuring that the temporal order is preserved when splitting the data into training and testing sets. Time series data is fundamentally different from regular tabular data because the order of observations matters. The goal of data preparation here is to provide the model with features that encode time-related information from the time series. This has the result of transforming the data into a supervised learning problem much like the housing value and housing price examples you explored in previous chapters. You’ll apply this concept beginning with creating lag features. Let’s do that next.
Creating lag features
One way to encode...