Implementing XGBoost’s feature importance
XGBoost automatically provides feature importance scores. These scores indicate how valuable each feature is in constructing the decision trees in the ensemble. XGBoost offers three primary ways to measure feature importance:
- Gain: Measures how a feature contributes to the improvement of the model’s accuracy, averaged over all trees.
- Weight: The number of times a feature appears in the trees (sometimes called frequency).
- Cover: Measures the relative number of observations related to a feature across all trees.
In this section, you’ll use XGBoost’s feature importance to understand which factors have the most impact on housing value predictions. You’ll do this by loading the dataset, training an XGBoost model, and visualizing the feature importance using XGBoost’s built-in methods. You’ll use the same California housing dataset you used in Chapter 4. Follow these steps:
...