Comparing XGBoost to CART
So far, you’ve compared XGBoost to linear fit. For the housing data, you’ve seen that XGBoost provides an improvement with better RMSE and R2 values, even though it takes longer to execute than linear regression. In the following steps, we will compare it to CART in terms of fit and performance, as implemented by scikit-learn:
1. Fit a regression tree model using CART: To fit a CART model, use the DecisionTree
module in scikit-learn, specifically sklearn.tree.DecisionTreeRegressor
since you will be performing regression to predict housing values. First, set up the model by calling DecisionTreeRegressor
and put the result in housing_CART
. Use the default settings so that no values are passed. This is where you can set the model hyperparameters. Next, fit the model by applying the .
fit
method:
%%time from sklearn.tree import DecisionTreeRegressor housing_CART = DecisionTreeRegressor() housing_CART_regression = housing_CART.fit(X_train, y_train...