Understanding the XGBoost algorithm
In this section, you will learn how the XGBoost algorithm tackles problems with current basic gradient-boosted tree algorithms. You will cover the improvements the authors highlight in the paper and how the improvements help correct problems. First, you will learn about how the authors addressed problems with data, then you will learn about the improvements in XGBoost that speed up training.
Addressing problems – sparse data, overfitting
To handle overfitting, a change the authors made from the standard gradient-boosted tree algorithm is to add a function (Ω, called omega in the paper) for the complexity of the model. This function smooths the weights to avoid overfitting. The omega function does this by penalizing complexity, meaning the algorithm prefers solutions that are simpler. This function also makes the algorithm easier to parallelize for faster computation.
Two additional techniques to handle overfitting are used:
...