Understanding the math behind causal forests
Causal forests are built on the framework of potential outcomes and, as such, use ensemble methods to estimate treatment effects. Now, what are ensemble methods?
An ensemble method is an approach that leverages the collective wisdom of a large number of models, reducing the impact of individual model biases and errors. It does so by combining different machine learning models, where the goal is to improve predictive performance and robustness.
Many such methods are available to apply such as bagging, boosting, and stacking (random forests use the bagging technique). Bagging involves training multiple models on random subsets of data and averaging their predictions to reduce variance (e.g., random forests). Boosting sequentially trains models, each correcting errors from the previous one, to reduce bias. Stacking combines different models by using their predictions as input for a final model, improving overall predictive power. For...