Performing feature engineering on the house prices dataset
Feature engineering is a key stage in the data preprocessing pipeline. It transforms raw data into informative features that drive predictive modeling. It’s a process that involves creating additional input columns that are to be used when you’re developing a model that improves the model’s performance. For example, you may transform numeric values so that they match a normal distribution, or group values into buckets to be able to convert from continuous values into categories. Let’s say that you’re looking at ages. In this case, you can bucket the values into age ranges of 0-10, 11-20, 21-30, and so on.
In this section, you’ll learn how to perform feature engineering by using the house prices dataset. The code for this chapter has been provided in a Jupyter notebook that can be found in this book’s GitHub repository. It provides a detailed exploration of feature engineering...