Common feature engineering techniques for categorical features
Categorical variables often contain useful information that can’t be easily quantified. In this section, you’ll learn about methods for transforming categorical variables to enhance their utility and interpretability in predictive modeling. Specifically, you’ll learn about mappings based on inherent order or related quality within categorical variables.
Understanding categorical variables
There are two common types of categorical variables: ordinal and nominal. Ordinal variables have a specific order or ranking. This would include things such as T-shirt sizes, where small comes first, then medium, followed by large, or ranking of things such as our tomato example of poor, fair, good, and excellent.
Nominal variables, on the other hand, are unordered categories. This would include things such as names of defects, or types of parts. Transforming categorical variables involves encoding these attributes...