Weight of Evidence (WoE) encoding
WoE encoding was originally developed for the credit and financial industries. It is used to build predictive models for assessing loan default risk in those industries. WoE quantifies the effectiveness of a grouping method by distinguishing between good and bad risks. This technique is particularly well suited for logistic regression as it establishes a monotonic relationship between the target variable and the independent variables, ordering the categories on a logistic scale. In this section, you can try it on sample data:
- Start by importing
pandas
and importingtrain_test_split
from scikit-learn to prepare for applying the encoding. Importmatplotlib.pyplot
andnumpy
, which you will use to make graphs to show the relationship between the encoded categorical variables and the target variable:import pandas as pd from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt import numpy as np
- Set up the sample data...