Using a Decision Tree to classify credit risks
In this section, we will create a model to classify credit risks. In this section, we will create the model; we won't look at the performance of the model. We'll evaluate the performance of the model and improve it in the next chapter.
As we did before, to create this example, we'll download a dataset from the UCI Machine Learning Repository. We'll use a dataset called Statlog (German Credit Data) Dataset. The source of the dataset is Professor Dr. Hans Hofmann from Institut für Statistik und Ökonometrie, Universität Hamburg. The dataset classifies people described by a set of attributes as good or bad credit risks.
The dataset is downloaded from the following link:
https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29
In the following screenshot, you can see the original form of this dataset. The screenshot shows us the top ten lines of the dataset. The dataset doesn't have a header line. It contains 20 attributes, and the...