Chapter 6: Building Your Own Program
Activity 16: Performing the Preparation and Creation Stages for the Bank Marketing Dataset
For the purpose of this demonstration, a random_state
equal to 100 will be used for the following solution:
- Open a Jupyter Notebook to implement this activity and import
pandas
:import pandas as pd
- Load the previously downloaded dataset into the notebook:
data = pd.read_csv("../datasets/bank-full.csv")
The first 10 rows of the dataset can be seen using the statement
data.head(10)
:Figure 6.6: A screenshot showing the first 10 instances of the dataset
The missing values are shown as
NaN
, as explained previously. - Select the metric that's the most appropriate for measuring the performance of the model, considering that the purpose of the study is to detect clients who would subscribe to the term deposit.
The metric to evaluate the performance of the model is the precision metric, as it compares the correctly classified positive labels...