Chapter 1: Introduction to scikit-learn
Activity 1: Selecting a Target Feature and Creating a Target Matrix
- Load the
titanic
dataset using the seaborn library. First, import the seaborn library, and then use theload_dataset("titanic")
function:import seaborn as sns titanic = sns.load_dataset('titanic') titanic.head(10)
Next, print out the top 10 instances; this should match the below screenshot:
Figure 1.23: An image showing the first 10 instances of the Titanic dataset
- The preferred target feature could be either
survived
oralive
. This is mainly because both of them label whether a person survived the crash. For the following steps, the variable chosen issurvived
. However, choosingalive
will not affect the final shape of the variables. - Create a variable,
X
, to store the features, by usingdrop()
. As explained previously, the selected target feature issurvived
, which is why it is dropped from the features matrix.Create a variable,
Y
, to store...