2. Building Blocks of Neural Networks
Activity 2.01: Performing Data Preparation
Solution
- Import the required libraries:
import pandas as pd
- Using pandas, load the
.csv
file:data = pd.read_csv("YearPredictionMSD.csv", nrows=50000) data.head()
Note
To avoid memory limitations, use the
nrows
argument when reading the text file in order to read a smaller section of the entire dataset. In the preceding example, we are reading the first 50,000 rows.The output is as follows:
- Verify whether any qualitative data is present in the dataset:
cols = data.columns num_cols = data._get_numeric_data().columns list(set(cols) - set(num_cols))
The output should be an empty list, meaning there are no qualitative features.
- Check for missing values.
If you add an additional
sum()
function to the line of code that was previously used for this purpose, you will get the sum of missing values in the entire dataset, without discriminating by column...