Performing descriptive statistics analysis in Python
As shown in Figure 3.11, the missing values have the removed, the species name has been corrected, and the first row has been removed. This means that the df object is now clean of any invalid data and can be used for further analysis.
The next step is essential and is called EDA:
#Perform the descriptive statistics in Iris dataset df.describe()
Here’s the output:
Out[2]: Petal_width Petal_length Sepal_width Sepal_length count 144.000000 144.000000 144.000000 144.000000 mean 1.239583 3.854167 3.045139 5.881944 std 0.751336 1.736078 ...