10. Analyzing Air Quality
Activity 10.01: Checking for Outliers
- Plot a boxplot for the
PM25
feature using seaborn:pm_25 = sns.boxplot(air['PM25'])
The output will be as follows:
- Check how many instances contain values of
PM25
higher than250
:(air['PM25'] >= 250).sum()
The output will be as follows:
18668
- Store all the instances from Step 2 in a DataFrame called
pm25
and print the first five rows:pm25 = air.loc[air['PM25'] >= 250] pm25.head()
The output will be as follows:
- Print the station names of the instances in
PM25
to ensure all the instances are not just from one station, but from multiple stations. This reduces the chances of them being incorrectly stored values:pm25.station.unique()
The output will be as follows:
array(['Aotizhongxin', 'Changping', 'Dingling', 'Dongsi', &apos...