Data cleaning level I – cleaning up the table
Data cleaning level I has the least deep data preprocessing steps. Most of the time, you can get away with not having your data cleaned at level I. However, having a dataset that is level I cleaned would be very rewarding as it would make the rest of the data cleaning process and data analytics much easier.
We will consider a level I dataset clean where the dataset has the following characteristics:
- It is in a standard and preferred data structure.
- It has codable and intuitive column titles.
- Each row has a unique identifier.
The following three examples feature at least one or a combination of the preceding characteristics for ease of learning.
Example 1 – unwise data collection
From time to time, you might come across sources of data that are not collected and recorded in the best possible way. These situations occur when the data collection process has been done by someone or a group of people...