Missing data
One of the biggest problems in real-world data is missing data. In carefully planned experiments on inanimate chemicals, small samples of rats, or highly mechanized factories, missing data may not be such a problem. However, whenever a dataset gets large enough, or starts to involve humans, missing data is almost a certainty. Let's begin by pointing out that if you have missing data, then you have a missing data problem, and you have to do something with that missing data; the question is, what? The answer lies in what kind of bias you are dealing with as a result of missing data.
Computational aspects of missing data in R
Before we delve into the statistical aspects of missing data, we need to review the computational ones. There are at least two different kinds of data missing in R, and they are NA and Null. NA is a missing value, but there are multiple types of NA, and R will automatically coerce missing values to be what it thinks is the appropriate type. For example, in the...