Use of simpler data structures
Many R users would agree that data.frame
as a data structure is the workhorse of data analysis in R. It provides an intuitive way to represent a typical structured dataset with rows and columns representing observations and variables respectively. A data.frame
object also allows more flexibility than a matrix by allowing variables of different types (such as character and numeric variables in a single data.frame
). Furthermore, in cases where a data.frame
stores only variables of the same type, basic matrix operations conveniently become applicable to it without any explicit coercing required. This convenience, however, can come with performance degradation.
Applying a matrix operation on a data.frame
is slower than on a matrix
. One of the reasons is that most matrix operations first coerce the data.frame
into a matrix
before performing the computation. For this reason, where possible, one should use a matrix
in place of a data.frame
. The next code demonstrates...