Using memory-efficient data structures
One of the first things to consider when you work with a large dataset is whether the same information can be stored and processed using more memory-efficient data structures. But first we need to know how data is stored in R. Vectors are the basic building blocks of almost all data types in R. R provides atomic vectors of logical, integer, numeric, complex, character and raw types. Many other data structures are also built from vectors. Lists, for example, are essentially vectors in R's internal storage structures. They differ from atomic vectors in the way that they store pointers to other R objects rather than atomic values. That is why lists can contain objects of different types.
Let's examine how much memory is required for each of the atomic data types. To do that, we will create vectors of each type with 1 million elements and measure their memory consumption using object.size()
(for character vectors, we will call rep.int(NA_character_, 1e6...