Summary
In this chapter, we learned how R stores vectors in memory, and how to estimate the amount of memory required for different types of data. We also learned how to use more efficient data structures like sparse matrices and bit vectors in order to store some types of data, so that they can be fully loaded and processed in the memory.
For datasets that are still too large, we used big.matrix
, ff
, and ffdf
objects to store memory on disk using memory-mapped files and processed the data one chunk at a time. The bigmemory
and ff
packages, along with their companion packages, provide a rich set of functionality for memory-mapped files that cannot be covered fully, in this book. We encourage you to look up the documentation for these packages to learn more about how to take advantage of the power of memory-mapped files when you handle large datasets.
In the next chapter, we will look beyond running R in a single process or thread, and learn how to run R computations in parallel.