How data moves internally
Before we deep dive into how to load and save data, let's explore how external storage systems, databases, CPUs, RAM, GPUs, and local storage systems interact to get your data ready for processing.
We'll divide the process of loading data into four categories.
File to RAM
A file is loaded from disk to RAM using the CPU:
File to GPU memory
A file is loaded from disk to GPU memory using the CPU. To save the file to disk, the data is sent via the GPU to the CPU, and then to disk:
Database to RAM
The data is extracted from a database and sent via a JDBC driver to RAM on your local laptop or cluster:
Database to GPU memory
In Optimus, the data is extracted from...