Data Layers
An AI system consists of multiple data storage layers that are connected with Extract, Transform, and Load (ETL) or Extract, Load, and Transform (ELT) pipelines. Each separate storage solution has its own requirements, depending on the type of data that is stored and the usage pattern. The following figure shows this concept:
From a high-level viewpoint, the backend (and thus, the storage systems) of an AI solution is split up into three parts or layers:
- Raw data layer: Contains copies of files from source systems. Also known as the staging area.
- Historical data layer: The core of a data-driven system, containing an overview of data from multiple source systems that have been gathered over time. By stacking the data rather than replacing or updating old values, history is preserved and time travel (being able to make queries over a data state in the past) is...