Despite being the most primitive format used to store data, files are still broadly used and they exist in several formats, such as fixed width, comma-separated values, spreadsheets, or even free format files. Pentaho Data Integration (PDI) has the ability to read data from all kinds of files. In this section, let's see how to use PDI to get data from these files.
Reading data from files
Reading a simple file
In this section, you will learn to read one of the most common input sources, plain files.
For demonstration purposes, we will use a simplified version of sales_data.csv that comes with the PDI bundle. Our sample file looks as follows:
ORDERDATE,ORDERNUMBER,ORDERLINENUMBER,PRODUCTCODE,PRODUCTLINE,QUANTITYORDERED...