Redshift Spectrum
Another topic I want to introduce before we dive into Redshift’s geospatial support is a feature called Spectrum. This is a powerful feature that lets Redshift seamlessly integrate with your data lake on S3. Spectrum allows Redshift to query data sitting in your S3 data lake as if it were sitting locally in the cluster. This provides massive efficiencies because you can have petabytes of data sitting on cheap S3 storage instead of sitting on expensive disk storage in your database. Not only does it allow you to query data from S3, but it also allows you to seamlessly write SQL joins against the data, both on S3 as well as the data local to the cluster.
Spectrum is also set up to take advantage of partitioning, similar to what we mentioned in the previous section. It’s not the same as the ALL, EVEN, and KEY-based partitioning but is instead a simpler folder-based partitioning. Like our previous example, where we queried for the city name, you can divide...