Querying large historical data with Redshift Spectrum
Amazon Redshift Spectrum is a feature of Amazon Redshift that allows you to query exabytes of data stored in Amazon S3 directly without prior loading to Redshift tables. This can be useful for various reasons, such as querying historical datasets that have expanded to multiple years or having multiple Redshift workgroups to query the same dataset. You can directly query scalar data or nested data formats stored in Amazon S3.
By using Redshift Spectrum, you will launch a cluster that is independent of your existing cluster.
Getting ready
To use Amazon Redshift Spectrum, you need to have the following:
- An SQL client: This cluster is independent of your existing Redshift cluster.
- At least three subnets: Each subnet should be associated with different Availability Zones (AZs) for your Redshift Serverless workspace. Make sure you understand how the subnet mask and subnet Classless Inter-Domain Routing (CIDR) block...