Using AWS SDK for pandas, the Redshift Data API, and Lambda to execute SQL statements
The AWS SDK for pandas library (previously known as AWS Data Wrangler) is a powerful tool that provides a pandas interface to various AWS services, including Glue, Redshift, Athena, and more using pandas syntax. This library is handy for data scientists, and analysts already using the pandas library and want to interact with AWS data and analytics services. One of the use cases of AWS SDK for pandas is to simplify the querying and manipulating data stored in AWS data stores.
In this recipe, we will demonstrate how to use AWS SDK for pandas and the Redshift Data API. In this flow, we will demonstrate a Lambda function with two steps:
- Execute an SQL statement to retrieve the data.
- Save the DataFrame to S3.
Getting ready
Before starting, ensure you have the following prerequisites:
- AWS Lambda set up with the necessary execution role permissions to interact with Redshift...