Ingesting and Transforming Your Data with AWS Glue
In data engineering, implementing integration between systems to extract, transform, and load (ETL) data is frequently what consumes the most time and cost. AWS Glue is a serverless data integration service that provides different engines and tools to build ETL jobs in a simple and scalable way, paying for what you use.
Glue is comprised of many components and features to serve many kinds of data products and users, including a Hive-compatible metastore and multiple engines for different needs, from single-node Glue Python shell jobs to distributed clusters that auto-scale using Glue for Spark, and the latest addition: Glue for Ray. Each of those engines has connectors for common data storage systems.
Glue also offers on-demand clusters via interactive sessions, which can be used for interactive development and analysis via Jupyter notebooks (either provided by Glue or your own). Finally, Glue Studio offers a visual environment...