Understanding file formats
Now, let’s talk a bit about the different data formats used in these workflows. GIS is still consumed with the ubiquitous shapefile, but there are a lot of up-and-coming big data formats with promising potential. Shapefiles work great on AWS and Redshift, which is an AWS-managed data warehouse service that actually has full support to ingest shapefiles natively. This is an incredibly powerful way to work with geospatial data in shapefiles. You can easily script ETL jobs to run on AWS so that when a shapefile is uploaded to S3, our simple object storage service, S3 triggers an event, and those shapefiles are picked up and ingested into Redshift for processing. The shapefiles are then immediately queryable and available to other analytic applications and services.
Next to shapefiles, I am seeing JSON and GeoJSON as preferred formats. Many users are familiar with the easy-to-read and write JSON syntax. There has been some standardization done in the...