Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Geospatial Data Analytics on AWS

You're reading from   Geospatial Data Analytics on AWS Discover how to manage and analyze geospatial data in the cloud

Arrow left icon
Product type Paperback
Published in Jun 2023
Publisher Packt
ISBN-13 9781804613825
Length 276 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (3):
Arrow left icon
Scott Bateman Scott Bateman
Author Profile Icon Scott Bateman
Scott Bateman
Jeff DeMuth Jeff DeMuth
Author Profile Icon Jeff DeMuth
Jeff DeMuth
Janahan Gnanachandran Janahan Gnanachandran
Author Profile Icon Janahan Gnanachandran
Janahan Gnanachandran
Arrow right icon
View More author details
Toc

Table of Contents (23) Chapters Close

Preface 1. Part 1: Introduction to the Geospatial Data Ecosystem
2. Chapter 1: Introduction to Geospatial Data in the Cloud FREE CHAPTER 3. Chapter 2: Quality and Temporal Geospatial Data Concepts 4. Part 2: Geospatial Data Lakes using Modern Data Architecture
5. Chapter 3: Geospatial Data Lake Architecture 6. Chapter 4: Using Geospatial Data with Amazon Redshift 7. Chapter 5: Using Geospatial Data with Amazon Aurora PostgreSQL 8. Chapter 6: Serverless Options for Geospatial 9. Chapter 7: Querying Geospatial Data with Amazon Athena 10. Part 3: Analyzing and Visualizing Geospatial Data in AWS
11. Chapter 8: Geospatial Containers on AWS 12. Chapter 9: Using Geospatial Data with Amazon EMR 13. Chapter 10: Geospatial Data Analysis Using R on AWS 14. Chapter 11: Geospatial Machine Learning with SageMaker 15. Chapter 12: Using Amazon QuickSight to Visualize Geospatial Data 16. Part 4: Accessing Open Source and Commercial Platforms and Services
17. Chapter 13: Open Data on AWS 18. Chapter 14: Leveraging OpenStreetMap on AWS 19. Chapter 15: Feature Servers and Map Servers on AWS 20. Chapter 16: Satellite and Aerial Imagery on AWS 21. Index 22. Other Books You May Enjoy

Geospatial data management best practices

The single most important consideration in a data management strategy is a deep understanding of the use cases the data intends to support. Data ingestion workflows need to eliminate bottlenecks in write performance. Geospatial transformation jobs need access to powerful computational resources, and the ability to cache large amounts of data temporarily in memory. Analytics and visualization concerns require quick searching and the retrieval of geospatial data. These core disciplines of geospatial data management have benefitted from decades of fantastic work done by the community, which has driven AWS to create pathways to implement these best practices in the cloud.

Data – it’s about both quantity and quality

A long-standing anti-pattern of data management is to rely primarily on folder structures or table names to infer meaning about datasets. Having naming standards is a good thing, but it is not a substitute for a well-formed data management strategy. Naming conventions invariably change over time and are never fully able to account for the future evolution of data and the resulting taxonomy. In addition to the physical structure of the data, instrumenting your resources with predefined tags and metadata becomes crucial in cloud architectures. This is because AWS inherently provides capabilities to specify more information about your geospatial data, and many of the convenient tools and services are built to consume and understand these designations. Enriching your geospatial data with the appropriate metadata is a best practice in the cloud as it is for any GIS.

Another best practice is to quantify your data quality. Simply having a hunch that your data is good or bad is not sufficient. Mature organizations not only quantitatively describe the quality of their data with continually assessed scores but also track the scores to ensure that the quality of critical data improves over time. For example, if you have a dataset of addresses, it is important to know what percentage of the addresses are invalid. Hopefully, that percentage is 0, but very rarely is that the case. More important than having 100% accurate data is having confidence in what the quality of a given dataset is… today. Neighborhoods are being built every day. Separate buildings are torn down to create apartment complexes. Perfect data today may not be perfect data tomorrow, so the most important aspect of data quality is real-time transparency. A threshold should be set to determine the acceptable data quality based on the criticality of the dataset. High-priority geospatial data should require a high bar for quality, while infrequently used low-impact datasets don’t require the same focus. Categorizing your data based on importance allows you to establish guidelines by category. This approach will allow finite resources to be directed toward the most pressing concerns to maximize value.

People, processes, and technology are equally important

Managing geospatial data successfully in the cloud relies on more than just the technology tools offered by AWS. Designating appropriate roles and responsibilities in your organization ensures that your cloud ecosystem will be sustainable. Avoid single points of failure with respect to skills or tribal knowledge of your environment. Having at least a primary and secondary person to cover each area will add resiliency to your people operations. Not only will this allow you to have more flexibility in coverage and task assignment but it also creates training opportunities within your team and allows team members to continually learn and improve their skills.

Next, let’s move on to talk about how to stretch your geospatial dollars to do more with less.

You have been reading a chapter from
Geospatial Data Analytics on AWS
Published in: Jun 2023
Publisher: Packt
ISBN-13: 9781804613825
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image