Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python GeoSpatial Analysis Essentials

You're reading from   Python GeoSpatial Analysis Essentials Process, analyze, and display geospatial data using Python libraries and related tools

Arrow left icon
Product type Paperback
Published in Jun 2015
Publisher
ISBN-13 9781782174516
Length 200 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Erik Westra Erik Westra
Author Profile Icon Erik Westra
Erik Westra
Arrow right icon
View More author details
Toc

Understanding geospatial data

Geospatial data is data that positions things on the Earth's surface. This is a deliberately vague definition that encompasses both the idea of location and shape. For example, a database of car accidents may include the latitude and longitude coordinates identifying where each accident occurred, and a file of county outlines would include both the position and shape of each county. Similarly, a GPS recording of a journey would include the position of the traveler over time, tracing out the path they took on their travels.

It is important to realize that geospatial data includes more than just the geospatial information itself. For example, the following outlines are not particularly useful by themselves:

Understanding geospatial data

Once you add appropriate metadata, however, these outlines make a lot more sense:

Understanding geospatial data

Geospatial data, therefore, includes both spatial information (locations and shapes) and non-spatial information (metadata) about each item being described.

Spatial information is usually represented as a series of coordinates, for example:

location = (-38.136734, 176.252300)
outline = ((-61.686,17.024),(-61.738,16.989),(-61.829,16.996) ...)

These numbers won't mean much to you directly, but once you plot these series of coordinates onto a map, the data suddenly becomes comprehensible:

Understanding geospatial data

There are two fundamental types of geospatial data:

  • Raster data: This is geospatial data that divides the world up into cells and associates values with each cell. This is very similar to the way that bitmapped images divide an image up into pixels and associate a color with each pixel; for example:
    Understanding geospatial data

    The value of each cell might represent the color to use when drawing the raster data on a map—this is often done to provide a raster basemap on which other data is drawn—or it might represent other information such as elevation, moisture levels, or soil type.

  • Vector data: This is geospatial data that consists of a list of features. For example, a shapefile containing countries would have one feature for each country. For each feature, the geospatial dataset will have a geometry, which is the shape associated with that feature, and any number of attributes containing the metadata for that feature.

    A feature's geometry is just a geometric shape that is positioned on the surface of the earth. This geometric shape is made up of points, lines (sometimes referred to as LineStrings), and polygons, or some combination of these three fundamental types:

    Understanding geospatial data

The typical raster data formats you might encounter include:

  • GeoTIFF files, which are basically just TIFF format image files with georeferencing information added to position the image accurately on the earth's surface.
  • USGS .dem files, which hold a Digital Elevation Model (DEM) in a simple ASCII data format.
  • .png, .bmp, and .jpeg format image files, with associated georeferencing files to position the images on the surface of the earth.

For vector-format data, you may typically encounter the following formats:

  • Shapefile: This is an extremely common file format used to store and share geospatial data.
  • WKT (Well-Known Text): This is a text-based format often used to convert geometries from one library or data source to another. This is also the format commonly used when retrieving features from a database.
  • WKB (Well-Known Binary): This is the binary equivalent of the WKT format, storing geometries as raw binary data rather than text.
  • GML (Geometry Markup Language): This is an industry-standard format based on XML, and is often used when communicating with web services.
  • KML (Keyhole Markup Language): This is another XML-based format popularized by Google.
  • GeoJSON: This is a version of JSON designed to store and transmit geometry data.

Because your analysis can only be as good as the data you are analyzing, obtaining and using good-quality geospatial data is critical. Indeed, one of the big challenges in performing geospatial analysis is to get the right data for the job. Fortunately, there are several websites which provide free good-quality geospatial data. But if you're looking for a more obscure set of data, you may have trouble finding it. Of course, you do always have the choice of creating your own data from scratch, though this is an extremely time-consuming process.

We will return to the topic of geospatial data in Chapter 2, Geospatial Data, where we will examine what makes good geospatial data and how to obtain it.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image