Programming and running your first example
Now that we have all we need installed, we will go through our first example. In this example, we will test the installation and then see a glimpse of OGR's capabilities.
To do this, we will open a vector file containing the boundaries of all the countries in the world and make a list of country names. The objective of this simple example is to present the logic behind OGR objects and functions and give an understanding of how geospatial files are represented. Here's how:
- First, you need to copy the sample data provided with the book to your project folder. You can do this by dragging and dropping the data folder into the
geopy
folder. Make sure that thedata
folder is named data and that it's inside thegeopy
folder. - Now, create a new directory for this chapter code, inside PyCharm. With your
geopy
project opened, right-click on the project folder and select New | Directory. Name itChapter1
. - Create a new Python file. To do this, right-click on the
Chapter1
folder and select New | Python File. Name itworld_borders.py
, and PyCharm will automatically open the file for editing. - Type the following code in this file:
import ogr # Open the shapefile and get the first layer. datasource = ogr.Open("../data/world_borders_simple.shp") layer = datasource.GetLayerByIndex(0) print("Number of features: {}".format(layer.GetFeatureCount()))
- Now, run the code; in the menu bar, navigate to Run | Run, and in the dialog, choose
world_borders
. An output console will open at the bottom of the screen, and if everything goes fine, you should see this output:C:\Python27\python.exe C:/geopy/world_borders.py Number of features: 246 Process finished with exit code 0
Congratulations! You successfully opened a Shapefile and counted the number of features inside it. Now, let's understand what this code does.
The first line imports the ogr
package. From this point on, all the functions are available as ogr.FunctionName()
. Note that ogr
doesn't follow the Python naming conventions for functions.
The line after the comment opens the OGR datasource (this opens the shapefile containing the data) and assigns the object to the datasource
variable. Note that the path, even on Windows, uses a forward slash (/
) and not a backslash.
The next line gets the first layer of the data source by its index (0
). Some data sources can have many layers, but this is not the case of a Shapefile, which has only one layer. So, when working with Shapefiles, we always know that the layer of interest is layer 0.
In the final line, the print statement prints the number of features returned by layer.GetFeatureCount()
. Here, we will use Python's string formatting, where the curly braces are replaced by the argument passed to format()
.
Now, perform the following steps:
- In the same file, let's type the next part of our program:
# Inspect the fields available in the layer. feature_definition = layer.GetLayerDefn() for field_index in range(feature_definition.GetFieldCount()): field_definition = feature_definition.GetFieldDefn(field_index) print("\t{}\t{}\t{}".format(field_index, field_definition.GetTypeName(), field_definition.GetName()))
- Rerun the code; you can use the Shift + F10 shortcut for this. Now, you should see the number of features as before plus a pretty table showing information on all the fields in the shapefile, as follows:
Number of features: 246 0 String FIPS 1 String ISO2 2 String ISO3 3 Integer UN 4 String NAME 5 Integer POP2005 6 Integer REGION 7 Integer SUBREGION Process finished with exit code 0
What happens in this piece of code is that
feature_definition = layer.GetLayerDefn()
gets the object that contains the definition of the features. This object contains the definition for each field and the type of geometry.In the
for
loop, we will get each field definition and print its index, name, and type. Note that the object returned bylayer.GetLayerDefn()
is not iterable, and we can't usefor
directly with it. So first, we will get the number of fields and use it in therange()
function so that we can iterate through the indexes of the fields: - Now, type the last part, as follows:
# Print a list of country names. layer.ResetReading() for feature in layer: print(feature.GetFieldAsString(4))
- Run the code again and check the results in the output:
Number of features: 246 0 String FIPS 1 String ISO2 2 String ISO3 3 Integer UN 4 String NAME 5 Integer POP2005 6 Integer REGION 7 Integer SUBREGION Antigua and Barbuda Algeria Azerbaijan Albania Armenia Angola ... Saint Barthelemy Guernsey Jersey South Georgia South Sandwich Islands Taiwan Process finished with exit code 0
The layer is iterable, but first, we need to ensure that we are at the beginning of the layer list with layer.ResetReading()
(this is one of OGR's "gotcha" points).
The feature.GetFieldAsString(4)
method returns the value of field 4
as a Python string. There are two ways of knowing whether the country names are in field 4
:
- Looking at the data's DBF file (by opening it with LibreOffice or Excel)
- Looking at the table that we printed in the first part of the code
Your complete code should look similar to the following:
import ogr # Open the shapefile and get the first layer. datasource = ogr.Open("../data/world_borders_simple.shp") layer = datasource.GetLayerByIndex(0) print("Number of features: {}".format(layer.GetFeatureCount())) # Inspect the fields available in the layer. feature_definition = layer.GetLayerDefn() for field_index in range(feature_definition.GetFieldCount()): field_definition = feature_definition.GetFieldDefn(field_index) print("\t{}\t{}\t{}".format(field_index, field_definition.GetTypeName(), field_definition.GetName())) # Print a list of country names. layer.ResetReading() for feature in layer: print(feature.GetFieldAsString(4))