Reading images
Probably the most common data that is not text is image data. Images have their own set of specific metadata that can be read to filter values or perform other operations.
The main challenge is dealing with multiple formats and different metadata definitions. We'll show in this recipe how to get information from both a JPEG and a PNG, and how the same information can be encoded differently.
Getting ready
The best general toolkit for dealing with images in Python is, arguably, Pillow
. This library allows you to easily read files in the most common formats, as well as perform operations on them. Pillow
started as a fork of PIL (Python Imaging Library), a previous module that became stagnant some years ago.
We will also use the xmltodict
module to transform some data from XML into a more convenient dictionary. We will add both modules to requirements.txt
and reinstall them in the virtual environment:
$ echo "Pillow==7.0.0" ...