Programming with laziness
Lazy evaluation of a data structure delays the computation of values until they are needed. It comes mostly from functional programming languages, but has been increasingly adopted by Python among other popular languages. Indeed, one of the biggest differences between Python 2 and Python 3 is that Python 3 tends to be lazier than Python 2. It turns out that lazy evaluation allows easier analysis of large datasets, generally requiring much less memory and sometimes performs much less computation.
Here, we will take a very simple example from Chapter 2, Next-generation Sequencing, we will take two paired-end read files and try to read them simultaneously (as order on both files represents the pair).
Getting ready
We will repeat part of the analysis performed on the FASTQ recipe in Chapter 2, Next-generation Sequencing. This will require two FASTQ files (SRR003265_1.filt.fastq.gz
and SRR003265_2.filt.fastq.gz
) that you can retrieve from https://github.com/tiagoantao...