Chapter 3. Iterating and Making Decisions
"Insanity: doing the same thing over and over again and expecting different results." | ||
--Albert Einstein |
In the previous chapter, we've seen Python built-in data types. Now that you're familiar with data in its many forms and shapes, it's time to start looking at how a program can use it.
According to Wikipedia:
In computer science, control flow (or alternatively, flow of control) refers to the specification of the order in which the individual statements, instructions or function calls of an imperative program are executed or evaluated.
In order to control the flow of a program, we have two main weapons: conditional programming (also known as branching) and looping. We can use them in many different combinations and variations, but in this chapter, instead of going through all possible various forms of those two constructs in a "documentation" fashion, I'd rather give you the basics and then I'll write a couple of small scripts with you. In the first one, we'll see how to create a rudimentary prime number generator, while in the second one, we'll see how to apply discounts to customers based on coupons. This way you should get a better feeling about how conditional programming and looping can be used.
Conditional programming
Conditional programming, or branching, is something you do every day, every moment. It's about evaluating conditions: if the light is green, then I can cross, if it's raining, then I'm taking the umbrella, and if I'm late for work, then I'll call my manager.
The main tool is the if
statement, which comes in different forms and colors, but basically what it does is evaluate an expression and, based on the result, choose which part of the code to execute. As usual, let's see an example:
conditional.1.py
late = True
if late:
print('I need to call my manager!')
This is possibly the simplest example: when fed to the if
statement, late
acts as a conditional expression, which is evaluated in a Boolean context (exactly like if we were calling bool(late)
). If the result of the evaluation is True
, then we enter the body of code immediately after the if
statement. Notice that the print
instruction is indented: this means it belongs to a scope defined by the if
clause. Execution of this code yields:
$ python conditional.1.py I need to call my manager!
Since late
is True
, the print
statement was executed. Let's expand on this example:
conditional.2.py
late = False
if late:
print('I need to call my manager!') #1
else:
print('no need to call my manager...') #2
This time I set late = False
, so when I execute the code, the result is different:
$ python conditional.2.py no need to call my manager...
Depending on the result of evaluating the late
expression, we can either enter block #1
or block #2
, but not both. Block #1
is executed when late
evaluates to True
, while block #2
is executed when late
evaluates to False
. Try assigning False/True
values to the late
name, and see how the output for this code changes accordingly.
The preceding example also introduces the else
clause, which becomes very handy when we want to provide an alternative set of instructions to be executed when an expression evaluates to False
within an if
clause. The else clause is optional, as it's evident by comparing the preceding two examples.
A specialized else: elif
Sometimes all you need is to do something if a condition is met (simple if
clause). Other times you need to provide an alternative, in case the condition is False
(if
/else
clause), but there are situations where you may have more than two paths to choose from, so, since calling the manager (or not calling them) is kind of a binary type of example (either you call or you don't), let's change the type of example and keep expanding. This time we decide tax percentages. If my income is less then 10k, I won't pay any taxes. If it is between 10k and 30k, I'll pay 20% taxes. If it is between 30k and 100k, I'll pay 35% taxes, and over 100k, I'll (gladly) pay 45% taxes. Let's put this all down into beautiful Python code:
taxes.py
income = 15000 if income < 10000: tax_coefficient = 0.0 #1 elif income < 30000: tax_coefficient = 0.2 #2 elif income < 100000: tax_coefficient = 0.35 #3 else: tax_coefficient = 0.45 #4 print('I will pay:', income * tax_coefficient, 'in taxes')
Executing the preceding code yields:
$ python taxes.py I will pay: 3000.0 in taxes
Let's go through the example line by line: we start by setting up the income value. In the example, my income is 15k. We enter the if
clause. Notice that this time we also introduced the elif
clause, which is a contraction for else-if
, and it's different from a bare else
clause in that it also has its own condition. So, the if
expression income < 10000
, evaluates to False
, therefore block #1
is not executed. The control passes to the next condition evaluator: elif income < 30000
. This one evaluates to True
, therefore block #2
is executed, and because of this, Python then resumes execution after the whole if
/elif
/elif
/else
clause (which we can just call if
clause from now on). There is only one instruction after the if
clause, the print
call, which tells us I will pay 3k in taxes this year (15k * 20%). Notice that the order is mandatory: if
comes first, then (optionally) as many elif
as you need, and then (optionally) an else
clause.
Interesting, right? No matter how many lines of code you may have within each block, when one of the conditions evaluates to True
, the associated block is executed and then execution resumes after the whole clause. If none of the conditions evaluates to True
(for example, income = 200000
), then the body of the else
clause would be executed (block #4
). This example expands our understanding of the behavior of the else
clause. Its block of code is executed when none of the preceding if
/elif
/.../elif
expressions has evaluated to True
.
Try to modify the value of income
until you can comfortably execute all blocks at your will (one per execution, of course). And then try the boundaries. This is crucial, whenever you have conditions expressed as equalities or inequalities (==
, !=
, <
, >
, <=
, >=
), those numbers represent boundaries. It is essential to test boundaries thoroughly. Should I allow you to drive at 18 or 17? Am I checking your age with age < 18
, or age <= 18
? You can't imagine how many times I had to fix subtle bugs that stemmed from using the wrong operator, so go ahead and experiment with the preceding code. Change some <
to <=
and set income to be one of the boundary values (10k, 30k, 100k) as well as any value in between. See how the result changes, get a good understanding of it before proceeding.
Before we move to the next topic, let's see another example that shows us how to nest if
clauses. Say your program encounters an error. If the alert system is the console, we print the error. If the alert system is an e-mail, we send it according to the severity of the error. If the alert system is anything other than console or e-mail, we don't know what to do, therefore we do nothing. Let's put this into code:
errorsalert.py
alert_system = 'console' # other value can be 'email' error_severity = 'critical' # other values: 'medium' or 'low' error_message = 'OMG! Something terrible happened!' if alert_system == 'console': print(error_message) #1 elif alert_system == 'email': if error_severity == 'critical': send_email('[email protected]', error_message) #2 elif error_severity == 'medium': send_email('[email protected]', error_message) #3 else: send_email('[email protected]', error_message) #4
The preceding example is quite interesting, in its silliness. It shows us two nested if
clauses (outer and inner). It also shows us the outer if
clause doesn't have any else
, while the inner one does. Notice how indentation is what allows us to nest one clause within another one.
If alert_system == 'console'
, body #1
is executed, and nothing else happens. On the other hand, if alert_system == 'email'
, then we enter into another if
clause, which we called inner. In the inner if
clause, according to error_severity
, we send an e-mail to either an admin, first-level support, or second-level support (blocks #2
, #3
, and #4
). The send_email
function is not defined in this example, therefore trying to run it would give you an error. In the source code of the book, which you can download from the website, I included a trick to redirect that call to a regular print
function, just so you can experiment on the console without actually sending an e-mail. Try changing the values and see how it all works.
The ternary operator
One last thing I would like to show you before moving on to the next subject, is the ternary operator or, in layman's terms, the short version of an if
/else
clause. When the value of a name is to be assigned according to some condition, sometimes it's easier and more readable to use the ternary operator instead of a proper if
clause. In the following example, the two code blocks do exactly the same thing:
ternary.py
order_total = 247 # GBP
# classic if/else form
if order_total > 100:
discount = 25 # GBP
else:
discount = 0 # GBP
print(order_total, discount)
# ternary operator
discount = 25 if order_total > 100 else 0
print(order_total, discount)
For simple cases like this, I find it very nice to be able to express that logic in one line instead of four. Remember, as a coder, you spend much more time reading code then writing it, so Python conciseness is invaluable.
Are you clear on how the ternary operator works? Basically is name = something if condition else something-else
. So name
is assigned something
if condition
evaluates to True
, and something-else
if condition
evaluates to False
.
Now that you know everything about controlling the path of the code, let's move on to the next subject: looping.
Looping
If you have any experience with looping in other programming languages, you will find Python's way of looping a bit different. First of all, what is looping? Looping means being able to repeat the execution of a code block more than once, according to the loop parameters we're given. There are different looping constructs, which serve different purposes, and Python has distilled all of them down to just two, which you can use to achieve everything you need. These are the for and while statements.
While it's definitely possible to do everything you need using either of them, they serve different purposes and therefore they're usually used in different contexts. We'll explore this difference thoroughly through this chapter.
The for loop
The for
loop is used when looping over a sequence, like a list, tuple, or a collection of objects. Let's start with a simple example that is more like C++ style, and then let's gradually see how to achieve the same results in Python (you'll love Python's syntax).
simple.for.py
for number in [0, 1, 2, 3, 4]: print(number)
This simple snippet of code, when executed, prints all numbers from 0 to 4. The for
loop is fed the list [0, 1, 2, 3, 4]
and at each iteration, number
is given a value from the sequence (which is iterated sequentially, in order), then the body of the loop is executed (the print line). number
changes at every iteration, according to which value is coming next from the sequence. When the sequence is exhausted, the for
loop terminates, and the execution of the code resumes normally with the code after the loop.
Iterating over a range
Sometimes we need to iterate over a range of numbers, and it would be quite unpleasant to have to do so by hardcoding the list somewhere. In such cases, the range
function comes to the rescue. Let's see the equivalent of the previous snippet of code:
simple.for.py
for number in range(5): print(number)
The range function is used extensively in Python programs when it comes to creating sequences: you can call it by passing one value, which acts as stop
(counting from 0), or you can pass two values (start
and stop
), or even three (start
, stop
, and step
). Check out the following example:
>>> list(range(10)) # one value: from 0 to value (excluded) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(range(3, 8)) # two values: from start to stop (excluded) [3, 4, 5, 6, 7] >>> list(range(-10, 10, 4)) # three values: step is added [-10, -6, -2, 2, 6]
For the moment, ignore that we need to wrap range(...)
within a list
. The range
object is a little bit special, but in this case we're just interested in understanding what are the values it will return to us. You see that the deal is the same with slicing: start
is included, stop
excluded, and optionally you can add a step
parameter, which by default is 1.
Try modifying the parameters of the range()
call in our simple.for.py
code and see what it prints, get comfortable with it.
Iterating over a sequence
Now we have all the tools to iterate over a sequence, so let's build on that example:
simple.for.2.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position in range(len(surnames)): print(position, surnames[position])
The preceding code adds a little bit of complexity to the game. Execution will show this result:
$ python simple.for.2.py 0 Rivest 1 Shamir 2 Adleman
Let's use the inside-out technique to break it down, ok? We start from the innermost part of what we're trying to understand, and we expand outwards. So, len(surnames)
is the length of the surnames
list: 3
. Therefore, range(len(surnames))
is actually transformed into range(3)
. This gives us the range [0, 3), which is basically a sequence (0, 1, 2)
. This means that the for
loop will run three iterations. In the first one, position
will take value 0
, while in the second one, it will take value 1
, and finally value 2
in the third and last iteration. What is (0, 1, 2)
, if not the possible indexing positions for the surnames
list? At position 0
we find 'Rivest'
, at position 1
, 'Shamir'
, and at position 2
, 'Adleman'
. If you are curious about what these three men created together, change print(position, surnames[position])
to print(surnames[position][0], end='')
add a final print()
outside of the loop, and run the code again.
Now, this style of looping is actually much closer to languages like Java or C++. In Python it's quite rare to see code like this. You can just iterate over any sequence or collection, so there is no need to get the list of positions and retrieve elements out of a sequence at each iteration. It's expensive, needlessly expensive. Let's change the example into a more Pythonic form:
simple.for.3.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for surname in surnames: print(surname)
Now that's something! It's practically English. The for
loop can iterate over the surnames
list, and it gives back each element in order at each interaction. Running this code will print the three surnames, one at a time. It's much easier to read, right?
What if you wanted to print the position as well though? Or what if you actually needed it for any reason? Should you go back to the range(len(...))
form? No. You can use the enumerate
built-in function, like this:
simple.for.4.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position, surname in enumerate(surnames): print(position, surname)
This code is very interesting as well. Notice that enumerate gives back a 2-tuple (position, surname)
at each iteration, but still, it's much more readable (and more efficient) than the range(len(...))
example. You can call enumerate
with a start
parameter, like enumerate(iterable, start)
, and it will start from start
, rather than 0
. Just another little thing that shows you how much thought has been given in designing Python so that it makes your life easy.
Using a for
loop it is possible to iterate over lists, tuples, and in general anything that in Python is called iterable. This is a very important concept, so let's talk about it a bit more.
Iterators and iterables
According to the Python documentation, an iterable is:
"An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, and tuple) and some non-sequence types likedict
,file
objects, and objects of any classes you define with an__iter__()
or__getitem__()
method. Iterables can be used in afor
loop and in many other places where a sequence is needed (zip()
,map()
, ...). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. Thefor
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop."
Simply put, what happens when you write for k in sequence: ... body ...
, is that the for
loop asks sequence
for the next element, it gets something back, it calls that something k
, and then executes its body. Then, once again, the for
loop asks sequence
again for the next element, it calls it k
again, and executes the body again, and so on and so forth, until the sequence is exhausted. Empty sequences will result in zero executions of the body.
Some data structures, when iterated over, produce their elements in order, like lists, tuples, and strings, while some others don't, like sets and dictionaries.
Python gives us the ability to iterate over iterables, using a type of object called iterator. According to the official documentation, an iterator is:
"An object representing a stream of data. Repeated calls to the iterator's
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream. When no more data are available aStopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again. Iterators are required to have an__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist
) produces a fresh new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container."
Don't worry if you don't fully understand all the preceding legalese, you will in due time. I put it here as a handy reference for the future.
In practice, the whole iterable/iterator mechanism is somewhat hidden behind the code. Unless you need to code your own iterable or iterator for some reason, you won't have to worry about this too much. But it's very important to understand how Python handles this key aspect of control flow because it will shape the way you will write your code.
Iterating over multiple sequences
Let's see another example of how to iterate over two sequences of the same length, in order to work on their respective elements in pairs. Say we have a list of people and a list of numbers representing the age of the people in the first list. We want to print a pair person/age on one line for all of them. Let's start with an example and let's refine it gradually.
multiple.sequences.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position in range(len(people)): person = people[position] age = ages[position] print(person, age)
By now, this code should be pretty straightforward for you to understand. We need to iterate over the list of positions (0, 1, 2, 3) because we want to retrieve elements from two different lists. Executing it we get the following:
$ python multiple.sequences.py Jonas 25 Julio 30 Mike 31 Mez 39
This code is both inefficient and not Pythonic. Inefficient because retrieving an element given the position can be an expensive operation, and we're doing it from scratch at each iteration. The mail man doesn't go back to the beginning of the road each time he delivers a letter, right? He moves from house to house. From one to the next one. Let's try to make it better using enumerate:
multiple.sequences.enumerate.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position, person in enumerate(people): age = ages[position] print(person, age)
Better, but still not perfect. And still a bit ugly. We're iterating properly on people
, but we're still fetching age
using positional indexing, which we want to lose as well. Well, no worries, Python gives you the zip
function, remember? Let's use it!
multiple.sequences.zip.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for person, age in zip(people, ages): print(person, age)
Ah! So much better! Once again, compare the preceding code with the first example and admire Python's elegance. The reason I wanted to show this example is twofold. On the one hand, I wanted to give you an idea of how shorter the code in Python can be compared to other languages where the syntax doesn't allow you to iterate over sequences or collections as easily. And on the other hand, and much more importantly, notice that when the for
loop asks zip(sequenceA, sequenceB)
for the next element, it gets back a tuple
, not just a single object. It gets back a tuple
with as many elements as the number of sequences we feed to the zip
function. Let's expand a little on the previous example in two ways: using explicit and implicit assignment:
multiple.sequences.explicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for person, age, nationality in zip(people, ages, nationalities): print(person, age, nationality)
In the preceding code, we added the nationalities list. Now that we feed three sequences to the zip
function, the for loop gets back a 3-tuple at each iteration. Notice that the position of the elements in the tuple respects the position of the sequences in the zip
call. Executing the code will yield the following result:
$ python multiple.sequences.explicit.py Jonas 25 Belgium Julio 30 Spain Mike 31 England Mez 39 Bangladesh
Sometimes, for reasons that may not be clear in a simple example like the preceding one, you may want to explode the tuple within the body of the for
loop. If that is your desire, it's perfectly possible to do so.
multiple.sequences.implicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for data in zip(people, ages, nationalities): person, age, nationality = data print(person, age, nationality)
It's basically doing what the for
loop does automatically for you, but in some cases you may want to do it yourself. Here, the 3-tuple data
that comes from zip(...)
, is exploded within the body of the for
loop into three variables: person
, age
, and nationality
.
The while loop
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
The break and continue statements
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
The for loop
The for
loop is used when looping over a sequence, like a list, tuple, or a collection of objects. Let's start with a simple example that is more like C++ style, and then let's gradually see how to achieve the same results in Python (you'll love Python's syntax).
simple.for.py
for number in [0, 1, 2, 3, 4]: print(number)
This simple snippet of code, when executed, prints all numbers from 0 to 4. The for
loop is fed the list [0, 1, 2, 3, 4]
and at each iteration, number
is given a value from the sequence (which is iterated sequentially, in order), then the body of the loop is executed (the print line). number
changes at every iteration, according to which value is coming next from the sequence. When the sequence is exhausted, the for
loop terminates, and the execution of the code resumes normally with the code after the loop.
Iterating over a range
Sometimes we need to iterate over a range of numbers, and it would be quite unpleasant to have to do so by hardcoding the list somewhere. In such cases, the range
function comes to the rescue. Let's see the equivalent of the previous snippet of code:
simple.for.py
for number in range(5): print(number)
The range function is used extensively in Python programs when it comes to creating sequences: you can call it by passing one value, which acts as stop
(counting from 0), or you can pass two values (start
and stop
), or even three (start
, stop
, and step
). Check out the following example:
>>> list(range(10)) # one value: from 0 to value (excluded) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(range(3, 8)) # two values: from start to stop (excluded) [3, 4, 5, 6, 7] >>> list(range(-10, 10, 4)) # three values: step is added [-10, -6, -2, 2, 6]
For the moment, ignore that we need to wrap range(...)
within a list
. The range
object is a little bit special, but in this case we're just interested in understanding what are the values it will return to us. You see that the deal is the same with slicing: start
is included, stop
excluded, and optionally you can add a step
parameter, which by default is 1.
Try modifying the parameters of the range()
call in our simple.for.py
code and see what it prints, get comfortable with it.
Iterating over a sequence
Now we have all the tools to iterate over a sequence, so let's build on that example:
simple.for.2.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position in range(len(surnames)): print(position, surnames[position])
The preceding code adds a little bit of complexity to the game. Execution will show this result:
$ python simple.for.2.py 0 Rivest 1 Shamir 2 Adleman
Let's use the inside-out technique to break it down, ok? We start from the innermost part of what we're trying to understand, and we expand outwards. So, len(surnames)
is the length of the surnames
list: 3
. Therefore, range(len(surnames))
is actually transformed into range(3)
. This gives us the range [0, 3), which is basically a sequence (0, 1, 2)
. This means that the for
loop will run three iterations. In the first one, position
will take value 0
, while in the second one, it will take value 1
, and finally value 2
in the third and last iteration. What is (0, 1, 2)
, if not the possible indexing positions for the surnames
list? At position 0
we find 'Rivest'
, at position 1
, 'Shamir'
, and at position 2
, 'Adleman'
. If you are curious about what these three men created together, change print(position, surnames[position])
to print(surnames[position][0], end='')
add a final print()
outside of the loop, and run the code again.
Now, this style of looping is actually much closer to languages like Java or C++. In Python it's quite rare to see code like this. You can just iterate over any sequence or collection, so there is no need to get the list of positions and retrieve elements out of a sequence at each iteration. It's expensive, needlessly expensive. Let's change the example into a more Pythonic form:
simple.for.3.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for surname in surnames: print(surname)
Now that's something! It's practically English. The for
loop can iterate over the surnames
list, and it gives back each element in order at each interaction. Running this code will print the three surnames, one at a time. It's much easier to read, right?
What if you wanted to print the position as well though? Or what if you actually needed it for any reason? Should you go back to the range(len(...))
form? No. You can use the enumerate
built-in function, like this:
simple.for.4.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position, surname in enumerate(surnames): print(position, surname)
This code is very interesting as well. Notice that enumerate gives back a 2-tuple (position, surname)
at each iteration, but still, it's much more readable (and more efficient) than the range(len(...))
example. You can call enumerate
with a start
parameter, like enumerate(iterable, start)
, and it will start from start
, rather than 0
. Just another little thing that shows you how much thought has been given in designing Python so that it makes your life easy.
Using a for
loop it is possible to iterate over lists, tuples, and in general anything that in Python is called iterable. This is a very important concept, so let's talk about it a bit more.
Iterators and iterables
According to the Python documentation, an iterable is:
"An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, and tuple) and some non-sequence types likedict
,file
objects, and objects of any classes you define with an__iter__()
or__getitem__()
method. Iterables can be used in afor
loop and in many other places where a sequence is needed (zip()
,map()
, ...). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. Thefor
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop."
Simply put, what happens when you write for k in sequence: ... body ...
, is that the for
loop asks sequence
for the next element, it gets something back, it calls that something k
, and then executes its body. Then, once again, the for
loop asks sequence
again for the next element, it calls it k
again, and executes the body again, and so on and so forth, until the sequence is exhausted. Empty sequences will result in zero executions of the body.
Some data structures, when iterated over, produce their elements in order, like lists, tuples, and strings, while some others don't, like sets and dictionaries.
Python gives us the ability to iterate over iterables, using a type of object called iterator. According to the official documentation, an iterator is:
"An object representing a stream of data. Repeated calls to the iterator's
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream. When no more data are available aStopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again. Iterators are required to have an__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist
) produces a fresh new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container."
Don't worry if you don't fully understand all the preceding legalese, you will in due time. I put it here as a handy reference for the future.
In practice, the whole iterable/iterator mechanism is somewhat hidden behind the code. Unless you need to code your own iterable or iterator for some reason, you won't have to worry about this too much. But it's very important to understand how Python handles this key aspect of control flow because it will shape the way you will write your code.
Iterating over multiple sequences
Let's see another example of how to iterate over two sequences of the same length, in order to work on their respective elements in pairs. Say we have a list of people and a list of numbers representing the age of the people in the first list. We want to print a pair person/age on one line for all of them. Let's start with an example and let's refine it gradually.
multiple.sequences.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position in range(len(people)): person = people[position] age = ages[position] print(person, age)
By now, this code should be pretty straightforward for you to understand. We need to iterate over the list of positions (0, 1, 2, 3) because we want to retrieve elements from two different lists. Executing it we get the following:
$ python multiple.sequences.py Jonas 25 Julio 30 Mike 31 Mez 39
This code is both inefficient and not Pythonic. Inefficient because retrieving an element given the position can be an expensive operation, and we're doing it from scratch at each iteration. The mail man doesn't go back to the beginning of the road each time he delivers a letter, right? He moves from house to house. From one to the next one. Let's try to make it better using enumerate:
multiple.sequences.enumerate.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position, person in enumerate(people): age = ages[position] print(person, age)
Better, but still not perfect. And still a bit ugly. We're iterating properly on people
, but we're still fetching age
using positional indexing, which we want to lose as well. Well, no worries, Python gives you the zip
function, remember? Let's use it!
multiple.sequences.zip.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for person, age in zip(people, ages): print(person, age)
Ah! So much better! Once again, compare the preceding code with the first example and admire Python's elegance. The reason I wanted to show this example is twofold. On the one hand, I wanted to give you an idea of how shorter the code in Python can be compared to other languages where the syntax doesn't allow you to iterate over sequences or collections as easily. And on the other hand, and much more importantly, notice that when the for
loop asks zip(sequenceA, sequenceB)
for the next element, it gets back a tuple
, not just a single object. It gets back a tuple
with as many elements as the number of sequences we feed to the zip
function. Let's expand a little on the previous example in two ways: using explicit and implicit assignment:
multiple.sequences.explicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for person, age, nationality in zip(people, ages, nationalities): print(person, age, nationality)
In the preceding code, we added the nationalities list. Now that we feed three sequences to the zip
function, the for loop gets back a 3-tuple at each iteration. Notice that the position of the elements in the tuple respects the position of the sequences in the zip
call. Executing the code will yield the following result:
$ python multiple.sequences.explicit.py Jonas 25 Belgium Julio 30 Spain Mike 31 England Mez 39 Bangladesh
Sometimes, for reasons that may not be clear in a simple example like the preceding one, you may want to explode the tuple within the body of the for
loop. If that is your desire, it's perfectly possible to do so.
multiple.sequences.implicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for data in zip(people, ages, nationalities): person, age, nationality = data print(person, age, nationality)
It's basically doing what the for
loop does automatically for you, but in some cases you may want to do it yourself. Here, the 3-tuple data
that comes from zip(...)
, is exploded within the body of the for
loop into three variables: person
, age
, and nationality
.
The while loop
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
The break and continue statements
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
Iterating over a range
Sometimes we need to iterate over a range of numbers, and it would be quite unpleasant to have to do so by hardcoding the list somewhere. In such cases, the range
function comes to the rescue. Let's see the equivalent of the previous snippet of code:
simple.for.py
for number in range(5): print(number)
The range function is used extensively in Python programs when it comes to creating sequences: you can call it by passing one value, which acts as stop
(counting from 0), or you can pass two values (start
and stop
), or even three (start
, stop
, and step
). Check out the following example:
>>> list(range(10)) # one value: from 0 to value (excluded) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(range(3, 8)) # two values: from start to stop (excluded) [3, 4, 5, 6, 7] >>> list(range(-10, 10, 4)) # three values: step is added [-10, -6, -2, 2, 6]
For the moment, ignore that we need to wrap range(...)
within a list
. The range
object is a little bit special, but in this case we're just interested in understanding what are the values it will return to us. You see that the deal is the same with slicing: start
is included, stop
excluded, and optionally you can add a step
parameter, which by default is 1.
Try modifying the parameters of the range()
call in our simple.for.py
code and see what it prints, get comfortable with it.
Iterating over a sequence
Now we have all the tools to iterate over a sequence, so let's build on that example:
simple.for.2.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position in range(len(surnames)): print(position, surnames[position])
The preceding code adds a little bit of complexity to the game. Execution will show this result:
$ python simple.for.2.py 0 Rivest 1 Shamir 2 Adleman
Let's use the inside-out technique to break it down, ok? We start from the innermost part of what we're trying to understand, and we expand outwards. So, len(surnames)
is the length of the surnames
list: 3
. Therefore, range(len(surnames))
is actually transformed into range(3)
. This gives us the range [0, 3), which is basically a sequence (0, 1, 2)
. This means that the for
loop will run three iterations. In the first one, position
will take value 0
, while in the second one, it will take value 1
, and finally value 2
in the third and last iteration. What is (0, 1, 2)
, if not the possible indexing positions for the surnames
list? At position 0
we find 'Rivest'
, at position 1
, 'Shamir'
, and at position 2
, 'Adleman'
. If you are curious about what these three men created together, change print(position, surnames[position])
to print(surnames[position][0], end='')
add a final print()
outside of the loop, and run the code again.
Now, this style of looping is actually much closer to languages like Java or C++. In Python it's quite rare to see code like this. You can just iterate over any sequence or collection, so there is no need to get the list of positions and retrieve elements out of a sequence at each iteration. It's expensive, needlessly expensive. Let's change the example into a more Pythonic form:
simple.for.3.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for surname in surnames: print(surname)
Now that's something! It's practically English. The for
loop can iterate over the surnames
list, and it gives back each element in order at each interaction. Running this code will print the three surnames, one at a time. It's much easier to read, right?
What if you wanted to print the position as well though? Or what if you actually needed it for any reason? Should you go back to the range(len(...))
form? No. You can use the enumerate
built-in function, like this:
simple.for.4.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position, surname in enumerate(surnames): print(position, surname)
This code is very interesting as well. Notice that enumerate gives back a 2-tuple (position, surname)
at each iteration, but still, it's much more readable (and more efficient) than the range(len(...))
example. You can call enumerate
with a start
parameter, like enumerate(iterable, start)
, and it will start from start
, rather than 0
. Just another little thing that shows you how much thought has been given in designing Python so that it makes your life easy.
Using a for
loop it is possible to iterate over lists, tuples, and in general anything that in Python is called iterable. This is a very important concept, so let's talk about it a bit more.
According to the Python documentation, an iterable is:
"An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, and tuple) and some non-sequence types likedict
,file
objects, and objects of any classes you define with an__iter__()
or__getitem__()
method. Iterables can be used in afor
loop and in many other places where a sequence is needed (zip()
,map()
, ...). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. Thefor
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop."
Simply put, what happens when you write for k in sequence: ... body ...
, is that the for
loop asks sequence
for the next element, it gets something back, it calls that something k
, and then executes its body. Then, once again, the for
loop asks sequence
again for the next element, it calls it k
again, and executes the body again, and so on and so forth, until the sequence is exhausted. Empty sequences will result in zero executions of the body.
Some data structures, when iterated over, produce their elements in order, like lists, tuples, and strings, while some others don't, like sets and dictionaries.
Python gives us the ability to iterate over iterables, using a type of object called iterator. According to the official documentation, an iterator is:
"An object representing a stream of data. Repeated calls to the iterator's
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream. When no more data are available aStopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again. Iterators are required to have an__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist
) produces a fresh new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container."
Don't worry if you don't fully understand all the preceding legalese, you will in due time. I put it here as a handy reference for the future.
In practice, the whole iterable/iterator mechanism is somewhat hidden behind the code. Unless you need to code your own iterable or iterator for some reason, you won't have to worry about this too much. But it's very important to understand how Python handles this key aspect of control flow because it will shape the way you will write your code.
Let's see another example of how to iterate over two sequences of the same length, in order to work on their respective elements in pairs. Say we have a list of people and a list of numbers representing the age of the people in the first list. We want to print a pair person/age on one line for all of them. Let's start with an example and let's refine it gradually.
multiple.sequences.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position in range(len(people)): person = people[position] age = ages[position] print(person, age)
By now, this code should be pretty straightforward for you to understand. We need to iterate over the list of positions (0, 1, 2, 3) because we want to retrieve elements from two different lists. Executing it we get the following:
$ python multiple.sequences.py Jonas 25 Julio 30 Mike 31 Mez 39
This code is both inefficient and not Pythonic. Inefficient because retrieving an element given the position can be an expensive operation, and we're doing it from scratch at each iteration. The mail man doesn't go back to the beginning of the road each time he delivers a letter, right? He moves from house to house. From one to the next one. Let's try to make it better using enumerate:
multiple.sequences.enumerate.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position, person in enumerate(people): age = ages[position] print(person, age)
Better, but still not perfect. And still a bit ugly. We're iterating properly on people
, but we're still fetching age
using positional indexing, which we want to lose as well. Well, no worries, Python gives you the zip
function, remember? Let's use it!
multiple.sequences.zip.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for person, age in zip(people, ages): print(person, age)
Ah! So much better! Once again, compare the preceding code with the first example and admire Python's elegance. The reason I wanted to show this example is twofold. On the one hand, I wanted to give you an idea of how shorter the code in Python can be compared to other languages where the syntax doesn't allow you to iterate over sequences or collections as easily. And on the other hand, and much more importantly, notice that when the for
loop asks zip(sequenceA, sequenceB)
for the next element, it gets back a tuple
, not just a single object. It gets back a tuple
with as many elements as the number of sequences we feed to the zip
function. Let's expand a little on the previous example in two ways: using explicit and implicit assignment:
multiple.sequences.explicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for person, age, nationality in zip(people, ages, nationalities): print(person, age, nationality)
In the preceding code, we added the nationalities list. Now that we feed three sequences to the zip
function, the for loop gets back a 3-tuple at each iteration. Notice that the position of the elements in the tuple respects the position of the sequences in the zip
call. Executing the code will yield the following result:
$ python multiple.sequences.explicit.py Jonas 25 Belgium Julio 30 Spain Mike 31 England Mez 39 Bangladesh
Sometimes, for reasons that may not be clear in a simple example like the preceding one, you may want to explode the tuple within the body of the for
loop. If that is your desire, it's perfectly possible to do so.
multiple.sequences.implicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for data in zip(people, ages, nationalities): person, age, nationality = data print(person, age, nationality)
It's basically doing what the for
loop does automatically for you, but in some cases you may want to do it yourself. Here, the 3-tuple data
that comes from zip(...)
, is exploded within the body of the for
loop into three variables: person
, age
, and nationality
.
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
Iterating over a sequence
Now we have all the tools to iterate over a sequence, so let's build on that example:
simple.for.2.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position in range(len(surnames)): print(position, surnames[position])
The preceding code adds a little bit of complexity to the game. Execution will show this result:
$ python simple.for.2.py 0 Rivest 1 Shamir 2 Adleman
Let's use the inside-out technique to break it down, ok? We start from the innermost part of what we're trying to understand, and we expand outwards. So, len(surnames)
is the length of the surnames
list: 3
. Therefore, range(len(surnames))
is actually transformed into range(3)
. This gives us the range [0, 3), which is basically a sequence (0, 1, 2)
. This means that the for
loop will run three iterations. In the first one, position
will take value 0
, while in the second one, it will take value 1
, and finally value 2
in the third and last iteration. What is (0, 1, 2)
, if not the possible indexing positions for the surnames
list? At position 0
we find 'Rivest'
, at position 1
, 'Shamir'
, and at position 2
, 'Adleman'
. If you are curious about what these three men created together, change print(position, surnames[position])
to print(surnames[position][0], end='')
add a final print()
outside of the loop, and run the code again.
Now, this style of looping is actually much closer to languages like Java or C++. In Python it's quite rare to see code like this. You can just iterate over any sequence or collection, so there is no need to get the list of positions and retrieve elements out of a sequence at each iteration. It's expensive, needlessly expensive. Let's change the example into a more Pythonic form:
simple.for.3.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for surname in surnames: print(surname)
Now that's something! It's practically English. The for
loop can iterate over the surnames
list, and it gives back each element in order at each interaction. Running this code will print the three surnames, one at a time. It's much easier to read, right?
What if you wanted to print the position as well though? Or what if you actually needed it for any reason? Should you go back to the range(len(...))
form? No. You can use the enumerate
built-in function, like this:
simple.for.4.py
surnames = ['Rivest', 'Shamir', 'Adleman'] for position, surname in enumerate(surnames): print(position, surname)
This code is very interesting as well. Notice that enumerate gives back a 2-tuple (position, surname)
at each iteration, but still, it's much more readable (and more efficient) than the range(len(...))
example. You can call enumerate
with a start
parameter, like enumerate(iterable, start)
, and it will start from start
, rather than 0
. Just another little thing that shows you how much thought has been given in designing Python so that it makes your life easy.
Using a for
loop it is possible to iterate over lists, tuples, and in general anything that in Python is called iterable. This is a very important concept, so let's talk about it a bit more.
According to the Python documentation, an iterable is:
"An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, and tuple) and some non-sequence types likedict
,file
objects, and objects of any classes you define with an__iter__()
or__getitem__()
method. Iterables can be used in afor
loop and in many other places where a sequence is needed (zip()
,map()
, ...). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. Thefor
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop."
Simply put, what happens when you write for k in sequence: ... body ...
, is that the for
loop asks sequence
for the next element, it gets something back, it calls that something k
, and then executes its body. Then, once again, the for
loop asks sequence
again for the next element, it calls it k
again, and executes the body again, and so on and so forth, until the sequence is exhausted. Empty sequences will result in zero executions of the body.
Some data structures, when iterated over, produce their elements in order, like lists, tuples, and strings, while some others don't, like sets and dictionaries.
Python gives us the ability to iterate over iterables, using a type of object called iterator. According to the official documentation, an iterator is:
"An object representing a stream of data. Repeated calls to the iterator's
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream. When no more data are available aStopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again. Iterators are required to have an__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist
) produces a fresh new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container."
Don't worry if you don't fully understand all the preceding legalese, you will in due time. I put it here as a handy reference for the future.
In practice, the whole iterable/iterator mechanism is somewhat hidden behind the code. Unless you need to code your own iterable or iterator for some reason, you won't have to worry about this too much. But it's very important to understand how Python handles this key aspect of control flow because it will shape the way you will write your code.
Let's see another example of how to iterate over two sequences of the same length, in order to work on their respective elements in pairs. Say we have a list of people and a list of numbers representing the age of the people in the first list. We want to print a pair person/age on one line for all of them. Let's start with an example and let's refine it gradually.
multiple.sequences.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position in range(len(people)): person = people[position] age = ages[position] print(person, age)
By now, this code should be pretty straightforward for you to understand. We need to iterate over the list of positions (0, 1, 2, 3) because we want to retrieve elements from two different lists. Executing it we get the following:
$ python multiple.sequences.py Jonas 25 Julio 30 Mike 31 Mez 39
This code is both inefficient and not Pythonic. Inefficient because retrieving an element given the position can be an expensive operation, and we're doing it from scratch at each iteration. The mail man doesn't go back to the beginning of the road each time he delivers a letter, right? He moves from house to house. From one to the next one. Let's try to make it better using enumerate:
multiple.sequences.enumerate.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position, person in enumerate(people): age = ages[position] print(person, age)
Better, but still not perfect. And still a bit ugly. We're iterating properly on people
, but we're still fetching age
using positional indexing, which we want to lose as well. Well, no worries, Python gives you the zip
function, remember? Let's use it!
multiple.sequences.zip.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for person, age in zip(people, ages): print(person, age)
Ah! So much better! Once again, compare the preceding code with the first example and admire Python's elegance. The reason I wanted to show this example is twofold. On the one hand, I wanted to give you an idea of how shorter the code in Python can be compared to other languages where the syntax doesn't allow you to iterate over sequences or collections as easily. And on the other hand, and much more importantly, notice that when the for
loop asks zip(sequenceA, sequenceB)
for the next element, it gets back a tuple
, not just a single object. It gets back a tuple
with as many elements as the number of sequences we feed to the zip
function. Let's expand a little on the previous example in two ways: using explicit and implicit assignment:
multiple.sequences.explicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for person, age, nationality in zip(people, ages, nationalities): print(person, age, nationality)
In the preceding code, we added the nationalities list. Now that we feed three sequences to the zip
function, the for loop gets back a 3-tuple at each iteration. Notice that the position of the elements in the tuple respects the position of the sequences in the zip
call. Executing the code will yield the following result:
$ python multiple.sequences.explicit.py Jonas 25 Belgium Julio 30 Spain Mike 31 England Mez 39 Bangladesh
Sometimes, for reasons that may not be clear in a simple example like the preceding one, you may want to explode the tuple within the body of the for
loop. If that is your desire, it's perfectly possible to do so.
multiple.sequences.implicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for data in zip(people, ages, nationalities): person, age, nationality = data print(person, age, nationality)
It's basically doing what the for
loop does automatically for you, but in some cases you may want to do it yourself. Here, the 3-tuple data
that comes from zip(...)
, is exploded within the body of the for
loop into three variables: person
, age
, and nationality
.
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
Iterators and iterables
According to the Python documentation, an iterable is:
"An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, and tuple) and some non-sequence types likedict
,file
objects, and objects of any classes you define with an__iter__()
or__getitem__()
method. Iterables can be used in afor
loop and in many other places where a sequence is needed (zip()
,map()
, ...). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. Thefor
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop."
Simply put, what happens when you write for k in sequence: ... body ...
, is that the for
loop asks sequence
for the next element, it gets something back, it calls that something k
, and then executes its body. Then, once again, the for
loop asks sequence
again for the next element, it calls it k
again, and executes the body again, and so on and so forth, until the sequence is exhausted. Empty sequences will result in zero executions of the body.
Some data structures, when iterated over, produce their elements in order, like lists, tuples, and strings, while some others don't, like sets and dictionaries.
Python gives us the ability to iterate over iterables, using a type of object called iterator. According to the official documentation, an iterator is:
"An object representing a stream of data. Repeated calls to the iterator's
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream. When no more data are available aStopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again. Iterators are required to have an__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist
) produces a fresh new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container."
Don't worry if you don't fully understand all the preceding legalese, you will in due time. I put it here as a handy reference for the future.
In practice, the whole iterable/iterator mechanism is somewhat hidden behind the code. Unless you need to code your own iterable or iterator for some reason, you won't have to worry about this too much. But it's very important to understand how Python handles this key aspect of control flow because it will shape the way you will write your code.
Iterating over multiple sequences
Let's see another example of how to iterate over two sequences of the same length, in order to work on their respective elements in pairs. Say we have a list of people and a list of numbers representing the age of the people in the first list. We want to print a pair person/age on one line for all of them. Let's start with an example and let's refine it gradually.
multiple.sequences.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position in range(len(people)): person = people[position] age = ages[position] print(person, age)
By now, this code should be pretty straightforward for you to understand. We need to iterate over the list of positions (0, 1, 2, 3) because we want to retrieve elements from two different lists. Executing it we get the following:
$ python multiple.sequences.py Jonas 25 Julio 30 Mike 31 Mez 39
This code is both inefficient and not Pythonic. Inefficient because retrieving an element given the position can be an expensive operation, and we're doing it from scratch at each iteration. The mail man doesn't go back to the beginning of the road each time he delivers a letter, right? He moves from house to house. From one to the next one. Let's try to make it better using enumerate:
multiple.sequences.enumerate.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position, person in enumerate(people): age = ages[position] print(person, age)
Better, but still not perfect. And still a bit ugly. We're iterating properly on people
, but we're still fetching age
using positional indexing, which we want to lose as well. Well, no worries, Python gives you the zip
function, remember? Let's use it!
multiple.sequences.zip.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for person, age in zip(people, ages): print(person, age)
Ah! So much better! Once again, compare the preceding code with the first example and admire Python's elegance. The reason I wanted to show this example is twofold. On the one hand, I wanted to give you an idea of how shorter the code in Python can be compared to other languages where the syntax doesn't allow you to iterate over sequences or collections as easily. And on the other hand, and much more importantly, notice that when the for
loop asks zip(sequenceA, sequenceB)
for the next element, it gets back a tuple
, not just a single object. It gets back a tuple
with as many elements as the number of sequences we feed to the zip
function. Let's expand a little on the previous example in two ways: using explicit and implicit assignment:
multiple.sequences.explicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for person, age, nationality in zip(people, ages, nationalities): print(person, age, nationality)
In the preceding code, we added the nationalities list. Now that we feed three sequences to the zip
function, the for loop gets back a 3-tuple at each iteration. Notice that the position of the elements in the tuple respects the position of the sequences in the zip
call. Executing the code will yield the following result:
$ python multiple.sequences.explicit.py Jonas 25 Belgium Julio 30 Spain Mike 31 England Mez 39 Bangladesh
Sometimes, for reasons that may not be clear in a simple example like the preceding one, you may want to explode the tuple within the body of the for
loop. If that is your desire, it's perfectly possible to do so.
multiple.sequences.implicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for data in zip(people, ages, nationalities): person, age, nationality = data print(person, age, nationality)
It's basically doing what the for
loop does automatically for you, but in some cases you may want to do it yourself. Here, the 3-tuple data
that comes from zip(...)
, is exploded within the body of the for
loop into three variables: person
, age
, and nationality
.
The while loop
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
The break and continue statements
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
Iterating over multiple sequences
Let's see another example of how to iterate over two sequences of the same length, in order to work on their respective elements in pairs. Say we have a list of people and a list of numbers representing the age of the people in the first list. We want to print a pair person/age on one line for all of them. Let's start with an example and let's refine it gradually.
multiple.sequences.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position in range(len(people)): person = people[position] age = ages[position] print(person, age)
By now, this code should be pretty straightforward for you to understand. We need to iterate over the list of positions (0, 1, 2, 3) because we want to retrieve elements from two different lists. Executing it we get the following:
$ python multiple.sequences.py Jonas 25 Julio 30 Mike 31 Mez 39
This code is both inefficient and not Pythonic. Inefficient because retrieving an element given the position can be an expensive operation, and we're doing it from scratch at each iteration. The mail man doesn't go back to the beginning of the road each time he delivers a letter, right? He moves from house to house. From one to the next one. Let's try to make it better using enumerate:
multiple.sequences.enumerate.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for position, person in enumerate(people): age = ages[position] print(person, age)
Better, but still not perfect. And still a bit ugly. We're iterating properly on people
, but we're still fetching age
using positional indexing, which we want to lose as well. Well, no worries, Python gives you the zip
function, remember? Let's use it!
multiple.sequences.zip.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] for person, age in zip(people, ages): print(person, age)
Ah! So much better! Once again, compare the preceding code with the first example and admire Python's elegance. The reason I wanted to show this example is twofold. On the one hand, I wanted to give you an idea of how shorter the code in Python can be compared to other languages where the syntax doesn't allow you to iterate over sequences or collections as easily. And on the other hand, and much more importantly, notice that when the for
loop asks zip(sequenceA, sequenceB)
for the next element, it gets back a tuple
, not just a single object. It gets back a tuple
with as many elements as the number of sequences we feed to the zip
function. Let's expand a little on the previous example in two ways: using explicit and implicit assignment:
multiple.sequences.explicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for person, age, nationality in zip(people, ages, nationalities): print(person, age, nationality)
In the preceding code, we added the nationalities list. Now that we feed three sequences to the zip
function, the for loop gets back a 3-tuple at each iteration. Notice that the position of the elements in the tuple respects the position of the sequences in the zip
call. Executing the code will yield the following result:
$ python multiple.sequences.explicit.py Jonas 25 Belgium Julio 30 Spain Mike 31 England Mez 39 Bangladesh
Sometimes, for reasons that may not be clear in a simple example like the preceding one, you may want to explode the tuple within the body of the for
loop. If that is your desire, it's perfectly possible to do so.
multiple.sequences.implicit.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] nationalities = ['Belgium', 'Spain', 'England', 'Bangladesh'] for data in zip(people, ages, nationalities): person, age, nationality = data print(person, age, nationality)
It's basically doing what the for
loop does automatically for you, but in some cases you may want to do it yourself. Here, the 3-tuple data
that comes from zip(...)
, is exploded within the body of the for
loop into three variables: person
, age
, and nationality
.
The while loop
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
The break and continue statements
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
The while loop
In the preceding pages, we saw the for
loop in action. It's incredibly useful when you need to loop over a sequence or a collection. The key point to keep in mind, when you need to be able to discriminate which looping construct to use, is that the for
loop rocks when you have to iterate over a finite amount of elements. It can be a huge amount, but still, something that at some point ends.
There are other cases though, when you just need to loop until some condition is satisfied, or even loop indefinitely until the application is stopped. Cases where we don't really have something to iterate on, and therefore the for
loop would be a poor choice. But fear not, for these cases Python provides us with the while
loop.
The while
loop is similar to the for
loop, in that they both loop and at each iteration they execute a body of instructions. What is different between them is that the while
loop doesn't loop over a sequence (it can, but you have to manually write the logic and it wouldn't make any sense, you would just want to use a for
loop), rather, it loops as long as a certain condition is satisfied. When the condition is no longer satisfied, the loop ends.
As usual, let's see an example which will clarify everything for us. We want to print the binary representation of a positive number. In order to do so, we repeatedly divide the number by two, collecting the remainder, and then produce the inverse of the list of remainders. Let me give you a small example using number 6, which is 110 in binary.
6 / 2 = 3 (remainder: 0) 3 / 2 = 1 (remainder: 1) 1 / 2 = 0 (remainder: 1) List of remainders: 0, 1, 1. Inverse is 1, 1, 0, which is also the binary representation of 6: 110
Let's write some code to calculate the binary representation for number 39: 1001112.
binary.py
n = 39 remainders = [] while n > 0: remainder = n % 2 # remainder of division by 2 remainders.append(remainder) # we keep track of remainders n //= 2 # we divide n by 2 # reassign the list to its reversed copy and print it remainders = remainders[::-1] print(remainders)
In the preceding code, I highlighted two things: n > 0
, which is the condition to keep looping, and remainders[::-1]
which is a nice and easy way to get the reversed version of a list (missing start
and end
parameters, step = -1
, produces the same list, from end
to start
, in reverse order). We can make the code a little shorter (and more Pythonic), by using the divmod
function, which is called with a number and a divisor, and returns a tuple with the result of the integer division and its remainder. For example, divmod(13, 5)
would return (2, 3)
, and indeed 5 * 2 + 3 = 13.
binary.2.py
n = 39
remainders = []
while n > 0:
n, remainder = divmod(n, 2)
remainders.append(remainder)
# reassign the list to its reversed copy and print it
remainders = remainders[::-1]
print(remainders)
In the preceding code, we have reassigned n to the result of the division by 2, and the remainder, in one single line.
Notice that the condition in a while
loop is a condition to continue looping. If it evaluates to True
, then the body is executed and then another evaluation follows, and so on, until the condition evaluates to False
. When that happens, the loop is exited immediately without executing its body.
Note
If the condition never evaluates to False
, the loop becomes a so called infinite loop. Infinite loops are used for example when polling from network devices: you ask the socket if there is any data, you do something with it if there is any, then you sleep for a small amount of time, and then you ask the socket again, over and over again, without ever stopping.
Having the ability to loop over a condition, or to loop indefinitely, is the reason why the for
loop alone is not enough, and therefore Python provides the while
loop.
Tip
By the way, if you need the binary representation of a number, checkout the bin
function.
Just for fun, let's adapt one of the examples (multiple.sequences.py
) using the while logic.
multiple.sequences.while.py
people = ['Jonas', 'Julio', 'Mike', 'Mez'] ages = [25, 30, 31, 39] position = 0 while position < len(people): person = people[position] age = ages[position] print(person, age) position += 1
In the preceding code, I have highlighted the initialization, condition, and update of the variable position
, which makes it possible to simulate the equivalent for
loop code by handling the iteration variable manually. Everything that can be done with a for
loop can also be done with a while
loop, even though you can see there's a bit of boilerplate you have to go through in order to achieve the same result. The opposite is also true, but simulating a never ending while
loop using a for
loop requires some real trickery, so why would you do that? Use the right tool for the job, and 99.9% of the times you'll be fine.
So, to recap, use a for
loop when you need to iterate over one (or a combination of) iterable, and a while
loop when you need to loop according to a condition being satisfied or not. If you keep in mind the difference between the two purposes, you will never choose the wrong looping construct.
Let's now see how to alter the normal flow of a loop.
The break and continue statements
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
The break and continue statements
According to the task at hand, sometimes you will need to alter the regular flow of a loop. You can either skip a single iteration (as many times you want), or you can break out of the loop entirely. A common use case for skipping iterations is for example when you're iterating over a list of items and you need to work on each of them only if some condition is verified. On the other hand, if you're iterating over a collection of items, and you have found one of them that satisfies some need you have, you may decide not to continue the loop entirely and therefore break out of it. There are countless possible scenarios, so it's better to see a couple of examples.
Let's say you want to apply a 20% discount to all products in a basket list for those which have an expiration date of today. The way you achieve this is to use the continue statement, which tells the looping construct (for
or while
) to immediately stop execution of the body and go to the next iteration, if any. This example will take us a little deeper down the rabbit whole, so be ready to jump.
discount.py
from datetime import date, timedelta today = date.today() tomorrow = today + timedelta(days=1) # today + 1 day is tomorrow products = [ {'sku': '1', 'expiration_date': today, 'price': 100.0}, {'sku': '2', 'expiration_date': tomorrow, 'price': 50}, {'sku': '3', 'expiration_date': today, 'price': 20}, ] for product in products: if product['expiration_date'] != today: continue product['price'] *= 0.8 # equivalent to applying 20% discount print( 'Price for sku', product['sku'], 'is now', product['price'])
You see we start by importing the date
and timedelta
objects, then we set up our products. Those with sku 1
and 3
have an expiration date of today
, which means we want to apply 20% discount on them. We loop over each product
and we inspect the expiration date. If it is not (inequality operator, !=
) today
, we don't want to execute the rest of the body suite, so we continue
.
Notice that is not important where in the body suite you place the continue
statement (you can even use it more than once). When you reach it, execution stops and goes back to the next iteration. If we run the discount.py
module, this is the output:
$ python discount.py Price for sku 1 is now 80.0 Price for sku 3 is now 16.0
Which shows you that the last two lines of the body haven't been executed for sku number 2.
Let's now see an example of breaking out of a loop. Say we want to tell if at least any of the elements in a list evaluates to True
when fed to the bool
function. Given that we need to know if there is at least one, when we find it we don't need to keep scanning the list any further. In Python code, this translates to using the break statement. Let's write this down into code:
any.py
items = [0, None, 0.0, True, 0, 7] # True and 7 evaluate to True found = False # this is called "flag" for item in items: print('scanning item', item) if item: found = True # we update the flag break if found: # we inspect the flag print('At least one item evaluates to True') else: print('All items evaluate to False')
The preceding code is such a common pattern in programming, you will see it a lot. When you inspect items this way, basically what you do is to set up a flag
variable, then start the inspection. If you find one element that matches your criteria (in this example, that evaluates to True
), then you update the flag and stop iterating. After iteration, you inspect the flag and take action accordingly. Execution yields:
$ python any.py scanning item 0 scanning item None scanning item 0.0 scanning item True At least one item evaluates to True
See how execution stopped after True
was found?
The break
statement acts exactly like the continue
one, in that it stops executing the body of the loop immediately, but also, prevents any other iteration to run, effectively breaking out of the loop.
The continue
and break
statements can be used together with no limitation in their number, both in the for
and while
looping constructs.
Tip
By the way, there is no need to write code to detect if there is at least one element in a sequence that evaluates to True
. Just check out the any
built-in function.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
A special else clause
One of the features I've seen only in the Python language is the ability to have else
clauses after while
and for
loops. It's very rarely used, but it's definitely nice to have. In short, you can have an else
suite after a for
or while
loop. If the loop ends normally, because of exhaustion of the iterator (for
loop) or because the condition is finally not met (while
loop), then the else
suite (if present) is executed. In case execution is interrupted by a break
statement, the else
clause is not executed. Let's take an example of a for
loop that iterates over a group of items, looking for one that would match some condition. In case we don't find at least one that satisfies the condition, we want to raise an exception. This means we want to arrest the regular execution of the program and signal that there was an error, or exception, that we cannot deal with. Exceptions will be the subject of Chapter 7, Testing, Profiling, and Dealing with Exceptions, so don't worry if you don't fully understand them now. Just bear in mind that they will alter the regular flow of the code. Let me now show you two examples that do exactly the same thing, but one of them is using the special for
... else
syntax. Say that we want to find among a collection of people one that could drive a car.
for.no.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] driver = None for person, age in people: if age >= 18: driver = (person, age) break if driver is None: raise DriverException('Driver not found.')
Notice the flag
pattern again. We set driver to be None
, then if we find one we update the driver
flag, and then, at the end of the loop, we inspect it to see if one was found. I kind of have the feeling that those kids would drive a very metallic car, but anyway, notice that if a driver is not found, a DriverException
is raised, signaling the program that execution cannot continue (we're lacking the driver).
The same functionality can be rewritten a bit more elegantly using the following code:
for.else.py
class DriverException(Exception): pass people = [('James', 17), ('Kirk', 9), ('Lars', 13), ('Robert', 8)] for person, age in people: if age >= 18: driver = (person, age) break else: raise DriverException('Driver not found.')
Notice that we aren't forced to use the flag
pattern any more. The exception is raised as part of the for
loop logic, which makes good sense because the for
loop is checking on some condition. All we need is to set up a driver
object in case we find one, because the rest of the code is going to use that information somewhere. Notice the code is shorter and more elegant, because the logic is now correctly grouped together where it belongs.
Putting this all together
Now that you have seen all there is to see about conditionals and loops, it's time to spice things up a little, and see those two examples I anticipated at the beginning of this chapter. We'll mix and match here, so you can see how one can use all these concepts together. Let's start by writing some code to generate a list of prime numbers up to some limit. Please bear in mind that I'm going to write a very inefficient and rudimentary algorithm to detect primes. The important thing for you is to concentrate on those bits in the code that belong to this chapter's subject.
Example 1 – a prime generator
According to Wikipedia:
"A prime number (or a prime) is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number."
Based on this definition, if we consider the first 10 natural numbers, we can see that 2, 3, 5, and 7 are primes, while 1, 4, 6, 8, 9, 10 are not. In order to have a computer tell you if a number N is prime, you can divide that number by all natural numbers in the range [2, N). If any of those divisions yields zero as a remainder, then the number is not a prime. Enough chatter, let's get down to business. I'll write two versions of this, the second of which will exploit the for
... else
syntax.
primes.py
primes = [] # this will contain the primes in the end upto = 100 # the limit, inclusive for n in range(2, upto + 1): is_prime = True # flag, new at each iteration of outer for for divisor in range(2, n): if n % divisor == 0: is_prime = False break if is_prime: # check on flag primes.append(n) print(primes)
Lots of things to notice in the preceding code. First of all we set up an empty list primes
, which will contain the primes at the end. The limit is 100, and you can see it's inclusive in the way we call range()
in the outer loop. If we wrote range(2, upto)
that would be [2, upto), right? Therefore range(2, upto + 1)
gives us [2, upto + 1) == [2, upto].
So, two for
loops. In the outer one we loop over the candidate primes, that is, all natural numbers from 2 to upto
. Inside each iteration of this outer loop we set up a flag (which is set to True
at each iteration), and then start dividing the current n
by all numbers from 2 to n – 1. If we find a proper divisor for n
, it means n
is composite, and therefore we set the flag to False
and break the loop. Notice that when we break the inner one, the outer one keeps on going normally. The reason why we break after having found a proper divisor for n
is that we don't need any further information to be able to tell that n
is not a prime.
When we check on the is_prime
flag, if it is still True
, it means we couldn't find any number in [2, n) that is a proper divisor for n
, therefore n
is a prime. We append n
to the primes
list, and hop! Another iteration, until n equals 100.
Running this code yields:
$ python primes.py [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Before we proceed, one question: of all iterations of the outer loop, one of them is different than all the others. Could you tell which one, and why? Think about it for a second, go back to the code and try to figure it out for yourself, and then keep reading on.
Did you figure it out? If not, don't feel bad, it's perfectly normal. I asked you to do it as a small exercise because it's what coders do all the time. The skill to understand what the code does by simply looking at it is something you build over time. It's very important, so try to exercise it whenever you can. I'll tell you the answer now: the iteration that behaves differently from all others is the first one. The reason is because in the first iteration, n
is 2. Therefore the innermost for
loop won't even run, because it's a for
loop which iterates over range(2, 2)
, and what is that if not [2, 2)? Try it out for yourself, write a simple for
loop with that iterable, put a print
in the body suite, and see if anything happens (it won't...).
Now, from an algorithmic point of view this code is inefficient so let's at least make it more beautiful:
primes.else.py
primes = [] upto = 100 for n in range(2, upto + 1): for divisor in range(2, n): if n % divisor == 0: break else: primes.append(n) print(primes)
Much nicer, right? The is_prime
flag is completely gone, and we append n
to the primes
list when we know the inner for
loop hasn't encountered any break
statements. See how the code looks cleaner and reads better?
Example 2 – applying discounts
In this example, I want to show you a technique I like a lot. In many programming languages, other than the if
/elif
/else
constructs, in whatever form or syntax they may come, you can find another statement, usually called switch
/case
, that in Python is missing. It is the equivalent of a cascade of if
/elif
/.../elif
/else clauses, with a syntax similar to this (warning! JavaScript code!):
switch.js
switch (day_number) { case 1: case 2: case 3: case 4: case 5: day = "Weekday"; break; case 6: day = "Saturday"; break; case 0: day = "Sunday"; break; default: day = ""; alert(day_number + ' is not a valid day number.') }
In the preceding code, we switch
on a variable called day_number
. This means we get its value and then we decide what case it fits in (if any). From 1 to 5 there is a cascade, which means no matter the number, [1, 5] all go down to the bit of logic that sets day
as "Weekday"
. Then we have single cases for 0 and 6 and a default
case to prevent errors, which alerts the system that day_number
is not a valid day number, that is, not in [0, 6]. Python is perfectly capable of realizing such logic using if
/elif
/else
statements:
switch.py
if 1 <= day_number <= 5: day = 'Weekday' elif day_number == 6: day = 'Saturday' elif day_number == 0: day = 'Sunday' else: day = '' raise ValueError( str(day_number) + ' is not a valid day number.')
In the preceding code, we reproduce the same logic of the JavaScript snippet, in Python, using if
/elif
/else
statements. I raised ValueError
exception just as an example at the end, if day_number
is not in [0, 6]. This is one possible way of translating the switch
/case
logic, but there is also another one, sometimes called dispatching, which I will show you in the last version of the next example.
Tip
By the way, did you notice the first line of the previous snippet? Have you noticed that Python can make double (actually, even multiple) comparisons? It's just wonderful!
Let's start the new example by simply writing some code that assigns a discount to customers based on their coupon value. I'll keep the logic down to a minimum here, remember that all we really care about is conditionals and loops.
coupons.py
customers = [ dict(id=1, total=200, coupon_code='F20'), # F20: fixed, £20 dict(id=2, total=150, coupon_code='P30'), # P30: percent, 30% dict(id=3, total=100, coupon_code='P50'), # P50: percent, 50% dict(id=4, total=110, coupon_code='F15'), # F15: fixed, £15 ] for customer in customers: code = customer['coupon_code'] if code == 'F20': customer['discount'] = 20.0 elif code == 'F15': customer['discount'] = 15.0 elif code == 'P30': customer['discount'] = customer['total'] * 0.3 elif code == 'P50': customer['discount'] = customer['total'] * 0.5 else: customer['discount'] = 0.0 for customer in customers: print(customer['id'], customer['total'], customer['discount'])
We start by setting up some customers. They have an order total, a coupon code, and an id. I made up four different types of coupon, two are fixed and two are percentage based. You can see that in the if
/elif
/else
cascade I apply the discount accordingly, and I set it as a 'discount'
key in the customer
dict.
At the end I just print out part of the data to see if my code is working properly.
$ python coupons.py 1 200 20.0 2 150 45.0 3 100 50.0 4 110 15.0
This code is simple to understand, but all those clauses are kind of cluttering the logic. It's not easy to see what's going on at a first glance, and I don't like it. In cases like this, you can exploit a dictionary to your advantage, like this:
coupons.dict.py
customers = [ dict(id=1, total=200, coupon_code='F20'), # F20: fixed, £20 dict(id=2, total=150, coupon_code='P30'), # P30: percent, 30% dict(id=3, total=100, coupon_code='P50'), # P50: percent, 50% dict(id=4, total=110, coupon_code='F15'), # F15: fixed, £15 ] discounts = { 'F20': (0.0, 20.0), # each value is (percent, fixed) 'P30': (0.3, 0.0), 'P50': (0.5, 0.0), 'F15': (0.0, 15.0), } for customer in customers: code = customer['coupon_code'] percent, fixed = discounts.get(code, (0.0, 0.0)) customer['discount'] = percent * customer['total'] + fixed for customer in customers: print(customer['id'], customer['total'], customer['discount'])
Running the preceding code yields exactly the same result we had from the snippet before it. We spared two lines, but more importantly, we gained a lot in readability, as the body of the for
loop now is just three lines long, and very easy to understand. The concept here is to use a dictionary as dispatcher. In other words, we try to fetch something from the dictionary based on a code (our coupon_code
), and by using dict.get(key, default)
, we make sure we also cater for when the code
is not in the dictionary and we need a default value.
Notice that I had to apply some very simple linear algebra in order to calculate the discount properly. Each discount has a percentage and fixed part in the dictionary, represented by a 2-tuple. By applying percent * total + fixed
, we get the correct discount. When percent
is 0
, the formula just gives the fixed amount, and it gives percent * total
when fixed is 0
. Simple but effective.
This technique is important because it is also used in other contexts, with functions, where it actually becomes much more powerful than what we've seen in the preceding snippet. If it's not completely clear to you how it works, I suggest you to take your time and experiment with it. Change values and add print statements to see what's going on while the program is running.
Example 1 – a prime generator
According to Wikipedia:
"A prime number (or a prime) is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number."
Based on this definition, if we consider the first 10 natural numbers, we can see that 2, 3, 5, and 7 are primes, while 1, 4, 6, 8, 9, 10 are not. In order to have a computer tell you if a number N is prime, you can divide that number by all natural numbers in the range [2, N). If any of those divisions yields zero as a remainder, then the number is not a prime. Enough chatter, let's get down to business. I'll write two versions of this, the second of which will exploit the for
... else
syntax.
primes.py
primes = [] # this will contain the primes in the end upto = 100 # the limit, inclusive for n in range(2, upto + 1): is_prime = True # flag, new at each iteration of outer for for divisor in range(2, n): if n % divisor == 0: is_prime = False break if is_prime: # check on flag primes.append(n) print(primes)
Lots of things to notice in the preceding code. First of all we set up an empty list primes
, which will contain the primes at the end. The limit is 100, and you can see it's inclusive in the way we call range()
in the outer loop. If we wrote range(2, upto)
that would be [2, upto), right? Therefore range(2, upto + 1)
gives us [2, upto + 1) == [2, upto].
So, two for
loops. In the outer one we loop over the candidate primes, that is, all natural numbers from 2 to upto
. Inside each iteration of this outer loop we set up a flag (which is set to True
at each iteration), and then start dividing the current n
by all numbers from 2 to n – 1. If we find a proper divisor for n
, it means n
is composite, and therefore we set the flag to False
and break the loop. Notice that when we break the inner one, the outer one keeps on going normally. The reason why we break after having found a proper divisor for n
is that we don't need any further information to be able to tell that n
is not a prime.
When we check on the is_prime
flag, if it is still True
, it means we couldn't find any number in [2, n) that is a proper divisor for n
, therefore n
is a prime. We append n
to the primes
list, and hop! Another iteration, until n equals 100.
Running this code yields:
$ python primes.py [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Before we proceed, one question: of all iterations of the outer loop, one of them is different than all the others. Could you tell which one, and why? Think about it for a second, go back to the code and try to figure it out for yourself, and then keep reading on.
Did you figure it out? If not, don't feel bad, it's perfectly normal. I asked you to do it as a small exercise because it's what coders do all the time. The skill to understand what the code does by simply looking at it is something you build over time. It's very important, so try to exercise it whenever you can. I'll tell you the answer now: the iteration that behaves differently from all others is the first one. The reason is because in the first iteration, n
is 2. Therefore the innermost for
loop won't even run, because it's a for
loop which iterates over range(2, 2)
, and what is that if not [2, 2)? Try it out for yourself, write a simple for
loop with that iterable, put a print
in the body suite, and see if anything happens (it won't...).
Now, from an algorithmic point of view this code is inefficient so let's at least make it more beautiful:
primes.else.py
primes = [] upto = 100 for n in range(2, upto + 1): for divisor in range(2, n): if n % divisor == 0: break else: primes.append(n) print(primes)
Much nicer, right? The is_prime
flag is completely gone, and we append n
to the primes
list when we know the inner for
loop hasn't encountered any break
statements. See how the code looks cleaner and reads better?
Example 2 – applying discounts
In this example, I want to show you a technique I like a lot. In many programming languages, other than the if
/elif
/else
constructs, in whatever form or syntax they may come, you can find another statement, usually called switch
/case
, that in Python is missing. It is the equivalent of a cascade of if
/elif
/.../elif
/else clauses, with a syntax similar to this (warning! JavaScript code!):
switch.js
switch (day_number) { case 1: case 2: case 3: case 4: case 5: day = "Weekday"; break; case 6: day = "Saturday"; break; case 0: day = "Sunday"; break; default: day = ""; alert(day_number + ' is not a valid day number.') }
In the preceding code, we switch
on a variable called day_number
. This means we get its value and then we decide what case it fits in (if any). From 1 to 5 there is a cascade, which means no matter the number, [1, 5] all go down to the bit of logic that sets day
as "Weekday"
. Then we have single cases for 0 and 6 and a default
case to prevent errors, which alerts the system that day_number
is not a valid day number, that is, not in [0, 6]. Python is perfectly capable of realizing such logic using if
/elif
/else
statements:
switch.py
if 1 <= day_number <= 5: day = 'Weekday' elif day_number == 6: day = 'Saturday' elif day_number == 0: day = 'Sunday' else: day = '' raise ValueError( str(day_number) + ' is not a valid day number.')
In the preceding code, we reproduce the same logic of the JavaScript snippet, in Python, using if
/elif
/else
statements. I raised ValueError
exception just as an example at the end, if day_number
is not in [0, 6]. This is one possible way of translating the switch
/case
logic, but there is also another one, sometimes called dispatching, which I will show you in the last version of the next example.
Tip
By the way, did you notice the first line of the previous snippet? Have you noticed that Python can make double (actually, even multiple) comparisons? It's just wonderful!
Let's start the new example by simply writing some code that assigns a discount to customers based on their coupon value. I'll keep the logic down to a minimum here, remember that all we really care about is conditionals and loops.
coupons.py
customers = [ dict(id=1, total=200, coupon_code='F20'), # F20: fixed, £20 dict(id=2, total=150, coupon_code='P30'), # P30: percent, 30% dict(id=3, total=100, coupon_code='P50'), # P50: percent, 50% dict(id=4, total=110, coupon_code='F15'), # F15: fixed, £15 ] for customer in customers: code = customer['coupon_code'] if code == 'F20': customer['discount'] = 20.0 elif code == 'F15': customer['discount'] = 15.0 elif code == 'P30': customer['discount'] = customer['total'] * 0.3 elif code == 'P50': customer['discount'] = customer['total'] * 0.5 else: customer['discount'] = 0.0 for customer in customers: print(customer['id'], customer['total'], customer['discount'])
We start by setting up some customers. They have an order total, a coupon code, and an id. I made up four different types of coupon, two are fixed and two are percentage based. You can see that in the if
/elif
/else
cascade I apply the discount accordingly, and I set it as a 'discount'
key in the customer
dict.
At the end I just print out part of the data to see if my code is working properly.
$ python coupons.py 1 200 20.0 2 150 45.0 3 100 50.0 4 110 15.0
This code is simple to understand, but all those clauses are kind of cluttering the logic. It's not easy to see what's going on at a first glance, and I don't like it. In cases like this, you can exploit a dictionary to your advantage, like this:
coupons.dict.py
customers = [ dict(id=1, total=200, coupon_code='F20'), # F20: fixed, £20 dict(id=2, total=150, coupon_code='P30'), # P30: percent, 30% dict(id=3, total=100, coupon_code='P50'), # P50: percent, 50% dict(id=4, total=110, coupon_code='F15'), # F15: fixed, £15 ] discounts = { 'F20': (0.0, 20.0), # each value is (percent, fixed) 'P30': (0.3, 0.0), 'P50': (0.5, 0.0), 'F15': (0.0, 15.0), } for customer in customers: code = customer['coupon_code'] percent, fixed = discounts.get(code, (0.0, 0.0)) customer['discount'] = percent * customer['total'] + fixed for customer in customers: print(customer['id'], customer['total'], customer['discount'])
Running the preceding code yields exactly the same result we had from the snippet before it. We spared two lines, but more importantly, we gained a lot in readability, as the body of the for
loop now is just three lines long, and very easy to understand. The concept here is to use a dictionary as dispatcher. In other words, we try to fetch something from the dictionary based on a code (our coupon_code
), and by using dict.get(key, default)
, we make sure we also cater for when the code
is not in the dictionary and we need a default value.
Notice that I had to apply some very simple linear algebra in order to calculate the discount properly. Each discount has a percentage and fixed part in the dictionary, represented by a 2-tuple. By applying percent * total + fixed
, we get the correct discount. When percent
is 0
, the formula just gives the fixed amount, and it gives percent * total
when fixed is 0
. Simple but effective.
This technique is important because it is also used in other contexts, with functions, where it actually becomes much more powerful than what we've seen in the preceding snippet. If it's not completely clear to you how it works, I suggest you to take your time and experiment with it. Change values and add print statements to see what's going on while the program is running.
Example 2 – applying discounts
In this example, I want to show you a technique I like a lot. In many programming languages, other than the if
/elif
/else
constructs, in whatever form or syntax they may come, you can find another statement, usually called switch
/case
, that in Python is missing. It is the equivalent of a cascade of if
/elif
/.../elif
/else clauses, with a syntax similar to this (warning! JavaScript code!):
switch.js
switch (day_number) { case 1: case 2: case 3: case 4: case 5: day = "Weekday"; break; case 6: day = "Saturday"; break; case 0: day = "Sunday"; break; default: day = ""; alert(day_number + ' is not a valid day number.') }
In the preceding code, we switch
on a variable called day_number
. This means we get its value and then we decide what case it fits in (if any). From 1 to 5 there is a cascade, which means no matter the number, [1, 5] all go down to the bit of logic that sets day
as "Weekday"
. Then we have single cases for 0 and 6 and a default
case to prevent errors, which alerts the system that day_number
is not a valid day number, that is, not in [0, 6]. Python is perfectly capable of realizing such logic using if
/elif
/else
statements:
switch.py
if 1 <= day_number <= 5: day = 'Weekday' elif day_number == 6: day = 'Saturday' elif day_number == 0: day = 'Sunday' else: day = '' raise ValueError( str(day_number) + ' is not a valid day number.')
In the preceding code, we reproduce the same logic of the JavaScript snippet, in Python, using if
/elif
/else
statements. I raised ValueError
exception just as an example at the end, if day_number
is not in [0, 6]. This is one possible way of translating the switch
/case
logic, but there is also another one, sometimes called dispatching, which I will show you in the last version of the next example.
Tip
By the way, did you notice the first line of the previous snippet? Have you noticed that Python can make double (actually, even multiple) comparisons? It's just wonderful!
Let's start the new example by simply writing some code that assigns a discount to customers based on their coupon value. I'll keep the logic down to a minimum here, remember that all we really care about is conditionals and loops.
coupons.py
customers = [ dict(id=1, total=200, coupon_code='F20'), # F20: fixed, £20 dict(id=2, total=150, coupon_code='P30'), # P30: percent, 30% dict(id=3, total=100, coupon_code='P50'), # P50: percent, 50% dict(id=4, total=110, coupon_code='F15'), # F15: fixed, £15 ] for customer in customers: code = customer['coupon_code'] if code == 'F20': customer['discount'] = 20.0 elif code == 'F15': customer['discount'] = 15.0 elif code == 'P30': customer['discount'] = customer['total'] * 0.3 elif code == 'P50': customer['discount'] = customer['total'] * 0.5 else: customer['discount'] = 0.0 for customer in customers: print(customer['id'], customer['total'], customer['discount'])
We start by setting up some customers. They have an order total, a coupon code, and an id. I made up four different types of coupon, two are fixed and two are percentage based. You can see that in the if
/elif
/else
cascade I apply the discount accordingly, and I set it as a 'discount'
key in the customer
dict.
At the end I just print out part of the data to see if my code is working properly.
$ python coupons.py 1 200 20.0 2 150 45.0 3 100 50.0 4 110 15.0
This code is simple to understand, but all those clauses are kind of cluttering the logic. It's not easy to see what's going on at a first glance, and I don't like it. In cases like this, you can exploit a dictionary to your advantage, like this:
coupons.dict.py
customers = [ dict(id=1, total=200, coupon_code='F20'), # F20: fixed, £20 dict(id=2, total=150, coupon_code='P30'), # P30: percent, 30% dict(id=3, total=100, coupon_code='P50'), # P50: percent, 50% dict(id=4, total=110, coupon_code='F15'), # F15: fixed, £15 ] discounts = { 'F20': (0.0, 20.0), # each value is (percent, fixed) 'P30': (0.3, 0.0), 'P50': (0.5, 0.0), 'F15': (0.0, 15.0), } for customer in customers: code = customer['coupon_code'] percent, fixed = discounts.get(code, (0.0, 0.0)) customer['discount'] = percent * customer['total'] + fixed for customer in customers: print(customer['id'], customer['total'], customer['discount'])
Running the preceding code yields exactly the same result we had from the snippet before it. We spared two lines, but more importantly, we gained a lot in readability, as the body of the for
loop now is just three lines long, and very easy to understand. The concept here is to use a dictionary as dispatcher. In other words, we try to fetch something from the dictionary based on a code (our coupon_code
), and by using dict.get(key, default)
, we make sure we also cater for when the code
is not in the dictionary and we need a default value.
Notice that I had to apply some very simple linear algebra in order to calculate the discount properly. Each discount has a percentage and fixed part in the dictionary, represented by a 2-tuple. By applying percent * total + fixed
, we get the correct discount. When percent
is 0
, the formula just gives the fixed amount, and it gives percent * total
when fixed is 0
. Simple but effective.
This technique is important because it is also used in other contexts, with functions, where it actually becomes much more powerful than what we've seen in the preceding snippet. If it's not completely clear to you how it works, I suggest you to take your time and experiment with it. Change values and add print statements to see what's going on while the program is running.
A quick peek at the itertools module
A chapter about iterables, iterators, conditional logic, and looping wouldn't be complete without spending a few words about the itertools
module. If you are into iterating, this is a kind of heaven.
According to the Python official documentation, the itertools
module is:
"A module which implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python. The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an "iterator algebra" making it possible to construct specialized tools succinctly and efficiently in pure Python."
By no means do I have the room here to show you all the goodies you can find in this module, so I encourage you to go and check it out for yourself, I promise you'll enjoy it.
In a nutshell, it provides you with three broad categories of iterators. I will give you a very small example of one iterator taken from each one of them, just to make your mouth water a little.
Infinite iterators
Infinite iterators allow you to work with a for
loop in a different fashion, like if it was a while
loop.
infinite.py
from itertools import count for n in count(5, 3): if n > 20: break print(n, end=', ') # instead of newline, comma and space
Running the code gives this:
$ python infinite.py 5, 8, 11, 14, 17, 20,
The count
factory class makes an iterator that just goes on and on counting. It starts from 5 and keeps adding 3 to it. We need to manually break it if we don't want to get stuck in an infinite loop.
Iterators terminating on the shortest input sequence
This category is very interesting. It allows you to create an iterator based on multiple iterators, combining their values according to some logic. The key point here is that among those iterators, in case any of them are shorter than the rest, the resulting iterator won't break, it will simply stop as soon as the shortest iterator is exhausted. This is very theoretical, I know, so let me give you an example using compress
. This iterator gives you back the data according to a corresponding item in a selector being True
or False
:
compress('ABC', (1, 0, 1))
would give back 'A'
and 'C'
, because they correspond to the 1's
. Let's see a simple example:
compress.py
from itertools import compress data = range(10) even_selector = [1, 0] * 10 odd_selector = [0, 1] * 10 even_numbers = list(compress(data, even_selector)) odd_numbers = list(compress(data, odd_selector)) print(odd_selector) print(list(data)) print(even_numbers) print(odd_numbers)
Notice that odd_selector
and even_selector
are 20 elements long, while data
is just 10 elements long. compress
will stop as soon as data
has yielded its last element. Running this code produces the following:
$ python compress.py [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [0, 2, 4, 6, 8] [1, 3, 5, 7, 9]
It's a very fast and nice way of selecting elements out of an iterable. The code is very simple, just notice that instead of using a for
loop to iterate over each value that is given back by the compress calls, we used list()
, which does the same, but instead of executing a body of instructions, puts all the values into a list and returns it.
Combinatoric generators
Last but not least, combinatoric generators. These are really fun, if you are into this kind of thing. Let's just see a simple example on permutations.
According to Wolfram Mathworld:
"A permutation, also called an "arrangement number" or "order", is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself."
For example, the permutations of ABC are 6: ABC, ACB, BAC, BCA, CAB, and CBA.
If a set has N elements, then the number of permutations of them is N! (N factorial). For the string ABC the permutations are 3! = 3 * 2 * 1 = 6. Let's do it in Python:
permutations.py
from itertools import permutations
print(list(permutations('ABC')))
This very short snippet of code produces the following result:
$ python permutations.py [('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
Be very careful when you play with permutation. Their number grows at a rate that is proportional to the factorial of the number of the elements you're permuting, and that number can get really big, really fast.
Infinite iterators
Infinite iterators allow you to work with a for
loop in a different fashion, like if it was a while
loop.
infinite.py
from itertools import count for n in count(5, 3): if n > 20: break print(n, end=', ') # instead of newline, comma and space
Running the code gives this:
$ python infinite.py 5, 8, 11, 14, 17, 20,
The count
factory class makes an iterator that just goes on and on counting. It starts from 5 and keeps adding 3 to it. We need to manually break it if we don't want to get stuck in an infinite loop.
Iterators terminating on the shortest input sequence
This category is very interesting. It allows you to create an iterator based on multiple iterators, combining their values according to some logic. The key point here is that among those iterators, in case any of them are shorter than the rest, the resulting iterator won't break, it will simply stop as soon as the shortest iterator is exhausted. This is very theoretical, I know, so let me give you an example using compress
. This iterator gives you back the data according to a corresponding item in a selector being True
or False
:
compress('ABC', (1, 0, 1))
would give back 'A'
and 'C'
, because they correspond to the 1's
. Let's see a simple example:
compress.py
from itertools import compress data = range(10) even_selector = [1, 0] * 10 odd_selector = [0, 1] * 10 even_numbers = list(compress(data, even_selector)) odd_numbers = list(compress(data, odd_selector)) print(odd_selector) print(list(data)) print(even_numbers) print(odd_numbers)
Notice that odd_selector
and even_selector
are 20 elements long, while data
is just 10 elements long. compress
will stop as soon as data
has yielded its last element. Running this code produces the following:
$ python compress.py [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [0, 2, 4, 6, 8] [1, 3, 5, 7, 9]
It's a very fast and nice way of selecting elements out of an iterable. The code is very simple, just notice that instead of using a for
loop to iterate over each value that is given back by the compress calls, we used list()
, which does the same, but instead of executing a body of instructions, puts all the values into a list and returns it.
Combinatoric generators
Last but not least, combinatoric generators. These are really fun, if you are into this kind of thing. Let's just see a simple example on permutations.
According to Wolfram Mathworld:
"A permutation, also called an "arrangement number" or "order", is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself."
For example, the permutations of ABC are 6: ABC, ACB, BAC, BCA, CAB, and CBA.
If a set has N elements, then the number of permutations of them is N! (N factorial). For the string ABC the permutations are 3! = 3 * 2 * 1 = 6. Let's do it in Python:
permutations.py
from itertools import permutations
print(list(permutations('ABC')))
This very short snippet of code produces the following result:
$ python permutations.py [('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
Be very careful when you play with permutation. Their number grows at a rate that is proportional to the factorial of the number of the elements you're permuting, and that number can get really big, really fast.
Iterators terminating on the shortest input sequence
This category is very interesting. It allows you to create an iterator based on multiple iterators, combining their values according to some logic. The key point here is that among those iterators, in case any of them are shorter than the rest, the resulting iterator won't break, it will simply stop as soon as the shortest iterator is exhausted. This is very theoretical, I know, so let me give you an example using compress
. This iterator gives you back the data according to a corresponding item in a selector being True
or False
:
compress('ABC', (1, 0, 1))
would give back 'A'
and 'C'
, because they correspond to the 1's
. Let's see a simple example:
compress.py
from itertools import compress data = range(10) even_selector = [1, 0] * 10 odd_selector = [0, 1] * 10 even_numbers = list(compress(data, even_selector)) odd_numbers = list(compress(data, odd_selector)) print(odd_selector) print(list(data)) print(even_numbers) print(odd_numbers)
Notice that odd_selector
and even_selector
are 20 elements long, while data
is just 10 elements long. compress
will stop as soon as data
has yielded its last element. Running this code produces the following:
$ python compress.py [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [0, 2, 4, 6, 8] [1, 3, 5, 7, 9]
It's a very fast and nice way of selecting elements out of an iterable. The code is very simple, just notice that instead of using a for
loop to iterate over each value that is given back by the compress calls, we used list()
, which does the same, but instead of executing a body of instructions, puts all the values into a list and returns it.
Combinatoric generators
Last but not least, combinatoric generators. These are really fun, if you are into this kind of thing. Let's just see a simple example on permutations.
According to Wolfram Mathworld:
"A permutation, also called an "arrangement number" or "order", is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself."
For example, the permutations of ABC are 6: ABC, ACB, BAC, BCA, CAB, and CBA.
If a set has N elements, then the number of permutations of them is N! (N factorial). For the string ABC the permutations are 3! = 3 * 2 * 1 = 6. Let's do it in Python:
permutations.py
from itertools import permutations
print(list(permutations('ABC')))
This very short snippet of code produces the following result:
$ python permutations.py [('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
Be very careful when you play with permutation. Their number grows at a rate that is proportional to the factorial of the number of the elements you're permuting, and that number can get really big, really fast.
Combinatoric generators
Last but not least, combinatoric generators. These are really fun, if you are into this kind of thing. Let's just see a simple example on permutations.
According to Wolfram Mathworld:
"A permutation, also called an "arrangement number" or "order", is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself."
For example, the permutations of ABC are 6: ABC, ACB, BAC, BCA, CAB, and CBA.
If a set has N elements, then the number of permutations of them is N! (N factorial). For the string ABC the permutations are 3! = 3 * 2 * 1 = 6. Let's do it in Python:
permutations.py
from itertools import permutations
print(list(permutations('ABC')))
This very short snippet of code produces the following result:
$ python permutations.py [('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
Be very careful when you play with permutation. Their number grows at a rate that is proportional to the factorial of the number of the elements you're permuting, and that number can get really big, really fast.
Summary
In this chapter, we've taken another step forward to expand our coding vocabulary. We've seen how to drive the execution of the code by evaluating conditions, and we've seen how to loop and iterate over sequences and collections of objects. This gives us the power to control what happens when our code is run, which means we are getting an idea on how to shape it so that it does what we want and it reacts to data that changes dynamically.
We've also seen how to combine everything together in a couple of simple examples, and in the end we have taken a brief look at the itertools
module, which is full of interesting iterators which can enrich our abilities with Python even more.
Now it's time to switch gears, to take another step forward and talk about functions. The next chapter is all about them because they are extremely important. Make sure you're comfortable with what has been done up to now: I want to provide you with interesting examples, so I'll have to go a little faster. Ready? Turn the page.