Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Functional Python Programming, 3rd edition

You're reading from   Functional Python Programming, 3rd edition Use a functional approach to write succinct, expressive, and efficient Python code

Arrow left icon
Product type Paperback
Published in Dec 2022
Publisher Packt
ISBN-13 9781803232577
Length 576 pages
Edition 3rd Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Steven F. Lott Steven F. Lott
Author Profile Icon Steven F. Lott
Steven F. Lott
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface
1. Chapter 1: Understanding Functional Programming FREE CHAPTER 2. Chapter 2: Introducing Essential Functional Concepts 3. Chapter 3: Functions, Iterators, and Generators 4. Chapter 4: Working with Collections 5. Chapter 5: Higher-Order Functions 6. Chapter 6: Recursions and Reductions 7. Chapter 7: Complex Stateless Objects 8. Chapter 8: The Itertools Module 9. Chapter 9: Itertools for Combinatorics – Permutations and Combinations 10. Chapter 10: The Functools Module 11. Chapter 11: The Toolz Package 12. Chapter 12: Decorator Design Techniques 13. Chapter 13: The PyMonad Library 14. Chapter 14: The Multiprocessing, Threading, and Concurrent.Futures Modules 15. Chapter 15: A Functional Approach to Web Services 16. Other Books You Might Enjoy
17. Index

1.6 Exercises

The exercises in this book are based on code available from Packt Publishing on GitHub. See https://github.com/PacktPublishing/Functional-Python-Programming-3rd-Edition.

In some cases, the reader will notice that the code provided on GitHub includes partial solutions to some of the exercises. These serve as hints, allowing the reader to explore alternative solutions.

In many cases, exercises will need unit test cases to confirm they actually solve the problem. These are often identical to the unit test cases already provided in the GitHub repository. The reader will need to replace the book’s example function name with their own solution to confirm that it works.

1.6.1 Convert an imperative algorithm to functional code

The following algorithm is stated as imperative assignment statements and a while construct to indicate processing something iteratively.

Algorithm 1: Imperative iteration

What does this appear to compute? Given Python built-in functions like sum, can this be simplified?

It helps to write this in Python and refactor the code to be sure that correct answers are created.

A test case is the following:

V ← {7.46,6.77,12.74,7.11,7.81,8.84,6.08,5.39,8.15,6.42,5.73}

The computed value for m is approximately 7.5.

1.6.2 Convert step-wise computation to functional code

The following algorithm is stated as a long series of single assignment statements. The rad(x) function converts degrees to radians, rad(d) = π ×1d80. See the math module for an implementation.

Algorithm 2: Imperative computation

Is this code easy to understand? Can you summarize this computation as a short mathematical-looking formula?

Breaking it down into sections, lines 1 to 8 seem to be focused on some conversions, differences, and mid-point computations. Lines 9 to 12 compute two values, x and y. Can these be summarized or simplified? The final four lines do a relatively direct computation of d. Can this be summarized or simplified? As a hint, look at math.hypot() for a function that might be applicable in this case.

It helps to write this in Python and refactor the code.

A test case is the following:

  lat1 32.82950
  lon1 ←−79.93021
  lat2 32.74412
  lon2 ←−79.85226

The computed value for d is approximately 6.4577.

Refactoring the code can help to confirm your understanding.

1.6.3 Revise the sqrt() function

The sqrt() function defined in the A classic example of functional programming section has only a single parameter value, n. Rewrite this to create a more advanced version using default parameter values to make changes possible. An expression such as sqrt(1.0, 0.000_01, 3) will start with an approximation of 1.0 and compute the value to a precision of 0.00001. The final parameter value, 3, is the value of n, the number we need to compute the square root of.

1.6.4 Data cleansing steps

A file of source data has US ZIP codes in a variety of formats. This problem often arises when spreadsheet software is used to collect or transform data.

  • Some ZIP codes were processed as numbers. This doesn’t work out well for places in New England, where ZIP codes have a leading zero. For example, one of Portsmouth, New Hampshire’s codes should be stated as 03801. In the source file, it is 3801. For the most part, these numbers will have five or nine digits, but some codes in New England will be four or eight digits when a single leading zero was dropped. For Puerto Rico, there may be two leading zeroes.

  • Some ZIP codes are stored as strings, 123450100, where a four-digit extension for a post-office box has been appended to the base five-digit code.

A CSV-format file has only text values. However, when data in the file has been processed by a spreadsheet, problems can arise. Because a ZIP code has only digits, it can be treated as numeric data. This means the original data values will have been converted to a number, and then back to a text representation. These conversions will drop the leading zeroes. There are a number of workarounds in various spreadsheet applications to prevent this problem. If they’re not used, the data can have anomalous values that can be cleansed to restore the original representation.

The objective of the exercise is to compute a histogram of the most popular ZIP codes in the source data file. The data must be cleansed to have the following two ZIP formats:

  • Five characters with no post-office box, for example 03801

  • Ten characters with a hyphen, for example 03899-9876

The essential histogram can be done with a collections.Counter object as follows.

from collections import Counter 
import csv 
from pathlib import Path 
 
DEFAULT_PATH = Path.cwd() / "address.csv" 
 
def main(source_path: Path = DEFAULT_PATH) -> None: 
    frequency: Counter[str] = Counter() 
    with source_path.open() as source: 
        rdr = csv.DictReader(source) 
        for row in rdr: 
            if "-" in row[’ZIP’]: 
                text_zip = row[’ZIP’] 
                missing_zeroes = 10 - len(text_zip) 
                if missing_zeroes: 
                    text_zip = missing_zeroes*’0’ + text_zip 
            else: 
                text_zip = row[’ZIP’] 
                if 5 < len(row[’ZIP’]) < 9: 
                    missing_zeroes = 9 - len(text_zip) 
                else: 
                    missing_zeroes = 5 - len(text_zip) 
                if missing_zeroes: 
                    text_zip = missing_zeroes*’0’ + text_zip 
            frequency[text_zip] += 1 
    print(frequency) 
 
if __name__ == "__main__": 
    main()

This makes use of imperative processing features to read a file. The overall design, using a for statement to process rows of a file, is an essential Pythonic feature that we can preserve.

On the other hand, the processing of the text_zip and missing_zeroes variables through a number of state changes seems like it’s a potential source for confusion.

This can be refactored through several rewrites:

  1. Decompose the main() function into two parts. A new zip_histogram() function should be written to contain much of the processing detail. This function will process the opened file, and return a Counter object. A suggested signature is the following:

        def zip_histogram( 
                reader: csv.DictReader[str]) -> Counter[str]: 
            pass

    The main() function is left with the responsibility to open the file, create the csv.DictReader instance, evaluate zip_histogram(), and print the histogram.

  2. Once the zip_histogram() function has been defined, the cleansing of the ZIP attribute can be refactored into a separate function, with a name like zip_cleanse(). Rather than setting the value of the text_zip variable, this function can return the cleansed result. This can be tested separately to be sure the various cases are handled gracefully.

  3. The distinction between long ZIP codes with a hyphen and without a hyphen is something that should be fixed. Once the zip_cleanse() works in general, add a new function to inject hyphens into ZIP codes with only digits. This should transform 38011234 to 03801-1234. Note that short, five-digit ZIP codes do not need to have a hyphen added; this additional transformation only applies to nine-digit codes to make them into ten-position strings.

The final zip_histogram() function should look something like the following:

def zip_histogram( 
        reader: csv.DictReader[str]) -> Counter[str]: 
    return Counter( 
        zip_cleanse( 
            row[’ZIP’] 
        ) for row in reader 
    )

This provides a framework for performing a focused data cleanup in the given column. It allows us to distinguish between CSV and file processing features, and the details of how to clean up a specific column of data.

1.6.5 (Advanced) Optimize this functional code

The following algorithm is stated as a single ”step” that has been decomposed into three separate formulae. The decomposition is more a concession to the need to fit the expression into the limits of a printed page than a useful optimization. The rad(x) function converts degrees to radians, rad(d) = π ×-d- 180.

Algorithm 3: Redundant expressions

There are a number of redundant expressions, like rad(lat1) and rad(lat2). If these are assigned to local variables, can the expression be simplified?

The final computation of d does not match the conventional understanding of computing a hypotenuse, ∘ ------- x2 + y2. Should the code be refactored to match the definition in math.hypot?

It helps to start by writing this in Python and then refactoring the code.

A test case is the following:

  lat1 32.82950
  lon1 ←−79.93021
  lat2 32.74412
  lon2 ←−79.85226

The computed value for d is approximately 6.4577.

Refactoring the code can help to confirm your understanding of what this code really does.

You have been reading a chapter from
Functional Python Programming, 3rd edition - Third Edition
Published in: Dec 2022
Publisher: Packt
ISBN-13: 9781803232577
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image