Functions are like mini programs within our programs that allow us to break code into smaller units. This spares us from having to write duplicate code, which can introduce bugs. But writing effective functions requires making many decisions about naming, size, parameters, and complexity.
This chapter explores the different ways we can write functions and the benefits and drawbacks of different trade-offs. We’ll delve into how to trade off between small and large functions, how the number of parameters affects the function’s complexity, and how to write functions with variable numbers of arguments using the *
and **
operators. We’ll also explore the functional programming paradigm and the benefits of writing functions according to this paradigm.
Function names should follow the same convention we use for identifiers in general, as described in Chapter 4. But they should usually include a verb, because functions typically perform some action. You might also include a noun to describe the thing being acted on. For example, the names refreshConnection()
, setPassword()
, and extract_version()
clarify what the function does and to what.
You might not need a noun for methods that are part of a class or module. A reset()
method in a SatelliteConnection
class or an open()
function in the webbrowser
module already provides the necessary context. You can tell that a satellite connection is the item being reset and that a web browser is the item being opened.
It’s better to use long, descriptive names rather than an acronym or a name that’s too short. A mathematician might immediately understand that a function named gcd()
returns the greatest common denominator of two numbers, but everyone else would find getGreatestCommonDenominator()
more informative.
Remember not to use any of Python’s built-in function or module names, such as all
, any
, date
, email
, file
, format
, hash
, id
, input
, list
, min
, max
, object
, open
, random
, set
, str
, sum
, test
, and type
.
Some programmers say that functions should be as short as possible and no longer than what can fit on a single screen. A function that is only a dozen lines long is relatively easy to understand, at least compared to one that is hundreds of lines long. But making functions shorter by splitting up their code into multiple smaller functions can also have its downsides. Let’s look at some of the advantages of small functions:
But there are also some disadvantages to short functions:
Some people take the guideline “the shorter, the better” to an extreme and claim that all functions should be three or four lines of code at most. This is madness. For example, here’s the getPlayerMove()
function from Chapter 14’s Tower of Hanoi game. The specifics of how this code works are unimportant. Just look at the function’s general structure:
def getPlayerMove(towers):
"""Asks the player for a move. Returns (fromTower, toTower)."""
while True: # Keep asking player until they enter a valid move.
print('Enter the letters of "from" and "to" towers, or QUIT.')
print("(e.g. AB to moves a disk from tower A to tower B.)")
print()
response = input("> ").upper().strip()
if response == "QUIT":
print("Thanks for playing!")
sys.exit()
# Make sure the user entered valid tower letters:
if response not in ("AB", "AC", "BA", "BC", "CA", "CB"):
print("Enter one of AB, AC, BA, BC, CA, or CB.")
continue # Ask player again for their move.
# Use more descriptive variable names:
fromTower, toTower = response[0], response[1]
if len(towers[fromTower]) == 0:
# The "from" tower cannot be an empty tower:
print("You selected a tower with no disks.")
continue # Ask player again for their move.
elif len(towers[toTower]) == 0:
# Any disk can be moved onto an empty "to" tower:
return fromTower, toTower
elif towers[toTower][-1] < towers[fromTower][-1]:
print("Can't put larger disks on top of smaller ones.")
continue # Ask player again for their move.
else:
# This is a valid move, so return the selected towers:
return fromTower, toTower
This function is 34 lines long. Although it covers multiple tasks, including allowing the player to enter a move, checking whether this move is valid, and asking the player again to enter a move if the move is invalid, these tasks all fall under the umbrella of getting the player’s move. On the other hand, if we were devoted to writing short functions, we could break the code in getPlayerMove()
into smaller functions, like this:
def getPlayerMove(towers):
"""Asks the player for a move. Returns (fromTower, toTower)."""
while True: # Keep asking player until they enter a valid move.
response = askForPlayerMove()
terminateIfResponseIsQuit(response)
if not isValidTowerLetters(response):
continue # Ask player again for their move.
# Use more descriptive variable names:
fromTower, toTower = response[0], response[1]
if towerWithNoDisksSelected(towers, fromTower):
continue # Ask player again for their move.
elif len(towers[toTower]) == 0:
# Any disk can be moved onto an empty "to" tower:
return fromTower, toTower
elif largerDiskIsOnSmallerDisk(towers, fromTower, toTower):
continue # Ask player again for their move.
else:
# This is a valid move, so return the selected towers:
return fromTower, toTower
def askForPlayerMove():
"""Prompt the player, and return which towers they select."""
print('Enter the letters of "from" and "to" towers, or QUIT.')
print("(e.g. AB to moves a disk from tower A to tower B.)")
print()
return input("> ").upper().strip()
def terminateIfResponseIsQuit(response):
"""Terminate the program if response is 'QUIT'"""
if response == "QUIT":
print("Thanks for playing!")
sys.exit()
def isValidTowerLetters(towerLetters):
"""Return True if `towerLetters` is valid."""
if towerLetters not in ("AB", "AC", "BA", "BC", "CA", "CB"):
print("Enter one of AB, AC, BA, BC, CA, or CB.")
return False
return True
def towerWithNoDisksSelected(towers, selectedTower):
"""Return True if `selectedTower` has no disks."""
if len(towers[selectedTower]) == 0:
print("You selected a tower with no disks.")
return True
return False
def largerDiskIsOnSmallerDisk(towers, fromTower, toTower):
"""Return True if a larger disk would move on a smaller disk."""
if towers[toTower][-1] < towers[fromTower][-1]:
print("Can't put larger disks on top of smaller ones.")
return True
return False
These six functions are 56 lines long, nearly double the line count of the original code, but they do the same tasks. Although each function is easier to understand than the original getPlayerMove()
function, the group of them together represents an increase in complexity. Readers of your code might have trouble understanding how they all fit together. The getPlayerMove()
function is the only one called by other parts of the program; the other five functions are called only once, from getPlayerMove()
. But the mass of functions doesn’t convey this fact.
I also had to come up with new names and docstrings (the triple-quoted strings under each def
statement, further explained in Chapter 11) for each new function. This leads to functions with confusingly similar names, such as getPlayerMove()
and askForPlayerMove()
. Also, getPlayerMove()
is still longer than three or four lines, so if I were following the guideline “the shorter, the better,” I’d need to split it into even smaller functions!
In this case, the policy of allowing only incredibly short functions might have resulted in simpler functions, but the overall complexity of the program increased drastically. In my opinion, functions should be fewer than 30 lines ideally and definitely no longer than 200 lines. Make your functions as short as reasonably possible but not any shorter.
A function’s parameters are the variable names between the parentheses of the function’s def
statement, whereas the arguments are the values between a function call’s parentheses. The more parameters a function has, the more configurable and generalized its code can be. But more parameters also mean greater complexity.
A good rule to adhere to is that zero to three parameters is fine, but more than five or six is probably too many. Once functions become overly complicated, it’s best to consider how to split them into smaller functions with fewer parameters.
One way to reduce the complexity of your function’s parameters is by providing default arguments for your parameters. A default argument is a value used as an argument if the function call doesn’t specify one. If the majority of function calls use a particular parameter value, we can make that value a default argument to avoid having to enter it repeatedly in the function call.
We specify a default argument in the def
statement, following the parameter name and an equal sign. For example, in this introduction()
function, a parameter named greeting
has the value 'Hello'
if the function call doesn’t specify it:
>>> def introduction(name, greeting='Hello'):
... print(greeting + ', ' + name)
...
>>> introduction('Alice')
Hello, Alice
>>> introduction('Hiro', 'Ohiyo gozaimasu')
Ohiyo gozaimasu, Hiro
When the introduction()
function is called without a second argument, it uses the string 'Hello'
by default. Note that parameters with default arguments must always come after parameters without default arguments.
Recall from Chapter 8 that you should avoid using a mutable object, such as an empty list []
or empty dictionary {}
, as the default value. “Don’t Use Mutable Values for Default Arguments” on page 143 explains the problem that this approach causes and its solution.
You can use the *
and **
syntax (often pronounced as star and star star) to pass groups of arguments to functions separately. The *
syntax allows you to pass in the items in an iterable object (such as a list or tuple). The **
syntax allows you to pass in the key-value pairs in a mapping object (such as a dictionary) as individual arguments.
For example, the print()
function can take multiple arguments. It places a space in between them by default, as the following code shows:
>>> print('cat', 'dog', 'moose')
cat dog moose
These arguments are called positional arguments, because their position in the function call determines which argument is assigned to which parameter. But if you stored these strings in a list and tried to pass the list, the print()
function would think you were trying to print the list as a single value:
>>> args = ['cat', 'dog', 'moose']
>>> print(args)
['cat', 'dog', 'moose']
Passing the list to print()
displays the list, including brackets, quotes, and comma characters.
One way to print the individual items in the list would be to split the list into multiple arguments by passing each item’s index to the function individually, resulting in code that is harder to read:
>>> # An example of less readable code:
>>> args = ['cat', 'dog', 'moose']
>>> print(args[0], args[1], args[2])
cat dog moose
There’s an easier way to pass these items to print()
. You can use the *
syntax to interpret the items in a list (or any other iterable data type) as individual positional arguments. Enter the following example into the interactive shell.
>>> args = ['cat', 'dog', 'moose']
>>> print(*args)
cat dog moose
The *
syntax allows you pass the list items to a function individually, no matter how many items are in the list.
You can use the **
syntax to pass mapping data types (such as dictionaries) as individual keyword arguments. Keyword arguments are preceded by a parameter name and equal sign. For example, the print()
function has a sep
keyword argument that specifies a string to put in between the arguments it displays. It’s set to a single space string ' '
by default. You can assign a keyword argument to a different value using either an assignment statement or the **
syntax. To see how this works, enter the following into the interactive shell:
>>> print('cat', 'dog', 'moose', sep='-')
cat-dog-moose
>>> kwargsForPrint = {'sep': '-'}
>>> print('cat', 'dog', 'moose', **kwargsForPrint)
cat-dog-moose
Notice that these instructions produce identical output. In the example, we used only one line of code to set up the kwargsForPrint
dictionary. But for more complex cases, you might need more code to set up a dictionary of keyword arguments. The **
syntax allows you to create a custom dictionary of configuration settings to pass to a function call. This is useful especially for functions and methods that accept a large number of keyword arguments.
By modifying a list or dictionary at runtime, you can supply a variable number of arguments for a function call using the *
and **
syntax.
You can also use the *
syntax in def
statements to create variadic or varargs functions that receive a varying number of positional arguments. For instance, print()
is a variadic function, because you can pass any number of strings to it: print('Hello!')
or print('My name is', name)
, for example. Note that although we used the *
syntax in function calls in the previous section, we use the *
syntax in function definitions in this section.
Let’s look at an example by creating a product()
function that takes any number of arguments and multiplies them together:
>>> def product(*args):
... result = 1
... for num in args:
... result *= num
... return result
...
>>> product(3, 3)
9
>>> product(2, 1, 2, 3)
12
Inside the function, args
is just a regular Python tuple containing all the positional arguments. Technically, you can name this parameter anything, as long as it begins with the star (*
), but it’s usually named args
by convention.
Knowing when to use the *
takes some thought. After all, the alternative to making a variadic function is to have a single parameter that accepts a list (or other iterable data type), which contains a varying number of items. This is what the built-in sum()
function does:
>>> sum([2, 1, 2, 3])
8
The sum()
function expects one iterable argument, so passing it multiple arguments results in an exception:
>>> sum(2, 1, 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sum() takes at most 2 arguments (4 given)
Meanwhile, the built-in min()
and max()
functions, which find the minimum or maximum value of several values, accept a single iterable argument or multiple separate arguments:
>>> min([2, 1, 3, 5, 8])
1
>>> min(2, 1, 3, 5, 8)
1
>>> max([2, 1, 3, 5, 8])
8
>>> max(2, 1, 3, 5, 8)
8
All of these functions take a varying number of arguments, so why are their parameters designed differently? And when should we design functions to take a single iterable argument or multiple separate arguments using the *
syntax?
How we design our parameters depends on how we predict a programmer will use our code. The print()
function takes multiple arguments because programmers more often pass a series of strings, or variables that contain strings, to it, as in print('My name is', name)
. It isn’t as common to collect these strings into a list over several steps and then pass the list to print()
. Also, if you passed a list to print()
, the function would print that list value in its entirety, so you can’t use it to print the individual values in the list.
There’s no reason to call sum()
with separate arguments because Python already uses the +
operator for that. Because you can write code like 2 + 4 + 8
, you don’t need to be able to write code like sum(2, 4, 8)
. It makes sense that you must pass the varying number of arguments only as a list to sum()
.
The min()
and max()
functions allow both styles. If the programmer passes one argument, the function assumes it’s a list or tuple of values to inspect. If the programmer passes multiple arguments, it assumes these are the values to inspect. These two functions commonly handle lists of values while the program is running, as in the function call min(allExpenses)
. They also deal with separate arguments the programmer selects while writing the code, such as in max(0, someNumber)
. Therefore, the functions are designed to accept both kinds of arguments. The following myMinFunction()
, which is my own implementation of the min()
function, demonstrates this:
def myMinFunction(*args):
if len(args) == 1:
1 values = args[0]
else:
2 values = args
if len(values) == 0:
3 raise ValueError('myMinFunction() args is an empty sequence')
4 for i, value in enumerate(values):
if i == 0 or value < smallestValue:
smallestValue = value
return smallestValue
The myMinFunction()
uses the *
syntax to accept a varying number of arguments as a tuple. If this tuple contains only one value, we assume it’s a sequence of values to inspect 1. Otherwise, we assume that args
is a tuple of values to inspect 2. Either way, the values
variable will contain a sequence of values for the rest of the code to inspect. Like the actual min()
function, we raise ValueError
if the caller didn’t pass any arguments or passed an empty sequence 3. The rest of the code loops through values and returns the smallest value found 4. To keep this example simple, myMinFunction()
accepts only sequences like lists or tuples rather than any iterable value.
You might wonder why we don’t always write functions to accept both ways of passing a varying number of arguments. The answer is that it’s best to keep your functions as simple as possible. Unless both ways of calling the function are common, choose one over the other. If a function usually deals with a data structure created while the program is running, it’s better to have it accept a single parameter. If a function usually deals with arguments that the programmer specifies while writing the code, it’s better to use the *
syntax to accept a varying number of arguments.
Variadic functions can use the **
syntax, too. Although the *
syntax in def
statements represents a varying number of positional arguments, the **
syntax represents a varying number of optional keyword arguments.
If you define a function that could take numerous optional keyword arguments without using the **
syntax, your def
statement could become unwieldy. Consider a hypothetical formMolecule()
function, which has parameters for all 118 known elements:
>>> def formMolecule(hydrogen, helium, lithium, beryllium, boron, --snip--
Passing 2
for the hydrogen
parameter and 1
for the oxygen
parameter to return 'water'
would also be burdensome and unreadable, because you’d have to set all of the irrelevant elements to zero:
>>> formMolecule(2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 --snip--
'water'
You could make the function more manageable by using named keyword parameters that each have a default argument, freeing you from having to pass that parameter an argument in a function call.
For example, this def
statement has default arguments of 0
for each of the keyword parameters:
>>> def formMolecule(hydrogen=0, helium=0, lithium=0, beryllium=0, --snip--
This makes calling formMolecule()
easier, because you only need to specify arguments for parameters that have a different value than the default argument. You can also specify the keyword arguments in any order:
>>> formMolecule(hydrogen=2, oxygen=1)
'water'
>>> formMolecule(oxygen=1, hydrogen=2)
'water'
>>> formMolecule(carbon=8, hydrogen=10, nitrogen=4, oxygen=2)
'caffeine'
But you still have an unwieldy def
statement with 118 parameter names. And what if new elements were discovered? You’d have to update the function’s def
statement along with any documentation of the function’s parameters.
Instead, you can collect all the parameters and their arguments as key-value pairs in a dictionary using the **
syntax for keyword arguments. Technically, you can name the **
parameter anything, but it’s usually named kwargs
by convention:
>>> def formMolecules(**kwargs):
... if len(kwargs) == 2 and kwargs['hydrogen'] == 2 and
kwargs['oxygen'] == 1:
... return 'water'
... # (rest of code for the function goes here)
...
>>> formMolecules(hydrogen=2, oxygen=1)
'water'
The **
syntax indicates that the kwargs
parameter can handle all keyword arguments passed in a function call. They’ll be stored as key-value pairs in a dictionary assigned to the kwargs
parameter. As new chemical elements are discovered, you’d need to update the function’s code but not its def
statement, because all keyword arguments are put into kwargs
:
1 >>> def formMolecules(**kwargs):
2 ... if len(kwargs) == 1 and kwargs.get('unobtanium') == 12:
... return 'aether'
... # (rest of code for the function goes here)
...
>>> formMolecules(unobtanium=12)
'aether'
As you can see, the def
statement 1 is the same as before, and only the function’s code 2 needed updating. When you use the **
syntax, the def
statement and the function calls become much simpler to write and still produce readable code.
A common use case for the *
and **
syntax in def
statements is to create wrapper functions, which pass on arguments to another function and return that function’s return value. You can use the *
and **
syntax to forward any and all arguments to the wrapped function. For example, we can create a printLowercase()
function that wraps the built-in print()
function. It relies on print()
to do the real work but converts the string arguments to lowercase first:
1 >>> def printLower(*args, **kwargs):
2 ... args = list(args)
... for i, value in enumerate(args):
... args[i] = str(value).lower()
3 ... return print(*args, **kwargs)
...
>>> name = 'Albert'
>>> printLower('Hello,', name)
hello, albert
>>> printLower('DOG', 'CAT', 'MOOSE', sep=', ')
dog, cat, moose
The printLower()
function 1 uses the *
syntax to accept a varying number of positional arguments in a tuple assigned to the args
parameter, whereas the **
syntax assigns any keyword arguments to a dictionary in the kwargs
parameter. If a function uses *args
and **kwargs
together, the *args
parameter must come before the **kwargs
parameter. We pass these on to the wrapped print()
function, but first our function modifies some of the arguments, so we create a list form of the args
tuple 2.
After changing the strings in args
to lowercase, we pass the items in args
and key-value pairs in kwargs
as separate arguments to print()
using the *
and **
syntax 3. The return value of print()
also gets returned as the return value of printLower()
. These steps effectively wrap the print()
function.
Functional programming is a programming paradigm that emphasizes writing functions that perform calculations without modifying global variables or any external state (such as files on the hard drive, internet connections, or databases). Some programming languages, such as Erlang, Lisp, and Haskell, are heavily designed around functional programming concepts. Although not shackled to the paradigm, Python has some functional programming features. The main ones that Python programs can use are side-effect-free functions, higher-order functions, and lambda functions.
Side effects are any changes a function makes to the parts of the program that exist outside of its own code and local variables. To illustrate this, let’s create a subtract()
function that implements Python’s subtraction operator (-
):
>>> def subtract(number1, number2):
... return number1 - number2
...
>>> subtract(123, 987)
-864
This subtract()
function has no side effects. That is, it doesn’t affect anything in the program that isn’t a part of its code. There’s no way to tell from the program’s or the computer’s state whether the subtract()
function has been called once, twice, or a million times before. A function might modify local variables inside the function, but these changes remain isolated from the rest of the program.
Now consider an addToTotal()
function, which adds the numeric argument to a global variable named TOTAL
:
>>> TOTAL = 0
>>> def addToTotal(amount):
... global TOTAL
... TOTAL += amount
... return TOTAL
...
>>> addToTotal(10)
10
>>> addToTotal(10)
20
>>> addToTotal(9999)
10019
>>> TOTAL
10019
The addToTotal()
function does have a side effect, because it modifies an element that exists outside of the function: the TOTAL
global variable. Side effects can be more than changes to global variables. They include updating or deleting files, printing text onscreen, opening a database connection, authenticating to a server, or making any other change outside of the function. Any trace that a function call leaves behind after returning is a side effect.
Side effects can also include making in-place changes to mutable objects referred to outside of the function. For example, the following removeLastCatFromList()
function modifies the list argument in-place:
>>> def removeLastCatFromList(petSpecies):
... if len(petSpecies) > 0 and petSpecies[-1] == 'cat':
... petSpecies.pop()
...
>>> myPets = ['dog', 'cat', 'bird', 'cat']
>>> removeLastCatFromList(myPets)
>>> myPets
['dog', 'cat', 'bird']
In this example, the myPets
variable and petSpecies
parameter hold references to the same list. Any in-place modifications made to the list object inside the function would also exist outside the function, making this modification a side effect.
A related concept, a deterministic function, always returns the same return value given the same arguments. The subtract(123, 987)
function call always returns −864
. Python’s built-in round()
function always returns 3
when passed 3.14
as an argument. A nondeterministic function won’t always return the same values when passed the same arguments. For example, calling random.randint(1, 10)
returns a random integer between 1
and 10
. The time.time()
function has no arguments, but it returns a different value depending on what your computer’s clock is set to when the function was called. In the case of time.time()
, the clock is an external resource that is effectively an input into the function the same way an argument is. Functions that depend on resources external to the function (including global variables, files on the hard drive, databases, and internet connections) are not considered deterministic.
One benefit of deterministic functions is that you can cache their values. There’s no need for subtract()
to calculate the difference of 123
and 987
more than once if it can remember the return value from the first time it’s called with those arguments. Therefore, deterministic functions allow us to make a space-time trade-off, quickening the runtime of a function by using space in memory to cache previous results.
A function that is deterministic and free of side effects is called a pure function. Functional programmers strive to create only pure functions in their programs. In addition to those already noted, pure functions offer several benefits:
You can and should write pure functions in Python whenever possible. Python functions are made pure by convention only; there’s no setting that causes the Python interpreter to enforce purity. The most common way to make your functions pure is to avoid using global variables in them and ensure they don’t interact with files, the internet, the system clock, random numbers, or other external resources.
Higher-order functions can accept other functions as arguments or return functions as return values. For example, let’s define a function named callItTwice()
that will call a given function twice:
>>> def callItTwice(func, *args, **kwargs):
... func(*args, **kwargs)
... func(*args, **kwargs)
...
>>> callItTwice(print, 'Hello, world!')
Hello, world!
Hello, world!
The callItTwice()
function works with any function it’s passed. In Python, functions are first-class objects, meaning they’re like any other object: you can store functions in variables, pass them as arguments, or use them as return values.
Lambda functions, also known as anonymous functions or nameless functions, are simplified functions that have no names and whose code consists solely of one return
statement. We often use lambda functions when passing functions as arguments to other functions.
For example, we could create a normal function that accepts a list containing a 4 by 10 rectangle’s width and height, like this:
>>> def rectanglePerimeter(rect):
... return (rect[0] * 2) + (rect[1] * 2)
...
>>> myRectangle = [4, 10]
>>> rectanglePerimeter(myRectangle)
28
The equivalent lambda function would look like this:
lambda rect: (rect[0] * 2) + (rect[1] * 2)
To define a Python lambda function, use the lambda
keyword, followed by a comma-delimited list of parameters (if any), a colon, and then an expression that acts as the return value. Because functions are first-class objects, you can assign a lambda function to a variable, effectively replicating what a def
statement does:
>>> rectanglePerimeter = lambda rect: (rect[0] * 2) + (rect[1] * 2)
>>> rectanglePerimeter([4, 10])
28
We assigned this lambda function to a variable named rectanglePerimeter
, essentially giving us a rectanglePerimeter()
function. As you can see, functions created by lambda
statements are the same as functions created by def
statements.
The lambda function syntax is helpful for specifying small functions to serve as arguments to other function calls. For example, the sorted()
function has a keyword argument named key
that lets you specify a function. Instead of sorting items in a list based on the item’s value, it sorts them based on the function’s return value. In the following example, we pass sorted()
a lambda function that returns the perimeter of the given rectangle. This makes the sorted()
function sort based on the calculated perimeter of its [width, height]
list rather than based directly on the [width, height]
list:
>>> rects = [[10, 2], [3, 6], [2, 4], [3, 9], [10, 7], [9, 9]]
>>> sorted(rects, key=lambda rect: (rect[0] * 2) + (rect[1] * 2))
[[2, 4], [3, 6], [10, 2], [3, 9], [10, 7], [9, 9]]
Rather than sorting the values [10, 2]
or [3, 6]
, for example, the function now sorts based on the returned perimeter integers 24
and 18
. Lambda functions are a convenient syntactic shortcut: you can specify a small one-line lambda function instead of defining a new, named function with a def
statement.
In earlier Python versions, the map()
and filter()
functions were common higher-order functions that could transform and filter lists, often with the help of lambda functions. Mapping could create a list of values based on the values of another list. Filtering could create a list that contained only the values from another list that match some criteria.
For example, if you wanted to create a new list that had strings instead of the integers [8, 16, 18, 19, 12, 1, 6, 7]
, you could pass that list and lambda n: str(n)
to the map()
function:
>>> mapObj = map(lambda n: str(n), [8, 16, 18, 19, 12, 1, 6, 7])
>>> list(mapObj)
['8', '16', '18', '19', '12', '1', '6', '7']
The map()
function returns a map
object, which we can get in list form by passing it to the list()
function. The mapped list now contains string values based on the original list’s integer values. The filter()
function is similar, but here, the lambda function argument determines which items in the list remain (if the lambda function returns True
) or are filtered out (if it returns False
). For example, we could pass lambda n: n % 2 == 0
to filter out any odd integers:
>>> filterObj = filter(lambda n: n % 2 == 0, [8, 16, 18, 19, 12, 1, 6, 7])
>>> list(filterObj)
[8, 16, 18, 12, 6]
The filter()
function returns a filter object, which we can once again pass to the list()
function. Only the even integers remain in the filtered list.
But the map()
and filter()
functions are outdated ways to create mapped or filtered lists in Python. Instead, you can now create them with list comprehensions. List comprehensions not only free you from writing out a lambda function, but are also faster than map()
and filter()
.
Here we replicate the map()
function example using a list comprehension:
>>> [str(n) for n in [8, 16, 18, 19, 12, 1, 6, 7]]
['8', '16', '18', '19', '12', '1', '6', '7']
Notice that the str(n)
part of the list comprehension is similar to lambda n: str(n)
.
And here we replicate the filter()
function example using a list comprehension:
>>> [n for n in [8, 16, 18, 19, 12, 1, 6, 7] if n % 2 == 0]
[8, 16, 18, 12, 6]
Notice that the if n % 2 == 0
part of the list comprehension is similar to lambda n: n % 2 == 0
.
Many languages have a concept of functions as first-class objects, allowing for the existence of higher-order functions, including mapping and filtering functions.
Python is a dynamically typed language, which means that Python functions and methods are free to return values of any data type. But to make your functions more predictable, you should strive to have them return values of only a single data type.
For example, here’s a function that, depending on a random number, returns either an integer value or a string value:
>>> import random
>>> def returnsTwoTypes():
... if random.randint(1, 2) == 1:
... return 42
... else:
... return 'forty two'
When you’re writing code that calls this function, it can be easy to forget that you must handle several possible data types. To continue this example, say we call returnsTwoTypes()
and want to convert the number that it returns to hexadecimal:
>>> hexNum = hex(returnsTwoTypes())
>>> hexNum
'0x2a'
Python’s built-in hex()
function returns a string of a hexadecimal number of the integer value it was passed. This code works fine as long as returnsTwoTypes()
returns an integer, giving us the impression that this code is bug free. But when returnsTwoTypes()
returns a string, it raises an exception:
>>> hexNum = hex(returnsTwoTypes())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer
Of course, we should always remember to handle every possible data type that the return value could have. But in the real world, it’s easy to forget this. To prevent these bugs, we should always attempt to make functions return values of a single data type. This isn’t a strict requirement, and sometimes there’s no way around having your function return values of different data types. But the closer you get to returning only one type, the simpler and less bug prone your functions will be.
There is one case in particular to be aware of: don’t return None
from your function unless your function always returns None
. The None
value is the only value in the NoneType
data type. It’s tempting to have a function return None
to signify that an error occurred (I discuss this practice in the next section, “Raising Exceptions vs. Returning Error Codes”), but you should reserve returning None
for functions that have no meaningful return value.
The reason is that returning None
to indicate an error is a common source of uncaught 'NoneType' object has no attribute
exceptions:
>>> import random
>>> def sometimesReturnsNone():
... if random.randint(1, 2) == 1:
... return 'Hello!'
... else:
... return None
...
>>> returnVal = sometimesReturnsNone()
>>> returnVal.upper()
'HELLO!'
>>> returnVal = sometimesReturnsNone()
>>> returnVal.upper()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'upper'
This error message is rather vague, and it could take some effort to trace its cause back to a function that normally returns an expected result but could also return None
when an error happens. The problem occurred because sometimesReturnsNone()
returned None
, which we then assigned to the returnVal
variable. But the error message would lead you to think the problem occurred in the call to the upper()
method.
In a 2009 conference talk, computer scientist Tony Hoare apologized for inventing the null reference (the general analogous value to Python’s None
value) in 1965, saying “I call it my billion dollar mistake. […] I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last 40 years.” You can view his full talk online at https://autbor.com/billiondollarmistake.
In Python, the meanings of the terms exception and error are roughly the same: an exceptional circumstance in your program that usually indicates a problem. Exceptions became popular as a programming language feature in the 1980s and 1990s with C++ and Java. They replaced the use of error codes, which are values returned from functions to indicate a problem. The benefit of exceptions is that return values are only related to the function’s purpose instead of also indicating the presence of errors.
Error codes can also cause issues in your program. For example, Python’s find()
string method normally returns the index where it found a substring, and if it’s unable to find it, it returns -1
as an error code. But because we can also use -1
to specify the index from the end of a string, inadvertently using -1
as an error code might introduce a bug. Enter the following in the interactive shell to see how this works.
>>> print('Letters after b in "Albert":', 'Albert'['Albert'.find('b') + 1:])
Letters after b in "Albert": ert
>>> print('Letters after x in "Albert":', 'Albert'['Albert'.find('x') + 1:])
Letters after x in "Albert": Albert
The 'Albert'.find('x')
part of the code evaluates to the error code -1
. That makes the expression 'Albert'['Albert'.find('x') + 1:]
evaluate to 'Albert'[-1 + 1:]
, which further evaluates to 'Albert'[0:]
and then to 'Albert'
. Obviously, this isn’t the code’s intended behavior. Calling index()
instead of find()
, as in 'Albert'['Albert'.index('x') + 1:]
, would have raised an exception, making the problem obvious and unignorable.
The index()
string method, on the other hand, raises a ValueError
exception if it’s unable to find a substring. If you don’t handle this exception, it will crash the program—behavior that is often preferable to not noticing the error.
The names of exception classes often end with “Error” when the exception indicates an actual error, such as ValueError
, NameError
, or SyntaxError
. Exception classes that represent exceptional cases that aren’t necessarily errors include StopIteration
, KeyboardInterrupt
, or SystemExit
.
Functions are a common way of grouping our programs’ code together, and they require you to make certain decisions: what to name them, how big to make them, how many parameters they should have, and how many arguments you should pass for those parameters. The *
and **
syntax in def
statements allows functions to receive a varying number of parameters, making them variadic functions.
Although not a functional programming language, Python has many features that functional programming languages use. Functions are first-class objects, meaning you can store them in variables and pass them as arguments to other functions (which are called higher-order functions in this context). Lambda functions offer a short syntax for specifying nameless, anonymous functions as the arguments for higher-order functions. The most common higher-order functions in Python are map()
and filter()
, although you can execute the functionality they provide faster with list comprehensions.
The return values of your functions should always be the same data type. You shouldn’t use return values as error codes: exceptions are for indicating errors. The None
value in particular is often mistakenly used as an error code.