Code that causes a program to crash is obviously wrong, but crashes aren’t the only indicator of issues in your programs. Other signs can suggest the presence of more subtle bugs or unreadable code. Just as the smell of gas can indicate a gas leak or the smell of smoke could indicate a fire, a code smell is a source code pattern that signals potential bugs. A code smell doesn’t necessarily mean a problem exists, but it does mean you should investigate your program.
This chapter lists several common code smells. It takes much less time and effort to prevent a bug than to encounter, understand, and fix a bug later. Every programmer has stories of spending hours debugging only to find that the fix involved changing a single line of code. For this reason, even a whiff of a potential bug should give you pause, prompting you to double-check that you aren’t creating future problems.
Of course, a code smell isn’t necessarily a problem. Ultimately, whether to address or ignore a code smell is a judgment call for you to make.
The most common code smell is duplicate code. Duplicate code is any source code that you could have created by copying and pasting some other code into your program. For example, this short program contains duplicate code. Notice that it asks how the user is feeling three times:
print('Good morning!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
print('Good afternoon!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
print('Good evening!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
Duplicate code is a problem because it makes changing the code difficult; a change you make to one copy of the duplicate code must be made to every copy of it in the program. If you forget to make a change somewhere, or if you make different changes to different copies, your program will likely end up with bugs.
The solution to duplicate code is to deduplicate it; that is, make it appear once in your program by placing the code in a function or loop. In the following example, I’ve moved the duplicate code into a function and then repeatedly called that function:
def askFeeling():
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
print('Good morning!')
askFeeling()
print('Good afternoon!')
askFeeling()
print('Good evening!')
askFeeling()
In this next example, I’ve moved the duplicate code into a loop:
for timeOfDay in ['morning', 'afternoon', 'evening']:
print('Good ' + timeOfDay + '!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
You could also combine these two techniques and use a function and a loop:
def askFeeling(timeOfDay):
print('Good ' + timeOfDay + '!')
print('How are you feeling?')
feeling = input()
print('I am happy to hear that you are feeling ' + feeling + '.')
for timeOfDay in ['morning', 'afternoon', 'evening']:
askFeeling(timeOfDay)
Notice that the code that produces the “Good morning/afternoon/evening!” messages is similar but not identical. In the third improvement to the program, I parameterized the code to deduplicate the identical parts. Meanwhile, the timeOfDay
parameter and timeOfDay
loop variable replace the parts that differ. Now that I’ve deduplicated this code by removing the extra copies, I only need to make any necessary changes in one place.
As with all code smells, avoiding duplicate code isn’t a hard-and-fast rule you must always follow. In general, the longer the duplicate code section or the more duplicate copies that appear in your program, the stronger the case for deduplicating it. I don’t mind copying and pasting code once or even twice. But I generally start to consider deduplicating code when three or four copies exist in my program.
Sometimes, code is just not worth the trouble of deduplicating. Compare the first code example in this section to the most recent one. Although the duplicate code is longer, it’s simple and straightforward. The deduplicated example does the same thing but involves a loop, a new timeOfDay
loop variable, and a new function with a parameter that is also named timeOfDay
.
Duplicate code is a code smell because it makes your code harder to change consistently. If several duplicates are in your program, the solution is to place code inside a function or loop so it appears only once.
It’s no surprise that programming involves numbers. But some of the numbers that appear in your source code can confuse other programmers (or you a couple weeks after writing them). For example, consider the number 604800
in the following line:
expiration = time.time() + 604800
The time.time()
function returns an integer representing the current time. We can assume that the expiration
variable will represent some point 604,800 seconds into the future. But 604800
is rather mysterious: what’s the significance of this expiration date? A comment can help clarify:
expiration = time.time() + 604800 # Expire in one week.
This is a good solution, but an even better one is to replace these “magic” numbers with constants. Constants are variables whose names are written in uppercase letters to indicate that their values shouldn’t change after their initial assignment. Usually, constants are defined as global variables at the top of the source code file:
# Set up constants for different time amounts:
SECONDS_PER_MINUTE = 60
SECONDS_PER_HOUR = 60 * SECONDS_PER_MINUTE
SECONDS_PER_DAY = 24 * SECONDS_PER_HOUR
SECONDS_PER_WEEK = 7 * SECONDS_PER_DAY
--snip--
expiration = time.time() + SECONDS_PER_WEEK # Expire in one week.
You should use separate constants for magic numbers that serve different purposes, even if the magic number is the same. For example, there are 52 cards in a deck of playing cards and 52 weeks in a year. But if you have both amounts in your program, you should do something like the following:
NUM_CARDS_IN_DECK = 52
NUM_WEEKS_IN_YEAR = 52
print('This deck contains', NUM_CARDS_IN_DECK, 'cards.')
print('The 2-year contract lasts for', 2 * NUM_WEEKS_IN_YEAR, 'weeks.')
When you run this code, the output will look like this:
This deck contains 52 cards.
The 2-year contract lasts for 104 weeks.
Using separate constants allows you to change them independently in the future. Note that constant variables should never change values while the program is running. But this doesn’t mean that the programmer can never update them in the source code. For example, if a future version of the code includes a joker card, you can change the cards
constant without affecting the weeks
one:
NUM_CARDS_IN_DECK = 53
NUM_WEEKS_IN_YEAR = 52
The term magic number can also apply to non-numeric values. For example, you might use string values as constants. Consider the following program, which asks the user to enter a direction and displays a warning if the direction is north. A 'nrth'
typo causes a bug that prevents the program from displaying the warning:
while True:
print('Set solar panel direction:')
direction = input().lower()
if direction in ('north', 'south', 'east', 'west'):
break
print('Solar panel heading set to:', direction)
1 if direction == 'nrth':
print('Warning: Facing north is inefficient for this panel.')
This bug can be hard to detect: the typo in the 'nrth'
string 1 is still syntactically correct Python. The program doesn’t crash, and it’s easy to overlook the lack of a warning message. But if we used constants and made this same typo, the typo would cause the program to crash because Python would notice that a NRTH
constant doesn’t exist:
# Set up constants for each cardinal direction:
NORTH = 'north'
SOUTH = 'south'
EAST = 'east'
WEST = 'west'
while True:
print('Set solar panel direction:')
direction = input().lower()
if direction in (NORTH, SOUTH, EAST, WEST):
break
print('Solar panel heading set to:', direction)
1 if direction == NRTH:
print('Warning: Facing north is inefficient for this panel.')
The NameError
exception raised by the code line with the NRTH
typo 1 makes the bug immediately obvious when you run this program:
Set solar panel direction:
west
Solar panel heading set to: west
Traceback (most recent call last):
File "panelset.py", line 14, in <module>
if direction == NRTH:
NameError: name 'NRTH' is not defined
Magic numbers are a code smell because they don’t convey their purpose, making your code less readable, harder to update, and prone to undetectable typos. The solution is to use constant variables instead.
Commenting out code so it doesn’t run is fine as a temporary measure. You might want to skip some lines to test other functionality, and commenting them out makes them easy to add back in later. But if commented-out code remains in place, it’s a complete mystery why it was removed and under what condition it might ever be needed again. Consider the following example:
doSomething()
#doAnotherThing()
doSomeImportantTask()
doAnotherThing()
This code prompts many unanswered questions: Why was doAnotherThing()
commented out? Will we ever include it again? Why wasn’t the second call to doAnotherThing()
commented out? Were there originally two calls to doAnotherThing()
, or was there one call that was moved after doSomeImportantTask()
? Is there a reason we shouldn’t remove the commented-out code? There are no readily available answers to these questions.
Dead code is code that is unreachable or logically can never run. For example, code inside a function but after a return
statement, code in an if
statement block with an always False
condition, or code in a function that is never called is all dead code. To see this in practice, enter the following into the interactive shell:
>>> import random
>>> def coinFlip():
... if random.randint(0, 1):
... return 'Heads!'
... else:
... return 'Tails!'
... return 'The coin landed on its edge!'
...
>>> print(coinFlip())
Tails!
The return 'The coin landed on its edge!'
line is dead code because the code in the if
and else
blocks returns before the execution could ever reach that line. Dead code is misleading because programmers reading it assume that it’s an active part of the program when it’s effectively the same as commented-out code.
Stubs are an exception to these code smell rules. These are placeholders for future code, such as functions or classes that have yet to be implemented. In lieu of real code, a stub contains a pass
statement, which does nothing. (It’s also called a no operation or no-op.) The pass
statement exists so you can create stubs in places where the language syntax requires some code:
>>> def exampleFunction():
... pass
...
When this function is called, it does nothing. Instead, it indicates that code will eventually be added in.
Alternatively, to avoid accidentally calling an unimplemented function, you can stub it with a raise
NotImplementedError
statement. This will immediately indicate that the function isn’t yet ready to be called:
>>> def exampleFunction():
... raise NotImplementedError
...
>>> exampleFunction()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in exampleFunction
NotImplementedError
Raising a NotImplementedError
will warn you whenever your program calls a stub function or method by accident.
Commented-out code and dead code are code smells because they can mislead programmers into thinking that the code is an executable part of the program. Instead, remove them and use a version control system, such as Git or Subversion, to keep track of changes. Version control is covered in Chapter 12. With version control, you can remove the code from your program and, if needed, easily add it back in later.
Print debugging is the practice of placing temporary print()
calls in a program to display the values of variables and then rerunning the program. The process often follows these steps:
print()
calls for some variables to find out what they contain.print()
calls because the earlier ones didn’t show enough information.print()
calls and remove them.Print debugging is deceptively quick and simple. But it often requires multiple iterations of rerunning the program before you display the information you need to fix your bug. The solution is to use a debugger or set up logfiles for your program. By using a debugger, you can run your programs one line of code at a time and inspect any variable. Using a debugger might seem slower than simply inserting a print()
call, but it saves you time in the long run.
Logfiles can record large amounts of information from your program so you can compare one run of it to previous runs. In Python, the built-in logging
module provides the functionality you need to easily create logfiles by using just three lines of code:
import logging
logging.basicConfig(filename='log_filename.txt', level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
logging.debug('This is a log message.')
After importing the logging
module and setting up its basic configuration, you can call logging.debug()
to write information to a text file, as opposed to using print()
to display it onscreen. Unlike print debugging, calling logging.debug()
makes it obvious what output is debugging information and what output is the result of the program’s normal run. You can find more information about debugging in Chapter 11 of Automate the Boring Stuff with Python, 2nd edition (No Starch, 2019), which you can read online at https://autbor.com/2e/c11/.
When writing programs, you might need multiple variables that store the same kind of data. In those cases, you might be tempted to reuse a variable name by adding a numeric suffix to it. For example, if you’re handling a signup form that asks users to enter their password twice to prevent typos, you might store those password strings in variables named password1
and password2
. These numeric suffixes aren’t good descriptions of what the variables contain or the differences between them. They also don’t indicate how many of these variables there are: is there a password3
or a password4
as well? Try to create distinct names rather than lazily adding numeric suffixes. A better set of names for this password example would be password
and confirm_password
.
Let’s look at another example: if you have a function that deals with start and destination coordinates, you might have the parameters x1
, y1
, x2
, and y2
. But the numeric suffix names don’t convey as much information as the names start_x
, start_y
, end_x
, and end_y
. It’s also clearer that the start_x
and start_y
variables are related to each other, compared to x1
and y1
.
If your numeric suffixes extend past 2, you might want to use a list or set data structure to store your data as a collection. For example, you could store the values of pet1Name
, pet2Name
, pet3Name
, and so on in a list called petNames
.
This code smell doesn’t apply to every variable that simply ends with a number. For example, it’s perfectly fine to have a variable named enableIPv6
, because “6” is part of the “IPv6” proper name, not a numeric suffix. But if you’re using numeric suffixes for a series of variables, consider replacing them with a data structure, such as a list or dictionary.
Programmers who use languages such as Java are used to creating classes to organize their program’s code. For example, let’s look at this example Dice
class, which has a roll()
method:
>>> import random
>>> class Dice:
... def __init__(self, sides=6):
... self.sides = sides
... def roll(self):
... return random.randint(1, self.sides)
...
>>> d = Dice()
>>> print('You rolled a', d.roll())
You rolled a 1
This might seem like well-organized code, but think about what our actual needs are: a random number between 1 and 6. We could replace this entire class with a simple function call:
>>> print('You rolled a', random.randint(1, 6))
You rolled a 6
Compared to other languages, Python uses a casual approach to organizing code, because its code isn’t required to exist in a class or other boilerplate structure. If you find you’re creating objects simply to make a single function call, or if you’re writing classes that contain only static methods, these are code smells that indicate you might be better off writing functions instead.
In Python, we use modules rather than classes to group functions together. Because classes must be in a module anyway, putting this code in classes just adds an unnecessary layer of organization to your code. Chapters 15 through 17 discuss these object-oriented design principles in more detail. Jack Diederich’s PyCon 2012 talk “Stop Writing Classes” covers other ways that you might be overcomplicating your Python code.
List comprehensions are a concise way to create complex list values. For example, to create a list of strings of digits for the numbers 0 through 100, excluding all multiples of 5, you’d typically need a for
loop:
>>> spam = []
>>> for number in range(100):
... if number % 5 != 0:
... spam.append(str(number))
...
>>> spam
['1', '2', '3', '4', '6', '7', '8', '9', '11', '12', '13', '14', '16', '17',
--snip--
'86', '87', '88', '89', '91', '92', '93', '94', '96', '97', '98', '99']
Alternatively, you can create this same list in a single line of code by using the list comprehension syntax:
>>> spam = [str(number) for number in range(100) if number % 5 != 0]
>>> spam
['1', '2', '3', '4', '6', '7', '8', '9', '11', '12', '13', '14', '16', '17',
--snip--
'86', '87', '88', '89', '91', '92', '93', '94', '96', '97', '98', '99']
Python also has syntax for set comprehensions and dictionary comprehensions:
1 >>> spam = {str(number) for number in range(100) if number % 5 != 0}
>>> spam
{'39', '31', '96', '76', '91', '11', '71', '24', '2', '1', '22', '14', '62',
--snip--
'4', '57', '49', '51', '9', '63', '78', '93', '6', '86', '92', '64', '37'}
2 >>> spam = {str(number): number for number in range(100) if number % 5 != 0}
>>> spam
{'1': 1, '2': 2, '3': 3, '4': 4, '6': 6, '7': 7, '8': 8, '9': 9, '11': 11,
--snip--
'92': 92, '93': 93, '94': 94, '96': 96, '97': 97, '98': 98, '99': 99}
A set comprehension 1 uses braces instead of square brackets and produces a set value. A dictionary comprehension 2 produces a dictionary value and uses a colon to separate the key and value in the comprehension.
These comprehensions are concise and can make your code more readable. But notice that the comprehensions produce a list, set, or dictionary based on an iterable object (in this example, the range
object returned by the range(100)
call). Lists, sets, and dictionaries are iterable objects, which means you could have comprehensions nested inside of comprehensions, as in the following example:
>>> nestedIntList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> nestedStrList = [[str(i) for i in sublist] for sublist in nestedIntList]
>>> nestedStrList
[['0', '1', '2', '3'], ['4'], ['5', '6'], ['7', '8', '9']]
But nested list comprehensions (or nested set and dictionary comprehensions) cram a lot of complexity into a small amount of code, making your code hard to read. It’s better to expand the list comprehension into one or more for
loops instead:
>>> nestedIntList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> nestedStrList = []
>>> for sublist in nestedIntList:
... nestedStrList.append([str(i) for i in sublist])
...
>>> nestedStrList
[['0', '1', '2', '3'], ['4'], ['5', '6'], ['7', '8', '9']]
Comprehensions can also contain multiple for
expressions, although this tends to produce unreadable code as well. For example, the following list comprehension produces a flattened list from a nested list:
>>> nestedList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> flatList = [num for sublist in nestedList for num in sublist]
>>> flatList
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
This list comprehension contains two for
expressions, but it’s difficult for even experienced Python developers to understand. The expanded form, which uses two for
loops, creates the same flattened list but is much easier to read:
>>> nestedList = [[0, 1, 2, 3], [4], [5, 6], [7, 8, 9]]
>>> flatList = []
>>> for sublist in nestedList:
... for num in sublist:
... flatList.append(num)
...
>>> flatList
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Comprehensions are syntactic shortcuts that can produce concise code, but don’t go overboard and nest them within each other.
Catching exceptions is one of the primary ways to ensure that your programs will continue to function even when problems arise. When an exception is raised but there is no except
block to handle it, the Python program crashes by immediately stopping. This could result in losing your unsaved work or leaving files in a half-finished state.
You can prevent crashes by supplying an except
block that contains code for handling the error. But it can be difficult to decide how to handle an error, and programmers can be tempted to simply leave the except
block blank with a pass
statement. For example, in the following code we use pass
to create an except
block that does nothing:
>>> try:
... num = input('Enter a number: ')
... num = int(num)
... except ValueError:
... pass
...
Enter a number: forty two
>>> num
'forty two'
This code doesn’t crash when 'forty two'
is passed to int()
because the ValueError
that int()
raises is handled by the except
statement. But doing nothing in response to an error might be worse than a crash. Programs crash so they don’t continue to run with bad data or in incomplete states, which could lead to even worse bugs later on. Our code doesn’t crash when nondigit characters are entered. But now the num
variable contains a string instead of an integer, which could cause issues whenever the num
variable gets used. Our except
statement isn’t handling errors so much as hiding them.
Handling exceptions with poor error messages is another code smell. Look at this example:
>>> try:
... num = input('Enter a number: ')
... num = int(num)
... except ValueError:
... print('An incorrect value was passed to int()')
...
Enter a number: forty two
An incorrect value was passed to int()
This code doesn’t crash, which is good, but it doesn’t give the user enough information to know how to fix the problem. Error messages are meant to be read by users, not programmers. Not only does this error message have technical details that a user wouldn’t understand, such as a reference to the int()
function, but it doesn’t tell the user how to fix the problem. Error messages should explain what happened as well as what the user should do about it.
It’s easier for programmers to quickly write a single, unhelpful description of what happened rather than detailed steps that the user can take to fix the problem. But keep in mind that if your program doesn’t handle all possible exceptions that could be raised, it’s an unfinished program.
Some code smells aren’t really code smells at all. Programming is full of half-remembered bits of bad advice that are taken out of context or stick around long after they’ve outlived their usefulness. I blame tech book authors who try to pass off their subjective opinions as best practices.
You might have been told some of these practices are code smells, but they’re mostly fine. I call them code smell myths: they’re warnings that you can and should ignore. Let’s look at a few of them.
The “one entry, one exit” idea comes from misinterpreted advice from the days of programming in assembly and FORTRAN languages. These languages allowed you to enter a subroutine (a structure similar to a function) at any point, including in the middle of it, making it hard to debug which parts of the subroutine had been executed. Functions don’t have this problem (execution always begins at the start of the function). But the advice lingered, becoming “functions and methods should only have one return
statement, which should be at the end of the function or method.”
Trying to achieve a single return
statement per function or method often requires a convoluted series of if
-else
statements that’s far more confusing than having multiple return
statements. Having more than one return
statement in a function or method is fine.
“Functions and methods should do one thing” is good advice in general. But taking this to mean that exception handling should occur in a separate function goes too far. For example, let’s look at a function that indicates whether a file we want to delete is already nonexistent:
>>> import os
>>> def deleteWithConfirmation(filename):
... try:
... if (input('Delete ' + filename + ', are you sure? Y/N') == 'Y'):
... os.unlink(filename)
... except FileNotFoundError:
... print('That file already did not exist.')
...
Proponents of this code smell myth argue that because functions should always do just one thing, and error handling is one thing, we should split this function into two functions. They argue that if you use a try
-except
statement, it should be the first statement in a function and envelop all of the function’s code to look like this:
>>> import os
>>> def handleErrorForDeleteWithConfirmation(filename):
... try:
... _deleteWithConfirmation(filename)
... except FileNotFoundError:
... print('That file already did not exist.')
...
>>> def _deleteWithConfirmation(filename):
... if (input('Delete ' + filename + ', are you sure? Y/N') == 'Y'):
... os.unlink(filename)
...
This is unnecessarily complicated code. The _deleteWithConfirmation()
function is now marked as private with the _
underscore prefix to clarify that it should never be called directly, only indirectly through a call to handleErrorForDeleteWithConfirmation()
. This new function’s name is awkward, because we call it intending to delete a file, not handle an error to delete a file.
Your functions should be small and simple, but this doesn’t mean they should always be limited to doing “one thing” (however you define that). It’s fine if your functions have more than one try
-except
statement and the statements don’t envelop all of the function’s code.
Boolean arguments to function or method calls are sometimes referred to as flag arguments. In programming, a flag is a value that indicates a binary setting, such as “enabled” or “disabled,” and it’s often represented by a Boolean value. We can describe these settings as set (that is, True
) or cleared (that is, False
).
The false belief that flag arguments to function calls are bad is based on the claim that, depending on the flag value, the function does two entirely different things, such as in the following example:
def someFunction(flagArgument):
if flagArgument:
# Run some code...
else:
# Run some completely different code...
Indeed, if your function looks like this, you should create two separate functions rather than making an argument decide which half of the function’s code to run. But most functions with flag arguments don’t do this. For example, you can pass a Boolean value for the sorted()
function’s reverse
keyword argument to determine the sort order. This code wouldn’t be improved by splitting the function into two functions named sorted()
and reverseSorted()
(while also duplicating the amount of documentation required). So the idea that flag arguments are always bad is a code smell myth.
Functions and methods are like mini-programs within your program: they contain code, including local variables that are forgotten when the function returns. This is similar to how a program’s variables are forgotten after it terminates. Functions are isolated: their code either performs correctly or has a bug depending on the arguments passed when they’re called.
But functions and methods that use global variables lose some of this helpful isolation. Every global variable you use in a function effectively becomes another input to the function, just like the arguments. More arguments mean more complexity, which in turn means a higher likelihood for bugs. If a bug manifests in a function due to a bad value in a global variable, that bad value could have been set anywhere in the program. To search for a likely cause of this bad value, you can’t just analyze the code in the function or the line of code calling the function; you must look at the entire program’s code. For this reason, you should limit your use of global variables.
For example, let’s look at the calculateSlicesPerGuest()
function in a fictional partyPlanner.py program that is thousands of lines long. I’ve included line numbers to give you a sense of the program’s size:
1504. def calculateSlicesPerGuest(numberOfCakeSlices):
1505. global numberOfPartyGuests
1506. return numberOfCakeSlices / numberOfPartyGuests
Let’s say when we run this program, we encounter the following exception:
Traceback (most recent call last):
File "partyPlanner.py", line 1898, in <module>
print(calculateSlicesPerGuest(42))
File "partyPlanner.py", line 1506, in calculateSlicesPerGuest
return numberOfCakeSlices / numberOfPartyGuests
ZeroDivisionError: division by zero
The program has a zero divide error, caused by the line return numberOfCakeSlices / numberOfPartyGuests
. The numberOfPartyGuests
variable must be set to 0
to have caused this, but where did numberOfPartyGuests
get assigned this value? Because it’s a global variable, it could have happened anywhere in the thousands of lines in this program! From the traceback information, we know that calculateSlicesPerGuest()
was called on line 1898 of our fictional program. If we looked at line 1898, we could find out what argument was passed for the numberOfCakeSlices
parameter. But the numberOfPartyGuests
global variable could have been set at any time before the function call.
Note that global constants aren’t considered poor programming practice. Because their values never change, they don’t introduce complexity into the code the way other global variables do. When programmers mention that “global variables are bad,” they aren’t referring to constant variables.
Global variables broaden the amount of debugging needed to find where an exception-causing value could have been set. This makes abundant use of global variables a bad idea. But the idea that all global variables are bad is a code smell myth. Global variables can be useful in smaller programs or for keeping track of settings that apply across the entire program. If you can avoid using a global variable, that’s a sign you probably should avoid using one. But “global variables are bad” is an oversimplified opinion.
Bad comments are indeed worse than no comments at all. A comment with outdated or misleading information creates more work for the programmer instead of a better understanding. But this potential problem is sometimes used to proclaim that all comments are bad. The argument asserts that every comment should be replaced with more readable code, to the point that programs should have no comments at all.
Comments are written in English (or whatever language the programmer speaks), allowing them to convey information to a degree that variable, function, and class names cannot. But writing concise, effective comments is hard. Comments, like code, require rewrites and multiple iterations to get right. We understand the code we write immediately after writing it, so writing comments seems like pointless extra work. As a result, programmers are primed to accept the “comments are unnecessary” viewpoint.
The far more common experience is that programs have too few or no comments rather than too many or misleading comments. Rejecting comments is like saying, “Flying across the Atlantic Ocean in a passenger jet is only 99.999991 percent safe, so I’m going to swim across it instead.”
Chapter 10 has more information about how to write effective comments.
Code smells indicate that there might be a better way to write your code. They don’t necessarily require a change, but they should make you take another look. The most common code smell is duplicate code, which can signal an opportunity to place code inside a function or loop. This ensures future code changes will need to be made in only one place. Other code smells include magic numbers, which are unexplained values in your code that can be replaced by constants with descriptive names. Similarly, commented-out code and dead code are never run by the computer, and might mislead programmers who later read the program’s code. It’s best to remove them and rely on a source control system like Git if you later need to add them back to your program.
Print debugging uses print()
calls to display debugging information. Although this approach to debugging is easy, it’s often faster in the long run to rely on a debugger and logs to diagnose bugs.
Variables with numeric suffixes, such as x1
, x2
, x3
, and so on, are often best replaced with a single variable containing a list. Unlike in languages such as Java, in Python we use modules rather than classes to group functions together. A class that contains a single method or only static methods is a code smell suggesting that you should put that code into a module rather than a class. And although list comprehensions are a concise way to create list values, nested list comprehensions are usually unreadable.
Additionally, any exceptions handled with empty except
blocks are a code smell that you’re simply silencing the error rather than handling it. A short, cryptic error message is just as useless to the user as no error message.
Along with these code smells are the code smell myths: programming advice that is no longer valid or has, over time, proven counterproductive. These include putting only a single return
statement or try
-except
block in each function, never using flag arguments or global variables, and believing that comments are unnecessary.
Of course, as with all programming advice, the code smells described in this chapter might or might not apply to your project or personal preferences. A best practice isn’t an objective measure. As you gain more experience, you’ll come to different conclusions about what code is readable or reliable, but the recommendations in this chapter outline issues to consider.