The following are the answers to the practice questions, along with brief explanations and solutions to the practice projects. The programs created for the practice projects don’t have to exactly match the code given here, as long as they generate the same or similar output.
Answers to the Practice Questions
1. /
2. *
3. -
4. %
5. +
6. **
7. //
8. No. The whitespace in between values and operators is insignificant, so these two expressions are effectively the same.
9. 3, because the // operator is the “floor” division operator, which does division and then rounds down.
10. 2, because the % operator is the modulo operator, which evaluates to the remainder of a division operation.
11. 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10. You can place these integer values in any order, as in 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1. The expression should evaluate to 55.
12. (4 + 5), because it’s in parentheses.
13. 2 ** 3, because exponents are evaluated before addition.
14. 2 ** 3, because exponents are evaluated before addition.
15. (1 + 2), because parentheses are evaluated before exponents.
16. 2 + 4, because addition is performed left to right.
17. Error. The + operator needs two values.
18. No error. A value by itself is an expression (which evaluates to itself).
19. No error. This is an expression.
20. Error. It’s missing a closing parenthesis.
21. No error. This expression evaluates to 0.
22. Error. There is no operator connecting the 2 and 3.
23. Int, because it is a number without a decimal point.
24. Int, because it is a number without a decimal point, and ints include negative numbers.
25. Float, because it is a number with a decimal point.
26. String, because it is enclosed with quotes.
27. Float, because it is a number with a decimal point.
28. String, because it is enclosed with quotes. Numeral characters can be a part of strings, just like letter or punctuation characters.
29. String, because it is enclosed with quotes.
30. The 10 is an int, the 10.0 is a float, and the '10' is a string.
31. 'HelloHelloHello', because the + operator can operate on string values, in which case it does string concatenation.
32. 'HelloHelloHello', because the * operator can operate on a string and an int, in which case it does string replication.
33. 'HelloHelloHello', because the order of the string and int doesn’t matter for string replication.
34. 'HelloHelloHelloHello', because the parentheses perform the (2 * 2) multiplication first, evaluating the expression to 4 * 'Hello', which is string replication.
35. '1312', because the values are strings, so the + operator does string concatenation, not mathematical addition.
36. Error, because there is no closing quote at the end of what is supposed to be a string value.
37. Error, because string replication can be done only with integers like 3 and not with floating-point numbers like 3.0.
38. Error, because adding a string and an int is neither string concatenation nor mathematical addition.
39. Error, because Hello without quotes is a variable name and not a string, and we have not assigned a variable named Hello.
40. Error, because multiplying two strings is neither string replication nor mathematical multiplication.
41. Error, because you cannot divide a string by an int.
42. Error, because you cannot divide a string by a string.
43. Jack, which is the value assigned to the nephew variable.
44. nephew, because the print() call prints the string 'nephew'. The nephew variable is ignored in this program.
45. Albert, because although nephew was first assigned the string 'Jack', this value was next overwritten by the string value 'Albert'.
46. Jack, because the nephew variable is assigned the string 'Jack', while the separate Nephew variable is assigned 'Albert'. Variable names are case sensitive in Python, so nephew and Nephew are two separate variables, and the print() call prints the nephew variable.
47. This program causes an error because Jack is not enclosed in quotes, so Python thinks it is a variable named Jack, but our program has not previously created this variable, so we are attempting to assign nephew a nonexistent Jack variable’s value.
48. This program causes an error because, while the nephew variable is assigned the string 'Jack', the print() function attempts to print the value of a nonexistent Jack variable. The nephew variable is ignored in this program.
49. This program causes an error because, while the nephew variable is assigned the string 'Jack', the print() function attempts to print a NEPHEW variable. Variable names are case sensitive in Python, so nephew and NEPHEW are considered two separate variables, and no variable named NEPHEW exists.
50. This program causes an error because it tries to print a nephew variable that was never created.
51. Valid, because variable names can contain underscores.
52. Invalid, because variable names cannot contain dashes.
53. Valid, because variable names cannot contain spaces.
54. Valid, because variable names can contain lowercase and uppercase letters.
55. Valid, because variable names can contain underscores.
56. Valid, because an underscore (even by itself) can be used in variable names.
57. Invalid, because variable names can’t begin with a number.
58. String, because it has text enclosed in quotes.
59. Variable, because it is not enclosed in quotes and not followed by a set of parentheses, and it follows the rules for variable names.
60. Function call, because it ends with a set of parentheses.
61. String, because it has text enclosed in quotes, even though the text looks like a function call.
62. No error; 42.
63. Error, because a string of a number written in words cannot be converted to an integer.
64. Error, because 'Hello' cannot be converted to an integer.
65. No error; -42.
66. No error; 3.
67. No error; -42.0.
68. No error; '-42'.
69. No error; '3.1415'.
70. No error; 'Hello'.
71. No error; '3.0'.
72. No error; '3'.
73. No error; '3.0'.
74. The value in the number_of_cats variable is the integer 4, not the string '4'. You cannot concatenate a string and an integer with the + operator. You can fix this program by changing number_of_cats = 4 to number_of_cats = '4' or changing print('I have ' + number_of_cats) to print('I have ' + str(number_of_cats)).
75. The integer 5.
76. The abs() function returns the absolute value of the integer or float passed to it.
77. 5
78. 5
79. Because binary is the simplest number system, and it enables cheaper, more economical components for computer hardware.
80. 8
81. 210 and 1,024
82. 220 and 1,048,576
83. 230 and 1,073,741,824
84. 240 and 1,099,511,627,776
85. 11
86. 1,000
Rectangle Printer
print('Enter the width for the rectangle:')
width = input()
width = int(width)
print('O' * width)
print('O' * width)
print('O' * width)
print('O' * width)
print('O' * width)
Perimeter and Area Calculator
print('Enter the width for the rectangle:')
width = input()
width = int(width)
print('Enter the length for the rectangle:')
length = input()
length = int(length)
print('Area of the rectangle:')
print(width * length)
print('Perimeter of the rectangle:')
print(width + width + length + length)
Alternatively, the perimeter could have been calculated as width * 2 + length * 2.
Answers to the Practice Questions
1. Yes.
2. No. The quotes make this a string value.
3. No. The f needs to be capitalized.
4. Yes.
5. No. The quotes make this a string value.
6. No. The t needs to be capitalized.
7. No. A single = (equal sign) is the assignment operator.
8. Yes. This is the “less than” operator.
9. No. The > (greater than) symbol comes first: >=.
10. No. The ! (exclamation mark) comes first: !=.
11. Yes. This is the “not equal to” operator.
12. Yes. This is the “equal to” operator.
13. Yes. This is the “greater than” operator.
14. Yes. This is the “less than or equal to” operator.
15. The < operator checks if one value is less than the other, while the <= operator checks if one value is less than or equal to the other.
16. The = operator is the assignment operator used for assigning values to variables, while the == operator is the “equal to” operator, which evaluates to True if both values are the same.
17. Integer and floating-point values of the same number are considered equal in Python.
18. Strings and integer values are never equal in Python.
19. You will get an error message saying that the < operator cannot compare integer and string values.
20. The truth table for the and operator looks like this (though the rows can be in any order):
A |
and |
B |
Evaluates to |
|---|---|---|---|
True |
and |
True |
True |
True |
and |
False |
False |
False |
and |
True |
False |
False |
and |
False |
False |
21. The truth table for the or operator looks like this (though the rows can be in any order):
A |
or |
B |
Evaluates to |
|---|---|---|---|
True |
or |
True |
True |
True |
or |
False |
True |
False |
or |
True |
True |
False |
or |
False |
False |
22. The truth table for the not operator looks like this (though the rows can be in any order):
not |
A |
Evaluates to |
|---|---|---|
not |
True |
False |
not |
False |
True |
23. True
24. True
25. True
26. True
27. True
28. False
29. True
30. False
31. True
32. This statement sets the is_raining variable to the opposite Boolean value it currently has. This is called toggling.
33. The expression name == 'Alice' or name == 'Bob' is correct. The expression name == 'Alice' or 'Bob' is a common mistaken form, because the expression always evaluates to True, since 'Bob' is a “truthy” value.
34. A new block begins when the indentation of a line of code increases relative to the previous line. This is expected after any statement that ends with a colon (:).
35. Yes.
36. A new block is expected after instructions that end with a colon (:).
37. A block ends when the indentation decreases to the same amount as a previous line.
38. The program execution is the instruction currently being executed. The execution moves from instruction to instruction as the program executes.
39. There is one block in this program: the single line following the if statement.
40. The first block begins on line 3.
41. The first block also ends on line 3. (The block is only one line long.)
42. No. The condition uses the = assignment operator instead of the == (equal to) operator.
43. No. All if statements require a colon at the end.
44. No. All if statements require a condition.
45. Yes.
46. Yes.
47. Yes. (The condition evaluates to False, but this is still a valid if statement.)
48. Yes.
49. No. An else statement doesn’t have anything after the else keyword.
50. No. This else statement is missing the colon at the end.
51. No. An else statement doesn’t have anything after the else keyword.
52. No. An else statement doesn’t have anything after the else keyword.
53. No. An elif statement must have a condition.
54. Yes.
55. No. This “else if” is not valid Python code.
56. Yes.
57. An elif statement comes after an if or another elif statement, and the else statement must come last.
58. Zero or more elif statements can follow an if statement.
Fixing the Safe Temperature Program
The corrective changes are in bold:
print('Enter C or F to indicate Celsius or Fahrenheit:')
scale = input()
print('Enter the number of degrees:')
degrees = int(input())
if scale == 'C':
if degrees >= 16 and degrees <= 38:
print('Safe')
else:
print('Dangerous')
elif scale == 'F':
if degrees >= 60.8 and degrees <= 100.4:
print('Safe')
else:
print('Dangerous')
Single-Expression Safe Temperature
Change the if statement’s condition to the following (or an equivalent expression):
if (scale == 'C' and degrees >= 16 and degrees <= 38) or (scale == 'F' and degrees >= 60.8 and degrees <= 100.4):
You could also chain the comparison operators, like this:
if (scale == 'C' and 16 <= degrees <= 38) or (scale == 'F' and 60.8 <= degrees <= 100.4):
Fizz Buzz
print('Enter an integer:')
number = input()
number = int(number)
if number % 3 == 0 and number % 5 == 0:
print('Fizz Buzz')
elif number % 3 == 0:
print('Fizz')
elif number % 5 == 0:
print('Buzz')
else:
print(number)
Answers to the Practice Questions
1. Yes.
2. Yes.
3. No. The condition is missing.
4. No. The colon at the end is missing.
5. Yes.
6. No. The if keyword is extraneous.
7. Yes.
8. Yes. Even though the condition ensures that the loop code never runs, it is still a syntactically valid Python statement.
9. The break statement.
10. The continue statement.
11. No. They don’t have colons at the end because they do not start a new block of code.
12. This code prints 'Hello' six times.
13. This code prints nothing, because the condition is already False.
14. This code prints the numbers 0 to 5.
15. This code prints nothing, because the break statement stops the loop.
16. This code prints only 0, because the break statement stops the loop afterward.
17. This code prints nothing, because the condition of the while loop is False.
18. Falsey, because 0 is the only falsey integer value.
19. Truthy, because all non-empty strings are truthy.
20. Falsey, because the blank string is the only falsey string value.
21. Truthy, because all nonzero numbers are truthy.
22. Truthy, because all non-blank strings are truthy.
23. Truthy, because all non-blank strings are truthy.
24. Truthy, because all nonzero numbers are truthy.
25. No, because the range() function requires at least one integer argument.
26. Yes.
27. No, because the variable and in keyword are missing.
28. No, because the range() function call is missing (and the integer value 10 is not “iterable” in the way range objects are).
29. No, because the colon at the end is missing.
30. A for loop is good for executing code a set number of times. (There are other valid answers to this question, such as executing code once for each item in a list.)
31. Both.
32. Both.
33. No. You can’t use a continue or break statement outside a loop.
34. print(spam)
35. The for and in keywords are used in that statement. The i is a variable and range(10) is a function call.
36. This code prints Hello six times.
37. This code prints Hello six times.
38. This code prints Hello six times (two Hellos for each of the three iterations).
39. This code prints Hello three times. The continue statement skips the second print('Hello').
40. This code prints the numbers 0 to 5 (one per line).
41. This code prints the numbers 0 to 5 (one per line).
42. This code prints the numbers 1 to 6 (one per line).
43. This code prints the numbers 0, 2, and 4 (one per line).
44. Here is code that adds the integers 1 to 100, and then prints the total sum (5050):
total = 0 for i in range(1, 101): total = total + i print(total)
45. An off-by-one error. The code prints the numbers 0 to 9 instead of 1 to 10.
46. These are built-in functions.
47. Yes.
48. No. You don’t import string values.
49. No. A comma is missing between the two module names.
50. No. The comma should only separate multiple module names.
51. Yes.
52. The sys.exit() function immediately stops, or terminates, the program.
53. You must run import sys to import the sys module before you can call the sys.exit() function.
Tree Printer
# Ask the user for the tree size:
print('Enter the tree size:')
size = int(input())
# Print the tree top:
for row_num in range(1, size + 1):
spaces = ' ' * (size - row_num)
tree = '^' * (row_num * 2 - 1)
print(spaces + tree)
# Print the tree trunk:
spaces = ' ' * (size - 1)
print(spaces + '#')
print(spaces + '#')
Here is the same program written using a while loop instead of a for loop:
# Ask the user for the tree size:
print('Enter the tree size:')
size = int(input())
# Print the tree top:
row_num = 1
while row_num < size + 1:
spaces = ' ' * (size - row_num)
tree = '^' * (row_num * 2 - 1)
print(spaces + tree)
row_num = row_num + 1
# Print the tree trunk:
spaces = ' ' * (size - 1)
print(spaces + '#')
print(spaces + '#')
Christmas Tree Printer
import random
# Ask the user for the tree size:
print('Enter the tree size:')
size = int(input())
# Print the tree top:
for row_num in range(1, size + 1):
spaces = ' ' * (size - row_num)
tree = ''
# Create the row from random 'o' and '^' characters:
for branch_num in range(row_num * 2 - 1):
if random.randint(1, 4) == 1:
tree = tree + 'o'
else:
tree = tree + '^'
print(spaces + tree)
# Print the tree trunk:
spaces = ' ' * (size - 1)
print(spaces + '#')
print(spaces + '#')
Here is the same program written using a while loop instead of a for loop:
import random
# Ask the user for the tree size:
print('Enter the tree size:')
size = int(input())
# Print the tree top:
row_num = 1
while row_num < size + 1:
spaces = ' ' * (size - row_num)
tree = ''
# Create the row from random 'o' and '^' characters:
for branch_num in range(row_num * 2 - 1):
if random.randint(1, 4) == 1:
tree = tree + 'o'
else:
tree = tree + '^'
print(spaces + tree)
row_num = row_num + 1
# Print the tree trunk:
spaces = ' ' * (size - 1)
print(spaces + '#')
print(spaces + '#')
Answers to the Practice Questions
1. No. The parentheses after hello are missing.
2. No. The keyword is def and not define.
3. Yes.
4. No. The def keyword is missing.
5. No. The function name and parentheses are missing.
6. Yes.
7. Yes.
8. This function definition has the def keyword and a colon, which a function call does not have.
9. The two parameters are first_name and last_name.
10. The 'Albert' value is an argument. Parameters are variables, not values.
11. This program prints nothing, because the say_hello() function is defined but never called.
12. This program prints the string Hello three times.
13. This program prints the string Hello six times (three times each for the two function calls).
14. The data type of the return value is a Boolean or bool value.
15. Technically, the password parameter can have a value of any data type, though the code suggests that it should be a string.
16. The data type of the return value is a string, because 'Hello, ' is a string and name is also a string (since the return values of input() are always strings), and concatenating two strings with the + operator always evaluates to a string.
17. False
18. False
19. False
20. True
21. False
22. False
23. False
24. A stack frame object represents a function call.
25. A stack frame object is pushed to the top of the call stack when a function is called.
26. A stack frame object is popped off the top of the call stack when the function call returns.
27. The stack frame object at the top of the call stack represents the function call the execution is currently in.
28. The call stack has a stack frame object for the spam() function call on the bottom, with a stack frame object for the bacon() function call on top of it.
29. A program that has absolutely no function calls in it always has an empty call stack with no stack frame objects for local variables.
30. Local. Function parameters are always local variables.
31. Global. Variables marked with the global statement are always global.
32. No. A variable must be global or local and can’t be both.
33. Local. If a variable in a function is used in an assignment statement in that function and there is no global statement for it, it is a local variable.
34. Global. Variables marked with the global statement are always global.
35. Global. If a variable in a function is used in a function but never in an assignment statement, it is a global variable. In this case, it is global even if there is no global statement for it.
36. The code prints the string 'cat' because spam in the func() function is a parameter and parameters are always local.
37. The code prints the string 'dog' because spam in the func() function is not used in an assignment statement. Note that the parameter is eggs, but spam is the variable passed to print(), so the value in eggs doesn’t matter.
38. The code prints the string 'dog' because spam in the func() function is used in an assignment statement, making it a local variable. The spam = 'cat' statement applies to a local spam variable and not to the global spam variable.
39. The code prints the string 'cat' because spam in the func() function is used in a global statement, making it a global variable. The spam = 'cat' statement applies to the global spam variable.
40. The code prints the string 'dog' because spam in the func() function is used in a global statement, making it a global variable. The print() function prints spam before it was changed to 'cat'.
41. The code prints nothing, because it crashes with an error. If a function contains an assignment statement for a variable that is not used in a global statement, the variable is local. However, the print(spam) line tries to print this local variable before it has been assigned a value. This causes an UnboundLocalError error.
42. Yes. The program would crash because the int(input()) call is outside the try block.
43. No. The program would not crash, because the int(input()) call is inside the try block, and the except block catches the ValueError exceptions that int() would raise.
44. Yes. The program would crash, because even though the int(input()) calls are inside the try block, the except block catches only ZeroDivisionError exceptions, and the int() function raises ValueError exceptions.
Transaction Tracker
def after_transaction(balance, transaction): if balance + transaction < 0: return balance else: return balance + transaction
Arithmetic Functions Without Arithmetic Operators
def plus_one(number): return number + 1 def add(number1, number2): total_sum = number1 for i in range(number2): total_sum = plus_one(total_sum) return total_sum def multiply(number1, number2): total_product = 0 for i in range(number2): total_product = add(total_product, number1) return total_product
Tick Tock
This implementation uses a for loop:
import time def tick_tock(seconds): tick_or_tock = 'Tick...' for i in range(seconds): # Print either "Tick" or "Tock": print(tick_or_tock) time.sleep(1) # Switch between "Tick" and "Tock": if tick_or_tock == 'Tick...': tick_or_tock = 'Tock...' else: tick_or_tock = 'Tick...'
Alternatively, this implementation uses a while loop:
import time def tick_tock(seconds): tick_or_tock = 'Tick...' while seconds > 0: # Print either "Tick" or "Tock": print(tick_or_tock) time.sleep(1) # Switch between "Tick" and "Tock": if tick_or_tock == 'Tick...': tick_or_tock = 'Tock...' else: tick_or_tock = 'Tick...' # Decrease seconds by one: seconds = seconds - 1
Answers to the Practice Questions
1. The program crashes with the exception message "You did not enter a name."
2. raise Exception('An error happened. This error message is vague and unhelpful.')
3. False. A raise statement can be anywhere.
4. The program prints 'Hello, Guido.' This is because the try block catches the raised exception, then runs the code in the except block, which assigns the name variable the value 'Guido'.
5. Assertions are for programmer errors.
6. Failing fast is a good thing because it reduces the time between the true cause of a bug and when the bug is first noticed, making the true cause easier to find and fix.
7. The -o command line argument suppresses assertion checks.
8. An assert False statement will always raise an assertion error if executed, because the condition is always False.
9. Using print() calls instead of the logging module could later result in accidentally leaving in some of the debugging print() calls or accidentally removing non-debugging print() calls.
10. CRITICAL.
11. DEBUG, or possibly INFO.
12. DEBUG.
13. ERROR, or possibly WARNING.
14. ERROR, or possibly WARNING.
15. INFO, or possibly DEBUG.
16. DEBUG.
17. DEBUG, or possibly INFO.
18. Set a breakpoint on that particular line of code.
19. Step Out.
20. Continue.
21. Stop or Quit (the label varies between debuggers).
22. Step In or Step Into (the label varies between debuggers).
23. Step Over.
Buggy Grade-Average Calculator
The bug is that the call to calculate_grade_average() passes the number of grades for the first argument and the grade sum for the second argument, which is the opposite of the expected order. You can fix this issue by changing this line
avg = calculate_grade_average(counter, total)
into this line:
avg = calculate_grade_average(total, counter)
Zero Division Error
When a 0 argument is passed for the number_of_grades parameter, it causes the grade_average = int(grade_sum / number_of_grades) to raise a ZeroDivisionError error.
Add this code to the calculate_grade_average() function so that it returns the integer 0 when the user hasn’t entered any grades:
def calculate_grade_average(grade_sum, number_of_grades): if number_of_grades == 0: return 0 grade_average = int(grade_sum / number_of_grades) return grade_average
Leap Year Calculator
The fixed code is shown in bold:
def is_leap_year(year):
if year % 4 == 0:
if year % 100 == 0:
if year % 400 == 0:
return True
return False
return True
return False
Writing Buggy Code on Purpose
The following are examples that produce the asked-for error messages. Your program doesn’t have to exactly match them as long as they produce the error message.
nameError.py:
print(spam)
badInt.py:
int('five')
badEquals.py:
age = 10
if age = 10:
print('You are ten.')
badString.py:
print('Hello)
badBool.py:
print(true)
missingIfBlock.py:
age = 10 if age == 10:
stringPlusInt.py:
print('Hello' + 5)
intPlusString.py:
print(5 + 'Hello')
Answers to the Practice Questions
1. 0. In Python and most programming languages, 0, not 1, is the first index.
2. 'hat'. If you thought the answer was 'rat', you forgot that the first index is 0, not 1.
3. This is a trick question! spam[4] would raise a “list index out of range” error because 3 is the last index in spam.
4. No. Python lists do not need to contain only values of one data type. This is a restriction in other programming languages, but not in Python.
5. Python raises a “list index out of range” error when evaluating spam[0] and spam is an empty list.
6. No. For spam[3], the [3] is meant to be the index 3 for the list in spam. Even though it uses square brackets, it is not itself a list.
7. The index -1.
8. The index -3.
9. The statement del spam[0] removes the first value in the spam list while del spam deletes the entire list.
10. The program prints the values in the spam list:
cat dog moose
11. The program prints the indexes of the spam list:
0 1 2
12. Even though the program loops over the indexes of the spam list, the actual output comprises the values in the spam list:
cat dog moose
13. A Boolean True or False value. For example, 'fish' in ['cat', 'dog', 'moose'] evaluates to False.
14. The variable b contains the string 'dog'. This kind of assignment is called unpacking, where the three values in ['cat', 'dog', 'moose'] are assigned to the variables a, b, and c, respectively.
15. The variable b contains the string 'a'. The string 'cat' is also a sequence, so it can be unpacked just like a list.
16. The a variable is assigned each index of the spam list, starting at 0. The b variable is assigned that index’s value.
17. The random.choice() function returns a randomly selected value from the list passed to it. For example, random.choice(['cat', 'dog']) could return either 'cat' or 'dog'.
18. The random.shuffle() function randomizes the order of the values in the list passed to it. For example, if spam = ['cat', 'dog', 'moose'], then random.shuffle(spam) could set spam with the value ['moose', 'cat', 'dog']. The list is modified by the function in place and the function doesn’t return any values.
19. The expression len(spam) evaluates to 3 because only the order of the values in the list has changed, not the number of values.
20. The program prints 105.
21. spam *= 2
22. bacon -= 3
23. eggs += bacon * 5
24. eggs *= bacon + 5
25. spam += 'LastName'
26. sort() is a list method.
27. len() is a function.
28. append() is a list method.
29. index() is a list and string method.
30. print() is a function.
31. input() is a function.
32. reverse() is a list method.
33. The remove() method removes a value by the value (for example, spam.remove('cat')) while the del operator removes a value by the index (for example, del spam[3]).
34. The code sort(spam) causes an error because sort() is a list method and not a function. The correct way to call it is spam.sort().
35. The code spam.sort() arranges the values in spam in “ASCIIbetical” order.
36. The code spam.sort(key=str.lower) sorts a list in alphabetical order.
37. ['cat', 'dog', 'moose'] because the list was already in “ASCIIbetical” order.
38. ['moose', 'dog', 'cat'] because the reverse=True keyword argument makes the sort() method sort in reverse order.
39. [42, 86, 99, 3] because the reverse() list method reverses the order of values in a list.
40. Hello.
41. Nothing.
42. Nothing.
43. Hello.
44. Hello.
45. Hello.
46. Hello.
47. Hello.
48. Sequence data types include lists, strings, and tuples.
49. Because the first index is 0, not 1. The index 1 refers to the second character in the 'Zophie' string, o.
50. The expression 'Zophie'[-1] evaluates to the last index in the string, e.
51. This is a trick question! 'Zophie'[9999] results in an IndexError: string index out of range because there is no index 9999 in a string of only six characters.
52. The code prints each character of the string separately:
c a t
53. The code prints the first value, the list ['cat', 'dog'], and then the second value, the string 'moose':
['cat', 'dog'] moose
54. Because 'moose'[0:3] evaluates to the string 'moo', the code prints:
m o o
55. The main difference between lists and tuples is that the contents of a tuple cannot be modified but the contents of a list can be.
56. list(('cat', 'dog')) returns a list form of the tuple.
57. tuple(['cat', 'dog']) returns a tuple form of the list.
58. You get the error message TypeError: 'tuple' object does not support item assignment because the contents of tuples cannot be modified.
59. Technically, Python variables always contain references to values rather than the values themselves.
60. In Python, the = assignment operator always copies references to values rather than the values themselves.
61. There is only one list value in the computer’s memory; the a, b, and c variables all contain copies of the same reference to this one list value.
62. There are three separate list values in the computer’s memory, as the copy() method creates a copy of the list instead of just copying a reference to the list.
63. Use deepcopy() to make a copy of [['cat', 'dog'], 'moose'], as it is a list that contains other list values.
Pangram Detector
def is_pangram(sentence):
EACH_LETTER = []
for char in sentence:
char = char.upper()
if char in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' and char not in EACH_LETTER:
EACH_LETTER.append(char)
if len(EACH_LETTER) == 26:
return True
else:
return False
print('Enter a sentence:')
response = input()
if is_pangram(response):
print('That sentence is a pangram.')
else:
print('That sentence is not a pangram.')
Coordinate Directions
def get_end_coordinates (directions):
# Start the x, y coordinates at 0, 0:
x = 0
y = 0
# Increase or decrease the coordinate for each direction:
for direction in directions:
if direction == 'N':
y = y + 1
elif direction == 'S':
y = y - 1
elif direction == 'E':
x = x + 1
elif direction == 'W':
x = x - 1
return [x, y]
# Hold the user's directions in this list:
directions = []
while True:
print('Enter N, S, E, or W to enter a direction. Enter nothing to stop.')
response = input().upper()
if response == '':
break # Stop accepting user directions.
if response == 'N' or response == 'S' or response == 'E' or response == 'W':
directions.append(response)
print(get_coordinates(directions))
Answers to the Practice Questions
1. 'name' and 42 are the keys of the key-value pairs.
2. 'Alice' and 'answer' are the values of the key-value pairs.
3. A SyntaxError error appears because dictionaries use curly brackets, not square brackets.
4. Change the square brackets to curly brackets: {'name': 'Alice'}.
5. A NameError error appears because, without quotes, Python thinks cat is a variable name instead of a string.
6. Put cat and Zophie in quotes to make them strings: {'cat': 'Zophie'}. Alternatively, the original code would work if there existed variables named cat and Zophie.
7. Yes. {True: True} is a valid dictionary. Boolean values can be used for keys and values in Python dictionaries.
8. Yes. They are the same. Python dictionaries are unordered, so it doesn’t matter in what order you enter the key-value pairs in your code.
9. No. They are different dictionaries. The first has a key 'name' with value 'Alice' and the second has a key 'Alice' with value 'name'.
10. No. They are different dictionaries. The value in the key-value pair for the first dictionary has a string '12345' while the second has an integer 12345.
11. Yes. Python dictionaries can have strings for keys.
12. Yes. Python dictionaries can have integers for keys. However, unlike integer list indexes, these integer keys don’t have anything to do with ordering, as Python dictionaries are unordered.
13. Yes. Python dictionaries can have negative integers for keys. However, unlike integer list indexes, they don’t have anything to do with ordering, as Python dictionaries are unordered.
14. A KeyError error happens if your code tries to access a non-existent key.
15. No. The keys in a Python dictionary must be unique. If you enter code such as {'a': 1, 'a': 2}, Python evaluates this as {'a': 2}.
16. Yes. Multiple key-value pairs can have the same values. For example, {'a': 1, 'b': 1} is a valid Python dictionary in which multiple key-value pairs have a value of 1.
17. Python dictionaries are unordered and therefore have no concept of a “first” or “last” key-value pair.
18. ['name', 'color']
19. ['Alice', 'red']
20. [('name', 'Alice'), ('color', 'red')]
21. This is a trick question! spam[42] causes a KeyError error because there is a key '42' in the dictionary but no key 42.
22. This is a trick question! spam[1] causes a KeyError error because there is no key 1 in the dictionary. Dictionaries are not like lists; just because there are keys 0 and 2 doesn’t mean there is necessarily a key 1.
23. No. If the 'color' key doesn’t exist, the get() dictionary method returns None by default.
24. The get() method returns 'red'. The optional second argument is returned when the requested key doesn’t exist.
25. No. The setdefault() method never results in a KeyError error.
26. {'time': 15, 'temp': 23.2, 'feels_like': 24.0, 'humidity': 91, 'pressure': 1014}. Note that the value for 'feels_like' must be 24.0 because 24 is an integer. Also note that the order of the key-value pairs doesn’t matter.
27. {15: 'Alice', 17: 'Bob', 19: 'Carol'}. Note that the order of the key-value pairs doesn’t matter.
28. No. The keys in Python dictionaries must be unique, so it’s not possible for two people to have a reservation at the same time if the reservation times are the keys.
29. Yes. If the customer names are the keys, it’s possible for two customers to have a reservation at the same time. For example, {'Alice': 15, 'Bob': 15, 'Carol': 15} is a dictionary containing three customers with reservations at 3 pm.
30. [{'name': 'Alice', 'grade': 7}, {'name': 'Bob', 'grade': 7}, {'name': 'Carol', 'grade': 7}, {'name': 'David', 'grade': 6}]. The particular order of the dictionaries in the list doesn’t matter and could differ.
31. spam[1]['name'] evaluates to 'Zophie'.
32. spam[0]['age'] evaluates to 3.
33. spam['pets'][0] evaluates to 'Zophie'.
34. There are multiple correct ways to write this code, but the most straightforward is:
for pet_name in pet_owners['Alice']: print(pet_name)
35. {'Home': {1: 0, 2: 0, 3: 1, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}, 'Visitor': {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}}
36. The full program looks like this:
game = {'Home': {}, 'Visitor': {}}
for inning in range(1, 10): # The loop goes up to but doesn't include 10.
game['Home'][inning] = 0
game['Visitor'][inning] = 0
game['Home'][3] = 1 # Set the one run made in the third inning.
print(game)37. The program looks the same as the previous program, except the for loop has changed from range(1, 10) to range(1, 10000):
game = {'Home': {}, 'Visitor': {}}
for inning in range(1, 10000): # Loop from 1 to 9999.
game['Home'][inning] = 0
game['Visitor'][inning] = 0
game['Home'][3] = 1 # Set the one run made in the third inning.
print(game)Random Weather Data Generator
import random
def get_random_weather_data ():
temp = float(random.randint(-50, 50))
return {'temp': temp, 'feels_like': temp + random.randint(-10, 10),
'humidity': random.randint(0, 100), 'pressure': random.randint(990, 1010)}
weather = []
for i in range(100):
weather.append(get_random_weather_generator())
print(weather)
Average-Temperature Analyzer
import random
def get_random_weather_generator():
temp = float(random.randint(-50, 50))
return {'temp': temp, 'feels_like': temp + random.randint(-10, 10),
'humidity': random.randint(0, 100), 'pressure': random.randint(990, 1010)}
def get_average_temperature(weather_data):
total = 0
for weather in weather_data:
total += weather['temp']
return total / len(weather_data)
weather = []
for i in range(100):
weather.append(get_random_weather_generator())
print(weather)
print(get_average_temperature(weather))
Chess Rook Capture Predictor
def white_rook_can_capture(rook, board):
can_capture = []
for square in board.keys():
piece = board[square]
if piece[0] == 'b' and (square[0] == rook[0] or square[1] == rook[1]):
can_capture.append(square)
return can_capture
print(white_rook_can_capture('d3', {'d7': 'bQ', 'd2': 'wB', 'f1': 'bP', 'a3': 'bN'}))
Answers to the Practice Questions
1. A string literal is the string value literally in the source code, typed as text surrounded by quote characters.
2. They are the same, though single-quoted strings need to escape any single-quote characters that are a part of the string, while double-quoted strings need to escape any double-quote characters that are a part of the string.
3. Multiline strings start and end with a “triple quote,” which is either three single-quote characters or three double-quote characters.
4. Yes. It is valid. String literals that begin and end with double quotes don’t have to escape single-quote characters.
5. Yes. It is valid. String literals that begin and end with double quotes don’t have to escape single-quote characters.
6. Yes. If the string uses both single and double quotes, then one of the quote types must be escaped. (The exception is that multiline strings can contain both kinds of quotes unescaped.)
7. The 'A\'B' string literal is valid because it has an escaped single-quote character \'. The 'A\\\'B' string literal is valid because it has an escaped backslash \\ followed by an escaped single quote \'. But 'A\\'B' is invalid because the first backslash escapes the second backslash, leaving the following quote unescaped. This makes Python think that the string is 'A\\', and the B' that follows is a syntax error.
8. You can mark a string literal as raw with an r prefix: r'Hello'.
9. print('A\\B') will print one backslash because \\ is an escape character for backslashes (just as \n is an escape character for newlines and \t is an escape character for tabs).
10. print(r'A\\B') will print two backslashes because raw string literals (marked with the r prefix) don’t escape any characters and treat all backslashes as literal backslash characters in the string.
11. A multiline string (which begins and ends with three quote characters) can also function as a multiline comment, as a string by itself does nothing in Python.
12. 'Hello'[1] evaluates to 'e' because 0 is the index of the first character in a string and 1 is the index of the second character.
13. 'Hello'[-1] evaluates to 'o' because negative indexes count from the end of the string, with -1 being the index of the last character.
14. 'Hello'[4:5] evaluates to 'o' because 4:5 is a slice substring starting at index 4 and going up to, but not including, index 5.
15. 'Hello'[4:4] evaluates to the blank string '' because the slice substring starts at index 4 and goes up to, but does not include, index 4. A slice substring with the same starting and ending indexes always translates to an empty string.
16. 'Hello'[9999] causes an IndexError because the largest index in the string 'Hello' is 4, and 9999 is larger than 4.
17. 'Hello'[1:9999] does not cause an IndexError, because slices don’t cause errors in Python. Instead, because 9999 is larger than the largest index, the slice just goes all the way to the end of the string. 'Hello'[1:9999] evaluates to 'ello'.
18. 'H' in 'Hello' evaluates to True, because 'H' does appear in 'Hello'. You can consider this code to be the same as the expression 'H' in ['H', 'e', 'l', 'l', 'o'].
19. 'H' in ['Hello', 'Goodbye'] evaluates to False. Only 'Hello' and 'Goodbye' exist in this list.
20. 'Hello' in ['Hello', 'Goodbye'] evaluates to True because 'Hello' is one of the two values in that list.
21. 'Hello' in ['Hi', ['Hello', 'Goodbye']] evaluates to False because 'Hello' is neither the string value 'Hi' nor the list value ['Hello', 'Goodbye'].
22. ['Hello', 'Goodbye'] in ['Hi', ['Hello', 'Goodbye']] evaluates to True because the list ['Hello', 'Goodbye'] is the second value (after 'Hi') in the other list.
23. 'I am number ' + 42 causes an error because you are trying to add a string and an integer. 'I am number ' + str(42) doesn’t cause an error, because str(42) evaluates to the string '42', making the expression 'I am number ' + '42', and concatenating two string values is allowed.
24. No. The f-string literal f'I am number {42}' doesn’t cause an error, because even though 42 is an integer, the f-string syntax automatically converts it to a string value.
25. No. The f-string literal f'I am number {str(42)}' doesn’t cause an error when it puts the string that str(42) returns inside the f-string. While having str() is unnecessary, it doesn’t cause an error. The curly brackets inside an f-string can contain not just individual variables but entire expressions.
26. While beard_length just evaluates to the value in the variable, the f-string f'{beard_length=}' evaluates to 'beard_length=\'' + str(beard_length) + '\'' for strings. It’s a common way to have the program print the name and contents of a variable for the purposes of debugging. (Technically, f'{beard_length=}' evaluates to 'beard_length=' + repr(beard_length), but the repr() function is beyond the scope of this book.)
27. You still need to learn about string interpolation and the format() string method because you may read them in Python code written by other people. (A second reason is that f-strings were introduced in Python 3.6 and aren’t available in earlier versions.)
28. No. Calling the upper() method on a string ensures that it will never have lowercase letters. This expression can only evaluate to False no matter what string is in spam.
29. '42'.isupper() evaluates to False. The string must have at least one uppercase letter for the isupper() method to return True.
30. 'X42'.isupper() evaluates to True. If the string has at least one uppercase letter and no lowercase letters, the isupper() method returns True.
31. The lower() method returns a string of the original string’s letters but in lowercase. The islower() method returns a True or False Boolean value depending on the string’s casing.
32. It returns False, because at least one word in that string doesn’t begin with an uppercase letter.
33. It returns 'This Sentence Is Capitalized.'
34. spam.isdecimal() is the expression that evaluates to True if spam contains only numeric digits. (The expression spam.isdigit() also works; there are slight differences between these two methods that are beyond the scope of this book, however, and the isdecimal() method is almost certainly the one you want to use.)
35. '1,000,000'.isdecimal() returns False because the commas are not decimal digits.
36. '-5'.isdecimal() returns False because the negative sign is not a decimal digit.
37. str(float(42)) returns the string '42.0' because float(42) returns the floating-point value 42.0, and str(42.0) returns '42.0'.
38. str(float(42)).isdecimal() returns False because the period in '42.0' is not a decimal digit.
39. 'headache'.startswith('he') and 'headache'.endswith('he') evaluate to True and True, respectively, which further evaluates to True. On the other hand, 'headache'.startswith('he').endswith('he') evaluates to True.endswith('he'), which causes an error, because endswith() is a string method and cannot be called on Boolean values like True.
40. The join() string method returns string values.
41. The split() string method returns a list value (specifically, a list of string values).
42. 'cat,dog,moose'
43. 'c,a,t,,,d,o,g,,,m,o,o,s,e' because the join() method expects a sequence of values. If you accidentally pass a string instead of a list of strings, the joining string is put in between the characters of the string.
44. Calling 'Hello!'.rjust(10) returns ' Hello!'.
45. Calling 'Hello!'.ljust(10) returns 'Hello! '.
46. A Unicode code point is a number that identifies a text character.
47. The UTF-8 encoding is almost certainly the Unicode encoding you want to use in every case.
48. chr() returns a text character string of the Unicode code point integer you pass it.
49. ord() returns the Unicode code point integer of the text character string you pass it.
50. The ! comes first before the A in ASCIIbetical order because the exclamation point’s code point integer is less than the uppercase letter A’s code point integer.
51. No. pyperclip is a third-party module and doesn’t come with Python.
52. The pyperclip.paste() function returns a string of the text contents on the clipboard.
53. The pyperclip.copy() function takes a string argument of text to place on the clipboard.
54. This pyperclip.paste() call would return 'Goodbye', because 'Goodbye' replaced 'Hello' as the text stored on the clipboard.
Word Match Game
def get_word_hint(secret_word, guess_word):
hint = ''
for i in range(5):
if guess_word[i] == secret_word[i]:
hint += 'O'
elif guess_word[i] in secret_word:
hint += 'o'
else:
hint += 'x'
return hint
import random
secret = random.choice('MITTS FLOAT BRICK LIKED DWARF COMMA GNASH ROOMS UNITE
BEARS SPOOL ARMOR'.split())
print('Guess the secret five-letter word:')
for i in range(6):
guess = input().upper()
print(get_word_hint(secret, guess))
print()
if guess == secret:
break
if guess == secret:
print('You guessed the secret word!')
else:
print('The secret word was ' + secret + '. Better luck next time.')
Diagonal Stripe Scroll Animation
import time
while True:
for i in range(50):
print('O' * i + '.' * (50 - i))
time.sleep(0.01)
for i in range(50):
print('.' * i + 'O' * (50 - i))
time.sleep(0.01)
mOcKiNg SpOnGeBoB mEmE
def spongecase(text):
use_upper = False
sponge_text = ''
for character in text:
if character.isalpha():
if use_upper:
sponge_text += character.upper()
else:
sponge_text += character.lower()
use_upper = not use_upper
else:
sponge_text += character
return sponge_text
print('Enter a sentence:')
response = input()
print(spongecase(response))
Answers to the Practice Questions
1. The re.compile() function creates a Pattern object from a regex string, while the search() method finds regex matches in a given string to search.
2. Three groups, from the three sets of parentheses.
3. Three groups, from the three sets of parentheses (even though two sets are nested).
4. r'\(\d{3}\)-(\d{3})-(\d{4})'
5. The following characters have special meaning in regex strings and must be escaped if you want to literally match them (the question asks for four of them): # $ & ( ) * + - . ? [ \ ] ^ { | } ~.
6. Either clutter|clue|club or clu(|tter|e|b).
7. It matches the two-letter strings AA, AB, BA, and BB.
8. The search() method returns the first match, while findall() returns all matches.
9. ['415-555-9999'], because the regex doesn’t have any groups, so findall() returns a list of strings.
10. [('415', '555', '9999')], because the regex has groups, so findall() returns a list of tuples of strings, one string for the text matching each group.
11. Both [abcd] and [a-d] are equivalent to a|b|c|d, so either is an acceptable answer.
12. The regex \w\d\w will match strings like a1z, B3x, and L0L. The regex [a-zA-Z]\d[a-zA-Z] will also match those strings.
13. No. [a-z] won’t match é, because it matches only the 26 letters from a to z and doesn’t match letters with accent marks.
14. Yes. \w will match é, because it matches alphanumeric characters.
15. No. \W won’t match é, because it matches everything that \w won’t match.
16. No. [A-Z] won’t match the lowercase string z, because it matches only the uppercase letters A to Z.
17. Yes. The . will match é, because the period is a special character that matches any single character.
18. No. r'\.' won’t match é, because it matches only literal period characters.
19. The \d, \w, and \S shorthand character classes. The \d class will match 5 because it matches digits, while the \w class will match 5 because it matches all letters and digits. The \S class matches all characters that are not whitespace characters, which includes digits like 5.
20. The regex will match all of these: A, B, AA, AB, BA, and BB.
21. The regex Cheese\?? matches “Cheese” optionally followed by a literal question mark, matching Cheese or Cheese?. Adding a group for clarity also works: r'Cheese(\?)?'.
22. X? and X* will match the blank string '', which has zero X characters. The regex X+ can match only strings with at least one X.
23. The regex X+ matches the same thing as X{1,}: one or more X characters.
24. Yes. All three of the regexes match the same strings.
25. The regex Ha{3} matches Haaa, while the regex (Ha){3} matches HaHaHa.
26. The regex is https://(www\.)?\w+\.com. Note that the periods must be escaped as \., and there must be at least one letter in the domain name.
27. (1|I){3}-(1|I){4} or (1|I)(1|I)(1|I)-(1|I)(1|I)(1|I)(1|I) but not (1|I)+-(1|I)+, because the pattern requires a specific number of characters. Alternatively, you can use character classes: [1I]{3}-[1I]{4}.
28. Greedy matching is the default behavior.
29. It’s a feature of quantifier syntax because it determines how many characters are matched rather than which characters are matched.
30. The regex .* means “do a greedy match of zero or more characters.” It effectively matches the largest amount of any text.
31. The regex .*? means “do a non-greedy match of zero or more characters.” It effectively matches the smallest amount of any text.
32. The Pattern object returned by re.compile('.*') matches all characters except the newline character, while the Pattern object returned by re.compile('.*', re.DOTALL) matches all characters.
33. ^spam$ is the only regex in that list that matches the exact string spam.
34. \B matches everything that is not a word boundary. For example, while re.search(r'e.*', 'An elephant') would match elephant starting from the first e, re.search(r'\Be.*', An elephant) would start matching only from the second e, which is not at the boundary of a word: ephant.
35. No. It doesn’t do case-insensitive matching by default.
36. The re.I and re.IGNORECASE arguments.
37. No. It will not match, because the é in 'Sinéad' won’t match [A-Z], even in case-insensitive mode.
38. No. It has no effect, because r'\d+' matches one or more digit characters, not letters.
39. \1, \2, and \3 are back references (in this case, for the first, second, and third groups in the regex string).
40. No. Instead, the sub() method returns a string.
41. The sub() method takes two arguments: a string that replaces any matches and a regex string to do the matching.
42. The re.VERBOSE flag enables verbose mode. (The re.X flag also enables verbose mode but isn’t covered in this book.)
43. Verbose mode makes regular expressions more readable by allowing whitespace (including newlines) and comments within the regex string.
44. Verbose mode comments begin with a # and continue to the end of the line. Unlike Python comments, they are written inside multiline strings.
45. Humre’s functions return strings.
46. A{3}
47. \\. or r'\.'
48. A{3}|B{2}
49. Benefits include the ability to indent Python code rather than regex strings in verbose mode, more helpful error messages, the ability to use Python comments, automatic handling of raw strings and escaping, and compatibility with the code editor’s parentheses matching, syntax highlighting, linting, and autocomplete features.
Hashtag-Finding Regex
import re
def get_hashtags(sentence):
pattern = re.compile(r'#\w*')
return pattern.findall(sentence)
print('Enter a sentence:')
response = input()
for hashtag in get_hashtags(response):
print(hashtag)
Price-Finding Regex
import re
def get_price(sentence):
pattern = re.compile(r'(\$\d+(\.\d\d)?)')
prices = []
for price in pattern.findall(sentence):
prices.append(price[0])
return prices
print('Enter a sentence:')
response = input()
for price in get_price(response):
print(price)
Creating a CSV File of PyCon Speakers
import re
speakers = """ A Bessas 1
A Bingham 1
A Cuni 3
A. Garassino 1
--snip--
Žygimantas Medelis 1""".splitlines()
speaker_count = re.compile(r'^ (.*)\s(\d+)')
with open('speakers.csv', 'w', encoding='utf-8') as file_obj:
for speaker in speakers:
line = speaker_count.sub (r'\1,\2\n', speaker)
file_obj.write(line)
Laugh Score
import re
def laugh_score(laugh):
pattern = re.compile('ha(h|a)*', re.IGNORECASE)
match = pattern.search(laugh)
if match == None:
return 0
return len(match.group(0))
assert laugh_score('abcdefg') == 0
assert laugh_score('h') == 0
assert laugh_score('ha') == 2
assert laugh_score('HA') == 2
assert laugh_score('hahaha') == 6
assert laugh_score('ha ha ha') == 2
assert laugh_score('haaaaa') == 6
assert laugh_score('ahaha') == 4
assert laugh_score('Harry said Hahaha') == 2
Word Twister—ordW wisterT
import re pattern = re.compile(r'\b(\w)(\w*)\b') print(pattern.sub(r'\2\1', 'Hello world! How are you? I am fine.'))
Answers to the Practice Questions
1. Directory is another term for folder.
2. The backslash (\) separates folders and filenames on Windows.
3. The forward slash (/) separates folders and filenames on macOS and Linux.
4. The root folder is the topmost folder in a filesystem that contains all other folders.
5. An absolute filepath begins with the root folder, which is either / on macOS and Linux or C:\ on Windows.
6. A relative filepath is relative to the current working directory.
7. from pathlib import Path
8. It evaluates to Path('spam/bacon/eggs') and is a relative path.
9. Like the Path() call in the previous question, it evaluates to Path('spam/bacon/eggs') and is a relative path.
10. Like the Path() call in the previous two questions, it evaluates to Path('spam/bacon/eggs') and is a relative path.
11. No. You cannot get a Path object from only string values. This expression causes an error.
12. Path(r'C:\spam\eggs.txt'), also written as Path('C:/spam/eggs.txt').
13. os.chdir() changes the Python program’s current working directory.
14. Path(r'C:\eggs.txt'), also written as Path('C:/eggs.txt').
15. Path.cwd() returns Path(r'C:\spam'), also written as Path('C:/spam').
16. C:\spam
17. Path.cwd().parent
18. The st_atime, st_ctime, and st_mtime attributes refer to the last access time, creation time, and last modification time of a file, respectively.
19. The * in a glob pattern means any number of any characters.
20. The ? in a glob pattern means one of any character.
21. The returned Boolean value indicates whether the path exists as a file or folder on your computer.
22. They both return False.
23. No. Unlike PDFs or Word documents, plaintext files contain text only, and have no font, size, or color information.
24. PDFs and spreadsheet files are binary files and aren’t human readable in a text editor.
25. Path('eggs.txt').read_text()
26. The eggs.txt file will contain 'Goodbye' because the original content gets overwritten.
27. The UTF-8 encoding.
28. Read mode.
29. file_obj.read()
30. file_obj.readlines()
31. Path('eggs.txt').write_text(contents)
32.
with open('eggs.txt', 'w', encoding='utf-8') as file_obj:
file_obj.write(contents)33. A with statement creates context managers.
34. A context manager will automatically close a file when the execution leaves the with statement’s block.
35. No. You don’t need to specify the file extension of a shelf file.
36. A dictionary is similar to the structure of a shelf file.
37. The keys() and values() methods.
Text File Combiner
def combine_two_text_files(filename1, filename2, output_filename):
with open(output_filename, 'w', encoding='UTF-8') as out_file_obj:
# Write the contents of the first file:
with open(filename1, encoding='UTF-8') as in_file_obj:
out_file_obj.write(in_file_obj.read())
# Write the contents of the second file:
with open(filename2, encoding='UTF-8') as in_file_obj:
out_file_obj.write(in_file_obj.read())
combine_two_text_files('spam.txt', 'eggs.txt', 'output.txt')
Zigzag File
import sys
def write_zigzag():
indent = 0 # How many spaces to indent
indentIncreasing = True # Whether the indentation is increasing or not
with open('zigzag.txt', 'w', encoding='utf-8') as file_obj:
for i in range(1000):
file_obj.write(' ' * indent + '********\n')
if indentIncreasing:
# Increase the number of spaces:
indent = indent + 1
if indent == 20:
# Change direction:
indentIncreasing = False
else:
# Decrease the number of spaces:
indent = indent - 1
if indent == 0:
# Change direction:
indentIncreasing = True
write_zigzag()
Rock, Paper, Scissors with Saved Games
import random, sys, shelve
print('ROCK, PAPER, SCISSORS')
shelf_file = shelve.open('rpsSaved')
# These variables keep track of the number of wins, losses, and ties.
if 'wins' not in shelf_file and 'losses' not in shelf_file and 'ties' not in shelf_file:
shelf_file['wins'] = 0
shelf_file['losses'] = 0
shelf_file['ties'] = 0
while True: # The main game loop
print(shelf_file['wins'], 'Wins')
print(shelf_file['losses'], 'Losses')
print(shelf_file['ties'], 'Ties')
while True: # The player input loop
print('Enter your move: (r)ock (p)aper (s)cissors or (q)uit')
player_move = input()
if player_move == 'q':
sys.exit() # Quit the program.
if player_move == 'r' or player_move == 'p' or player_move == 's':
break # Break out of the player input loop.
print('Type one of r, p, s, or q.')
# Display what the player chose:
if player_move == 'r':
print('ROCK versus...')
elif player_move == 'p':
print('PAPER versus...')
elif player_move == 's':
print('SCISSORS versus...')
# Display what the computer chose:
random_number = random.randint(1, 3)
if random_number == 1:
computer_move = 'r'
print('ROCK')
elif random_number == 2:
computer_move = 'p'
print('PAPER')
elif random_number == 3:
computer_move = 's'
print('SCISSORS')
# Display and record the win/loss/tie:
if player_move == computer_move:
print('It is a tie!')
shelf_file['ties'] = shelf_file['ties'] + 1
elif player_move == 'r' and computer_move == 's':
print('You win!')
shelf_file['wins'] = shelf_file['wins'] + 1
elif player_move == 'p' and computer_move == 'r':
print('You win!')
shelf_file['wins'] = shelf_file['wins'] + 1
elif player_move == 's' and computer_move == 'p':
print('You win!')
shelf_file['wins'] = shelf_file['wins'] + 1
elif player_move == 'r' and computer_move == 'p':
print('You lose!')
shelf_file['losses'] = shelf_file['losses'] + 1
elif player_move == 'p' and computer_move == 's':
print('You lose!')
shelf_file['losses'] = shelf_file['losses'] + 1
elif player_move == 's' and computer_move == 'r':
print('You lose!')
shelf_file['losses'] = shelf_file['losses'] + 1
shelf_file.close()
Answers to the Practice Questions
1. The shutil module name stands for shell utilities. Shell in this case refers to terminal or command line interfaces.
2. The backslash (\) separates folders in Windows filepaths.
3. The forward slash (/) separates folders in macOS and Linux filepaths.
4. shutil.copy(), shutil.copyfile(), and shutil.copytree() are real functions, while shutil.filecopy() is not.
5. Both files and folders can be moved by shutil.move().
6. The os module contains the makedirs() function.
7. No. There is no difference between os.makedirs('eggs') and os.makedirs(Path('eggs')).
8. The exist_ok=True keyword argument.
9. A dry run can help you verify that your code will delete the files you intend.
10. The os.unlink() and os.remove() functions delete files.
11. The shutil.rmtree() function deletes an entire folder and its contents.
12. The deletion functions in the os and shutil modules delete files and folders permanently.
13. The folder name, a list of subfolders, and a list of filenames.
14. Pass . (or os.getcwd() or Path.cwd()) to start from the current working directory.
15. Yes. The code deletes every file. Note that it doesn’t delete folders.
16. This program prints every subfolder, including the name of the folder it resides in:
import os
for folder_name, subfolders, filenames in os.walk('eggs'):
for subfolder in subfolders:
print(f"{folder_name}/{subfolder}")17. Compressed files and folders are contained in a .zip file.
18. import zipfile (it’s case sensitive).
19. zipfile.ZipFile('example.zip')
20. Without the compress_type=zipfile.ZIP_DEFLATED keyword argument, files are written without compression.
21. As the compression level goes up, the compressed ZIP file becomes smaller in size but slower to read and write.
22. The namelist() method gives you a list of the content in a ZIP file.
23. Yes. ZIP files can contain folders as well as files.
24. While file_size is the original size of the file, compress_size is the compressed size of the file.
25. The extractall() method extracts the entire contents of a ZIP file to the current working directory.
26. The extract() method extracts a single file from a ZIP file.
27. The variable name doesn’t have to be contents_zip:
import zipfile
contents_zip = zipfile.ZipFile('contents.zip', 'w',
compression=zipfile.ZIP_DEFLATED, compresslevel=9)
contents_zip.write('contents.txt')
contents_zip.close()Duplicate Filename Finder
import os
from pathlib import Path
def find_dup_filenames(folder):
# Dictionary to store filenames as keys and absolute paths as values (in a list)
files = {}
# Walk through the directory tree starting from 'folder'
for folder_name, subfolders, filenames in os.walk(folder):
for filename in filenames:
# Add a new key for the filename if not already present:
files.setdefault(filename, [])
# Append the full path of the file to the list for that filename:
files[filename].append(Path(folder_name) / filename)
# Prepare to remove keys (filenames) that are not duplicates:
keys_to_delete = []
for file in list(files.keys()): # Iterate through all files
# Duplicate filenames have two or more items in the list
if len(files[file]) < 2:
keys_to_delete.append(file)
# Remove all non-duplicate filenames from the dictionary
for file in keys_to_delete:
del files[file]
return files # Return a dictionary of duplicate filenames and their locations.
# Call the function and print the results
for filename, absolute_filepaths in find_dup_filenames(Path.home()).items():
print(filename) # Print the duplicate filename
for absolute_filepath in absolute_filepaths:
# Print each location of the duplicate file, indented for readability:
print(' ' + str(absolute_filepath))
Alphabetized Folders
import os from pathlib import Path def make_alpha_folders(folder): # Outer loop: Iterate over the first level of letters (A-Z) for level1 in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ': # Inner loop: Iterate over the second level of letters (A-Z) for level2 in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ': # Create the directory's path: folder_path = Path(folder) / level1 / (level1 + level2) # Create the directory, including any necessary parent folders: os.makedirs(folder_path) make_alpha_folders(Path.home() / 'alpha_folders')
ZIP File Folder Extractor
import zipfile
def extract_in_folder(zip_filename, folder):
zip_file = zipfile.ZipFile(zip_filename)
# Loop through all filenames in the ZIP archive
for name in zip_file.namelist():
# Check if the filename starts with the specified folder path
# and is not the folder itself
if name.startswith(folder + '/') and name != folder + '/':
# Extract the file to the current directory:
zip_file.extract(name, '.')
extract_in_folder('test.zip', 'spam')
Answers to the Practice Questions
1. A program is a general term for a piece of software, while a command is a program that is designed to run from a text-based terminal and doesn’t have a graphical user interface.
2. An application is a program that often has a graphical user interface, while a command does not. Also, applications are often larger than commands, with multiple features. Applications must be installed and uninstalled with installer and uninstaller programs.
3. An interactive command is a command that asks for text input from the user while running, such as an “Are you sure? Y/N” confirmation.
4. Yes. Scripts, commands, applications, and web apps are all types of programs.
5. The terminal application on Windows is called Command Prompt (or PowerShell or Terminal, if you have them installed). On macOS, it is Terminal, and on Linux, it is also Terminal.
6. The user’s home folder is represented by the tilde character (~).
7. The Python interpreter on Windows is in a file called python.exe.
8. The pwd command prints the current working directory.
9. The current working directory is part of the command line prompt on Windows.
10. The dir (“directory”) command on Windows and the ls (“list”) command on macOS and Linux display the contents of the current working directory.
11. The dir *.exe command displays all the executable files in the current working directory on Windows.
12. The file * | grep executable command displays all the executable files in the current working directory on macOS and Linux.
13. The start example.txt command opens the example.txt file with the default text editor app on Windows.
14. The open example.txt command opens the example.txt file with the default text editor app on macOS.
15. You can enter the name of the program to run it; enter eggs or eggs.exe on Windows and ./eggs on macOS and Linux.
16. If the eggs program is in a folder listed in the PATH environment variable, it will run. If not, you will get an error message saying that Python cannot find a program with that name.
17. The echo %PATH% command on Windows and the echo $PATH command on macOS and Linux show the contents of the PATH environment variable.
18. The semicolon character (;) on Windows and the colon character (:) on macOS and Linux separate the folder names in the PATH environment variable.
19. No. Entering spam.exe would not run C:\Users\al\Scripts\subfolder\spam.exe, because C:\Users\al\Scripts\subfolder is not in the PATH environment variable.
20. Edit the .zshrc file in your home folder to edit the PATH environment variable on macOS.
21. Edit the .bashrc file in your home folder to edit the PATH environment variable on Linux.
22. The where command on Windows and which command on macOS and Linux would tell you the folder location of the program.
23. Yes. You can use virtual environments to have multiple different versions of the same package installed at the same time.
24. The built-in venv module creates virtual environments.
25. The conventional name for virtual environment folders is .venv.
26. Running where python on Windows or which python3 on macOS and Linux can let you verify that you are using the virtual environment’s Python interpreter and not the system’s Python interpreter.
27. The command python -m pip list (or pip list or pip3 list) shows all of the third-party packages that are currently installed.
28. The pip or pip3 program (or running python - m pip) installs third-party Python packages.
29. The pip tool downloads third-party Python packages from the Python Package Index (PyPI) at https://pypi.org.
30. The command pip install automateboringstuff3 installs the automateboringstuff3 package, a package that includes all of the third-party packages in Automate the Boring Stuff with Python.
31. The __file__ variable stores a string of the full path of the current .py file being run.
32. Using __file__ in the interactive shell results in a NameError because this variable is created only when the Python interpreter runs a .py file.
33. The sys.executable variable holds the filepath of the Python interpreter program.
34. The sys.version variable is a string. It looks like this: '3.13.0 (tags/v3.13.0:60403°5, Oct 7 2024, 09:38:07) [MSC v.1941 64 bit (AMD64)]'.
35. The sys.version_info.major and sys.version_info.minor variables are integers.
36. if sys.version_info.major >= 3: checks whether the Python program is being run by a Python interpreter version 3 or later.
37. The sys.platform variable contains 'win32' on Windows, 'darwin' on macOS, and 'linux' on Linux.
38. The ModuleNotFoundError exception is raised if you try to import a module that isn’t installed.
39. Commands should have short names because they are typed often, while variables should have long, descriptive names because they are read often.
40. The sys.argv variable contains ['yourScript.py', 'download', 'confirm'].
41. The sys.argv variable contains ['yourScript.py', 'download_confirm'].
42. Yes. The order of command line arguments matters for the item order in the sys.argv list.
43. pyperclip.paste() returns the text that is currently on the clipboard.
44. pyperclip.copy() puts text on the clipboard.
45. clear is the command to clear the terminal window of text on macOS and Linux.
46. cls is the command to clear the terminal window of text on Windows.
47. playsound.playsound('hello.mp3') will play the audio in a file named hello.mp3.
48. A function call blocking means it won’t return from the function call until the function has finished executing.
49. The term quiet mode means the program won’t print any output, and verbose mode means it will print extra information.
50. No. PyMsgBox’s dialog boxes don’t appear in the terminal window.
51. pymsgbox.prompt() and pymsgbox.password() allow the user to enter text into a dialog box.
52. pymsgbox.alert() displays a text message in a dialog box to the user.
53. pymsgbox.confirm() presents the user with OK and Cancel buttons.
54. Yes. You can create an entire program that uses PyMsgBox functions instead of print() and input(). This kind of program has a simple graphical user interface.
55. Windows uses batch files.
56. The pause command prints the text Press any key to continue... and blocks until the user presses a key.
57. The macOS operating system uses .command files.
58. A command file or shell script named yourScript becomes executable on macOS and Linux after you’ve run the chmod u+x yourScript command.
59. Yes. The virtual environment must be activated by the batch file, command file, or shell script before it runs your program.
60. Python programs are mostly run by interpreters.
61. Compiled Python programs don’t require the user to have Python installed to run.
62. python -m PyInstaller --onefile yourScript.py
63. The dist folder contains the compiled program.
64. No. PyInstaller can only compile programs for the operating system that PyInstaller is run on. If you run PyInstaller on Windows, it can compile the Python program to run on Windows only.
65. The smallest compiled Python programs are several megabytes in size.
Guess the Number with PyMsgBox
# This is a guess the number game.
import random, pymsgbox
secret_number = random.randint(1, 20)
pymsgbox.alert('I am thinking of a number between 1 and 20.')
# Ask the player to guess 6 times.
for guesses_taken in range(1, 7):
guess = int(pymsgbox.prompt('Take a guess.'))
if guess < secret_number:
pymsgbox.alert('Your guess is too low.')
elif guess > secret_number:
pymsgbox.alert('Your guess is too high.')
else:
break # This condition is the correct guess!
if guess == secret_number:
pymsgbox.alert('Good job! You got it in ' + str(guesses_taken) + ' guesses!')
else:
pymsgbox.alert('Nope. The number was ' + str(secret_number))
Timer with PyMsgBox
import pymsgbox, time
delay = int(pymsgbox.prompt('Enter number of seconds to wait:'))
time.sleep(delay)
pymsgbox.alert("Time's up!")
Compiling the Timer and Guess the Number Programs
You can compile these programs by running python -m PyInstaller --onefile msgBoxGuess.py (or msgBoxTimer.py) on Windows. On macOS and Linux, run python3 instead of python.
Answers to the Practice Questions
1. No, because HTTPS encrypts the content passed between the web server and your browser, including form data such as credit cards and passwords.
2. Yes, because HTTPS doesn’t prevent an eavesdropper from knowing which websites you access.
3. While a VPN prevents eavesdroppers from knowing what websites or URLs you access, the VPN provider will be able to record this information.
4. This code uses Python’s webbrowser module to open a web browser to a given URL:
import webbrowser
webbrowser.open('https://docs.python.org/3')5. This code retrieves the home page of https://nostarch.com:
import requests
response = requests.get('https://nostarch.com')6. This code downloads the file at https://autbor.com/hello.mp3 and saves it as hi.mp3:
import requests
response = requests.get("https://autbor.com/hello.mp3")
with open("hi.mp3", "wb") as file:
file.write(response.content)7. 404.
8. 200.
9. The raise_for_status() method of Response objects.
10. HTTP (or HTTPS).
11. No. While some APIs are free to use (or have free plans), some require paid subscriptions.
12. JSON is the most popular format used for response data from API calls.
13. Unauthorized users can use your API key, either exhausting your free plan or charging credits to your API account.
14. The json.loads() function.
15. The scheme is https and the domain is openweathermap.org/api.
16. HTML stands for Hypertext Markup Language and CSS stands for Cascading Style Sheets.
17. The <b> and </b> are tags, while <b>Hello</b> is an element.
18. You see the HTML source for the current page in the web browser.
19. The browser’s developer tools show you the HTML of the current page (along with the Document Object Model, JavaScript Console, network traffic, and other features that are beyond the scope of the chapter).
20. No. You should use an HTML parser rather than regular expressions to find text in HTML source code.
21. CSS selectors.
22. Beautiful Soup parses HTML source code.
23. beautifulsoup4 is Beautiful Soup’s package name on PyPI.
24. bs4 is the name you use to import the Beautiful Soup module.
25. Python.
26. It returns an HTML element with an ID of 'p'.
27. Make the function call bs4.BeautifulSoup(source_html, 'html.parser').
28. The select() method returns an Element object based on a CSS selector.
29. A ResultSet object contains Tag objects.
30. The attrs Python attribute.
31. The getText() method of Tag objects returns a string of the inner text between the opening and closing tags.
32. elem.get('href')
33. A browser’s user agent.
34. from selenium import webdriver (and not just import selenium).
35. A WebDriver object represents a browser in Selenium.
36. browser.get('https://nostarch.com')
37. The back() and forward() methods simulate pressing the Back and Forward buttons.
38. The quit() method closes the browser.
39. The find_element() method returns the first matching element, while find_elements() returns all matching elements in a list.
40. from selenium.webdriver.common.by import By
41. The By.LINK_TEXT setting matches the full link text, and By.PARTIAL_LINK_TEXT matches partial link text.
42. browser.find_element(By.NAME, 'bday')
43. browser.find_element(By.TAG_NAME, 'input')
44. intro_paragraph.get_property("innerHTML")
45. first_name_field.send_keys('Albert')
46. first_name_field.submit()
47. This code finds and clicks a link with the text “Click here”:
link = browser.find_element(By.LINK_TEXT, "Click here") link.click()
48. Keys.HOME
49. Headless mode means running a browser without displaying the browser window on the screen.
50. python -m playwright install
51. from playwright.sync_api import sync_playwright
52. The new_page() method opens a new browser tab.
53. page.goto('https://nostarch.com') (assuming that the Page object is stored in a variable named page).
54. The close() method closes the browser.
55. The go_back() and go_forward() methods simulate pressing the Back and Forward buttons in the browser.
56. page.get_by_text('Click here')
57. page.locator('#author')
58. The inner_text() method.
59. The inner_html() method.
60. The check() and uncheck() methods will check and uncheck the checkbox (no matter its current state), while the click() method will set it to the opposite state, and the set_checked() method will set it based on the Boolean argument passed.
61. The click() method.
62. page.locator('html').press('Home')
Headline Downloader
This example solution uses Requests and Beautiful Soup to download headlines from Slashdot:
import requests
import bs4
# Download the Slashdot home page:
response = requests.get('https://slashdot.org')
response.raise_for_status()
# Create a parser and find all the headline elements.
soup = bs4.BeautifulSoup(response.text, 'html.parser')
elems = soup.select('.story-title')
# Print the headlines:
for elem in elems:
print(elem.text)
This example solution uses Playwright to download headlines from Slashdot:
from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
browser = playwright.firefox.launch(headless=True, slow_mo=50)
page = browser.new_page()
# Open the Slashdot home page:
page.goto('https://slashdot.org')
# Find all of the headline elements:
locator = page.locator('.story-title')
# Print the headlines:
for i in range(locator.count()):
print(locator.nth(i).inner_text())
browser.close()
Image Downloader
import requests
import bs4
def download_images_from(website):
# Download the web page:
response = requests.get(website)
response.raise_for_status()
# Parse the web page for all <img> tags:
soup = bs4.BeautifulSoup(response.text, 'html.parser')
elems = soup.select('img')
# Download the image for each <img> tag:
for elem in elems:
src = elem.attrs['src']
if src.startswith('http'):
img_url = src
else:
img_url = website + '/' + src
print(f'Downloading {img_url}...')
# Get the filename for the image:
img_filename = img_url[img_url.rfind('/') + 1:]
# Download the image:
response = requests.get(img_url)
response.raise_for_status()
with open(img_filename, 'wb') as img_file:
for chunk in response.iter_content(10000):
img_file.write(chunk)
download_images_from('https://inventwithpython.com/')
Breadcrumb Follower
This example solution uses Requests and Beautiful Soup to follow the trail of web pages:
import requests
import bs4
# Set the initial web page to download:
base = 'https://autbor.com/breadcrumbs/'
page = 'index.html'
while True:
# Download the web page:
print(f'Downloading {base + page}...')
response = requests.get(base + page)
response.raise_for_status()
# Parse the web page for the "Go to" text:
soup = bs4.BeautifulSoup(response.text, 'html.parser')
page_text = soup.select('#hello')[0].text
if 'Go to' in page_text:
# If "Go to" is found, get the next page name:
page = page_text[len('Go to '):]
print(f'Next page is {page}')
else:
# Otherwise, this is the last page:
print(page_text)
break # Stop looping.
HTML Chessboard
import random
def get_random_chessboard():
pieces = 'bP bN bR bB bQ bK wP wN wR wB wQ wK'.split()
board = {}
for board_rank in '87654321':
for board_file in 'abcdefgh':
if pieces == []:
break
if random.randint(1, 6) == 1:
board[board_file + board_rank] = random.choice(pieces)
return board
def write_html_chessboard(board):
# Open an html file for writing the chessboard html
with open('chessboard.html', 'w', encoding='utf-8') as file_obj:
# Start the table element:
file_obj.write('<table>\n')
write_white_square = True # Start with a white square.
# Loop over all the rows ("ranks") on the board:
for board_rank in '87654321':
# Start the table row element:
file_obj.write(' <tr>\n')
# Loop over all the columns ("files") on the board:
for board_file in 'abcdefgh':
# Start the table data cell element:
file_obj.write(' <td style="background: ')
# Give it a white or black background:
if write_white_square:
file_obj.write('white')
else:
file_obj.write('black')
# Switch square color:
write_white_square = not write_white_square
file_obj.write('; width: 60px; height: 60px;">')
# Write the html for a chess piece image:
square = board_file + board_rank
if square in board:
file_obj.write('<center><img src="' + board[square] + '.png"></center>')
# Finish the table data cell element:
file_obj.write('</td>\n')
# Finish the table row element:
file_obj.write(' </tr>\n')
# Switch square color for the next row:
write_white_square = not write_white_square
# Finish the table element:
file_obj.write('</table>')
'''
# Here is the dictionary for a starting chess board layout:
write_html_chessboard({'a8': 'bR', 'b8': 'bN', 'c8': 'bB', 'd8': 'bQ',
'e8': 'bK', 'f8': 'bB', 'g8': 'bN', 'h8': 'bR', 'a7': 'bP', 'b7': 'bP',
'c7': 'bP', 'd7': 'bP', 'e7': 'bP', 'f7': 'bP', 'g7': 'bP', 'h7': 'bP',
'a1': 'wR', 'b1': 'wN', 'c1': 'wB', 'd1': 'wQ', 'e1': 'wK', 'f1': 'wB',
'g1': 'wN', 'h1': 'wR', 'a2': 'wP', 'b2': 'wP', 'c2': 'wP', 'd2': 'wP',
'e2': 'wP', 'f2': 'wP', 'g2': 'wP', 'h2': 'wP'})
'''
write_html_chessboard(get_random_chessboard())
Answers to the Practice Questions
1. A Workbook object.
2. The active worksheet is the worksheet that is selected and viewable when the Excel file is opened.
3. Worksheets have titles; workbooks do not. The .xlsx filename acts as the “title” for a workbook.
4. wb.sheetnames is a list of strings of all Worksheet titles.
5. An int. Even though Excel labels columns with letters, the column attribute for a Cell object is an int that begins at 1 for the first column.
6. B1, because cell A1 is cell(row=1, column=1), not cell(row=0, column=0).
7. max_row and max_column.
8. After running from openpyxl.utils import get_column_letter, the function call get_column_letter(900) returns the column letters 'AHP'.
9. After running from openpyxl.utils import column_index_from_string, the function call column_index_from_string('ZZ') returns the column number 702.
10. Nine cells: A1, A2, A3, B1, B2, B3, C1, C2, and C3.
11. list(sheet.columns)[2]
12. openpyxl.Workbook()
13. The wb.create_sheet() method creates a new, empty Worksheet object.
14. With the del operator; for example, del wb['Sheet1'].
15. It renames the Worksheet object’s title to 'New Title'.
16. Save the Workbook object with a new filename. Changes to the Workbook object do not change the original file unless you save it with the original filename.
17. The code creates a new Worksheet object, positioned at the end of the existing Worksheet objects.
18. sheet['A3'] = 'Hello'
19. It changes the value in cell A1, because the rows and columns are 1-based, not 0-based like Python list indexes.
20. from openpyxl.styles import Font
21. name, size, bold, and italic. There are other keyword arguments you can find out about by running help(Font).
22. Font(size=24, italic=True)
23. Font(name='Times New Roman', bold=True)
24. sheet['B3'].font = font
25. No. You must separately increase the row height.
26. Passing strike=True adds a horizontal strikethrough line to the text.
27. Cells with Excel formulas begin with an equal sign (=).
28. Excel formulas include SUM, IFERROR, TRIM, IF, LEN, VLOOKUP, SUBSTITUTE, and many others.
29. No. Excel formulas are different from Python functions.
30. No. Excel formulas are evaluated by the Excel application or another spreadsheet application, not by Python or OpenPyXL.
31. Pass the data_only=True keyword argument to openpyxl.load_workbook() to have OpenPyXL return the resulting calculation of an Excel formula in a cell, instead of the text of the formula itself.
32. You don’t need to do anything to have OpenPyXL return the text of the formula in a cell, instead of the resulting calculation. This is OpenPyXL’s default behavior. (Alternatively, you could pass data_only=False to openpyxl.load_workbook().)
33. sheet.row_dimensions[3].height = 100
34. sheet.column_dimensions['D'].width = 200 and sheet.row_dimensions[2].height = 200.
35. Rows do not have a width attribute; they have a height attribute only.
36. sheet.merge_cells('A10:A22')
37. sheet.unmerge_cells('A10:A22')
38. Row 1 is frozen.
39. No rows are frozen; there are no freeze panes at all.
40. sheet.freeze_panes = None
41. BarChart, LineChart, PieChart, and ScatterChart.
42. The Worksheet object, the column and row of the top-left cell, and the column and row of the bottom-right cell.
43. The Reference object and a string title.
44. The Chart object and a string of the cell coordinate location for the top-left corner of the chart.
Search Term Finder
import openpyxl, os
from openpyxl.utils import get_column_letter, column_index_from_string
def find_in_excel(search_text):
# Lowercase the search text to do a case-insensitive search:
search_text = search_text.lower()
# The found results begin as a blank dictionary:
results = {}
# Loop through all files in the current working directory:
for filename in os.listdir('.'):
if not filename.endswith('.xlsx'):
# Skip non-Excel files:
continue
wb = openpyxl.load_workbook(filename, data_only=True)
sheet = wb.active
# Loop through every row and column in the worksheet:
for row in sheet['A1:' + get_column_letter(sheet.max_column) + str(sheet.max_row)]:
for cell in row:
# Check for a match:
if search_text in str(cell.value).lower():
# Append the match to results:
results.setdefault(filename, [])
results[filename].append(cell.coordinate)
return results
print(find_in_excel('name')) # Test this function.
Excel Home Folder Report
import openpyxl, os
from pathlib import Path
def get_home_folder_size():
filenames_and_sizes = []
# Loop over everything in the home folder:
for filename in os.listdir(Path.home()):
absolute_file_path = Path.home() / filename
# Skip folders/directories:
if absolute_file_path.is_dir():
continue
# Get file size:
try:
file_size = absolute_file_path.stat().st_size
except:
# Skip files with permissions errors:
continue
# Record filename and size:
filenames_and_sizes.append((filename, file_size))
return filenames_and_sizes
def make_excel_report(filenames_and_sizes):
# Create a new Workbook object:
wb = openpyxl.Workbook()
sheet = wb.active
row = 1 # Start writing data at row 1.
for filename_and_size in filenames_and_sizes:
sheet.cell(row=row, column=1).value = filename_and_size[0]
sheet.cell(row=row, column=2).value = filename_and_size[1]
row += 1 # Increment to the next row.
# Save the Workbook as an Excel file:
wb.save('homeFilesReport.xlsx')
# Uncomment to print the hundred largest filenames and sizes:
#print(get_home_folder_size())
make_excel_report(get_home_folder_size())
Answers to the Practice Questions
1. Yes. You need a Google account to use Google Sheets, but Google accounts are free. You may need to submit a phone number to sign up for an account. Google also offers paid tiers that enable you to make heavy, commercial-level use of its APIs.
2. No. Your Google account password should never appear in the source code of your Python programs. This is a security risk, as you may accidentally share these files, and anyone can read them. Don’t rely on “clever” obfuscation techniques like writing the password backward or using base64 encoding; these do not prevent people from recovering the password.
3. You can download the credentials JSON file for your Google account from the Google Cloud Console at https://console.cloud.google.com.
4. The Google Sheets API and the Google Drive API.
5. After you run import ezsheets for the first time, token-sheets.pickle and token-drive.pickle files are created.
6. You should revoke the credentials from the Google Cloud Console website.
7. No. The project name and ID are not visible.
8. The ezsheets.Spreadsheet() function creates a new spreadsheet.
9. The ezsheets.upload() function uploads a .xlsx spreadsheet to Google Sheets.
10. Spreadsheet objects have titles as a string in their title attribute.
11. If you’re logged in to a Google account with the appropriate permissions, you’ll load the Google Sheets web app and see the spreadsheet.
12. The ezsheets.downloadAsExcel() function converts a spreadsheet to an Excel file and downloads it.
13. The six spreadsheet formats in which you can download your Google Sheets spreadsheet are Excel, OpenOffice, CSV, TSV, PDF, and a ZIP of HTML files.
14. The ezsheets.listSpreadsheets() function lists all spreadsheets in your Google account.
15. No. You must also call the delete() method with permanent=True to permanently delete it.
16. Call the refresh() method to update the local Spreadsheet object in your program.
17. The Sheet() method for Spreadsheet objects creates a new, blank Sheet object.
18. The sheets attribute accesses the Sheet objects of a Spreadsheet object.
19. Yes. Sheet objects have a title, stored as a string in their title attribute.
20. sheet['C5'] = 'Hello'
21. It returns a string, even if the value is a number, such as 30.
22. Call the copyTo() method for Sheet objects to copy the entire sheet to a different Google Sheets spreadsheet.
23. 'AHP'
24. 702
25. It returns 'B3'.
26. It returns (1, 2).
27. The getRow() and getColumn() methods for Sheet objects can return an entire column or an entire row of cells at once.
28. Google Forms can add data from submitted forms to a spreadsheet in Google Sheets.
29. Call the refresh() method to update the local Spreadsheet object in your program.
30. Yes. As of this writing, the limit is 250 new spreadsheets per day.
31. Exceeding the activity limit of the Google Sheets API results in EZSheets raising the googleapiclient.errors.HttpError "Quota exceeded for quota group" exception.
32. Yes. You can monitor your API usage on the Google Cloud Console web page, in the APIs & Services section.
Uploading All Files in a Folder
import ezsheets, os
def upload_all_spreadsheets():
# Loop through all files in the current working directory:
for filename in os.listdir('.'):
if not (filename.endswith('.xlsx') or filename.endswith('.csv')):
# Skip non-Excel and non-CSV files:
continue
# Upload the spreadsheet file:
print(f'Uploading {filename}...')
ezsheets.upload(filename)
upload_all_spreadsheets()
Google Sheets Home Folder Report
import ezsheets, os from pathlib import Path def get_home_folder_size(): filenames_and_sizes = [] # Loop over everything in the home folder: for filename in os.listdir(Path.home()): absolute_file_path = Path.home() / filename # Skip folders/directories: if absolute_file_path.is_dir(): continue # Get file size: try: file_size = absolute_file_path.stat().st_size except: # Skip files with permissions errors: continue # Record filename and size: filenames_and_sizes.append((filename, file_size)) return filenames_and_sizes def make_google_sheets_report(filenames_and_sizes): # Create a new spreadsheet: ss = ezsheets.Spreadsheet() ss.title = 'Home Files Report' sheet = ss.sheets[0] row = 1 # Start writing data at row 1. for filename_and_size in filenames_and_sizes: sheet['A' + str(row)] = filename_and_size[0] sheet['B' + str(row)] = filename_and_size[1] row += 1 # Increment to the next row. make_google_sheets_report(get_home_folder_size())
Answers to the Practice Questions
1. There are several common pronunciations, including “sequel-ite,” “es-cue-lite,” and “es-cue-el-ite”; any of these answers is acceptable.
2. Tables are to databases what sheets are to spreadsheets.
3. A database can have one or more tables.
4. A unique, unchanging identifier for a row in a table.
5. In SQLite, the primary key column is often named rowid.
6. No. Records never change their primary key value.
7. No. Database tables are best suited for spreadsheets that have a variable number of rows of data and a repeated columns.
8. No. Using SQLite doesn’t require running separate database software.
9. No. SQLite doesn’t strictly enforce the data types of its columns.
10. SQLite does not have built-in permissions or roles.
11. No. SQLite is public domain software and free to use for any purpose.
12. Import sqlite3 to use SQLite in Python.
13. conn = sqlite3.connect('example.db')
14. It enables autocommit mode, so your queries will be immediately committed.
15. This query creates a new table in the database, but only if the table doesn’t already exist.
16. The SQLite data types NULL, INTEGER, REAL, TEXT, and BLOB are analogous to Python’s NoneType, int, float, str, and bytes data types.
17. SQLite’s type affinity feature will convert the TEXT value '42' to the INTEGER value 42.
18. Because 'Hello' cannot be converted to an INTEGER, SQLite inserts it as a TEXT value even though the column has INTEGER affinity. There is no error.
19. You can enable strict mode on a per-table basis by including the STRICT keyword at the end of the CREATE TABLE statement.
20. No. SQLite doesn’t have a data type for times and dates. Use the TEXT type instead.
21. No. SQLite doesn’t have a data type for Boolean values. Use the INT type instead, with 0 meaning false and 1 meaning true.
22. conn.execute('PRAGMA TABLE_INFO (cats)').fetchall()
23. conn.execute('SELECT name FROM sqlite_schema WHERE type="table"').fetchall()
24. An INSERT query creates data.
25. A SELECT query reads data.
26. An UPDATE query updates data.
27. A DELETE query deletes data.
28. The query is missing the VALUES keyword. It should read: 'INSERT INTO cats VALUES ("Zophie", "2021-01-24", "black", 5.6)'.
29. No. SQLite’s INSERT queries are atomic and either completely succeed to insert the data or fail to insert any data.
30. No. SQLite is isolated, and transactions cannot affect other transactions.
31. Using ? placeholders instead of f-strings prevents SQL injection attacks.
32. The * means “all columns except for rowid.”
33. The SELECT query is missing the columns in the table to select.
34. A WHERE clause allows a SELECT query to filter the data it returns by having it match the search parameters in the WHERE clause.
35. The 10 operators you can use in a WHERE clause are =, !=, <, >, <=, >=, AND, OR, NOT, and LIKE. Note that, in SQLite, the equality operator is = and not ==.
36. The LIKE operator does pattern matching, using % in a manner similar to * in glob pattern matching.
37. Add an ORDER BY clause to the query so that it looks as follows: 'SELECT rowid, name FROM cats ORDER BY rowid'.
38. The LIMIT clause limits the number of rows returned by a query.
39. A column index speeds up the process of reading data.
40. A column index slows down the process of inserting or updating data.
41. The Python code conn.execute("SELECT name FROM sqlite_schema WHERE type = 'index' AND tbl_name = 'cats'").fetchall() returns a list all of the indexes for a table named cats.
42. This query will update the fur column in every row of the table to "black", which is unlikely to have been the user’s intention.
43. This query will delete every row in the table, which is unlikely to have been the user’s intention.
44. The second instruction returns an empty list, because no record with a rowid of 42 exists after that record was deleted.
45. Yes. This code raises an exception with an OperationalError error message.
46. No. It just returns an empty list.
47. It begins a transaction.
48. All of the data will be inserted into the database when you finish the transaction. Until then, none of the data is inserted.
49. conn.commit()
50. conn.rollback()
51. No. You can’t roll back a transaction after it has been committed.
52. Copying the database file is the easiest way to make a backup if no program is currently connected to it.
53. The backup() method of Connection objects.
54. for line in conn.iterdump(): print(line) will print the text of SQLite queries needed to re-create a database and its data.
55. 'ALTER TABLE spam RENAME TO eggs'
56. 'ALTER TABLE spam RENAME COLUMN foo TO bar'
57. 'ALTER TABLE spam ADD COLUMN price INTEGER DEFAULT 42'
58. 'DROP TABLE spam'
59. You can assume the values in the cat_id column match the rowid values in a table named cats or cat, but this is just a database convention; the column name can be anything.
60. The orders table should have a foreign key column named customer_id, as each customer could have made multiple orders.
61. FOREIGN KEY(cat_id) REFERENCES cats(rowid)
62. In-memory SQLite databases are faster than disk-based SQLite database files.
63. The data in an in-memory SQLite database is lost when the program terminates if it isn’t saved to a file-based database.
64. memory_db_conn.backup(file_db_conn)
65. You can’t recover the data.
66. The data is saved, because the backup() method call in the exception handler copies the database to the cats.db file.
Monitoring Free Disk Space Levels
import sqlite3, psutil, time, datetime
# Set up database connection and table:
conn = sqlite3 .connect('monitorFreeSpace.db', isolation_level=None)
conn.execute('CREATE TABLE IF NOT EXISTS freespace (free INT, timestamp TEXT) STRICT')
print('Monitoring disk free space. Press Ctrl-C to quit.')
try:
while True:
free = psutil.disk_usage('C:\\').free # Windows
#free = psutil.disk_usage('/').free # macOS and Linux
timestamp = str(datetime.datetime.now())
conn.execute('INSERT INTO freespace VALUES (?, ?)', [free, timestamp])
print(free, timestamp)
time.sleep(1)
except KeyboardInterrupt:
pass
Database-to-String Converter
import sqlite3
def db_to_txt(db_filename):
conn = sqlite3 .connect(db_filename, isolation_level=None)
# Get the name of the table:
table_name = conn.execute('SELECT rowid, name FROM sqlite_schema WHERE
type="table"').fetchall()[0][0]
# Get the names of all the columns in the table:
column_names = ['rowid']
for column in conn.execute(f'PRAGMA TABLE_INFO({table_name})').fetchall():
column_names.append(column[1])
#print(column_names) # Print column names.
with open(db_filename + '.txt', 'w', encoding='utf-8') as text_file:
text_file.write(','.join(column_names) + '\n')
# Read in the table data and write it to the text file:
for row in conn.execute(f'SELECT * FROM {table_name}'):
# Convert all the items in the row list to strings:
row_as_strings = []
for item in row:
row_as_strings.append(str(item))
# Write data to text file:
text_file.write(','.join(row_as_strings) + '\n')
#print(row) # Print row data.
db_to_txt('example.db')
Answers to the Practice Questions
1. Pass a string filename to the pypdf.PdfReader() function to open a PDF file.
2. The pages attribute of PdfReader objects stores the individual Page objects.
3. Here is a function that returns the number of pages in a PDF:
def get_num_pages(filename): reader = pypdf.PdfReader(filename) return len(reader.pages)
4. The extract_text() method of Page objects extracts text from a PDF.
5. The code reader.pages[1].extract_text() extracts the text of page 2 of a PDF file.
6. The pdfminer.high_level.extract_text() function extracts text from a PDF when you pass it the PDF’s filename as a string.
7. LLMs such as ChatGPT can automatically clean up the extracted text strings from a PDF.
8. The pypdf.PdfWriter() function lets you create new PDF files.
9. No. The PyPDF package cannot write arbitrary text to a PDF file the way Python can write arbitrary text to a .txt file.
10. No. You cannot rotate a page by 45 degrees using pydpdf or pdfminer. These modules can rotate pages in 90-degree increments only.
11. The following code rotates all pages in example.pdf by 90 degrees and saves them in rotated.pdf. Your code may differ but can be considered correct as long as it produces a file named rotated.pdf with the correctly rotated pages:
import pypdf
writer = pypdf.PdfWriter()
writer.append('example.pdf')
for i in range(len(writer.pages)):
writer.pages[i].rotate(90)
with open('rotated.pdf', 'wb') as file:
writer.write(file)12. The merge_page() method of Page objects allows you to add a watermark to a page.
13. The add_blank_page() method adds a blank page to the end of a PDF.
14. The code writer.insert_blank_page(index=2) inserts a blank page as the new page 3.
15. Use the 'AES-256' encryption algorithm to encrypt your PDF files.
16. The password 'elephant' is a poor choice because it is an English word and can easily be brute-forced.
17. PDF files support a user password for viewing the PDF and an owner password for printing, commenting, extracting text, and taking other actions.
18. The code doc = docx.Document('demo.docx') opens a file named demo.docx and stores the Document object in a variable named doc.
19. The code doc.paragraphs[1].text contains the string value of the text in the second paragraph.
20. The code len(doc.paragraphs) returns the number of paragraphs in a Document object.
21. True. Document objects contain Paragraph objects, which in turn contain Run objects.
22. False. Setting the Paragraph object’s bold and italic attributes to True will set the entire text of the paragraph to bold and italic. To italicize some text in a paragraph and bold some other text, you must set the bold and italic attributes of the Run objects in the Paragraph object to True.
23. Paragraph objects and Run objects have text attributes, while Document objects do not.
24. You can set bold, italic, strike, and other attributes to True (meaning always enabled), False (meaning always disabled), or None (meaning enabled or disabled based on the Run object’s existing style).
25. The code doc.add_paragraph('Hello, world!', 'Title') adds a paragraph with the text “Hello, world!” in the built-in Title style.
26. Document objects have the add_paragraph() method.
27. Paragraph objects have the add_run() method.
28. A blank .docx document has one Paragraph object that contains zero Run objects.
29. Here is the code for creating a millionstars.docx file with exactly one million * asterisk characters:
import docx
doc = docx. Document()
doc.add_paragraph('*' * 1000000)
doc.save('millionstars.docx')30. Here is the code for creating a countdown.docx file that counts down from 1,000 to 0:
import docx
doc = docx.Document()
for number in range(1000, -1, -1):
doc.add_paragraph(str(number))
doc.save('countdown.docx')PDF Document Word Counter
import pypdf
def pdf_word_count(pdf_filename):
reader = pypdf.PdfReader(pdf_filename)
text = ''
for page in reader.pages:
text += page.extract_text()
return len(text.split())
print(pdf_word_count('example.pdf'))
Searching All PDFs in a Folder
import pypdf, os
def search_all_PDFs(text, folder='.', case_sensitive=False):
matches = []
for filename in os.listdir(folder):
if not filename.lower().endswith('.pdf'):
# Skip non-PDF files:
continue
reader = pypdf.PdfReader(filename)
for page_number, page_obj in enumerate(reader.pages):
page_text = page_obj.extract_text()
if not case_sensitive and text.lower() in page_text.lower():
matches.append(f'In {filename} on page {page_number}')
elif case_sensitive and text in page_text:
matches.append(f'In {filename} on page {page_number}')
return matches
print('\n'.join(search_all_PDFs('hello', '.')))
Word Document Logger for Guess the Number
# This is a guess the number game.
import random, docx
from pathlib import Path
# Open the existing or create a new Word document:
if Path('guessWordLog.docx').exists():
word_log = docx.Document('guessWordLog.docx')
else:
word_log = docx.Document()
secret_number = random.randint(1, 20)
print('I am thinking of a number between 1 and 20.')
word_log.add_paragraph('I am thinking of a number between 1 and 20.')
# Ask the player to guess 6 times.
for guesses_taken in range(1, 7):
guess = int(input('Take a guess. '))
word_log.add_paragraph(f'Take a guess. {guess}')
if guess < secret_number:
print('Your guess is too low.')
word_log.add_paragraph('Your guess is too low.')
elif guess > secret_number:
print('Your guess is too high.')
word_log.add_paragraph('Your guess is too high.')
else:
break # This condition is the correct guess!
# Show the game results:
if guess == secret_number:
print('Good job! You got it in ' + str(guesses_taken) + ' guesses!')
word_log.add_paragraph('Good job! You got it in ' + str(guesses_taken) + ' guesses!')
else:
print('Nope. The number was ' + str(secret_number))
word_log.add_paragraph('Nope. The number was ' + str(secret_number))
# Save the Word log file:
word_log.save('guessWordLog.docx')
Converting Text Files to Word Documents
import docx, os
from pathlib import Path
def str_to_docx(text, word_filename):
doc = docx.Document()
for line in text.splitlines():
doc.add_paragraph(line)
doc.save(word_filename)
for filename in os.listdir('.'):
if not filename.endswith('.txt'):
# Skip non-text files:
continue
print(f'Converting {filename}...')
with open(filename, encoding='utf-8') as text_file:
# Get the contents of the text file:
content = text_file.read()
# Create a Word document from the text:
str_to_docx(content, filename + '.docx')
Bolding Words in a Word Document
import docx
def bold_words(filename, word):
# Open the original document and the new document:
original_doc = docx.Document(filename)
bold_doc = docx.Document()
# Loop through all the paragraphs in the original document:
for original_paragraph in original_doc.paragraphs:
text = original_paragraph.text
# Create the paragraph for the new document:
para_with_bold_words = bold_doc.add_paragraph()
while word in text:
# Get the text that appears before the word:
before_word = text[0:text.find(word)]
# Add the text before the word as an unbold run:
para_with_bold_words.add_run(before_word)
# Add the bold word as a bold run:
run_with_bold_words = para_with_bold_words.add_run(word)
run_with_bold_words.bold = True
# Remove this from text and continue looping:
text = text[text.find(word) + len(word):]
# Add any remaining text to the new paragraph:
para_with_bold_words.add_run(text)
# After processing all the paragraphs, save the new document:
bold_doc.save(filename + '.bold.docx')
bold_words('demo.docx', 'hello')
Answers to the Practice Questions
1. The CSV format most closely resembles a spreadsheet.
2. CSV stands for comma-separated values. It’s called that, oddly enough, because the values in the file are separated by commas.
3. Yes. CSV files are plaintext files, and you can view them in a text editor such as Notepad or TextEdit.
4. Text strings are the only data type CSV files support.
5. The main advantage of CSV files compared to Excel spreadsheets is their simplicity.
6. False. CSV files can include commas in their data, so long as they’re escaped with slashes. The csv module handles this escaping for you automatically.
7. You must pass a File object (like the one returned by open()) to csv.reader() and csv.writer(), not a filename string.
8. Pass the reader object to the list() function, as in list(reader_obj).
9. Use a for loop to iterate over the reader object, as in for row in reader_obj:.
10. example_data[6][1] accesses the data in the seventh row and second column.
11. The writerow() method accepts a list of strings.
12. A DictReader object represents each row as a dictionary.
13. The keys in the dictionaries of a DictReader object come from the first row of the CSV file.
14. The writerow() method of DictWriter objects accepts a dictionary.
15. The File object passed to csv.writer() was opened in read mode instead of write mode.
16. TSV stands for tab-separated values.
17. Pass the delimiter='\t' keyword argument to csv.reader() or csv.writer().
18. In other programming languages, dictionary-like data structures are called mappings, hash maps, hash tables, or associative arrays.
19. In other programming languages, list-like data structures are called arrays.
20. Plaintext formats like JSON and XML have the benefit of being easy to read by humans using text editors.
21. JSON stands for JavaScript Object Notation.
22. XML stands for eXtensible Markup Language.
23. JSON resembles Python syntax more closely than XML.
24. ['cat', 'dog',] is Python syntax and ["cat", "dog"] is JSON syntax, because trailing commas aren’t allowed in JSON, and JSON requires double quotes for strings.
25. [True, False] is Python syntax because Python capitalizes Boolean values, while [true, false] is JSON syntax because JavaScript uses lowercase Boolean values.
26. APIs usually deliver JSON data instead of XML data.
27. The s stands for string. That’s because the json.loads() function loads a string of JSON-formatted text, and the json.dumps() function returns a string of JSON-formatted text representing the Python dictionary or list passed to it.
28. json.dumps({'temperature': 72})
29. XML syntax resembles HTML syntax.
30. </spam> is the closing XML tag for <spam>.
31. The XML syntax is invalid because the </person> and </name> closing tags are in the wrong order.
32. The JSON syntax is the same as the Python syntax in this case: {"address": {"street": "100 Larkin St.", "city": "San Francisco", "zip": "94102"}}.
33. The XML syntax is <address><street>100 Larkin St.</street><city>San Francisco</city><zip>94102</zip></address>.
34. The XML attribute names are street, city, and zip.
35. The root element is the first element in an XML document that contains all other elements.
36. DOM stands for Document Object Model.
37. SAX stands for Simple API for XML.
38. SAX is the approach of reading XML documents one element at a time.
39. The benefit of the DOM approach of reading XML documents entirely into memory is convenience; you can access any XML element at any time.
40. The ET.fromstring() function returns the Element object for the root element in the XML string passed to it.
41. The tostring() function doesn’t take string arguments like '<person>Albert</person>'. Instead, you must pass it an ET.Element object: ET.tostring(ET.Element('person'), encoding='utf-8').
42. ET.tostring(person, encoding='utf-8').decode('utf-8') decodes the bytes object and returns a string.
43. This code iterates over every element in the XML, printing the element name and the text in between the element’s opening and closing tags.
44. This code iterates over every <number> element in the XML, printing the element name ('number') and the text in between the <number> and </number> tags.
45. The text attribute is set to None.
Fizz Buzz (CSV)
import csv
csv_file = csv.writer(open('fizzBuzz.csv', 'w', newline=''))
row = []
for number in range(1, 10001):
# Figure out what the Fizz Buzz entry for this number is:
if number % 15 == 0:
row.append('Fizz Buzz')
elif number % 3 == 0:
row.append('Fizz')
elif number % 5 == 0:
row.append('Buzz')
else:
row.append(number)
if len(row) == 10:
# Once we have 10 entries, write them to a row:
csv_file.writerow(row)
row = [] # Reset row to empty.
Guess the Number Statistics (CSV)
# This is a guess the number game.
import random, csv
from pathlib import Path
if not Path('guessStats.csv').exists():
# The guessStats.csv file doesn't exist, so create it:
csv_file = csv.writer(open('guessStats.csv', 'w', newline=''))
csv_file.writerow(['Secret Number', 'Won', 'Attempts', 'Guess 1',
'Guess 2', 'Guess 3', 'Guess 4', 'Guess 5', 'Guess 6'])
else:
# The guessStats.csv file already exists, so just open it.
csv_file = csv.writer(open('guessStats.csv', 'a', newline=''))
# Record all the guesses in this list:
all_guesses = []
secret_number = random.randint(1, 20)
print('I am thinking of a number between 1 and 20.')
# Ask the player to guess 6 times.
for guesses_taken in range(1, 7):
print('Take a guess.')
guess = int(input('>'))
# Record the guess:
all_guesses.append(guess)
if guess < secret_number:
print('Your guess is too low.')
elif guess > secret_number:
print('Your guess is too high.')
else:
break # This condition is the correct guess!
if guess == secret_number:
print('Good job! You got it in ' + str(guesses_taken) + ' guesses!')
# Create the list of values we'll store in the CSV file:
csv_row = [secret_number, True, guesses_taken] + all_guesses
csv_file.writerow(csv_row)
else:
print('Nope. The number was ' + str(secret_number))
csv_file.writerow([secret_number, False, guesses_taken] + all_guesses])
Guess the Number Statistics (JSON)
# This is a guess the number game.
import random, json
from pathlib import Path
if not Path('guessStats.json').exists():
# Start with a blank stats list:
stats = []
else:
# guessStats.json already exists, so read in its data:
with open('guessStats.json', encoding='utf-8') as file_obj:
stats = json.loads(file_obj.read())
secret_number = random.randint(1, 20)
print('I am thinking of a number between 1 and 20.')
game_stat = {'Secret Number': secret_number, 'Guesses': []}
# Ask the player to guess 6 times.
for guesses_taken in range(1, 7):
print('Take a guess.')
guess = int(input('>'))
# Record the guess:
game_stat['Guesses'].append(guess)
if guess < secret_number:
print('Your guess is too low.')
elif guess > secret_number:
print('Your guess is too high.')
else:
break # This condition is the correct guess!
if guess == secret_number:
print('Good job! You got it in ' + str(guesses_taken) + ' guesses!')
game_stat['Won'] = True
else:
print('Nope. The number was ' + str(secret_number))
game_stat['Won'] = False
stats.append(game_stat)
with open('guessStats.json', 'w', encoding='utf-8') as file_obj:
file_obj.write(json.dumps(stats))
Guess the Number Statistics (XML)
# This is a guess the number game.
import random
import xml.etree.ElementTree as ET
from pathlib import Path
if not Path('guessStats.xml').exists():
# Create a new guessStats.xml file:
stats_element = ET.Element('stats')
else:
# guessStats.xml already exists, so read in its data:
with open('guessStats.xml', encoding='utf-8') as file_obj:
stats_element = ET.fromstring(file_obj.read())
game_element = ET.SubElement(stats_element, 'game')
secret_number = random.randint(1, 20)
print('I am thinking of a number between 1 and 20.')
# Record the secret number:
game_element.set('secret_number', str(secret_number))
# Ask the player to guess 6 times.
for guesses_taken in range(1, 7):
print('Take a guess.')
guess = int(input('>'))
# Record the guess:
guess_element = ET.SubElement(game_element, 'guess')
guess_element.text = str(guess)
if guess < secret_number:
print('Your guess is too low.')
elif guess > secret_number:
print('Your guess is too high.')
else:
break # This condition is the correct guess!
if guess == secret_number:
print('Good job! You got it in ' + str(guesses_taken) + ' guesses!')
game_element.set('won', 'True')
else:
print('Nope. The number was ' + str(secret_number))
game_element.set('won', 'False')
# Write the root <stats> element out to guessStats.xml:
with open('guessStats.xml', 'w', encoding='utf-8') as file_obj:
file_obj.write(ET.tostring(stats_element, encoding='utf-8').decode('utf-8'))
Answers to the Practice Questions
1. The Unix epoch timestamp is in the UTC time zone.
2. The time.ctime() function returns a string of the current time, such as 'Tue Mar 17 11:05:45 2026'.
3. The time.time() function returns a float of the current time, such as 1773813875.3518236.
4. time is a module, and time.time() is a function call to the time() function inside the time module. (It’s also correct to refer to the time.time() function.)
5. The expression time.time() + 10 evaluates to a timestamp 10 seconds after the current time.
6. The expression time.ctime(time.time() - 10) evaluates to a timestamp 10 seconds ago.
7. Profiling code means measuring how long it takes to run. This can be helpful for finding performance bottlenecks or estimating how long processing a larger amount of data might take.
8. Calling time.sleep(0.5) will make your program pause its execution for one-half of a second.
9. The code round(time.time()) returns the current time rounded to the nearest second.
10. The correct function call is datetime.datetime.now(), not datetime.now(). This is a bit confusing: The first datetime is a module, and the second is a data type in the datetime module, also named datetime.
11. The expression current_time.year evaluates to an integer of the current year.
12. January 1, 1970, at 12:00 am. This timestamp is also known as the Unix epoch.
13. datetime.datetime objects represent a moment in time.
14. datetime.timedelta objects represent a duration of time.
15. Because different months and years contain different numbers of days, hours, minutes, and seconds.
16. datetime.timedelta(days=1000)
17. datetime.timedelta(days=1000) * 2 or datetime.timedelta(days=1000) + datetime.timedelta(days=1000)
18. Adding a datetime.datetime object to a datetime.timedelta object evaluates to a datetime.datetime object.
19. Adding a datetime.timedelta object to a datetime.timedelta object evaluates to a datetime.timedelta object.
20. The expression timedelta(seconds=15) - timedelta(seconds=5) == timedelta(seconds=10) evaluates to True.
21. You cannot add a timedelta object to an integer, so this expression raises an exception.
22. The expression datetime.timedelta(seconds=60) == datetime.timedelta(minutes=1) evaluates to True.
23. Format.
24. Parse.
25. strptime() takes a human-readable string of a date and time along with a string to parse it, then returns a datetime.datetime object.
26. strftime() returns a human-readable string of a date and time.
27. The code datetime.datetime.strptime('26', '%y') returns a datetime object with the year 2026.
28. The code datetime.datetime.strptime('76', '%y') returns a datetime object with the year 1976.
29. The code datetime.datetime.strptime("October of '26", "%B of '%y") returns datetime.datetime(2026, 10, 1, 0, 0).
30. The code datetime.timedelta(days=0, hours=0, minutes=1, seconds=5).total_seconds() returns 65.0.
31. A process is a running instance of a program. If you have a calculator program on your computer running multiple times, each of those instances is a separate process of the same program.
32. The subprocess.run() function takes a list of strings representing the program to run and its command line arguments.
33. Unlike on Windows and Linux, on macOS you must run the open program followed by the program’s filepath; you can’t specify the filepath alone.
34. The subprocess.Popen() function launches a program and then immediately returns, while the subprocess.run() function launches a program and doesn’t return until the launched program quits.
35. The poll() method returns None if the launched program is still running.
36. The poll() method returns the integer exit code if the launched program has quit.
37. A program that quits without error has an exit code of 0.
38. The kill() method quits the launched program.
39. The wait() method blocks and won’t return until the launched program has quit.
40. Run the 'start' program and pass it 'hello.txt' as a command line argument (and pass shell=True): subprocess.run(['start', 'hello.txt'], shell=True).
41. Run the 'open' program and pass it 'hello.txt' as a command line argument: subprocess.run(['open', 'hello.txt']).
42. subprocess.run(['python', 'spam.py']) (or 'python3' on macOS and Linux)
Alarm with Sound
import subprocess, datetime, time
from pathlib import Path
def alarm_with_audio(alarm_time, audio_filename):
# Check to make sure that the audio file exists:
if not Path(audio_filename).exists():
raise Exception('Cannot find file ' + str(audio_filename))
while datetime.datetime.now() < alarm_time:
time.sleep(0.1) # Pause a little before checking again.
# Windows version to play audio file:
subprocess.run(['start', audio_filename], shell=True)
# macOS/Linux version to play audio file:
#subprocess.run(['open', audio_filename])
# Set the alarm for 5 seconds from now:
alarm_with_audio(datetime.datetime.now() + datetime.timedelta(seconds=5), 'hello.wav')
Image Opener
import subprocess, os
from pathlib import Path
def open_images_by_name(image_folder, name_match):
for filename in os.listdir(image_folder):
if filename.endswith('.jpg') or filename.endswith('.png') or
filename.endswith('.webp'):
if name_match.lower() in filename.lower():
# Windows version, open the image:
subprocess.run(['start', Path(image_folder) / filename],
shell=True)
# macOS/Linux version, open the image:
#subprocess.run(['open', Path(image_folder) / filename])
# Open images contains 'cat' in filename:
#open_images_by_name(r'C:\memes', 'cat')
“Next Holiday” Reporter
import datetime
# This constant is the format for how the holiday data is given to us:
HOLIDAYS = {'October 31': 'Halloween', 'February 14': "Valentine's Day",
'April 1': "April Fool's Day", 'May 1': 'May Day',
'May 5': 'Cinco de Mayo'}
def next_holiday(from_date):
one_day = datetime.timedelta(days=1)
while True:
# Convert the datetime object to a string.
# The day of the month must NOT have a leading 0:
month = from_date.strftime('%B')
day = from_date.strftime('%d').lstrip('0')
month_day = month + ' ' + day
# See if from_date is a holiday and return it:
if month_day in HOLIDAYS:
return HOLIDAYS[month_day]
# Move from_date forward by one day:
from_date += one_day
print('The next holiday starting from', datetime.datetime.now(), 'is:')
print(next_holiday(datetime.datetime.now()))
Answers to the Practice Questions
1. You should revoke your credentials at the Google Cloud Console website if the credentials or token files for the Gmail API are inadvertently shared.
2. ezgmail.EMAIL_ADDRESS contains the email address you are sending email from.
3. This function call sends an email to [email protected] with the subject “Hello!” and body “Here is that graduation photo.” It also attaches the file grad.jpg to the email.
4. A single received email is represented as a GmailMessage object.
5. A series of back-and-forth emails is represented as a GmailThread object.
6. GmailMessage objects have the attributes subject, body, timestamp, sender, and recipient.
7. The function call ezgmail.recent(maxResults=50) returns the 50 most recent email threads. Without maxResults=50, the ezgmail.recent() function returns up to 25 emails by default.
8. The function call ezgmail.search('cake recipes') returns email messages that mention “cake recipes.”
9. This function call returns email sent from the email address [email protected].
10. This function call prints a summary of the unread email in the Gmail account, including the sender, subject, and timestamp information.
11. spam.downloadAllAttachments() downloads all of the file attachments in that email.
12. The downloaded attachment overwrites any existing files with the same name.
13. SMS stands for Short Message Service.
14. MMS stands for Multimedia Messaging Service.
15. Yes. However, this isn’t a reliable way to send text messages.
16. No. You cannot receive text messages this way.
17. You must know the recipient’s cellular service provider in addition to the recipient’s phone number.
18. SMS email gateways are free to use, but they are an unreliable way to send text messages.
19. Disadvantages of using SMS email gateways include requiring you to know the recipient’s cellular service provider, not knowing if a text message has been sent, not knowing if a text message has been delayed, not knowing how many text messages can be sent before your program is blocked, and not knowing if the gateway will still work in the future.
20. The book uses requests for interacting with the ntfy service, but any module that can make HTTP requests will work.
21. You can receive ntfy notifications on your smartphone using the free ntfy Android or iOS app.
22. Yes, you can receive ntfy notifications on your laptop by going to the https://ntfy.sh website.
23. The ntfy service uses HTTPS to send and receive push notifications.
24. The requests.post() function sends a push notification to the ntfy service.
25. The ntfy service is free to use for low volumes of notifications. You can also run your own ntfy server for unlimited notifications.
26. The lowest priority level is 1 and the highest priority level is 5.
27. This code posts a notification for the topic 'hello' with the message 'goodbye' to the public ntfy server at https://ntfy.sh.
28. Keyword arguments for the headers parameter in requests.post() can set the notification title, priority level, and tags.
“Quote of the Day” Email
import ezgmail, datetime, sys, random from pathlib import Path RECIPIENT = '[email protected]' # If qotdLastSent.txt doesn't exist, make a blank file: if not Path('qotdLastSent.txt').exists(): with open('qotdLastSent.txt', 'w', encoding='utf-8') as file_obj: pass # Do nothing, just create a blank file. # Open qotdLastSent.txt to see if an email was already sent today: with open('qotdLastSent.txt', encoding='utf-8') as file_obj: contents = file_obj.read() if contents == str(datetime.date.today()): # Don't send an email and exit the program: print('Email already sent today. Exiting...') sys.exit() # Open the quote-of-the-day file and randomly select a quote: with open('qotd.txt', encoding='utf-8') as file_obj: quotes = file_obj.readlines() quote = random.choice(quotes) ezgmail.send(RECIPIENT, 'Quote of the day', quote) print(f'Email sent to {RECIPIENT}: {quote}') # Open qotdLastSent.txt and write today's date: with open('qotdLastSent.txt', 'w', encoding='utf-8') as file_obj: file_obj.write(str(datetime.date.today()))
“Quote of the Day” Push Notification
import requests, datetime, sys, random
from pathlib import Path
TOPIC = 'qotd-test-py'
# If qotdLastSent.txt doesn't exist, make a blank file:
if not Path('qotdLastSent.txt').exists():
with open('qotdLastSent.txt', 'w', encoding='utf-8') as file_obj:
pass # Do nothing, just create a blank file.
# Open qotdLastSent.txt to see if an notification was already sent today:
with open('qotdLastSent.txt', encoding='utf-8') as file_obj:
contents = file_obj.read()
if contents == str(datetime.date.today()):
# Don't send an notification and exit the program:
print('Notification already sent today. Exiting...')
sys.exit()
# Open the quote-of-the-day file and randomly select a quote:
with open('qotd.txt', encoding='utf-8') as file_obj:
quotes = file_obj.readlines()
quote = random.choice(quotes)
requests.post(f'https://ntfy.sh/{TOPIC}', quote)
print(f'Notification sent to {TOPIC}: {quote}')
# Open qotdLastSent.txt and write today's date:
with open('qotdLastSent.txt', 'w', encoding='utf-8') as file_obj:
file_obj.write(str(datetime.date.today()))
Answers to the Practice Questions
1. The red-green-blue-alpha tuple (0, 255, 0, 0) represents the color green.
2. The A in RGBA, which stands for alpha, represents how transparent the color is.
3. The function call ImageColor.getcolor('chocolate', 'RGBA') returns the RGBA value of the color “chocolate.”
4. list(ImageColor .colormap) returns a list of all the color names in the Pillow library.
5. 0, 0 are the x- and y-coordinates of the origin.
6. The origin is in the top-left corner of an image.
7. (128, 0, 128, 255) is the RGBA tuple for the color purple.
8. The color white is represented by the tuple (255, 255, 255, 255).
9. The color black is represented by the tuple (0, 0, 0, 255).
10. 99, 99 is the coordinate for the pixel in the bottom-right corner. It is one less than the width and height, because the topmost and leftmost coordinates are 0, not 1.
11. A box tuple is a tuple of four integers corresponding to the left, top, right, and bottom edges of a rectangular area. The right and bottom coordinates are one pixel greater than the right and bottom edges of the area (similar to how range(10) goes up to but does not include 10).
12. Subtract the right edge from the left edge in a box tuple to calculate the width.
13. The width is 5 and the height is 10.
14. While the package name is “Pillow,” the module name is PIL, so you must run from PIL import Image.
15. Yes. Pillow can read images in the .png, .jpg, and .gif formats.
16. The Image.open() function returns an Image object of an image file on your computer.
17. The show() method opens an Image object in the default image-viewing app on your computer.
18. The function call Image.new('RGBA', (100,100), 'purple').show() creates a 100×100 image with a purple background and then displays it in the default image-viewing app on your computer.
19. The code im = Image.open('zophie.png') and im.size gives you the width and height of the image in pixels.
20. im.crop((345, 355, 565, 560)) is correct; the method expects a box tuple argument and not four separate integer arguments.
21. No. The crop() method returns a new Image object. It doesn’t change the Image object it was called on.
22. The copy() method returns a copy of the image as a new Image object.
23. No. The copy() method takes no arguments at all. To get a copy of just a portion of the image, call the crop() method on the Image object that copy() returned.
24. This method call returns a new Image object with the face_im image in the top-left corner of the cat_im image.
25. No. You must pass face_im as a third argument to paste(), or else the transparent pixels will show up as opaque white pixels.
26. The correct method call is im.resize((im.width * 2, im.height * 2)), which passes the new dimensions as an exact number of pixels, not as an amount proportional to the current size.
27. No. The resize() method returns a new Image object. It doesn’t change the Image object it was called on.
28. im.transpose(Image.FLIP_LEFT_RIGHT)
29. im.transpose(Image.FLIP_TOP_BOTTOM)
30. No. They do not produce the same image. The rotated image will be upside down.
31. The image rotated 90 degrees will be 100×200. If you didn’t pass expand=True in the method call, however, the rotated image will be the original size, 200×100.
32. The image rotated 180 degrees will be 200×100.
33. The image rotated 45 degrees will be 200×100, because the expand=True keyword argument wasn’t passed to the method call, so the image maintains its original size.
34. The rotated image will be larger, because the expand=True keyword argument was passed to the method call.
35. The getpixel() method returns the color of a single pixel.
36. No. The putpixel() method modifies the Image object in place and returns None.
37. This code draws a 100×100 image of a blue square with a red border.
38. The following code draws a 100×100 image that is green on the top half and yellow on the bottom half by first drawing a green square and then filling in the bottom half with yellow:
from PIL import Image
# Make a green square:
im = Image.new('RGBA', (100, 100), 'green')
for x in range(100):
# Make the bottom half yellow:
for y in range(50, 100):
im.putpixel((x, y), (255, 255, 0, 255))
im.show()
And this program does the same thing by first drawing a yellow square, then filling in the top half with green:
from PIL import Image
# Make a yellow square:
im = Image.new('RGBA', (100, 100), (255, 255, 0, 255))
for x in range(100):
# Make the top half green:
for y in range(50):
im.putpixel((x, y), (0, 255, 0, 255))
im.show()39. from PIL import ImageDraw
40. ImageDraw.Draw(im) returns an ImageDraw object for im.
41. The ellipse() method draws a circle if you pass it a box tuple representing a square, with an equal width and height.
42. The rectangle() method draws a square if you pass it a box tuple representing a square, with an equal width and height.
43. That method call draws two lines: one from 10, 10 to 20, 20 and a second one from 20, 20 to 40, 60.
44. The polygon() method can draw a diamond shape. For example, draw.polygon([50, 0, 100, 50, 50, 100, 0, 50], (0, 0, 0, 255)) draws a black diamond on an Image object that is at least 100×100 in size.
45. The (20, 150) argument is the coordinate pair for the top-left coordinates of the text, the 'Hello' is the text to draw, and the fill='purple' is the color to use for the text.
46. The method call uses the default font, as it doesn’t specify a font keyword argument.
47. The function call ImageFont.truetype('no_such_font.ttf', 32) raises an OSError: cannot open resource error.
48. This code creates a 1,000×1,000 image that says “Hello” in black text on a white background:
from PIL import Image, ImageDraw
im = Image.new('RGBA', (100, 100), 'white')
draw = ImageDraw.Draw(im)
draw.text((0, 0), 'Hello', fill='black')
im = im.resize((1000, 1000))
Your code may be slightly different but is considered correct so long as it produces a similar Image object.
49. Yes. pyperclipimg requires Pillow to be installed. (Installing the pyperclipimg package will automatically install Pillow too.)
50. The code pyperclipimg.paste().show() displays the image currently on the clipboard.
51. The pyperclipimg.paste() function returns a Pillow Image object.
52. The pyperclipimg.paste() function returns the None value if the clipboard contains text and not an image.
53. The code pyperclipimg.copy(pyperclipimg.paste().resize((100, 100))) resizes the image on the clipboard to 100×100 and then copies it as the new image on the clipboard. You can run pyperclipimg.paste().show() to view this new image.
54. The code pyperclipimg.paste().save('contents.png') saves the image currently on the clipboard as a file named contents.png.
55. import matplotlib.pyplot as plt
56. Three points.
57. The code plt.plot([10, 20, 30], [10, 5, 40]) creates a plot with points connected by lines.
58. A scatter plot.
59. plt.savefig('plot.png') saves the plot as an image file.
60. The two arguments to plt.bar() are a list of string categories and a list of integer or floating-point values. The lists must have the same number of items.
61. The plt.xlabel() and plt.ylabel() functions add labels to the x-axis and y-axis, respectively.
62. plt.grid(True) adds grid lines to the background of a plot.
63. The plt.show() function displays an interactive preview window of a plot.
64. This code displays an interactive preview window of a pie chart with two sections, one labeled “Part that looks like Pacman” and the other labeled “Part that does not look like Pacman.” The pie chart does kind of look like Pacman.
65. The code plt.title('The plot thickens.') adds a title to the top of the plot.
Snowpal Image
from PIL import Image, ImageDraw
# Start with a 1000 x 1000 white image:
im = Image.new('RGBA', (1000, 1000), 'white')
draw = ImageDraw.Draw(im)
# Draw a blue sky:
SKY_COLOR = (150, 240, 255, 255)
draw.rectangle((0, 0, 1000, 700), SKY_COLOR)
# Draw three ellipses for the body:
SNOWPAL_FILL = (255, 255, 255, 255)
SNOWPAL_OUTLINE = (0, 0, 0, 255)
draw.ellipse((400, 300, 650, 500), SNOWPAL_FILL, SNOWPAL_OUTLINE, 10)
draw.ellipse((380, 450, 670, 700), SNOWPAL_FILL, SNOWPAL_OUTLINE, 10)
draw.ellipse((360, 650, 690, 900), SNOWPAL_FILL, SNOWPAL_OUTLINE, 10)
# Draw two rectangles for the top hat:
TOP_HAT_COLOR = (0, 0, 0, 255)
draw.rectangle((450, 150, 600, 350), TOP_HAT_COLOR)
draw.rectangle((350, 320, 700, 350), TOP_HAT_COLOR)
# Draw two lines for the arms:
ARM_COLOR = (160, 82, 45, 255)
draw.line([400, 550, 300, 400], ARM_COLOR, 10)
draw.line([650, 520, 850, 400], ARM_COLOR, 10)
im.show()
Rainbow Flag Image Generator
from PIL import Image, ImageDraw
def create_rainbow_flag(width, height):
# Create a blank image with the given size:
flag_im = Image.new('RGBA', (width, height))
# Create the ImageDraw object of the blank image:
flag_draw = ImageDraw.Draw(flag_im)
# Calculate the height of each stripe:
stripe_height = int(height / 6)
# Draw the red stripe at the top of the image:
flag_draw.rectangle((0, 0, width, stripe_height), 'red')
# Draw the orange stripe below the red stripe:
flag_draw.rectangle((0, stripe_height, width, stripe_height * 2), 'orange')
# Draw the yellow stripe below the orange stripe:
flag_draw.rectangle((0, stripe_height * 2, width, stripe_height * 3), 'yellow')
# Draw the green stripe below the yellow stripe:
flag_draw.rectangle((0, stripe_height * 3, width, stripe_height * 4), 'green')
# Draw the blue stripe below the green stripe:
flag_draw.rectangle((0, stripe_height * 4, width, stripe_height * 5), 'blue')
# Draw the purple stripe below the blue stripe:
flag_draw.rectangle((0, stripe_height * 5, width, height), 'purple')
return flag_im
im = create_rainbow_flag(640, 480)
im.show()
Clipboard Image Recorder
import pyperclipimg, time, datetime
print('Recording clipboard images... (Ctrl-C to stop)')
previous_content = None
try:
while True:
content = pyperclipimg.paste() # Get clipboard contents.
if content != previous_content and content != None:
# If it's different from the previous, save it:
filename = f'clipboard-{str(datetime.datetime.now()).replace(":", "_")}.png'
content.save(filename)
print(f'Saved {filename}')
previous_content = content
time.sleep(0.01) # Pause to avoid hogging the CPU.
except KeyboardInterrupt:
pass
Answers to the Practice Questions
1. Tesseract is the actual OCR engine, and PyTesseract is the Python package that works with Tesseract.
2. No. Your Python scripts require both PyTesseract and Tesseract to be installed.
3. The eng.traineddata and jpn.traineddata files contain the language packs for English and Japanese, respectively.
4. No. Tesseract has different installation steps for Windows, macOS, and Linux.
5. Yes. PyTesseract requires Pillow to be installed.
6. The image_to_string() function takes an Image object argument and returns a string of the text in that image.
7. No. PyTesseract cannot identify fonts, font sizes, and font colors. Tesseract returns only plaintext strings.
8. Yes. PyTesseract can extract text from a scanned document of typed text.
9. No. PyTesseract cannot extract text from a scanned document of handwritten text (or, at least, not with any degree of reliability). That’s because the language pack models have been trained on typed text.
10. No. PyTesseract cannot extract the text of a license plate (or any other text) from photographs. That’s because the language pack models have been trained on typed text.
11. Yes. PyTesseract preserves the layout of text such as hyphenated words broken across lines.
12. LLMs are somewhat reliable at “cleaning up” the extracted text from PyTesseract, but this output always requires human review to confirm its accuracy.
13. Yes and no. While the spellchecker can identify incorrectly extracted words from PyTesseract, the spellchecker will also identify correctly extracted words misspelled in the original image.
14. No. The spellchecker does not identify correctly or incorrectly extracted numbers.
15. Tesseract identifies English characters by default.
16. The get_languages() function returns a list of all the languages that Tesseract supports.
17. The lang='jpn' keyword argument makes the image_to_string() function recognize Japanese characters.
18. The image_to_string() function returns nonsense garbage text because it tries to interpret the Japanese characters as English characters.
19. The lang='eng+jpn' keyword argument makes the image_to_string() function recognize English and Japanese characters in the same document.
20. The NAPS2 app is free.
21. The NAPS2 app runs on Windows, macOS, and Linux.
22. The subprocess module allows your Python programs to run the NAPS2 app.
23. The command line flag -i followed by frankenstein.png specifies the frankenstein.png image as the input to NAPS2.
24. The command line flag -o followed by output.pdf causes the NAPS2 app to output the PDF to a file named output.pdf.
25. If you already have the English language pack installed for NAPS2, these command line flags do nothing. You should include them in case NAPS2 doesn’t already have the English language pack installed.
26. The command line flags --install followed by ocr-jpn installs the Japanese language pack for NAPS2.
27. The command line flag -n followed by 0 runs NAPS2 without requiring a flatbed scanner to perform any scans.
28. The command line flag -i followed by page1.png;page2.png causes NAPS2 to use the page1.png and page2.png image files as the inputs on which to perform OCR.
Searchable Web Comics
import pytesseract as tess
import os, json
image_text = {}
# Go through all the files in the current working directory:
for filename in os.listdir('.'):
if not filename.endswith('.png'):
# Skip non-png files:
continue
# Extract the text from the image:
print(f'{filename}...')
text = tess.image_to_string(Image.open(filename))
#print(text) # Uncomment to preview the text.
image_text[filename] = text
# Write the dictionary to a JSON file:
with open('imageText.json', 'w', encoding='utf-8') as file_obj:
file_obj.write(json.dumps(image_text, indent=2))
Enhancing Text in Web Comics
import pytesseract as tess
import os, json
from PIL import Image
image_text = {}
# Go through all the files in the current working directory:
for filename in os.listdir('.'):
if not filename.endswith('.png'):
# Skip non-png files:
continue
# Enlarge the image and extract the text from it:
print(f'{filename}...')
im = Image.open(filename)
im = im.resize((im.width * 2, im.height * 2))
text = tess.image_to_string(im)
#print(text) # Uncomment to preview the text.
image_text[filename] = text
# Write the dictionary to a JSON file:
with open('imageTextEnlarged.json', 'w', encoding='utf-8') as file_obj:
file_obj.write(json.dumps(image_text, indent=2))
Answers to the Practice Questions
1. While controlling the mouse, your program may have clicked other windows to put them into focus, keeping you from pressing ctrl-C in the Python program’s window.
2. Slide the mouse cursor into one of the four corners of the screen to stop the Python program by raising the pyautogui.FailSafeException exception.
3. Setting pyautogui.PAUSE to 0.1 adds a 0.1-second delay after PyAuto GUI function calls.
4. The pyautogui.FailSafeException exception is raised if the mouse pointer is in one of the four corners of the screen.
5. The origin’s x- and y-coordinates are 0, 0.
6. The origin is the top-left corner of your screen.
7. The letter x represents the horizontal coordinate.
8. The letter y represents the vertical coordinate.
9. The y-coordinate increases as you move the mouse down the screen.
10. The x-coordinate doesn’t change as you move the mouse down the screen, because it is the horizontal coordinate.
11. The coordinate of the lower-right corner is 1919, 1079. (Coordinates behave similarly to the list indexes; the indexes of a list with five items are 0 through 4.)
12. The pyautogui.size() function returns the screen resolution as a Size named tuple value.
13. screen_size[1] evaluates to 1080.
14. screen_size.height evaluates to 1080, the same as screen_size[1]. You can access the height by name or by integer index.
15. Call pyautogui.moveTo(10, 20) to move the mouse cursor to the coordinates 10, 20.
16. Call pyautogui.moveTo(110, 220) to move the mouse cursor to the coordinates 10, 20 within the window (and 110, 220 on the screen).
17. The mouse cursor doesn’t move left or right at all, because this function call moves the mouse cursor zero pixels horizontally.
18. The function call pyautogui.move(-100, 0) moves the mouse cursor left by 100 pixels because horizontal x-coordinates decrease as they move left.
19. The pyautogui.moveTo() function moves the cursor to the absolute coordinates on the screen, while the pyautogui.move() function moves the cursor relative to its current position.
20. Adding the duration=0.25 keyword argument makes the mouse cursor move to its destination in one-quarter of a second, rather than instantly.
21. It would return Point(310, 220) because pyautogui.move(10, 20) moves the mouse cursor 10 pixels to the right and 20 pixels down relative to its current position.
22. It evaluates to True, because the index 0 and attribute x of the Point named tuple returned by pyautogui.position() are the same thing, as are the index 1 and attribute y. (This answer assumes you didn’t move the mouse cursor in between the two instructions.)
23. The difference is that pyautogui.click() clicks the mouse at its current location, whereas pyautogui.click(10, 20) moves the mouse cursor to the coordinates 10, 20 and then clicks.
24. Pass the button='right' keyword argument to pyautogui.click().
25. The pyautogui.drag() and pyautogui.dragTo() functions hold down the left mouse button as they drag the mouse.
26. The active, or focused, window is the window that accepts keyboard input.
27. Calling pyautogui.scroll(10) scrolls the mouse wheel 10 units up. (What counts as one “unit” depends on your operating system.)
28. Pass a negative integer to pyautogui.scroll() to scroll the mouse wheel down.
29. The MouseInfo app gives you the x- and y-coordinates and the RGB color information of the pixel where the mouse cursor is.
30. Call pyautogui.mouseInfo() to launch the MouseInfo app.
31. Launch the MouseInfo app and leave it in focus. Then, move the mouse over each of the 20 buttons, pressing F6 to record the x, y coordinate for each button in the MouseInfo app’s text field.
32. A screenshot is an image that contains the exact, pixel-perfect contents of the screen.
33. The Pillow module (covered in Chapter 21 of Automate the Boring Stuff with Python) provides the Image data type and handles screenshots for PyAutoGUI.
34. The pyautogui.pixel() function returns a tuple of three integers, ranging from 0 to 255, for the red, green, blue (RGB) value of the pixel at the coordinates given.
35. No, because screenshots are fully opaque and never contain any alpha (transparency) values. The pyautogui.pixel() function returns RGB tuples, not RGBA tuples.
36. If your program is supposed to click, say, a button that has a known color, you can check that pixel’s color before clicking to have some assurance that the button is actually there. This can ensure that your program is doing what you intended it to do.
37. Call the save() method on the Image object with the string argument 'screenshot.png'.
38. You can call pyautogui.screenshot() while the button is visible on the screen to obtain an image of the screen, then crop the image in an image editing program.
39. Call pyautogui.locateOnScreen('submit.png').
40. The pyautogui.locateOnScreen() function returns Box named tuples that contain four integers representing an area on the screen: the x- and y-coordinates of the top-left corner, the width, and the height.
41. No. The entire image passed to pyautogui.locateOnScreen() must appear on the screen, because the function does a pixel-perfect match.
42. If the image isn’t found, the function raises the pyautogui.ImageNotFoundException exception.
43. If the image appears multiple times on the screen, pyautogui.locateOnScreen() returns the first one it finds.
44. The code prints 'Found submit button on screen.' three times.
45. win.title would evaluate to a string of the window’s title bar text.
46. Calling pyautogui.getAllWindows() returns Window objects for all windows.
47. Calling pyautogui.getWindowsWithTitle ('Notepad') would return Window objects for instances of the Notepad app (as well as any other windows that coincidently had “Notepad” in their title).
48. Calling pyautogui.getWindowsAt (100, 200) would return Window objects for every window underneath the mouse cursor at coordinates 100, 200.
49. Changing the top or left attributes will move the window around the screen.
50. Changing the width or height attributes will resize the window.
51. The active or focused window receives the keyboard key presses.
52. pyautogui.write('x' * 1000) would simulate pressing the X key 1,000 times.
53. pyautogui.write('leftleft') enters the eight letters in the string 'leftleft', while pyautogui.write(['left', 'left']) does two presses of the left arrow key.
54. pyautogui.KEYBOARD_KEYS contains a list of strings of all the keyboard keys in PyAutoGUI.
55. While the pyautogui.write('left') function call enters the four letters in the word left, the pyautogui.press('left') function call simulates pressing the left arrow key once.
56. pyautogui.hotkey('ctrl', 'c')
Jackson Pollock Bot
import pyautogui
import random
print('Hover the mouse cursor at the top-left corner of the canvas...')
pyautogui.countdown(5)
left, top = pyautogui.position()
print('Top-left corner recorded as', left, top)
print('Hover the mouse cursor at the bottom-right corner of the canvas...')
pyautogui.countdown(5)
right, bottom = pyautogui.position()
print('Bottom-right corner recorded as', right, bottom)
for i in range(1, 31):
print('Stroke', i)
pyautogui.moveTo(random.randint(left, right), random.randint(top, bottom))
pyautogui.dragTo(random.randint(left, right), random.randint(top, bottom))
Mouse Movement Recorder
import pyautogui, json
# Begin recording XY mouse positions to the positions list:
print('Recording mouse positions. Press Ctrl-C to quit.')
positions = []
try:
# Constantly record the positions of the mouse every one-tenth of a second.
while True:
positions.append(pyautogui.position())
pyautogui.sleep(0.1)
except KeyboardInterrupt:
# When the user presses Ctrl-C, save the positions to a JSON file:
with open('mousePositions.json', 'w', encoding='utf-8') as file_obj:
file_obj.write(json.dumps(positions))
print(f'Done. {len(positions)} positions recorded.')
Mouse Movement Playback
import pyautogui, json
# Read the mouse positions from a JSON file:
with open('mousePositions.json', encoding='utf-8') as file_obj:
positions = json.loads(file_obj.read())
# Move the mouse cursor to each position:
for pos in positions:
pyautogui.moveTo(pos[0], pos[1])
pyautogui.sleep(0.1)
Answers to the Practice Questions
1. The acronym tts stands for text-to-speech.
2. No. pyttsx3 uses your operating system’s built-in text-to-speech engine.
3. The pyttsx3 module uses the operating system’s built-in text-to-speech system to play speech.
4. Call pyttsx3.init() to initialize the text-to-speech system.
5. No. The computer doesn’t say anything. You must also call engine.runAndWait().
6. The WAV or .wav format.
7. The 'volume', 'rate', and 'voices' properties.
8. This code sets the speech rate to 300 words per minute.
9. This code sets the volume to double the normal level.
10. engine.save_to_file('Is it raining today?', 'raining.wav')
11. engine.save_to_file('Hello. How are you doing?', 'hello.wav')
12. No. The voice used differs for each of the operating systems that pyttsx3 supports.
13. openai-whisper
14. The load_model() function loads the model you will use to transcribe the audio. You must call it before calling the transcribe() method.
15. The models are tiny, base, small, medium, and large-v3.
16. The tiny model.
17. The tiny model.
18. The large-v3 model.
19. The base model.
20. model.transcribe('input.mp3')
21. model.transcribe('input.mp3', language='Spanish')
22. Yes. Whisper inserts punctuation into the text it transcribes, but it may not be perfectly accurate.
23. Whisper can produce SubRip Subtitle files (with the .srt extension) and Video Text Tracks files (with the .vtt extension).
24. write_function = whisper.utils.get_writer('srt', '.') and write_function(result, 'podcast').
25. No. Currently, Whisper can use only NVIDIA-brand GPUs to do speech recognition.
26. whisper.load_model('base', device='cuda') loads the “base” model uses the GPU to do speech recognition.
27. The .srt and .vtt files contain timestamp information in addition to spoken words.
28. SRT stands for SubRip Subtitle. The name comes from SubRip, a free Windows program that could extract the subtitle text from the bottom of video files.
29. VTT stands for Video Text Tracks. The name was coined by the W3C standards committee, which created the format.
30. Whisper can also produce .json and .tsv files of subtitles.
31. This code produces a subtitle file named subtitles.srt:
write_function = whisper.utils.get_writer('srt', '.')
write_function(result, 'subtitles')32. The column headings are “start,” “end,” and “text.”
33. The module name is yt_dlp, with an underscore instead of a dash.
34. This code downloads the video:
with yt_dlp.YoutubeDL() as ydl: ydl.download(['https://www.youtube.com/watch?v=kSrnLbioN6w'])
35. The downloaded video’s filename is based on the title.
36. A .m4a file contains audio data.
37. The extract_info() method returns video metadata.
Knock-Knock Jokes
import pyttsx3
engine = pyttsx3.init()
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
engine.say('Knock knock.')
engine.runAndWait()
engine.setProperty('voice', voices[1].id)
engine.say('Who\'s there?')
engine.runAndWait()
engine.setProperty('voice', voices[0].id)
engine.say('Lettuce.')
engine.runAndWait()
engine.setProperty('voice', voices[1].id)
engine.say('Lettuce who?')
engine.runAndWait()
engine.setProperty('voice', voices[0].id)
engine.say('Lettuce in. It\'s cold out here.')
engine.runAndWait()
12 Days of Christmas
import pyttsx3, time
engine = pyttsx3.init()
# Create the data structures for the verses:
days = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth',
'seventh', 'eighth', 'ninth', 'tenth', 'eleventh', 'twelfth']
verses = ['And a partridge in a pear tree.', 'Two turtle doves,',
'Three French hens,', 'Four calling birds,', 'Five gold rings,',
'Six geese a-laying,', 'Seven swans a-swimming,', 'Eight maids
a-milking,', 'Nine ladies dancing,', 'Ten lords a-leaping,',
'Eleven pipers piping,', 'Twelve drummers drumming,']
# Hardcode the lyrics for the first day, since we don't want
# to say "And" before "a partridge in a pear tree":
print('On the first day of Christmas, my true love gave to me:')
engine.say('On the first day of Christmas, my true love gave to me:')
engine.runAndWait()
print('A partridge in a pear tree.')
engine.say('A partridge in a pear tree.')
engine.runAndWait()
time.sleep(2)
# Loop from days 2 through 12:
for final_day_index in range(1, 12):
# Say the start of the day's verses:
print(f'On the {days[final_day_index]} day of Christmas, my true
love gave to me:')
engine.say(f'On the {days[final_day_index]} day of Christmas, my
true love gave to me:')
engine.runAndWait()
# Run through all of the verses for the current day:
for day_index in range(final_day_index, -1, -1):
print(verses[day_index])
engine.say(verses[day_index])
engine.runAndWait()
# Pause a bit before moving on to the next day's verses:
time.sleep(2)
Podcast Word Search
import whisper, srt, os
def find_in_audio(audio_filename, search_word):
# Convert search_word to lowercase for case-insensitive matching:
search_word = search_word.lower()
# Check if the subtitle file already exists:
if not os.path.exists(audio_filename[:-4] + '.srt'):
# Transcribe the audio file:
model = whisper.load_model('base')
result = model.transcribe(audio_filename)
# Create the subtitle file:
write_function = whisper.utils.get_writer('srt', '.')
write_function(result, audio_filename)
# Read in the text contents of the subtitle file:
with open(audio_filename[:-4] + '.srt', encoding='utf-8') as file_obj:
content = file_obj.read()
# Go through each subtitle and collect timestamps of matches:
found_timestamps = []
for subtitle in srt.parse(content):
if search_word in subtitle.content.lower():
found_timestamps.append(str(subtitle.start))
# Return the list of timestamps:
return found_timestamps
print(find_in_audio('DNA_lecture.mp3', 'amino'))