8STRINGS AND TEXT EDITING

Python lets you efficiently work with massive amounts of text data faster than any human file clerk could, but first you have to know what text editing operations Python makes available. By learning Python’s string operations, you’ll save yourself from having to reinvent this text editing code yourself.

A simple drawing of a light bulb. LEARNING OBJECTIVES

  • Know how to write string literals and use string values in your programs.
  • Be able to write f-strings as a shortcut for concatenation.
  • Become familiar with the wide variety of string methods and how they manipulate capitalization, add or remove whitespace, and describe features of string values.
  • Understand how text is encoded as numbers on your computer and how the ord() and chr() functions convert between text characters and numeric code points.
  • Know how to use the clipboard as a system of input and output for your program with the Pyperclip third-party package.

A grey circle with a white question mark at the center Practice Questions

These questions test your understanding of the string data type and its methods.

Working with Strings

Strings are how programs represent text data. There are several ways to write and use them; for example, you can encapsulate them in either single quotes or double quotes, and they have features similar to lists, such as indexes and the in and not in operators.

  1. 1. What is a string literal?

  2. 2. What is the difference between string literals using single quotes and strings using double quotes?

  3. 3. How do you mark the start and end of a multiline string?

  4. 4. Is "Zophie's scratching post" valid Python code for a string?

  5. 5. What about "Zophie\'s scratching post"?

  6. 6. Are escape characters needed when a string contains both single- and double-quote characters?

  7. 7. Why are the string literals 'A\'B' and 'A\\\'B' valid, but not the string literal 'A\\'B'?

  8. 8. How do you mark a string literal as a raw string literal?

  9. 9. How many backslashes appear when you run the code print('A\\B')?

  10. 10. How about when you run the code print(r'A\\B')?

  11. 11. How can you create multiline comments without using a # character at the start of each line?

Strings use indexes and slices the same way lists do. For questions 12 through 15, determine what the code evaluates to.

  1. 12. 'Hello'[1]

  2. 13. 'Hello'[-1]

  3. 14. 'Hello'[4:5]

  4. 15. 'Hello'[4:4]

  5. 16. Does 'Hello'[9999] cause an IndexError?

  6. 17. What about 'Hello'[1:9999]?

An expression with two strings joined using in or not in will evaluate to a Boolean True or False. For the following questions, determine what the expression evaluates to.

  1. 18. H in 'Hello'

  2. 19. H in ['Hello', 'Goodbye']

  3. 20. 'Hello' in ['Hello', 'Goodbye']

  4. 21. 'Hello' in ['Hi', ['Hello', 'Goodbye']]

  5. 22. ['Hello', 'Goodbye'] in ['Hi', ['Hello', 'Goodbye']]

F-Strings

Python’s f-strings let you place variable names or entire expressions within a string. Like the r prefix in raw strings, f-strings have an f prefix before the starting quotation mark. Everything between the curly brackets ({}) is interpreted as if it were passed to str() and concatenated with the + operator in the middle of the string. Answer the following questions about f-strings.

  1. 23. Why does 'I am number ' + 42 cause an error while 'I am number ' + str(42) does not?

  2. 24. Does f'I am number {42}' cause an error?

  3. 25. What about f'I am number {str(42)}'?

  4. 26. Describe the difference between what print(beard_length) and print(f'{beard_length=}') display on the screen.

  5. 27. If f-strings are the preferred way of putting strings inside other strings, why do you need to learn about string interpolation and the format() string method?

Useful String Methods

Several string methods analyze strings or create transformed string values, including by changing the case of letters, checking for certain types of characters, and joining or splitting them. Answer the following questions about string methods.

  1. 28. Can the expression spam.upper() == 'hello' ever evaluate to True?

  2. 29. What does '42'.isupper() evaluate to?

  3. 30. What does 'X42'.isupper() evaluate to?

  4. 31. What are the data types of the return values of the lower() and islower() methods?

  5. 32. What does 'This sentence is capitalized.'.istitle() return?

  6. 33. What about 'This sentence is capitalized.'.title()?

  7. 34. Write an expression that determines whether the string in spam contains only numeric digits.

For questions 35 through 38, determine what the method call returns.

  1. 35. '1,000,000'.isdecimal()

  2. 36. '-5'.isdecimal()

  3. 37. str(float(42))

  4. 38. str(float(42)).isdecimal()

  5. 39. What is the difference between the expression 'headache'.startswith('he') and 'headache'.endswith('he') and the expression 'headache'.startswith('he').endswith('he')?

  6. 40. What is the data type of the join() string method’s return value?

  7. 41. What is the data type of the split() string method’s return value?

  8. 42. What does ','.join(['cat', 'dog', 'moose']) evaluate to?

  9. 43. What about ','.join('cat,dog,moose')?

  10. 44. What string method should you call on the string 'Hello!' to return the 10-character string padded with spaces '    Hello!'?

  11. 45. What string method should you call on the string 'Hello!' to return the 10-character string padded with spaces 'Hello!    '?

Numeric Code Points of Characters

Computers store information as bytes (strings of binary numbers), which means we need to be able to convert text to numbers. Because of this requirement, every text character has a corresponding numeric value called a Unicode code point. Answer the following questions about the Unicode and the ord() and chr() functions.

  1. 46. What is a text character’s Unicode code point?

  2. 47. What Unicode encoding should you almost certainly use when writing programs?

  3. 48. What function returns a text character string, given a Unicode code point integer?

  4. 49. What function returns a Unicode code point integer, given a text character string?

  5. 50. Given that the expression ord('!') < ord('A') evaluates to True, which comes first in “ASCIIbetical” order, ! or A?

Copying and Pasting Strings

The pyperclip module has copy() and paste() functions that can send text to and receive text from your computer’s clipboard. Sending the output of your program to the clipboard will make it easy to paste it into an email, a word processor, or some other software. Answer the following questions about the pyperclip module.

  1. 51. Is pyperclip a built-in package that comes with Python?

  2. 52. Which function returns a string: pyperclip.copy() or pyperclip.paste()?

  3. 53. Which function takes a string argument: pyperclip.copy() or pyperclip.paste()?

  4. 54. If you call pyperclip.copy('Hello') and then call pyperclip.copy('Goodbye'), what does pyperclip.paste() return?

A simple drawing of a sharpened pencil. Practice Projects

You’ll now create some short programs that incorporate strings and text editing.

Word Match Game

The word game Jotto was created in 1955, and the 1980s game show Lingo later repurposed its concept (which you might recognize as another, more recent, game). You can make your own version of this game in Python.

Create a program that has the user guess a five-letter word. Your code should include a function named get_word_hint(secret_word, guess_word) that returns a five-character string of hints. The hints are an uppercase O for a correct letter in the same place in the secret word, a lowercase o for a correct letter in a different place in the secret word, and x for letters that are not in the secret word. If the guessed word is the same as the secret word, the function should return OOOOO.

For example, if the secret word is CRANE and the guess word is CANDY, get_word_hint('CRANE', 'CANDY') should return Oooxx because the first letter in CANDY matches the first letter in the secret word, CRANE. The next two hint characters are oo because the A and N characters in CANDY exist in CRANE but at different indexes. The last two hint characters are xx because the D and Y in CANDY don’t appear in CRANE at all.

The rest of the program should randomly choose a secret word from a list of five-character words and then give the user six tries to guess it. You can use this list of words:

'MITTS FLOAT BRICK LIKED DWARF COMMA GNASH ROOMS UNITE BEARS SPOOL ARMOR'.split()

The get_word_hint() function should convert the secret_word and guess_word arguments to uppercase. For simplicity, you don’t need to check that the user’s guess is a real word. When you run the program, it should look something like this:

Guess the secret five-letter word:
candy
Ooxxx

light
xxxxx

power
xOxxx

coals
OOoxx

cobra
OOxxO

cocoa
OOooO

The secret word was COMMA. Better luck next time.

Put the get_word_hint() function and the rest of the code in a program named wordMatchGame.py.

Diagonal Stripe Scroll Animation

Let’s create a scrolling text animation of a diagonal stripe. We don’t need advanced graphics to create animation; we can just use print() with strings to repeat the following pattern:

......
O.....
OO....
OOO...
OOOO..
OOOOO.
.OOOOO
..OOOO
...OOO
....OO
.....O

The pattern here is only 6 characters wide, but your program can be 50 characters wide by carrying out the following steps in an infinite loop:

  • Print a string that is zero O characters followed by 50 . characters.
  • Print a string that is one O character followed by 49 . characters.
  • Print a string that is two O characters followed by 48 . characters.
  • Continue this pattern until the program prints a string of 49 O characters followed by zero . characters.
  • Print a string that is one . character followed by 49 O characters.
  • Print a string that is two . characters followed by 48 O characters.
  • Repeat from the start.

You can use two for loops to print the two sets of patterns. To easily create the strings you’ll need, use the * operator for string replication. For example, if the variable i contains 2, the expression 'O' * i should create a string of two O characters, and the expression '.' * (50 - i) should create a string of 48 . characters.

This program may run too fast for you to enjoy the animation, so import the time module and, after each print() call, add a time.sleep(0.01) call. The resulting animation is rather enchanting considering the whole program is fewer than 10 or so lines long.

Save this program in a file named diagStripe.py.

mOcKiNg SpOnGeBoB mEmE

You may have seen the “Mocking Spongebob” meme format, which renders a statement in alternating uppercase and lowercase letters. Write a function named spongecase(text) that takes a string argument and returns the string in this format. Apply the following rules:

  • Leave non-letters unmodified.
  • Make the first letter lowercase.
  • For every letter, set the next letter to the opposite case. (Non-letter characters don’t change the case used for the next letter.)

The program should ask the user for a sentence and then display that sentence with “Mocking Spongebob” casing:

Enter a sentence:
Hello. It is nice to meet you.
hElLo. It Is NiCe To MeEt YoU.

Save this function in a file named mockingSpongebob.py.