Previous Chapter - A Modular Arithmetic Module for the Affine Cipher | Next Chapter - Hacking the Affine Cipher

PROGRAMMING THE AFFINE CIPHER

*“I should be able to whisper something in your ear, even if your ear is 1000 miles away, and the government disagrees with that.”—Philip Zimmermann, creator of Pretty Good Privacy (PGP), the most widely used email encryption software in the world*

In Chapter 13, you learned that the affine cipher is actually the multiplicative cipher combined with the Caesar cipher (Chapter 5), and the multiplicative cipher is similar to the Caesar cipher except it uses multiplication instead of addition to encrypt messages. In this chapter, you’ll build and run programs to implement the affine cipher. Because the affine cipher uses two different ciphers as part of its encryption process, it needs two keys: one for the multiplicative cipher and another for the Caesar cipher. For the affine cipher program, we’ll split a single integer into two keys.

Open a new file editor window by selecting **File**▸**New File**. Enter the following code into the file editor and then save it as *affineCipher.py*. Make sure the *pyperclip.py* module and the *cryptomath.py* module you made in Chapter 13 are in the same folder as the *affineCipher.py* file.

*affineCipher.py*

1. # Affine Cipher

2. # https://www.nostarch.com/crackingcodes/ (BSD Licensed)

3.

4. import sys, pyperclip, cryptomath, random

5. SYMBOLS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12345

67890 !?.'

6.

7.

8. def main():

9. myMessage = """"A computer would deserve to be called intelligent

if it could deceive a human into believing that it was human."

-Alan Turing"""

10. myKey = 2894

11. myMode = 'encrypt' # Set to either 'encrypt' or 'decrypt'.

12.

13. if myMode == 'encrypt':

14. translated = encryptMessage(myKey, myMessage)

15. elif myMode == 'decrypt':

16. translated = decryptMessage(myKey, myMessage)

17. print('Key: %s' % (myKey))

18. print('%sed text:' % (myMode.title()))

19. print(translated)

20. pyperclip.copy(translated)

21. print('Full %sed text copied to clipboard.' % (myMode))

22.

23.

24. def getKeyParts(key):

25. keyA = key // len(SYMBOLS)

26. keyB = key % len(SYMBOLS)

27. return (keyA, keyB)

28.

29.

30. def checkKeys(keyA, keyB, mode):

31. if keyA == 1 and mode == 'encrypt':

32. sys.exit('Cipher is weak if key A is 1. Choose a different key.')

33. if keyB == 0 and mode == 'encrypt':

34. sys.exit('Cipher is weak if key B is 0. Choose a different key.')

35. if keyA < 0 or keyB < 0 or keyB > len(SYMBOLS) - 1:

36. sys.exit('Key A must be greater than 0 and Key B must be

between 0 and %s.' % (len(SYMBOLS) - 1))

37. if cryptomath.gcd(keyA, len(SYMBOLS)) != 1:

38. sys.exit('Key A (%s) and the symbol set size (%s) are not

relatively prime. Choose a different key.' % (keyA,

len(SYMBOLS)))

39.

40.

41. def encryptMessage(key, message):

42. keyA, keyB = getKeyParts(key)

43. checkKeys(keyA, keyB, 'encrypt')

44. ciphertext = ''

45. for symbol in message:

46. if symbol in SYMBOLS:

47. # Encrypt the symbol:

48. symbolIndex = SYMBOLS.find(symbol)

49. ciphertext += SYMBOLS[(symbolIndex * keyA + keyB) %

len(SYMBOLS)]

50. else:

51. ciphertext += symbol # Append the symbol without encrypting.

52. return ciphertext

53.

54.

55. def decryptMessage(key, message):

56. keyA, keyB = getKeyParts(key)

57. checkKeys(keyA, keyB, 'decrypt')

58. plaintext = ''

59. modInverseOfKeyA = cryptomath.findModInverse(keyA, len(SYMBOLS))

60.

61. for symbol in message:

62. if symbol in SYMBOLS:

63. # Decrypt the symbol:

64. symbolIndex = SYMBOLS.find(symbol)

65. plaintext += SYMBOLS[(symbolIndex - keyB) * modInverseOfKeyA %

len(SYMBOLS)]

66. else:

67. plaintext += symbol # Append the symbol without decrypting.

68. return plaintext

69.

70.

71. def getRandomKey():

72. while True:

73. keyA = random.randint(2, len(SYMBOLS))

74. keyB = random.randint(2, len(SYMBOLS))

75. if cryptomath.gcd(keyA, len(SYMBOLS)) == 1:

76. return keyA * len(SYMBOLS) + keyB

77.

78.

79. # If affineCipher.py is run (instead of imported as a module), call

80. # the main() function:

81. if __name__ == '__main__':

82. main()

From the file editor, press F5 to run the *affineCipher.py* program; the output should look like this:

Key: 2894

Encrypted text:

"5QG9ol3La6QI93!xQxaia6faQL9QdaQG1!!axQARLa!!AuaRLQADQALQG93!xQxaGaAfaQ1QX3o1R

QARL9Qda!AafARuQLX1LQALQI1iQX3o1RN"Q-5!1RQP36ARuFull encrypted text copied to

clipboard.

In the affine cipher program, the message, "A computer would deserve to be called intelligent if it could deceive a human into believing that it was human." -Alan Turing, gets encrypted with the key 2894 into ciphertext. To decrypt this ciphertext, you can copy and paste it as the new value to be stored in myMessage on line 9 and change myMode on line 13 to the string 'decrypt'.

Lines 1 and 2 of the program are comments describing what the program is. There’s also an import statement for the modules used in this program:

1. # Affine Cipher

2. # https://www.nostarch.com/crackingcodes/ (BSD Licensed)

3.

4. import sys, pyperclip, cryptomath, random

The four modules imported in this program serve the following functions:

The sys module is imported for the exit() function.

The pyperclip module is imported for the copy() clipboard function.

The cryptomath module that you created in Chapter 13 is imported for the gcd() and findModInverse() functions.

The random module is imported for the random.randint() function to generate random keys.

The string stored in the SYMBOLS variable is the symbol set, which is the list of all characters that can be encrypted:

5. SYMBOLS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12345

67890 !?.'

Any characters in the message that don’t appear in SYMBOLS remain unencrypted in the ciphertext. For example, in the sample run of *affineCipher.py*, the quotation marks and the hyphen (-) don’t get encrypted in the ciphertext because they don’t belong in the symbol set.

Line 8 calls the main() function, which is almost exactly the same as the one in the transposition cipher programs. Lines 9, 10, and 11 store the message, key, and mode in variables, respectively:

8. def main():

9. myMessage = """"A computer would deserve to be called intelligent

if it could deceive a human into believing that it was human."

-Alan Turing"""

10. myKey = 2894

11. myMode = 'encrypt' # Set to either 'encrypt' or 'decrypt'.

The value stored in myMode determines whether the program encrypts or decrypts the message:

13. if myMode == 'encrypt':

14. translated = encryptMessage(myKey, myMessage)

15. elif myMode == 'decrypt':

16. translated = decryptMessage(myKey, myMessage)

If myMode is set to 'encrypt', line 14 executes and the return value of encryptMessage() is stored in translated. But if myMode is set to 'decrypt', decryptMessage() is called on line 16 and the return value is stored in translated. I’ll cover how the encryptMessage() and decryptMessage() functions work when we define them later in the chapter.

After the execution passes line 16, the translated variable has the encrypted or decrypted version of the message in myMessage.

Line 17 displays the key used for the cipher using the %s placeholder, and line 18 tells the user whether the output is encrypted or decrypted text:

17. print('Key: %s' % (myKey))

18. print('%sed text:' % (myMode.title()))

19. print(translated)

20. pyperclip.copy(translated)

21. print('Full %sed text copied to clipboard.' % (myMode))

Line 19 prints the string in translated, which is the encrypted or decrypted version of the string in myMessage, and line 20 copies it to the clipboard. Line 21 notifies the user that it is on the clipboard.

Unlike the Caesar cipher, which uses addition with only one key, the affine cipher uses multiplication and addition with two integer keys, which we’ll call Key A and Key B. Because it’s easier to remember just one number, we’ll use a mathematical trick to convert two keys into one key. Let’s look at how this works in *affineCipher.py*.

The getKeyParts() function on line 24 splits a single integer key into two integers for Key A and Key B:

24. def getKeyParts(key):

25. keyA = key // len(SYMBOLS)

26. keyB = key % len(SYMBOLS)

27. return (keyA, keyB)

The key to split is passed to the key parameter. On line 25, Key A is calculated by using integer division to divide key by len(SYMBOLS), the size of the symbol set. Integer division (//) returns the quotient without a remainder. The mod operator (%) on line 26 calculates the remainder, which we’ll use for Key B.

For example, with 2894 as the key parameter and a SYMBOLS string of 66 characters, Key A would be 2894 // 66 = 43 and Key B would be 2894 % 66 = 56.

To combine Key A and Key B back into a single key, multiply Key A by the size of the symbol set and add Key B to the product: (43 * 66) + 56 evaluates to 2894, which is the integer key we started with.

**NOTE**

*Keep in mind that according to Shannon’s Maxim (“The enemy knows the system!”) we must assume hackers know everything about the encryption algorithm, including the symbol set and the size of the symbol set. We assume that the only piece a hacker doesn’t know is the key that was used. The security of our cipher program should depend only on the secrecy of the key, not the secrecy of the symbol set or the program’s source code.*

Line 27 looks like it returns a list value, except parentheses are used instead of square brackets. This is a *tuple* value.

27. return (keyA, keyB)

A tuple value is similar to a list value in that it can store other values, which can be accessed with indexes or slices. However, unlike list values, tuple values cannot be modified. There’s no append() method for tuple values.

Because *affineCipher.py* doesn’t need to modify the value returned by getKeyParts(), using a tuple is more appropriate than a list.

Encrypting with the affine cipher involves a character’s index in SYMBOLS being multiplied by Key A and added to Key B. But if keyA is 1, the encrypted text is very weak because multiplying the index by 1 results in the same index. In fact, as defined by the multiplicative identity property, the product of any number and 1 is that number. Similarly, if keyB is 0, the encrypted text is weak because adding 0 to the index doesn’t change it. If keyA is 1 and keyB is 0 at the same time, the “encrypted” output would be identical to the original message. In other words, it wouldn’t be encrypted at all!

We check for weak keys using the checkKeys() function on line 30. The if statements on lines 31 and 33 check whether keyA is 1 or keyB is 0.

30. def checkKeys(keyA, keyB, mode):

31. if keyA == 1 and mode == 'encrypt':

32. sys.exit('Cipher is weak if key A is 1. Choose a different key.')

33. if keyB == 0 and mode == 'encrypt':

34. sys.exit('Cipher is weak if key B is 0. Choose a different key.')

If these conditions are met, the program exits with a message indicating what went wrong. Lines 32 and 34 each pass a string to the sys.exit() call. The sys.exit() function has an optional parameter that lets you print a string to the screen before terminating the program. You can use this function to display an error message on the screen before the program quits.

These checks prevent you from encrypting with weak keys, but if your mode is set to 'decrypt', the checks on lines 31 and 33 don’t apply.

The condition on line 35 checks whether keyA is a negative number (that is, whether it’s less than 0) *or* whether keyB is greater than 0 *or* less than the size of the symbol set minus one:

35. if keyA < 0 or keyB < 0 or keyB > len(SYMBOLS) - 1:

36. sys.exit('Key A must be greater than 0 and Key B must be

between 0 and %s.' % (len(SYMBOLS) - 1))

The reason the keys are in these ranges is described in the next section. If any of these conditions is True, the keys are invalid and the program exits.

Additionally, Key A must be relatively prime to the symbol set size. This means that the greatest common divisor (GCD) of keyA and len(SYMBOLS) must be equal to 1. Line 37 checks for this using an if statement, and line 38 exits the program if the two values are not relatively prime:

37. if cryptomath.gcd(keyA, len(SYMBOLS)) != 1:

38. sys.exit('Key A (%s) and the symbol set size (%s) are not

relatively prime. Choose a different key.' % (keyA,

len(SYMBOLS)))

If all the conditions in the checkKeys() function return False, nothing is wrong with the key, and the program doesn’t exit. Program execution returns to the line that originally called checkKeys().

Let’s try to calculate the number of possible keys the affine cipher has. The affine cipher’s Key B is limited to the size of the symbol set, where len(SYMBOLS) is 66. At first glance, it seems like Key A could be as large as you want it to be as long as it’s relatively prime to the symbol set size. Therefore, you might think that the affine cipher has an infinite number of keys and cannot be brute-forced.

But this is not the case. Recall how large keys in the Caesar cipher ended up being the same as smaller keys due to the wraparound effect. With a symbol set size of 66, the key 67 in the Caesar cipher would produce the same encrypted text as the key 1. The affine cipher also wraps around in this way.

Because the Key B part of the affine cipher is the same as the Caesar cipher, its range is limited from 1 to the size of the symbol set. To determine whether the affine cipher’s Key A is also limited, we’ll write a short program to encrypt a message using several different integers for Key A and see what the ciphertext looks like.

Open a new file editor window and enter the following source code. Save this file as *affineKeyTest.py* in the same folder as *affineCipher.py* and *cryptomath.py*. Then press F5 to run it.

*affineKeyTest.py*

1. # This program proves that the keyspace of the affine cipher is limited

2. # to less than len(SYMBOLS) ^ 2.

3.

4. import affineCipher, cryptomath

5.

6. message = 'Make things as simple as possible, but not simpler.'

7. for keyA in range(2, 80):

8. key = keyA * len(affineCipher.SYMBOLS) + 1

9.

10. if cryptomath.gcd(keyA, len(affineCipher.SYMBOLS)) == 1:

11. print(keyA, affineCipher.encryptMessage(key, message))

This program imports the affineCipher module for its encryptMessage() function and the cryptomath module for its gcd() function. We’ll always encrypt the string stored in the message variable. The for loop remains in a range between 2 and 80, because 0 and 1 are not allowed as valid Key A integers, as explained earlier.

On each iteration of the loop, line 8 calculates the key from the current keyA value and always uses 1 for Key B, which is why 1 is added at the end of line 8. Keep in mind that Key A must be relatively prime with the symbol set size to be valid. Key A is relatively prime with the symbol set size if the GCD of the key and the symbol set size is equal to 1. So if the GCD of the key and the symbol set size is not equal to 1, the if statement on line 10 will skip the call to encryptMessage() on line 11.

In short, this program prints the same message encrypted with several different integers for Key A. The output of this program looks like this:

5 0.xTvcin?dXv.XvXn8I3Tv.XvIDXXnE3T,vEhcv?DcvXn8I3TS

7 Tz4Nn1ipKbtnztntpDY NnztnYRttp7 N,n781nKR1ntpDY Nm9

13 ZJH0P7ivuVtPJtPtvhGU0PJtPG8ttvWU0,PWF7Pu87PtvhGU0g3

17 HvTx.oizERX.vX.Xz2mkx.vX.mVXXz?kx,.?6o.EVo.Xz2mkxGy

--snip--

67 Nblf!uijoht!bt!tjnqmf!bt!qpttjcmf,!cvu!opu!tjnqmfsA

71 0.xTvcin?dXv.XvXn8I3Tv.XvIDXXnE3T,vEhcv?DcvXn8I3TS

73 Tz4Nn1ipKbtnztntpDY NnztnYRttp7 N,n781nKR1ntpDY Nm9

79 ZJH0P7ivuVtPJtPtvhGU0PJtPG8ttvWU0,PWF7Pu87PtvhGU0g3

Look carefully at the output, and you’ll notice that the ciphertext for Key A of 5 is the same as the ciphertext for Key A of 71! In fact, the ciphertext from keys 7 and 73 are the same, as are the ciphertext from keys 13 and 79!

Notice also that subtracting 5 from 71 results in 66, the size of our symbol set. This is why a Key A of 71 does the same thing as a Key A of 5: the encrypted output repeats itself, or wraps around, every 66 keys. As you can see, the affine cipher has the same wraparound effect for Key A as it does for Key B. In sum, Key A is also limited to the symbol set size.

When you multiply 66 possible Key A keys by 66 possible Key B keys, the result is 4356 possible combinations. Then when you subtract the integers that can’t be used for Key A because they’re not relatively prime with 66, the total number of possible key combinations for the affine cipher drops to 1320.

To encrypt the message in *affineCipher.py*, we first need the key and the message to encrypt, which the encryptMessage() function takes as parameters:

41. def encryptMessage(key, message):

42. keyA, keyB = getKeyParts(key)

43. checkKeys(keyA, keyB, 'encrypt')

Then we need to get the integer values for Key A and Key B from the getKeyParts() function by passing it key on line 42. Next, we check whether these values are valid keys by passing them to the checkKeys() function. If the checkKeys() function doesn’t cause the program to exit, the keys are valid and the rest of the code in the encryptMessage() function after line 43 can proceed.

On line 44, the ciphertext variable starts as a blank string but will eventually hold the encrypted string. The for loop that begins on line 45 iterates through each of the characters in message and then adds the encrypted character to ciphertext:

44. ciphertext = ''

45. for symbol in message:

By the time the for loop is done looping, the ciphertext variable will contain the complete string of the encrypted message.

On each iteration of the loop, the symbol variable is assigned a single character from message. If this character exists in SYMBOLS, which is our symbol set, the index in SYMBOLS is found and assigned to symbolIndex on line 48:

46. if symbol in SYMBOLS:

47. # Encrypt the symbol:

48. symbolIndex = SYMBOLS.find(symbol)

49. ciphertext += SYMBOLS[(symbolIndex * keyA + keyB) %

len(SYMBOLS)]

50. else:

51. ciphertext += symbol # Append the symbol without encrypting.

To encrypt the text, we need to calculate the index of the encrypted letter. Line 49 multiplies this symbolIndex by keyA and adds keyB to the product. Then it mods the result by the size of the symbol set, represented by the expression len(SYMBOLS). Modding by len(SYMBOLS) handles the wraparound by ensuring the calculated index is always between 0 and up to, but not including, len(SYMBOLS). The resulting number will be the index in SYMBOLS of the encrypted character, which is concatenated to the end of the string in ciphertext.

Everything in the previous paragraph is done on line 49, using a single line of code!

If symbol isn’t in our symbol set, symbol is concatenated to the end of the ciphertext string on line 51. For example, the quotation marks and hyphen in the original message are not in the symbol set and therefore are concatenated to the string.

After the code has iterated through each character in the message string, the ciphertext variable should contain the full encrypted string. Line 52 returns the encrypted string from encryptMessage():

52. return ciphertext

The decryptMessage() function that decrypts the text is almost the same as encryptMessage(). Lines 56 to 58 are equivalent to lines 42 to 44.

55. def decryptMessage(key, message):

56. keyA, keyB = getKeyParts(key)

57. checkKeys(keyA, keyB, 'decrypt')

58. plaintext = ''

59. modInverseOfKeyA = cryptomath.findModInverse(keyA, len(SYMBOLS))

However, instead of multiplying by Key A, the decryption process multiplies by the modular inverse of Key A. The mod inverse is calculated by calling cryptomath.findModInverse(), as explained in Chapter 13.

Lines 61 to 68 are almost identical to the encryptMessage() function’s lines 45 to 52. The only difference is on line 65.

61. for symbol in message:

62. if symbol in SYMBOLS:

63. # Decrypt the symbol:

64. symbolIndex = SYMBOLS.find(symbol)

65. plaintext += SYMBOLS[(symbolIndex - keyB) * modInverseOfKeyA %

len(SYMBOLS)]

66. else:

67. plaintext += symbol # Append the symbol without decrypting.

68. return plaintext

In the encryptMessage() function, the symbol index was multiplied by Key A and then Key B was added to it. In the decryptMessage() function’s line 65, the symbol index first subtracts Key B from the symbol index and then multiplies it by the modular inverse. Then it mods this number by the size of the symbol set, len(SYMBOLS).

This is how the decryption process in *affineCipher.py* undoes the encryption. Now let’s look at how we can change *affineCipher.py* so that it randomly selects valid keys for the affine cipher.

It can be difficult to come up with a valid key for the affine cipher, so you can instead use the getRandomKey() function to generate a random but valid key. To do this, simply change line 10 to store the return value of getRandomKey() in the myKey variable:

10. myKey = getRandomKey()

--snip--

17. print('Key: %s' % (myKey))

Now the program randomly selects the key and prints it to the screen when line 17 executes. Let’s look at how the getRandomKey() function works.

The code on line 72 enters a while loop where the condition is True. This *infinite loop* will loop forever until it is told to return or the user terminates the program. If your program gets stuck in an infinite loop, you can terminate the program by pressing ctrl-C (ctrl-D on Linux or macOS). The getRandomKey() function will eventually exit the infinite loop with a return statement.

71. def getRandomKey():

72. while True:

73. keyA = random.randint(2, len(SYMBOLS))

74. keyB = random.randint(2, len(SYMBOLS))

Lines 73 and 74 determine random numbers between 2 and the size of the symbol set for keyA and for keyB. This code ensures that there’s no chance that Key A or Key B will be equal to the invalid values 0 or 1.

The if statement on line 75 checks to make sure that keyA is relatively prime with the size of the symbol set by calling the gcd() function in the cryptomath module.

75. if cryptomath.gcd(keyA, len(SYMBOLS)) == 1:

76. return keyA * len(SYMBOLS) + keyB

If keyA is relatively prime with the size of the symbol set, these two randomly selected keys are combined into a single key by multiplying keyA by the symbol set size and adding keyB to the product. (Note that this is the opposite of the getKeyParts() function, which splits a single integer key into two integers.) Line 76 returns this value from the getRandomKey() function.

If the condition on line 75 returns False, the code loops back to the start of the while loop on line 73 and picks random numbers for keyA and keyB again. The infinite loop ensures that the program continues looping until it finds random numbers that are valid keys.

Lines 81 and 82 call the main() function if this program was run by itself rather than being imported by another program:

79. # If affineCipher.py is run (instead of imported as a module), call

80. # the main() function:

81. if __name__ == '__main__':

82. main()

This ensures that the main() function runs when the program is run but not when the program is imported as a module.

Just as we did in Chapter 9, in this chapter we wrote a program (*affineKeyTest.py*) that can test our cipher program. Using this test program, you learned that the affine cipher has approximately 1320 possible keys, which is a number you can easily hack using brute-force. This means that we’ll have to toss the affine cipher onto the heap of easily hackable weak ciphers.

So the affine cipher isn’t much more secure than the previous ciphers we’ve looked at. The transposition cipher can have more possible keys, but the number of possible keys is limited to the size of the message. For a message with only 20 characters, the transposition cipher can have at most 18 keys, with keys ranging from 2 to 19. You can use the affine cipher to encrypt short messages with more security than the Caesar cipher provides, because its number of possible keys is based on the symbol set.

In Chapter 15, we’ll write a brute-force program that can break affine cipher–encrypted messages!