Note: The second edition of this book is available under the title Cracking Codes with Python
“I should be able to whisper something in your ear, even if your ear is 1000 miles away, and the government disagrees with that.”
Philip Zimmermann, creator of Pretty Good Privacy (PGP), the most widely used email encryption software in the world.
This chapter’s programs implement the multiplicative and affine ciphers. The multiplicative cipher is like the Caesar cipher from Chapter 6, except it uses multiplication instead of addition. The affine cipher is the multiplicative cipher, which is then encrypted by the Caesar cipher on top of that. The affine cipher needs two keys: one for the multiplicative cipher multiplication and the other for the Caesar cipher addition.
For the affine cipher program, we will use a single integer for the key. We will use some simple math to split this key into the two keys, which we will call Key A and Key B.
How the affine cipher works was covered in the last chapter. Here is the source code for a Python program that implements the affine cipher. Open a new file editor window by clicking on File ► New Window. Type in the following code into the file editor, and then save it as affineCipher.py. Press F5 to run the program. Note that first you will need to download the pyperclip.py module and place this file in the same directory as the affineCipher.py file. You can download this file from http://invpy.com/pyperclip.py.
When you press F5 from the file editor to run this program, the output will look like this:
The message “"A computer would deserve to be called intelligent if it could deceive a human into believing that it was human." -Alan Turing” gets encrypted with the key 2023 into the above ciphertext.
To decrypt, paste this text as the new value to be stored in myMessage and change myMode to the string 'decrypt'.
Practice exercises can be found at http://invpy.com/hackingpractice15A.
Lines 1 and 2 are the usual comments describing what the program is. There is also an import statement for the modules used in this program.
· The sys module is imported for the exit() function.
· The pyperclip module is imported for the copy() clipboard function.
· The cryptomath module that we created in the last chapter is imported for the gcd() and findModInverse() function.
In our program, the string stored in the SYMBOLS variable is the symbol set. The symbol set is the list of all characters that can be encrypted. Any characters in the message to be encrypted that don’t appear in SYMBOLS will be added to the ciphertext unencrypted.
The main() function is almost exactly the same as the one from the transposition cipher programs. The message, key, and mode are stored in variables on lines 9, 10, and 11.
If myMode is set to 'encrypt', then line 14 will be executed and the return value of encryptMessage() is stored in translated. Or else, if myMode is set to 'decrypt', then decryptMessage() is called on line 16 and the return value is stored in translated.
Either way, after the execution has passed line 16, the translated variable will have the encrypted or decrypted version of the message in myMessage.
The string in translated (which is the encrypted or decrypted version of the string in myMessage) is displayed on the screen on line 19 and copied to the clipboard on line 20.
The affine cipher is like the Caesar cipher, except that it uses multiplication and addition (with two integer keys, which we called Key A and Key B) instead of just addition (with one key). It’s easier to remember just one number, so we will use a mathematical trick to convert between two keys and one key.
The getKeyParts() function splits a single integer key into two integers for Key A and Key B. The single key (which is in the parameter key) is divided by the size of the symbol set, and Key A is the quotient and Key B is the remainder. The quotient part (without any remainder) can be calculated using the // integer division operator, which is what line 25 does. The remainder part (without the quotient) can be calculated using the % mod operator, which is what line 26 does.
It is assumed that the symbol set, as well as the size of the symbol set, is publicly known along with the rest of the source code.
For example, with 2023 as the key parameter and a SYMBOLS string of 95 characters, Key A would be 2023 // 95 or 21 and Key B would be 2023 % 95 or 28.
To combine Key A and Key B back into the single key, multiply Key A by the size of the symbol set and add Key B: (21 * 95) + 28 evaluates to 2023.
A tuple value is similar to a list: it is a value that can store other values, which can be accessed with indexes or slices. However, the values in a tuple cannot be modified. There is no append() method for tuple values. A tuple is written using parentheses instead of square brackets. The value returned on line 27 is a tuple.
For technical reasons beyond the scope of this book, the Python interpreter can execute code faster if it uses tuples compared to code that uses lists.
Encrypting with the affine cipher involves a character’s index in SYMBOLS being multiplied by Key A and added to Key B. But if keyA is 1, the encrypted text will be very weak because multiplying the index by 1 does not change it. Similarly, if keyB is 0, the encrypted text will be weak because adding the index to 0 does not change it. And if both keyA was 1 and keyB was 0, the “encrypted” message would be the exact same as the original message. It wouldn’t be encrypted at all!
The if statements on line 31 and 33 check for these “weak key” conditions, and exit the program with a message telling the user what was wrong. Notice on lines 32 and 34, a string is being passed to the sys.exit() call. The sys.exit() function has an optional parameter of a string that will be printed to the screen before terminating the program. This can be used to display an error message on the screen before the program quits.
Of course, these checks only apply to prevent you from encrypting with weak keys. If mode is set to 'decrypt', then the checks on lines 31 and 33 don’t apply.
The condition on line 35 checks if keyA is a negative number (that is, it is less than 0) or if keyB is greater than 0 or less than the size of the symbol set minus one. (The reason the Key B check has this range is described later in the “How Many Keys Does the Affine Cipher Have?” section.) If any of these things are True, the keys are invalid and the program exits.
Finally, Key A must be relatively prime with the symbol set size. This means that the greatest common divisor of keyA and len(SYMBOLS) must be equal to 1. Line 37’s if statement checks for this and exits the program if they are not relatively prime.
If all of the conditions in the checkKeys() function were False, there is nothing wrong with the key and the program will not exit. Line 38 is the last line in the function, so the program execution next returns to the line that originally called checkKeys().
First we get the integer values for Key A and Key B from the getKeyParts() function. These values are checked if they are valid keys or not by passing them to the checkKeys() function. If the checkKeys() function does not cause the program to exit, then the rest of the code in the encryptMessage() function after line 43 can assume that the keys are valid.
The ciphertext variable will eventually hold the encrypted string, but starts off as a blank string. The for loop that begins on line 45 will iterate through each of the characters in message, and then add the encrypted character to ciphertext. By the time the for loop is done looping, the ciphertext variable will have the complete string of the encrypted message.
On each iteration of the loop, the symbol variable is assigned the single character from message. If this character exists in SYMBOLS (that is, our symbol set), then the index in SYMBOLS is found and assigned to symIndex. The value in symIndex is the “number” version of the character.
To encrypt it, we need to calculate the index of the encrypted letter. We multiply this symIndex by keyA and add keyB, and mod the number by the size of the symbol set (that is, the expression len(SYMBOLS)). We mod by len(SYMBOLS) because the affine cipher has a similar “wrap-around” issue that the Caesar cipher had. Modding by len(SYMBOLS) handles the “wrap-around” by ensuring the calculated index is always between 0 up to (but not including) len(SYMBOLS). The number that we calculate will be the index in SYMBOLS of the encrypted character, which is concatenated to the end of the string in ciphertext.
Everything that happens in the above paragraph was done on line 49.
If symbol was not in our symbol set, then symbol is concatenated to the end of the ciphertext string on line 51.
Once we have iterated through each character in the message string, the ciphertext variable should contain the full encrypted string. This string is returned from encryptMessage().
The decryptMessage() function is almost the same as the encryptMessage(). Lines 56 to 58 are equivalent to lines 44 to 46.
However, instead of multiplying by Key A, the decryption process needs to multiply by the modular inverse of Key A. The mod inverse can be calculated by calling cryptomath.findModInverse(). This function was explained in the previous chapter.
Lines 61 to 68 are almost identical to the encryptMessage() function’s lines 45 to 52. The only difference is on line 65. In the encryptMessage() function, the symbol index was multiplied by Key A and then had Key B added to it. In decryptMessage()’s line 65, the symbol index first has Key B subtracted from it, and then is multiplied by the modular inverse. Then this number is modded by the size of the symbol set, len(SYMBOLS). This is how the decryption process undoes the encryption.
It can be difficult to come up with a valid key for the affine cipher, so we will create a getRandomKey() function that generates a random (but valid) key for the user to use. To use this, the user simply has to change line 10 to store the return value of getRandomKey() in the myKey variable:
Now the key that is used to encrypt is randomly selected for us. It will be printed to the screen when line 17 is executed.
The code in getRandomKey()enters a while loop on line 72 where the condition is True. This is called an infinite loop, because the loop’s condition is never False. If your program gets stuck in an infinite loop, you can terminate it by pressing Ctrl-C or Ctrl-D.
The code on lines 73 and 74 determine random numbers between 2 and the size of the symbol set for keyA and for keyB. This way there is no chance that Key A or Key B are equal to the invalid values 0 or 1.
The if statement on line 75 checks to make sure that keyA is relatively prime with the size of the symbol set by calling the gcd() function in the cryptomath module. If it is, then these two keys are combined into a single key by multiplying keyA by the symbol set size and adding keyB. (This is the opposite of what the getKeyParts() function does.) This value is returned from the getRandomKey() function.
If the condition on line 75 was False, then the code loops back to the start of the while loop on line 73 and picks random numbers for keyA and keyB again. The infinite loop ensures that the program keeps looping again and again until it finds random numbers that are valid keys.
Lines 81 and 82 call the main() function if this program was run by itself, rather than imported by another program.
Key B of the affine cipher is limited to the size of the symbol set (in the case of affineCipher.py, len(SYMBOLS) is 95). But it seems like Key A could be as large as we want it to be (as long as it is relatively prime to the symbol set size). Therefore the affine cipher should have an infinite number of keys and therefore cannot be brute-forced.
As it turns out, no. Remember how large keys in the Caesar cipher ended up being the same as smaller keys due to the “wrap-around” effect. With a symbol set size of 26, the key 27 in the Caesar cipher would produce the same encrypted text as the key 1. The affine cipher also “wraps around”.
Since the Key B part of the affine cipher is the same as the Caesar cipher, we know it is limited from 1 to the size of the symbol set. But to find out if the affine cipher’s Key A is also limited, we can write a small program to encrypt a message with several different integers for Key A and see what the ciphertext looks like.
Open a new file editor window and type the following source code. Save this file as affineKeyTest.py, and then press F5 to run it.
This is a fairly simple program. It imports the affineCipher module for its encryptMessage() function and the cryptomath module for its gcd() function. We will always encrypt the string stored in the message variable. The for loop will range between 2 (since 0 and 1 are not allowed as valid Key A integers) and 100.
On each iteration of the loop, we calculate the key from the current keyA value and always use 1 for Key B (this is why 1 is added on line 8). Remember that it is not valid to use a Key A that is not relatively prime with the symbol set size. So if the greatest common divisor of the key and the symbol set size is not equal to 1, the if statement on line 10 will skip the call to encryptMessage() on line 11.
Basically, this program will print out the same message encrypted with several different integers for Key A. The output of this program will look like this:
Look carefully at the output. You’ll notice that the ciphertext for Key A of 2 is the exact same as the ciphertext for Key A of 97! In fact, the ciphertext from keys 3 and 98 are the same, as are the ciphertext from keys 4 and 99!
Notice that 97 - 95 is 2. This is why a Key A of 97 does the same thing as a Key A of 2: the encrypted output repeats itself (that is, “wraps around”) every 95 keys. The affine cipher has the same “wrap-around” for the Key A as it does for Key B! It seems like it is limited to the symbol set size.
95 possible Key A keys multiplied by 95 possible Key B keys means there are 9,025 possible combinations. If you subtract the integers that can’t be used for Key A (because they are not relatively prime with 95), this number drops to 7,125 possible keys.
7,125 is about the same number of keys that’s possible with most transposition cipher messages, and we’ve already learned how to program a computer to hack that number of keys with brute-force. This means that we’ll have to toss the affine cipher onto the heap of weak ciphers that are easily hacked.
The affine cipher isn’t any more secure than the previous ciphers we’ve looked at. The transposition cipher can have more possible keys, but the number of possible keys is limited to the size of the message. For a message with only 20 characters, the transposition cipher can only have at most 18 keys (the keys 2 to 19). The affine cipher can be used to encrypt short messages with more security than the Caesar cipher provided, since its number of possible keys is based on the symbol set.
But we did learn some new mathematical concepts that we will use later on. The concepts of modular arithmetic, greatest common divisor, and modular inverses will help us in the RSA cipher at the end of this book.
But enough about how the affine cipher is weak in theory. Let’s write a brute-force program that can actually break affine cipher encrypted messages!