Note: The second edition of this book is available under the title Cracking Codes with Python
To hack the transposition cipher, we will use a brute-force approach. Of the thousands of keys, the correct key is most likely that only one that will result in readable English. We developed English-detection code in the last chapter so the program can realize when it has found the correct key.
Open a new file editor window by clicking on File ► New Window. Type in the following code into the file editor, and then save it as transpositionHacker.py. Press F5 to run the program. Note that first you will need to download the pyperclip.py module and place this file in the same directory as the transpositionHacker.py file. You can download this file from http://invpy.com/pyperclip.py.
When you run this program, the output will look this:
When the hacker program has found a likely correct decryption, it will pause and wait for the user to press “D” and then Enter. If the decryption is a false positive, the user can just press Enter and the program will continue to try other keys.
Run the program again and skip the correct decryption by just pressing Enter. The program assumes that it was not a correct decryption and continues brute-forcing through the other possible keys. Eventually the program runs through all the possible keys and then gives up, telling the user that it was unable to hack the ciphertext:
The transposition hacker program is under 50 lines of code because much of it exists in other programs. Several modules are imported on line 4.
The ciphertext to be hacked is stored in the myMessage variable. Line 9 has a string value that begins and ends with triple quotes. These strings do not have to have literal single and double quotes escaped inside of them. Triple quote strings are also called multi-line strings, because they can also contain actual newlines within them. Try typing the following into the interactive shell:
Notice that this string value can span over multiple lines. Everything after the opening triple quotes will be interpreted as part of the string until it reaches triple quotes ending it. Multi-line strings can either use three double quote characters or three single quote characters.
Multi-line strings are useful for putting very large strings into the source code for a program, which is why it is used on line 9 to store the ciphertext to be broken.
The ciphertext hacking code exists inside the hackTransposition() function. This function takes one string argument: the encrypted ciphertext message to be broken. If the function can hack the ciphertext, it returns a string of the decrypted text. Otherwise, it returns the None value. This value is stored in the hackedMessage variable.
If None was stored in hackedMessage, the program prints that it was unable to break the encryption on the message.
Otherwise, the text of the decrypted message is printed to the screen on line 17 and also copied to the clipboard on line 18.
Because there are many keys the program can go through, the program displays a message to the user telling her that the hacking has started. The print() call on line 26 also tells her that she can press Ctrl-C (on Windows) or Ctrl-D (on OS X and Linux) to exit the program at any point. (Pressing these keys will always exit a running Python program.)
The range of possible keys for the transposition cipher is the integers between 1 and the length of the message. The for loop on line 29 will run the hacking part of the function with each of these keys.
To provide feedback to the user, the key that is being tested is printed to the string on line 30, using string interpolation to place the integer in key inside the 'Trying key #%s...' % (key) string.
Using the decryptMessage() function in the transpositionDecrypt.py program that we’ve already written, line 32 gets the decrypted output from the current key being tested and stores it in the decryptedText variable.
The decrypted output in decryptedText will most likely only be English if the correct key was used (otherwise, it will appear to be random garbage). The string in decryptedText is passed to the detectEnglish.isEnglish() function we wrote in the last chapter.
But just because detectEnglish.isEnglish() returns True (making the program execution enter the block following the if statement on line 34) doesn’t mean the program has found the correct key. It could be a “false positive”. To be sure, line 38 prints out the first 100 characters of the decryptedText string (by using the slice decryptedText[:100]) on the screen for the user to look at.
The program pauses when line 41 executes, waiting for the user to type something in either D on nothing before pressing Enter. This input is stored as a string in response.
The strip() method can also have a string argument passed to it that tells the method which characters should be removed from the start and end of the string instead of removing whitespace. The whitespace characters are the space character, the tab character, and the newline character. Try typing the following into the interactive shell:
The expression on line 43 used for the if statement’s condition lets the user have some flexibility with what has to be typed in. If the condition were response == 'D', then the user would have to type in exactly “D” and nothing else in order to end the program.
If the user typed in 'd' or ' D' or 'Done' then the condition would be False and the program would continue. To avoid this, the string in response has any whitespace removed from the start or end with the call to strip(). Then the string that response.strip() evaluates to has the upper() method called on it. If the user typed in either “d” or “D”, the string returned from upper() will be 'D'. Little things like this make our programs easier for the user to use.
If the user has indicated that the decrypted string is correct, the decrypted text is returned from hackTransposition() on line 44.
Line 46 is the first line after the for loop that began on line 29. If the program execution reaches this point, it’s because the return statement on line 44 was never reached. That would only happen if the correctly decrypted text was never found for any of the keys that were tried.
In that case, line 46 returns the None value to indicate that the hacking has failed.
Lines 48 and 49 call the main() function if this program was run by itself, rather than imported by another program that wants to use its hackTransposition() function.
Practice exercises can be found at http://invpy.com/hackingpractice13A.
This chapter was short like the “Breaking the Caesar Cipher with the Brute-Force Technique” chapter because (also like that chapter) most of the code was already written in other programs. Our hacking program can import functions from these other programs by importing them as modules.
The strip() string method is useful for removing whitespace (or other) characters from the beginning or end of a string. If we use triple quotes, then a string value can span across multiple lines in our source code.
The detectEnglish.py program removes a lot of the work of inspecting the decrypted output to see if it’s English. This allows the brute-force technique to be applied to a cipher that can have thousands of keys.
Our programs are becoming more sophisticated. Before we learn the next cipher, we should learn how to use Python’s debugger tool to help us find bugs in our programs.