Note: The second edition of this book is available under the title Cracking Codes with Python
Topics Covered In This Chapter:
· The Unbreakable One-Time Pad Cipher
· The Two-Time Pad is the Vigenère Cipher
“I’ve been over it a thousand times,” Waterhouse says, “and the only explanation I can think of is that they are converting their messages into large binary numbers and then combining them with other large binary numbers —one-time pads, most likely —to produce the ciphertext.”
“In which case your project is doomed," Alan says, "because you can't break a one-time pad.”
“Cryptonomicon” by Neal Stephenson
There is one cipher that is impossible to crack, no matter how powerful your computer is, how much time you have to crack it, or how clever of a hacker you are. We won’t have to write a new program to use it either. Our Vigenère program can implement this cipher without any changes. But this cipher is so inconvenient to use on a regular basis that it is often only used for the most top-secret of messages.
The one-time pad cipher is an unbreakable cipher. It is a Vigenère cipher where:
1. The key is exactly as long as the message that is encrypted.
2. The key is made up of truly random symbols.
3. The key is used one time only, and never used again for any other message.
By following these three rules, your encrypted message will be invulnerable to any cryptanalyst’s attack. Even with literally an infinite amount of computing power, the cipher cannot be broken.
The key for the one-time pad cipher is called a pad because they were printed on pads of paper. The top sheet of paper would be torn off the pad after it was used to reveal the next key to use.
To see why the one-time pad (OTP) cipher is unbreakable, let’s think about why the regular Vigenère cipher is vulnerable to breaking. Our Vigenère cipher hacking program works by doing frequency analysis. But if the key is the same length as the message, then every possible ciphertext letter is equally probable to be for the same plaintext letter.
Say that we want to encrypt the message, “If you want to survive out here, you've got to know where your towel is.” If we remove the spaces and punctuation, this message has 55 letters. So to encrypt it with a one-time pad, we need a key that is also 55 letters long. Let’s use the key “kcqyzhepxautiqekxejmoretzhztrwwqdylbttvejmedbsanybpxqik”. Encrypting the string looks like this:
Plaintext |
ifyouwanttosurviveouthereyouvegottoknowwhereyourtowelis |
Key |
kcqyzhepxautiqekxejmoretzhztrwwqdylbttvejmedbsanybpxqik |
Ciphertext |
shomtdecqtilchzssixghyikdfnnmacewrzlghraqqvhzguerplbbqc |
Now imagine a cryptanalyst got a hold of the ciphertext (“shomtdec...”). How could she attack the cipher? Brute-forcing through the keys would not work, because there are too many even for a computer. The number of keys is 26 ^ (number of letters in the message), so if the message has 55 letters, there would be a total of 26 ^ 55, or 666,091,878,431,395,624,153,823,182,526,730,590, 376,250,379,52 8,249,805,353,030,484,209,594,192,101,376 possible keys.
But it turns out that even if she had a computer that was powerful enough to try all the keys, it still would not break the one-time pad cipher. This is because for any ciphertext, all possible plaintext messages are equally likely.
For example, given the ciphertext “shomtdec...”, we could easily say the original plaintext was “The myth of Osiris was of importance in ancient Egyptian religion.” encrypted with the key “zakavkxolfqdlzhwsqjbzmtwmmnakwurwexdcuywksgorghnnedvtcp”:
Plaintext |
themythofosiriswasofimportanceinancientegyptianreligion |
Key |
zakavkxolfqdlzhwsqjbzmtwmmnakwurwexdcuywksgorghnnedvtcp |
Ciphertext |
shomtdecqtilchzssixghyikdfnnmacewrzlghraqqvhzguerplbbqc |
The way we are able to hack encryption is because there is usually only one key that can be used to decrypt the message to sensible English. But we’ve just shown that the same ciphertext could have been made from two very different plaintext messages. For the one-time pad, the cryptanalyst has no way of telling which was the original message. In fact, any readable English plaintext message that is exactly 55 letters long is just as likely to be the original plaintext. Just because a certain key can decrypt the ciphertext to readable English does not mean it was the original encryption key.
Since any English plaintext could have been used to create a ciphertext with equal likelihood, it is completely impossible to hack a message encrypted with a one-time pad.
The random module that comes with Python does not generate truly random numbers. They are computed from an algorithm that creates numbers that only appear random (which is often good enough). If the pad is not generated from a truly random source, then it loses its mathematically-perfect secrecy.
The os.urandom() function can provide truly random numbers but is a bit more difficult to use. For more information about this function, see http://invpy.com/random.
If you do use the same one-time pad key to encrypt two different messages, you have introduced a weakness into your encryption. Using the one-time pad cipher this way is sometimes called a “two-time pad cipher”. It’s a joke name though, the two-time pad cipher is really just using the one-time pad cipher incorrectly.
Just because a key decrypts the one-time pad ciphertext to readable English does not mean it is the correct key. However, if you use the same key for two different messages, now the hacker can know that if a key decrypts the first ciphertext to readable English, but that same key decrypts the second message to random garbage text, it must not be the original key. In fact, it is highly likely that there is only one key that will decrypt both messages to English.
If the hacker only had one of the two messages, then it is still perfectly encrypted. But, you must always assume that all of your encrypted messages are being intercepted by hackers and/or governments (otherwise, you wouldn’t need to bother encrypting your messages.) Remember Shannon’s Maxim: The enemy knows the system! This includes knowing the ciphertext.
To see why the two-time pad is hackable just like the Vigenère Cipher, let’s think about how the Vigenère cipher works when it encrypts a message that is longer than the key. Once we run out of characters in the key to encrypt with, we go back to the first character of the key and continue encrypting. So to encrypt a 20-character message like “AABBCCDDEEVVWWXXYYZZ” with a 10-character long key such as “PRECOCIOUS”, the first ten characters (AABBCCDDEE) are encrypted with “PRECOCIOUS” and then the next ten characters (VVWWXXYYZZ) are also encrypted with “PRECOCIOUS”.
Plaintext |
AABBCCDDEEVVWWXXYYZZ |
Vigenère Key |
PRECOCIOUSPRECOCIOUS |
Vigenère Ciphertext |
PRFDQELRYWKMAYLZGMTR |
We have already learned how to break Vigenère ciphers. If we can show that a two-time pad cipher is the same thing as a Vigenère cipher, then we can prove that it is breakable using the same techniques used to break Vigenère cipher.
Using the one-time pad cipher, let’s say the 10-character message “AABBCCDDEE” was encrypted with the one-time pad key “PRECOCIOUS”. Then the cryptographer makes the mistake of encrypting a second 10-character message “VVWWXXYYZZ” with the same one-time pad key, “PRECOCIOUS”.
|
Message 1 Message 2 |
Plaintext |
AABBCCDDEE VVWWXXYYZZ |
One-Time Pad Key |
PRECOCIOUS PRECOCIOUS |
One-Time Pad Ciphertext |
PRFDQELRYW KMAYLZGMTR |
If we compare the ciphertext of the Vigenère cipher and the ciphertexts of the one-time pad cipher, we can see they are the exact same. The two-time pad cipher has the same properties as the Vigenère cipher, which means that the same techniques could be used to hack it!
This also tells us that if we do the Vigenère cipher but use a key that is as long as the message it is encrypting (and only use this key once for this message), then it will be perfectly unbreakable. This is why we don’t need to write a separate one-time pad cipher program. Our Vigenère cipher program already does it!
Practice exercises can be found at http://invpy.com/hackingpractice22A.
In short, a one-time pad is just the Vigenère cipher with a key that is the same length as the message and is only used once. As long as these two conditions are followed, it is literally impossible to break the one-time pad. However, it is so inconvenient to use the one-time pad that it is not generally used except for the most top-secret of secrets. Usually a large list of one-time pad keys are generated and shared in person, with the keys marked for specific dates. This way, if you receive a message from your collaborator on October 31st, you can just look through the list of one-time pads to find the one for that day. But be sure this list doesn’t fall into the wrong hands!