“I’ve been over it a thousand times,” Waterhouse says, “and the only explanation I can think of is that they are converting their messages into large binary numbers and then combining them with other large binary numbers—one-time pads, most likely.”
“In which case your project is doomed,” Alan says, “because you can’t break a one-time pad.”
—Neal Stephenson, Cryptonomicon
In this chapter, you’ll learn about a cipher that is impossible to crack, no matter how powerful your computer is, how much time you spend trying to crack it, or how clever a hacker you are. It’s called the one-time pad cipher, and the good news is that we don’t have to write a new program to use it! The Vigenère cipher program you wrote in Chapter 18 can implement this cipher without any changes. But the one-time pad cipher is so inconvenient to use on a regular basis that it’s often reserved for the most top-secret of messages.
The one-time pad cipher is a Vigenère cipher that becomes unbreakable when the key meets the following criteria:
It is exactly as long as the encrypted message.
It is made up of truly random symbols.
It is used only once and never again for any other message.
By following these three rules, you can make your encrypted message invulnerable to any cryptanalyst’s attack. Even with infinite computing power, the cipher cannot be broken.
The key for the one-time pad cipher is called a pad because the keys used to be printed on pads of paper. After the top sheet of paper was used, it would be torn off the pad to reveal the next key to use. Usually, a large list of one-time pad keys is generated and shared in person, and the keys are marked for specific dates. For example, if we received a message from our collaborator on October 31, we would just look through the list of one-time pads to find the corresponding key for that day.
To understand why the one-time pad cipher is unbreakable, let’s think about what makes the regular Vigenère cipher vulnerable to attack. Recall that the Vigenère cipher–hacking program works by using frequency analysis. But if the key is the same length as the message, each plaintext letter’s subkey is unique, meaning that each plaintext letter could be encrypted to any ciphertext letter with equal probability.
For example, to encrypt the message IF YOU WANT TO SURVIVE OUT HERE, YOU’VE GOT TO KNOW WHERE YOUR TOWEL IS, we remove the spaces and punctuation to get a message that has 55 letters. To encrypt this message using a one-time pad, we need a key that is also 55 letters long. Using the example key KCQYZHEPXAUTIQEKXEJMORETZHZTRWWQDYLBTTVEJMEDBSANYBPXQIK to encrypt the string would result in the ciphertext SHOMTDECQTILCHZSSIXGHYIKDFNNMACEWRZLGHRAQQVHZGUERPLBBQC, as shown in Figure 21-1.
Figure 21-1: Encrypting an example message using a one-time pad
Now imagine that a cryptanalyst gets hold of the ciphertext (SHOM TDEC...). How could they attack the cipher? Brute-forcing through the keys wouldn’t work, because there are too many even for a computer. The number of keys would be equal to 26 raised to the power of the total number of letters in the message. So if the message has 55 letters as in our example, there would be a total of 2655, or 666,091,878,431,395,624,153,823,182, 526,730,590,376,250,379,528,249,805,353,030,484,209,594,192,101,376 possible keys.
Even if the cryptanalyst had a computer powerful enough to try all the keys, it still wouldn’t be able to break the one-time pad cipher because for any ciphertext, all possible plaintext messages are equally likely.
For example, the ciphertext SHOMTDEC... could have easily resulted from a completely different plaintext the same number of letters in length, such as THE MYTH OF OSIRIS WAS OF IMPORTANCE IN ANCIENT EGYPTIAN RELIGION encrypted using the key ZAKAVKXOLFQDLZHWSQJBZMTWMMNAKWURWEXDCUYWKSGORGHNNEDVTCP, as shown in Figure 21-2.
Figure 21-2: Encrypting a different example message using a different key but producing the same ciphertext as before
The reason we can hack any encryption at all is that we know there is usually only one key that decrypts the message to sensible English. But we’ve just seen in the preceding example that the same ciphertext could have been made using two very different plaintext messages. When we use the one-time pad, the cryptanalyst has no way of telling which is the original message. In fact, any readable English plaintext message that is exactly 55 letters long is just as likely to be the original plaintext. Just because a certain key can decrypt the ciphertext to readable English doesn’t mean it is the original encryption key.
Because any English plaintext could have been used to create a ciphertext with equal likelihood, it’s impossible to hack a message encrypted using a one-time pad.
As you learned in Chapter 9, the random module built into Python doesn’t generate truly random numbers. They are computed using an algorithm that creates numbers that only appear random, which is good enough in most cases. However, for the one-time pad to work, the pad must be generated from a truly random source; otherwise, it loses its mathematically perfect secrecy.
Python 3.6 and later have the secrets module, which uses the operating system’s source of truly random numbers (often gathered from random events, such as the time between the user’s keystrokes). The function secrets.randbelow() can return truly random numbers between 0 and up to but not including the argument passed to it, as in this example:
>>> import secrets
>>> secrets.randbelow(10)
2
>>> secrets.randbelow(10)
0
>>> secrets.randbelow(10)
6
The functions in secrets are slower than the functions in random, so the functions in random are preferred when true randomness is not needed. You can also use the secrets.choice() function, which returns a randomly chosen value from the string or list passed to it, as in this example:
>>> import secrets
>>> secrets.choice('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
'R'
>>> secrets.choice(['cat', 'dog', 'mouse'])
'dog'
To create a truly random one-time pad that is 55 characters long, for example, use the following code:
>>> import secrets
>>> otp = ''
>>> for i in range(55):
otp += secrets.choice('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
>>> otp
'MVOVAAYDPELIRNRUZNNQHDNSOUWWNWPJUPIUAIMKFKNHQANIIYCHHDC'
There’s one more detail we must keep in mind when using the one-time pad. Let’s examine why we need to avoid using the same one-time pad key more than once.
A two-time pad cipher refers to using the same one-time pad key to encrypt two different messages. This creates a weakness in an encryption.
As mentioned earlier, just because a key decrypts the one-time pad ciphertext to readable English doesn’t mean it is the correct key. However, when you use the same key for two different messages, you’re giving crucial information to the hacker. If you encrypt two messages using the same key and a hacker finds a key that decrypts the first ciphertext to readable English but decrypts the second message to random garbage text, the hacker will know that the key they found must not be the original key. In fact, it is highly likely that there is only one key that decrypts both messages to English, as you’ll see in the next section.
If the hacker has only one of the two messages, that message is still perfectly encrypted. But we must always assume that all of our encrypted messages are being intercepted by hackers and governments. Otherwise, we wouldn’t bother encrypting messages in the first place. Shannon’s maxim is important to keep in mind: the enemy knows the system! This includes all of your ciphertext.
You’ve already learned how to break Vigenère ciphers. If we can show that a two-time pad cipher is the same as a Vigenère cipher, we can prove that it’s breakable using the same techniques used to break the Vigenère cipher.
To explain why the two-time pad is hackable just like the Vigenère cipher, let’s review how the Vigenère cipher works when it encrypts a message that is longer than the key. When we run out of letters in the key to encrypt with, we go back to the first letter of the key and continue encrypting. For example, to encrypt a 20-letter message like BLUE IODINE INBOUND CAT with a 10-letter key such as YZNMPZXYXY, the first 10 letters (BLUE IODINE) are encrypted with YZNMPZXYXY, and then the next 10 letters (INBOUND CAT) are also encrypted with YZNMPZXYXY. Figure 21-3 shows this wraparound effect.
Figure 21-3: The Vigenère cipher’s wraparound effect
Using the one-time pad cipher, let’s say the 10-letter message BLUE IODINE is encrypted using the one-time pad key YZNMPZXYXY. Then the cryptographer makes the mistake of encrypting a second 10-letter message, INBOUND CAT, with the same one-time pad key, YZNMPZXYXY, as shown in Figure 21-4.
Figure 21-4: Encrypting plaintext using a one-time pad produces the same ciphertext as the Vigenère cipher.
When we compare the ciphertext of the Vigenère cipher shown in Figure 21-3 (ZKHQXNAGKCGMOAJMAAXR) to the ciphertexts of the one-time pad cipher shown in Figure 21-4 (ZKHQXNAGKC GMOAJMAAXR), we can see they are exactly the same. This means that because the two-time pad cipher has the same properties as the Vigenère cipher, we can use the same techniques to hack it!
In short, a one-time pad is a way to make Vigenère cipher encryptions invulnerable to hacking by using a key that has the same length as the message, is truly random, and is used only once. When these three conditions are met, it’s impossible to break the one-time pad. However, because it’s so inconvenient to use, it’s not used for everyday encryption. Usually, the one-time pad is distributed in person and contains a list of keys. But be sure this list doesn’t fall into the wrong hands!