# Thread: Decrypting OTP ciphertext

1. ## Decrypting OTP ciphertext

(See Stroustrup, The C++ Programming Language (3rd ed.), p. 164).

How would you help decrypt messages encrypted using the OTP algorithm?

I'm able to encrypt messages alright using XOR:

Code:
```while there are characters in message:
message XOR key = ciphertext;```
The key ("pad") has the same number of characters as the message, and is generated using a truly random RNG; true randomness is required for perfect security.

Suppose you're presented with an OTP-encoded message. How would you generate a list of plausable messages and keys?

My first idea was to generate message-key pairs by exhaustion and apply grammar rules for indo-european languages to each generated message, but that would have to be done ~10^21 times for a 20-character-long message.

Is there a quicker way to do this? I don't know where to start : (

Thanks.

2. I think that's the hole point with OTP generated encryption. It is literally unbreakable other than by brute force, and that takes a very long time...

--
Mats

3. Originally Posted by matsp
I think that's the hole point with OTP generated encryption. It is literally unbreakable other than by brute force, and that takes a very long time...
It's not breakable period, by brute force or any other method.

4. What about just generating messages that are grammatically correct? A combination like

"97xj34m725"

probably wouldn't occur in a plaintext message....

Also:

If the key is shorter than the message, the encryption is not secure. Why is that?

5. In practical terms, for messages with any reasonable content, yes, I agree - the only possible way to break One Time Pad encryption is by having a copy of the One Time Pad used for the message.

Of course you CAN break it by brute force - ANY encryption can be broken by that method. Just try every permutation in every position, and you will sooner or later find the right combination [1]. Of course, there are 26 combinations for each position, so the number of possible permuations is 26 to the power of message length, which quickly gets to a HUGE number [that assumes that you only use A-Z keys too, which may not be the case - in which case it takes off at even more rapid rate].

It is obviously also necessary to have some mechanism to recognise "words" in some way.

[1] Of course, you can probably use some heuristics for "cutting the attempt short" when a partiucular letter doesn't make any meaning in that position. Although there is of course no guarantee that the message doesn't starts or ends with "%%A!"H##11211NNNAZZmmAAAE" just to confuse people, and that there are some extra random letters thrown in for good measure here and there. .

--
Mats

6. Originally Posted by slippy
What about just generating messages that are grammatically correct? A combination like

"97xj34m725"

probably wouldn't occur in a plaintext message....

Also:

If the key is shorter than the message, the encryption is not secure. Why is that?
If the key is (noticably) shorter than the message, it is more likely that the breaker can "guess" the key based on the frequency of certain letters in "average text" (in English, S and E are amongst the most common).

As I stated in the other message, there is nothing saying that there isn't nonsense "filler" stuff in the message, but you can of course expect SOME of the message to have valid content. It helps if you know which language it is supposed to be, as for example KK doesn't occur very often in English words [if at all], but isn't that unlikely in Finish.

--
Mats

7. > Of course you CAN break it by brute force - ANY encryption can be broken by that method.

With a random OTP, all possible output messages are equally likely, so in particular all possible output messages that satisfy any criterion you want are equally likely. Sure, with brute force you will eventually generate the right message, but you won't have any way of recognizing it. The only way to break OTP is if the pad isn't truly random, as can happen if one is lazy and decides to reuse a pad.

Edit: Actually, that's not right - plausible-looking output messages are more likely, of course, but since there is a one-to-one correspondence between output messages and keys, if the key is truly random and unknown then the encrypted message can't provide any information that one didn't have to begin with (except for the message's length, and even that's just an upper bound, since the message could have been padded with random garbage). In other words, instead of trying random keys and looking for plausible-looking output messages, one may as well just generate every possible output message of the given length and sort through those instead.

8. >Just try every permutation in every position, and you will sooner or later find the right combination

But the problem is distinguishing the correct plaintext from the wrong ones, isn't it? Assuming the key is perfectly random, the only way you could find the correct plaintext is if there is only one possible plaintext of the same length of the ciphertext.

On the other hand, if there is any kind of pattern in the key, it might be possible to recognize when that pattern is apparent in a guessed key and also produces a reasonable decryption.

9. Originally Posted by slippy
What about just generating messages that are grammatically correct? A combination like

"97xj34m725"

probably wouldn't occur in a plaintext message....
Usually, as an extra precaution, an key based encryption is use in conjunction with a simple cypher, so that even if the codebreaker finds the key, he won't immediately be able to recognize that the key was correct.

10. Originally Posted by matsp
Of course you CAN break it by brute force - ANY encryption can be broken by that method.
Untrue. There are a huge number of plausible messages the ciphertext could decrypt to, each with a different key. It is impossible in principle to determine which is which.

Seriously, Claude Shannon didn't get this wrong.

Here's a 16 character ciphertext:

67c6697351ff4aec29cdbaabf2fbe346

This was encrypted by XOR'ing the plaintext with a 16 bit one time pad.

Tell me what this decrypts to, using pad 13ae0000719639cc48edc9ce91898632.