one time pad breakable debate

**laserlight** · 03-17-2010

Originally Posted by MK27

Randomness isn't, IMO. I'm a determinist like Einstein. Randomness is an illusion.

Entropy is I see more like an effect of an "arbitrary" event or juxtaposition, as in thermodynamics or quantum physics. On which I am no expert

The point being, there are no random sources, only entropic wells to tap.

That's a fine philosophical view of randomness (of the lack thereof). It has no bearing on the concept on randomness with respect to cryptography, which works even if say, God does not play dice with the universe.

**NeonBlack** · 03-17-2010

Originally Posted by laserlight

God does not play dice with the universe.

God can do whatever the hell he wants to.

**Mario F.** · 03-17-2010

Originally Posted by whiteflags

If you choose a source that maps to 47 zeros amd 53 ones, that's not going to be random

I hope you are talking exclusively of deterministic PRNGs.

An RNG input will condition the results, admittedly mapping to 47 zeros and 53 ones. But you will still get a random result if you cannot measure the quality of the RNG in any possible way. And here too there is no form of rigging the output (like running an extractor to produce a uniform distribution) that will change the fact the original sequence was random, despite not being uniformly distributed.

**whiteflags** · 03-17-2010

I am talking exclusively about the generator that MK says is random.

If I understand what you mean, then a generator can be random even if the probability of an outcome is not constant. (Again, uniformly distributed stuff... there is a wiki page.)

In probability theory and statistics, the discrete uniform distribution is a probability distribution whereby a finite number of values are equally likely to be observed; every one of n values has equal probability 1/n.

Maybe that's clearer.... Otherwise, frankly I give up then with sincere apologies. I understood this as a fundamental part of unpredictability. It might be me, but I'd need some help learning this correctly, if I am wrong.

**Mario F.** · 03-18-2010

You are correct, as far as I know. But only where the definition is applicable. And this is why they run extractors on RNGs. To achieve that characteristic dear to fields like cryptography where randomness needs to be mathematically defined.

However a uniform distribution, merely defines a characteristic of a known system. It does not by itself determine the condition of being random. It's defines the quality of that randomness. What has been defining randomness for us so far is our inability to determine the next outcome without any margin of error whatsoever (shocking!). That's why we look for generators outside deterministic origins. Reading noise levels from rain is bound do produce random data, data we have no way of predicting or duplicating and that is fully non deterministic to the best of our current scientific knowledge and technological abilities. Yet we know full well, before we start reading it, that the output of those noise levels will contain undesirable patterns and bias. We do not question their random condition, but because we need a uniform distribution we merely call them Weak Random Data.

I agree that under such fields as cryptanalysis, it doesn't matter. Laserlight also repeatedly tried to put this point across. And I agree. We cannot just extrapolate a concept and trample over centuries of well defined and mathematically established rules concerning random data generation. But I felt I needed to insist that the original data is still random, despite not being very useful for some purposes under its present condition.

It's not that cryptanalysis has a definition for Random. It's that it has requirements for Random. I think this is an important distinction.

**kryptkat** · 03-18-2010

Unicity distance - Wikipedia, the free encyclopedia

true randomness can not be mathematically defined. that is the problem. as programmers we can only try to simulate true randomness.

http://www.cacr.math.uwaterloo.ca/hac/about/chap5.pdf
is that the whole chapter or just part of the chapter ?

this is more fascinating than a ball of string.

exactly how many numbers do you really need to find the seed ? do you have any handy code you could paste that would find the seed ?

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

**Mario F.** · 03-18-2010

Originally Posted by kryptkat

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

As someone who knows the contents of the plaintext, you have a determining advantage over an attacker who is exactly trying to establish what the contents of the plaintext are. You are not attacking the cipher because a one-time pad is exactly defined by its ability to hide the real plaintext behind a seemingly infinite number of possible matches. And you cannot test this ability if you know what the plaintest is. In other words, you are not attacking the cipher because the one thing that defines it is never put to the test.

Your example can serve other type of ciphers, mind you. But just not a One-Time Pad.

**Sebastiani** · 03-18-2010

You know, I just realized a very easy way to implement a "repetitive" OTP (eg: wherein the key is shorter than the data and so is repeated for the length of the data) that would be sufficiently secure (provided that the key is "reasonably strong", of course): Just compress the data before applying the key. The only requirement there is that the compression format couldn't contain any "signature" bytes that could be used to reveal the key. I actually have a Huffman compressor that uses such a convention (it relies on sanity checks to verify the file rather than signatures). The attacker would have to resort to trying to guess the key without the benefit of analytic methods.

**laserlight** · 03-18-2010

Originally Posted by kryptkat

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

It is obviously not breaking a one time pad, since the user would have known the plaintext even without cryptanalysis. If the user does not know the plaintext, then he/she would be unable to recognise the message.

Originally Posted by Sebastiani

You know, I just realized a very easy way to implement a "repetitive" OTP (eg: wherein the key is shorter than the data and so is repeated for the length of the data) that would be sufficiently secure (provided that the key is "reasonably strong", of course): Just compress the data before applying the key.

Yes, that is a good idea, and I believe that it is already in use when implementing ciphers in practice (including those not equivalent to a one time pad). Of course, in the end it must the case that the key is not repeated, otherwise it is not a one time pad.

Originally Posted by Sebastiani

The only requirement there is that the compression format couldn't contain any "signature" bytes that could be used to reveal the key.

For a one time pad, that does not matter. The portion of the key revealed has nothing to do with the rest of the data.

**Sebastiani** · 03-18-2010

Originally Posted by laserlight

Yes, that is a good idea, and I believe that it is already in use when implementing ciphers in practice (including those not equivalent to a one time pad). Of course, in the end it must the case that the key is not repeated, otherwise it is not a one time pad.

Right, and no doubt a true OTP would be much more secure.

Originally Posted by laserlight

For a one time pad, that does not matter. The portion of the key revealed has nothing to do with the rest of the data.

Yes, for a true OTP that is indeed the case, but if the key is actually being repeated then revealing a portion of the key would necessarily expose each and every cell that corresponds to that index of the key. So for example, let's say the attacker knows that the data is a text file. He starts with a guess that the key block is N bytes long, and then xor's the Xth index of each block with the value V. If every byte altered yields a character, it's marked as a possible candidate, and the process continues. Then all of the permutations of candidate combinations are tried and compared with a dictionary. The value of N is incremented and the process repeats. Even though it's just a brute force technique, the rejection criteria may well speed things up quite a bit, making it an ugly but notheless feasable approach.

**laserlight** · 03-18-2010

Originally Posted by kryptkat

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

How can that be considered as breaking a one time pad? Let's make your scenario harder: not only has the user lost the key, but the user has lost the encrypted file as well. It is obvious that using your idea, the user can still recover the message, simply by brute force. In other words, it is not even necessary to have the ciphertext to retrieve the plaintext. Therefore, there is no cryptanalysis involved; you're only testing the user's ability to recognise the message when it is shown to him/her.

**pianorain** · 03-19-2010

Originally Posted by MK27

I'm gonna assert the best key generator will have no predictable distribution at all.

This is an interesting idea. I want to make sure I understand what you're saying. For all two-bit strings, which of the following scenarios best exemplifies what you mean by 'no predictable distribution'?

There are four two-bit strings: 00, 01, 10, 11. Each of these strings has a 25% chance of being generated by the RNG.
There are three bit ratios: 100/0, 50/50, 0/100. Each of these bit ratios has a 33% chance of being generated by the RNG.

**laserlight** · 03-19-2010

Originally Posted by MK27

Let's go back to the idea that a 1000 bits represents 2^1000 possibilities. If I use a RNG that guarantees a 50/50 distribution, or is based on an algorithm admired for such, then that is no longer true...

IF I, AS THE CRACKER, KNOW THAT YOU ARE OBSESSED WITH PERFECT DISTRIBUTION, GUESS WHAT JUST HAPPENED? HA HA HA!

With a uniform distribution, each bit is equally likely to be a 0 or a 1, by definition. Therefore, a RNG with a uniform distribution has a probability of 1/(2^1000) to generate a sequence that consists entirely of 0s, and indeed this probability is the same for any particular sequence of 1000 bits. So, the adversary is not better off with this knowledge.

EDIT:
Ah, I see that whiteflags has already mentioned this, although I would not go so far as to say that distribution has nothing to do with output since it is the distribution that determines characteristics of the output (more than just the proportion of 0s and 1s). I believe that this is an idea behind the use of statistical tests of randomness, i.e., by testing a sufficiently large sample of output, you can draw some conclusions about how "random" is the random sequence generated, and thus how good is the quality of the PRNG or RNG for the intended purpose.

**kryptkat** · 03-19-2010

How can that be considered as breaking a one time pad? Let's make your scenario harder: not only has the user lost the key, but the user has lost the encrypted file as well. It is obvious that using your idea, the user can still recover the message, simply by brute force. In other words, it is not even necessary to have the ciphertext to retrieve the plaintext. Therefore, there is no cryptanalysis involved; you're only testing the user's ability to recognise the message when it is shown to him/her.

nice. that makes sense.

but what about video ? say a video stream or pictures. you know i am correct about the focus thing. you or the user or the attacker may recognize pictures or normal movies or even porn. you think that would be considered breaking a one time pad then ?

if a user xors the most popular letters say 's' 'a' and 't' they might get them in the correct spot. then guess some of the correct words cryptanalysis is just testing and guessing anyway. ie one line of 't' then one run on the file with 's' then one run on the file with the other letters to see what shows up.

**laserlight** · 03-19-2010

Originally Posted by kryptkat

but what about video ? say a video stream or pictures. you know i am correct about the focus thing. you or the user or the attacker may recognize pictures or normal movies or even porn. you think that would be considered breaking a one time pad then ?

No, because even if the attacker has external information that tells him/her that the plaintext is video of a particular format, the attacker would still be faced with all possible videos of that format that can be encoded and encrypted such that it produces a ciphertext of that length. Since all those videos are equally likely, recognising one of them does not mean anything... further down the road of brute force, the attacker will recognise another, and another, and another, etc. Eventually, one of them will be the plaintext, but with just this information, the attacker will never be able to honestly say that aha!, among all those that I recognise, this video is most likely to be the plaintext.

Originally Posted by kryptkat

if a user xors the most popular letters say 's' 'a' and 't' they might get them in the correct spot. then guess some of the correct words cryptanalysis is just testing and guessing anyway. ie one line of 't' then one run on the file with 's' then one run on the file with the other letters to see what shows up.

You seem to be talking about frequency analysis. Again, this does not work against a one time pad, because the attacker has no way to verify which of the guesses is correct. The idea that because the ciphertext depends entirely on the key, as long as you do not know the key, you can derive no information about the plaintext, regardless of your cryptanalysis technique.

We only have to be concerned with brute force. If brute force does not cut it, nothing else will, including frequency analysis.