one time pad breakable debate

**Mario F.** · 03-17-2010

Yah, but i still had no one correct my spelling there. Help?

**whiteflags** · 03-17-2010

I think where everyone has a cerebral disconnect is where distribution applies. Some of us seem to think that uniform distribution means that out of all generated data, the data will appear in equal quantities, such that you can guess 1/nth of it by brute force. Not at all the case. It just means that you've selected something that may "seem right".

In cryptography especially, seeming right is tantamount to disaster. The output of an rng has nothing to do with its randomness, it's how random it is going to be. If a generator favors 1 as opposed to 0 by whatever amount... it may produce things that are very random... but without an equal probability, I just cannot define something as random. There's always some other factor.

The whole point of a period in a random system is after x tries, there is no choice, it's going to repeat the period, unless the prng is self modifying (and that would be interesting).

I just start to wonder if I make sense to anyone at all.

**MK27** · 03-17-2010

Originally Posted by whiteflags

Some of us seem to think that uniform distribution means that out of all generated data, the data will appear in equal quantities, such that you can guess 1/nth of it by brute force. Not at all the case.

So you could get a stream of all ones with a uniform distribution, is what you are saying. I guess so!

If a generator favors 1 as opposed to 0 by whatever amount... it may produce things that are very random... but without an equal probability, I just cannot define something as random. There's always some other factor.

You are still hung-up on this REIFIED idea of randomness. You need to read some critiques of Platonism my friend!

If a generator does not favor 1 or 0 zero by any amount, then it must favor a uniform distribution. That is not random.

A perfect random bitstream generator would have the same odds of producing a stream of continuous zeros as a stream of continuous ones as a stream consisting of exactly 50% ones and exactly 50% zeros. There should be no ratio more likely than any other.

Given infinite time and computing power, what kind of distribution would you assign to such a generator at the end? Uniform?

Do you see some paradoxical problems starting to form? There are a few routes to a uniform distribution. And the one that makes it there first is not the winner! Therefore: you can never declare a distribution uniform until you have all the data the generator will ever produce! You are just guesstimating the truth!

It's all because of that reification fallacy. I'm starting to understand where this came from...

Originally Posted by whiteflags

I just start to wonder if I make sense to anyone at all.

**whiteflags** · 03-17-2010

In a word, yes!

I want you to fully understand my point, so I am going to use the same quote.

"A perfect random bitstream generator would have the same odds of producing a stream of continuous zeros as a stream of continuous ones as a stream consisting of exactly 50% ones and exactly 50% zeros. There should be no ratio more likely than any other."

Except, with dice, 1/6 is not greater than 1/6. With bits, a system that has an equal number of ones as to choose from as outcomes, versus an equal number of zeros, is going to be perfectly random. If you choose a source that maps to 47 zeros amd 53 ones, that's not going to be random, according to your own damn quote, because a one is more favorable than zero.

This is why I kept telling you I don't care how you nigger rig the results so that there are 50 zeros and 50 ones, or whatever. You cannot determine randomness through output. A stream of ones is as likely as a stream of zeros if the outcomes are equally likely. This is distribution. Distribution has nothing to do with output. Bold, capitals, color. I have been trying to explain this to you the whole time. The keyboard is not a random source because you have not shown me a keyboard that maps to an equal number of ones and zeros.

If you don't want to believe that, you misunderstand your own quote.

**Mario F.** · 03-17-2010

Originally Posted by whiteflags

If you choose a source that maps to 47 zeros amd 53 ones, that's not going to be random

I hope you are talking exclusively of deterministic PRNGs.

An RNG input will condition the results, admittedly mapping to 47 zeros and 53 ones. But you will still get a random result if you cannot measure the quality of the RNG in any possible way. And here too there is no form of rigging the output (like running an extractor to produce a uniform distribution) that will change the fact the original sequence was random, despite not being uniformly distributed.

**laserlight** · 03-17-2010

Originally Posted by MK27

Randomness isn't, IMO. I'm a determinist like Einstein. Randomness is an illusion.

Entropy is I see more like an effect of an "arbitrary" event or juxtaposition, as in thermodynamics or quantum physics. On which I am no expert

The point being, there are no random sources, only entropic wells to tap.

That's a fine philosophical view of randomness (of the lack thereof). It has no bearing on the concept on randomness with respect to cryptography, which works even if say, God does not play dice with the universe.

**NeonBlack** · 03-17-2010

Originally Posted by laserlight

God does not play dice with the universe.

God can do whatever the hell he wants to.

**whiteflags** · 03-17-2010

I am talking exclusively about the generator that MK says is random.

If I understand what you mean, then a generator can be random even if the probability of an outcome is not constant. (Again, uniformly distributed stuff... there is a wiki page.)

In probability theory and statistics, the discrete uniform distribution is a probability distribution whereby a finite number of values are equally likely to be observed; every one of n values has equal probability 1/n.

Maybe that's clearer.... Otherwise, frankly I give up then with sincere apologies. I understood this as a fundamental part of unpredictability. It might be me, but I'd need some help learning this correctly, if I am wrong.

**Mario F.** · 03-18-2010

You are correct, as far as I know. But only where the definition is applicable. And this is why they run extractors on RNGs. To achieve that characteristic dear to fields like cryptography where randomness needs to be mathematically defined.

However a uniform distribution, merely defines a characteristic of a known system. It does not by itself determine the condition of being random. It's defines the quality of that randomness. What has been defining randomness for us so far is our inability to determine the next outcome without any margin of error whatsoever (shocking!). That's why we look for generators outside deterministic origins. Reading noise levels from rain is bound do produce random data, data we have no way of predicting or duplicating and that is fully non deterministic to the best of our current scientific knowledge and technological abilities. Yet we know full well, before we start reading it, that the output of those noise levels will contain undesirable patterns and bias. We do not question their random condition, but because we need a uniform distribution we merely call them Weak Random Data.

I agree that under such fields as cryptanalysis, it doesn't matter. Laserlight also repeatedly tried to put this point across. And I agree. We cannot just extrapolate a concept and trample over centuries of well defined and mathematically established rules concerning random data generation. But I felt I needed to insist that the original data is still random, despite not being very useful for some purposes under its present condition.

It's not that cryptanalysis has a definition for Random. It's that it has requirements for Random. I think this is an important distinction.

**kryptkat** · 03-18-2010

Unicity distance - Wikipedia, the free encyclopedia

true randomness can not be mathematically defined. that is the problem. as programmers we can only try to simulate true randomness.

http://www.cacr.math.uwaterloo.ca/hac/about/chap5.pdf
is that the whole chapter or just part of the chapter ?

this is more fascinating than a ball of string.

exactly how many numbers do you really need to find the seed ? do you have any handy code you could paste that would find the seed ?

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

**Mario F.** · 03-18-2010

Originally Posted by kryptkat

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

As someone who knows the contents of the plaintext, you have a determining advantage over an attacker who is exactly trying to establish what the contents of the plaintext are. You are not attacking the cipher because a one-time pad is exactly defined by its ability to hide the real plaintext behind a seemingly infinite number of possible matches. And you cannot test this ability if you know what the plaintest is. In other words, you are not attacking the cipher because the one thing that defines it is never put to the test.

Your example can serve other type of ciphers, mind you. But just not a One-Time Pad.

**Sebastiani** · 03-18-2010

You know, I just realized a very easy way to implement a "repetitive" OTP (eg: wherein the key is shorter than the data and so is repeated for the length of the data) that would be sufficiently secure (provided that the key is "reasonably strong", of course): Just compress the data before applying the key. The only requirement there is that the compression format couldn't contain any "signature" bytes that could be used to reveal the key. I actually have a Huffman compressor that uses such a convention (it relies on sanity checks to verify the file rather than signatures). The attacker would have to resort to trying to guess the key without the benefit of analytic methods.

**laserlight** · 03-18-2010

Originally Posted by kryptkat

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

It is obviously not breaking a one time pad, since the user would have known the plaintext even without cryptanalysis. If the user does not know the plaintext, then he/she would be unable to recognise the message.

Originally Posted by Sebastiani

You know, I just realized a very easy way to implement a "repetitive" OTP (eg: wherein the key is shorter than the data and so is repeated for the length of the data) that would be sufficiently secure (provided that the key is "reasonably strong", of course): Just compress the data before applying the key.

Yes, that is a good idea, and I believe that it is already in use when implementing ciphers in practice (including those not equivalent to a one time pad). Of course, in the end it must the case that the key is not repeated, otherwise it is not a one time pad.

Originally Posted by Sebastiani

The only requirement there is that the compression format couldn't contain any "signature" bytes that could be used to reveal the key.

For a one time pad, that does not matter. The portion of the key revealed has nothing to do with the rest of the data.

**Sebastiani** · 03-18-2010

Originally Posted by laserlight

Yes, that is a good idea, and I believe that it is already in use when implementing ciphers in practice (including those not equivalent to a one time pad). Of course, in the end it must the case that the key is not repeated, otherwise it is not a one time pad.

Right, and no doubt a true OTP would be much more secure.

Originally Posted by laserlight

For a one time pad, that does not matter. The portion of the key revealed has nothing to do with the rest of the data.

Yes, for a true OTP that is indeed the case, but if the key is actually being repeated then revealing a portion of the key would necessarily expose each and every cell that corresponds to that index of the key. So for example, let's say the attacker knows that the data is a text file. He starts with a guess that the key block is N bytes long, and then xor's the Xth index of each block with the value V. If every byte altered yields a character, it's marked as a possible candidate, and the process continues. Then all of the permutations of candidate combinations are tried and compared with a dictionary. The value of N is incremented and the process repeats. Even though it's just a brute force technique, the rejection criteria may well speed things up quite a bit, making it an ugly but notheless feasable approach.

**laserlight** · 03-18-2010

Originally Posted by kryptkat

i provide you with a senerio where your criteria is met. a user recognizing the message as the one that was encrypted. the user lost the key and had to brute force the file to recover the data. how is that not considered breaking a one time pad ?

How can that be considered as breaking a one time pad? Let's make your scenario harder: not only has the user lost the key, but the user has lost the encrypted file as well. It is obvious that using your idea, the user can still recover the message, simply by brute force. In other words, it is not even necessary to have the ciphertext to retrieve the plaintext. Therefore, there is no cryptanalysis involved; you're only testing the user's ability to recognise the message when it is shown to him/her.