one time pad breakable debate

**Mario F.** · 03-17-2010

Originally Posted by laserlight

The prediction has to do with the proportion of atoms in the sample that will decay over time, but RNGs that use it are usually based on a measurement concerned with when the next atom decays.

If you read Asimov's Foundation series, you might notice that he used this kind of concept in psychohistory: Seldon's science allows him to predict events of humanity for centuries to come, but tells him nothing about what the person next to him will do next.

Noted. Will study it later. Thanks

**MK27** · 03-17-2010

51.4% may be more than 50% but it sounds like another version of the same bad joke: a random sample must yield 50/50 to be random.

You miss the forest for the trees. You don't need to do any sort of analysis at all to get a 50% success rate, if the data tends toward 50%: just guess the same thing every time. The next bit is either 0 or 1. The criteria is completely meaningless.

You already know the baseline here is 47%, which is why you can come up with a way "to guess" that will be correct slightly more than 50% -- just guess 1. If (because of some extremely silly misapplication of principles) you are determined to get 50%, like I said, adjust the algorithm to reflect English language distribution. The all your samples will be 50/50, and you will be unable to guess better than exactly 50%.

That will get your analysis more in line with what you want, but it does not take away from the fact that this is just a grotesque distortion of rational principles. If the reason you can get 51.4% is because the ultimate baseline is 47% (which it is), then you will never get a better rate than 53%, no matter how big the sample. And 53% will never enable you to break the cypher.

**Mario F.** · 03-17-2010

Let me tell you what I seem to believe is the problem here:

There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.

I suggest you stop.

**pianorain** · 03-17-2010

Originally Posted by Mario F.

Let me tell you what I seem to believe is the problem here:

There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.

I suggest you stop.

Or just quote a source. This paper was extremely interesting. Even after de-skewing, the typing test failed.

**MK27** · 03-17-2010

Originally Posted by pianorain

Or just quote a source. This paper was extremely interesting. Even after de-skewing, the typing test failed.

All that indicates is that the typing test was poorly designed (the "de-skewing" criteria is even worse). You could easily keep adjusting the mechanism until it passed the test -- which implies to me the person who wrote this paper is chasing their own tail.

It would be simple to modify the input algorithm to pass the de-skewing test as described (if two bits are the same, throw them away, if not keep the first one). You don't even have to redo the test: just take the data and reassign the characters to yield a completely uniform distribution.

Pure circular reasoning. It is all based on the idea that "a sequence of random bits is a sequence where all the bits are independent from one another and uniformly distributed". That is a bad definition.

That the bits are independent of one another is sufficient. If you want to test the algorithm, and not just a sample of data, you have to demonstrate that you can guess the next bit at a rate better than the distribution of the bit. The exact distribution is irrelevant.

You still need to think for yourself.

**whiteflags** · 03-17-2010

Originally Posted by MK27

Pure circular reasoning. It is all based on the idea that "a sequence of random bits is a sequence where all the bits are independent from one another and uniformly distributed". That is a bad definition.

That the bits are independent of one another is sufficient. If you want to test the algorithm, and not just a sample of data, you have to demonstrate that you can guess the next bit at a rate better than the distribution of the bit. The exact distribution is irrelevant.

You still need to think for yourself.

Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.

I'm not saying it's completely nonrandom, (and neither is pianorain) but on the assumption that I can believe the figures you and pianorain have, it's not going to be good enough for a one time pad. Even if it manages to fool all the people here. I mean, the key has to be randomly picked for the otp to work, that's completely why otp does work. Otherwise, the facts will be vulnerable to enemy cryptographers.

**MK27** · 03-17-2010

Originally Posted by whiteflags

Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.

Look: this is stupid.

If the distribution of characters in the English language means that using the mod 2 algorithm will on average yield 47% zero and 53% one, that means that by changing the algorithm, you could achieve exactly 50%. Just pick the characters according to the average distribution.

Presto! Now you have an algorithm, that when applied to English language texts, should ideally (on average, with a greater degree of accuracy in a larger sample, just like any other kind of random data) yield exactly 50% ones and exactly 50% zeros!

Hooray, now we have perfectly uniform distribution and perfectly equal probability! From the exact same source!

Now, explain how this makes the data "more" or "less" random. Otherwise this criteria is pure red herring (and I don't care if some grad student somewhere has fallen for it or not).

I believe the concern with equal probability is for the purpose of evaluating "unknown" sources. If it appears skewed, then that skew will just keep increasing with the volume
of data. In this case, don't bother -- we already have all the data in the form of the Statistical Distribution of letters in the English language.

In the case of the "random typing" to bitstream test, if you want to use the keyboard for random bits, adjust the method until you get, on average, equal probability for each bit. The criticism is that people will not "evenly distribute" their key presses: so there must be statistical trends. If you have fifty keys, take your data for those fifty keys, then assign so that you will end up with the distribution you want.

The whole premise of the evaluation is bogus. I am sure it will be impossible to eliminate recurrent patterns of characters. But if you cannot get an even distribution of bits, that's simply because you did not try to do so.

Which is like turning the Geiger counter up too high and saying, well, the distribution is not even, I guess radioactive decay is no good.

I'm coming to agree with Mario F. here, vis, this is a purely academic debate in pursuit of a definition of "random" which has little or no real value.

**pianorain** · 03-17-2010

Originally Posted by whiteflags

Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.

Yes, this seems correct. For example, suppose I have a hat with four slips of paper: two with '1' and two with '0'. I could draw a slip out of the hat, write down the binary digit, and return the slip. This method of generating binary digits has both properties of bit independence and uniform distribution. It makes sense.

However, if the hat contained one '1' and three '0', despite bit independence in how the bits are selected, the distribution is wildly skewed. I don't think anyone here would argue that this method of generating random binary digits is good. This illustrates the need for a generator to generate bits with as close to equal probability as possible.

**rogster001** · 03-17-2010

There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.

for my own part i am merely saying that historically cryptographers have not used simple random typing as a secure form of randomness,whether through empirical evidence or theoretical study it has been found that it is not a usable method, the fact that its use has been shunned in real world applications by people that genuinely need to guarantee security as far as they can, is evidence enough for me.

**Mario F.** · 03-17-2010

Originally Posted by rogster001

for my own part i am merely saying that historically cryptographers have not used simple random typing as a secure form of randomness,whether through empirical evidence or theoretical study it has been found that it is not a usable method, the fact that its use has been shunned in real world applications by people that genuinely need to guarantee security as far as they can, is evidence enough for me.

It's a good argument. But then,

It probably hasn't been used because it really isn't practical to type hundreds of random characters in a keyboard and it does tend to look silly and rather unprofessional for a cryptanalyst to do it

But more seriously (despite the above paragraph being more serious than what I'm making it look), it says nothing of the randomness approach to a One-Time pad. Why? Because your only hope to find the plaintext is to force the cypher against bruteforce and hope for the best, not to study the properties of the pad.

Originally Posted by pianorain

Or just quote a source. This paper was extremely interesting. Even after de-skewing, the typing test failed.

It probably did. As this sequence I just generated from an Mersenne Twister may fail: 195433953634

**pianorain** · 03-17-2010

Originally Posted by Mario F.

It probably did. As this sequence I just generated from an Mersenne Twister may fail: 195433953634

This is a non-response. Either explain the faults of the experiment's procedure or accept its conclusions. Keep in mind: you cannot prove or disprove the randomness of any generator. You can only run tests that suggest that it was random or non-random in the past. In this case, random human typing was found to be non-random. I have seen no case in which a serious effort has been exerted that shows that random human typing is, in fact, random.

**laserlight** · 03-17-2010

Originally Posted by Mario F.

There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.

That is a gross misrepresentation of what I am trying to say. If you actually read the material that I linked to, you would notice that it affirms my claim that even well recognised sources of randomness must be tested to check that they are suitable for use as a RNG in cryptography. If they are not suitable as implemented, then it may be possible to account for bias, but failure to do so means that the resulting RNG's output cannot be used for a one time pad.

But there's more: I am also concerned with a practical application of this RNG, given that it depends on source text that might have a tendency to be repeated. It would be more difficult to test for this though, since this depends on the implementation and patterns of human behaviour should humans be involved in source text selection. But if humans are not involved, then how can this be correctly automated?

**rogster001** · 03-17-2010

It probably hasn't been used because it really isn't practical to type hundreds of random characters in a keyboard and it does tend to look silly and rather unprofessional for a cryptanalyst to do it

that of course is very true, a key generation method has to be ideally easily replicated, but then your mad professor might be the jackson pollock of the crytography world and get a kick out of such key generation

**Mario F.** · 03-17-2010

Actually it wasn't directed at you laserlight. Your argument stands firm, I believe. You question, but do not dismiss entirely. You suggest traditional methods to test this, and I accept that. But where are these tests so we can put this debate to a rest?

Instead the statement was directed at attempts to entirely dismiss my argument. I find them slightly annoying because there's still no evidence either way. And being the case we have historically supported our RNGs on our ability to produce unpredictable and not duplicateable (sp?) sequences, so far there's only a proposal.

**pianorain** · 03-17-2010

Originally Posted by Mario F.

Instead the statement was directed at attempts to entirely dismiss my argument. I find them slightly annoying because there's still no evidence either way. And being the case we have historically supported our RNGs on our ability to produce unpredictable and not duplicateable (sp?) sequences, so far there's only a proposal.

It's okay, Mario. You can say, "pianorain is annoying me." I'm a big boy; I can take it.

I apologize if I sound like I'm trying to dismiss your argument. I simply don't know of and have never heard of any source citing what you're saying. Meanwhile, I know I've heard of sources saying exactly the opposite: that random human typing is non-random. It's beyond me to try to test definitively either way. I'm just trying to present my thoughts. Once again, apologies for any disrespect to your argument.