one time pad breakable debate

**laserlight** · 03-17-2010

Originally Posted by Mario F.

There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.

That is a gross misrepresentation of what I am trying to say. If you actually read the material that I linked to, you would notice that it affirms my claim that even well recognised sources of randomness must be tested to check that they are suitable for use as a RNG in cryptography. If they are not suitable as implemented, then it may be possible to account for bias, but failure to do so means that the resulting RNG's output cannot be used for a one time pad.

But there's more: I am also concerned with a practical application of this RNG, given that it depends on source text that might have a tendency to be repeated. It would be more difficult to test for this though, since this depends on the implementation and patterns of human behaviour should humans be involved in source text selection. But if humans are not involved, then how can this be correctly automated?

**rogster001** · 03-17-2010

It probably hasn't been used because it really isn't practical to type hundreds of random characters in a keyboard and it does tend to look silly and rather unprofessional for a cryptanalyst to do it

that of course is very true, a key generation method has to be ideally easily replicated, but then your mad professor might be the jackson pollock of the crytography world and get a kick out of such key generation

**pianorain** · 03-17-2010

Originally Posted by Mario F.

It probably did. As this sequence I just generated from an Mersenne Twister may fail: 195433953634

This is a non-response. Either explain the faults of the experiment's procedure or accept its conclusions. Keep in mind: you cannot prove or disprove the randomness of any generator. You can only run tests that suggest that it was random or non-random in the past. In this case, random human typing was found to be non-random. I have seen no case in which a serious effort has been exerted that shows that random human typing is, in fact, random.

**MK27** · 03-17-2010

Originally Posted by pianorain

Or just quote a source. This paper was extremely interesting. Even after de-skewing, the typing test failed.

All that indicates is that the typing test was poorly designed (the "de-skewing" criteria is even worse). You could easily keep adjusting the mechanism until it passed the test -- which implies to me the person who wrote this paper is chasing their own tail.

It would be simple to modify the input algorithm to pass the de-skewing test as described (if two bits are the same, throw them away, if not keep the first one). You don't even have to redo the test: just take the data and reassign the characters to yield a completely uniform distribution.

Pure circular reasoning. It is all based on the idea that "a sequence of random bits is a sequence where all the bits are independent from one another and uniformly distributed". That is a bad definition.

That the bits are independent of one another is sufficient. If you want to test the algorithm, and not just a sample of data, you have to demonstrate that you can guess the next bit at a rate better than the distribution of the bit. The exact distribution is irrelevant.

You still need to think for yourself.

**Mario F.** · 03-17-2010

Actually it wasn't directed at you laserlight. Your argument stands firm, I believe. You question, but do not dismiss entirely. You suggest traditional methods to test this, and I accept that. But where are these tests so we can put this debate to a rest?

Instead the statement was directed at attempts to entirely dismiss my argument. I find them slightly annoying because there's still no evidence either way. And being the case we have historically supported our RNGs on our ability to produce unpredictable and not duplicateable (sp?) sequences, so far there's only a proposal.

**laserlight** · 03-17-2010

Originally Posted by pianorain

This paper was extremely interesting.

Indeed, particularly because the researchers "assign either 1 or 0 to each key on the keyboard". This is a hash function that is pretty much equivalent to what MK27 proposed.

By the way:

Ideally, we then open a text editor and get a monkey to hit the keyboard. In practice, we were forced to use computer science students instead.

Clearly, computer science students are good substitutes for monkeys

Originally Posted by Mario F.

But more seriously (despite the above paragraph being more serious than what I'm making it look), it says nothing of the randomness approach to a One-Time pad. Why? Because your only hope to find the plaintext is to force the cypher against bruteforce and hope for the best, not to study the properties of the pad.

The reason why even brute force fails against a one time pad is that the information contained in the key is no less than the information contained in the plaintext. But if the key comes from a source that is not sufficiently random, possibly due to bias, then the key would contain less information than you think it does, and thus the stream cipher would not be equivalent to a one time pad. So, it makes sense to study the properties of the pad if the randomness of the generator that produced the pad is suspect.

Originally Posted by Mario F.

It probably did. As this sequence I just generated from an Mersenne Twister may fail: 195433953634

Yes, that is an important consideration mentioned in the paper: "both the number of sequences and the number of bits per sequence are too low to be statistically significant".

Originally Posted by Mario F.

But where are these tests so we can put this debate to a rest?

The Diehard tests are quite famous, and then I believe NIST also provides a test suite. The problem that I am facing right now is how to select source text. You need a great deal of it to be statistically significant.

**MK27** · 03-17-2010

Originally Posted by laserlight

Indeed, particularly because the researchers "assign either 1 or 0 to each key on the keyboard". This is a hash function that is pretty much equivalent to what MK27 proposed.

Yeah, and I have read the paper and hopefully made clear that IMO it is not a good paper -- simply quoting someone else's shoddy work does not legitimize either the work or the criticism it is mean to support.

The entire assessment was not based not on predictability, but whether a 50/50 distribution is produced. That's no good.

It also misses the point that over time, the keyboard test MUST tend toward 50%. If it does not, you have designed the test badly. It cannot be any other way. One test with a few students does not change that.

**kryptkat** · 03-17-2010

lets say you have the porn.bmp file encrypted using 4 different one time pads. first pad uses a computer generated random numbers. second pad uses a true random number generator. third pad uses the numbers you put in to a file key.txt using 30 sided dice so that the numbers would only be from 1 to 30 per character. the fourth pad would use a sequential number generator. 1 2 3 4 5 up to 255 a stair case pattern. the control file or pad. do you really think it matters what it was encrypted with if a brute force on all the pads would open all the files ? trying every possible key.

would you not recognize your own data ? porn.bmp file ? in the above example. if it was your data or file you would know when you have the correct key. after the fact of opening and looking through a LOT of data to get possible contents of the file that was encrypted. that could be verified later through torture or trickery or other means. if you were not the one that encrypted the file in the one time pad. the difference being that if yours you would know right a way which one is the correct data or the second external verification after the fact <or not> which would be the correct data.

that would make a good twilight zone or outer limits show with the porn generator accidently discovered by trying to decrypt a one time pad by brute force. but you know what would really happen in real life ? some corrupt district attorney would charge you with 9,999,999 counts of illegal porn. instead of just the one porn.bmp file count.

do you think one time pads should have a master key or skeleton key or back door incase you loose your key.file or hard drive crash say key on one and file on another drive. some encryption algorithms do contain a master key.

you that put up short encryptions do you not recognize your own data if the encryption was brute forced ? even if you lost your key ?

No, frequency analysis is useless against a one time pad.

maybe but it is a fact that certain characters are used more than others therefore an inverse freq could be used.

example 2 a random run repeated key encryption and a vernam one time pad. if you do not have the key to eighter file at the beginning of the cryptanalysis it does not matter if it is a one time pad or not ? but after you have the key is the only time when that matters as to wether you can use the key again or not.

back to brute forcing your own file. if you recognize your own data then you can agree that a one time pad was cracked ? even if you have to add the stipulations that only someone in the "know" can say "yes" that is the data or verified through external verification after the fact by torture or trickery or other means ? or if the file is long enough to see what comes in to focus <analogy> or the rest of the data makes sense.

please do not say i am wrong. please respect me for having a different opinions or beliefs.

theoretic perfect secrecy

only in theory.

Schrodingers Kat

schrodingers cat is dead you "geniususes" forgot to feed her. a biological entity can only live for three days with out water. a biological entity can only live for three weeks with out food. oh gawd did you even provide oxygen ? a biological entity can only live for three minute with out oxygen. if so then the "geniususes" are dead too because of the radiation leek. "geniususes" ....

it is not like starving a computer of electricity to crack rsa.

laserlight have you ever really used a real nitrocellulose one time pad ?

**pianorain** · 03-17-2010

Originally Posted by Mario F.

Instead the statement was directed at attempts to entirely dismiss my argument. I find them slightly annoying because there's still no evidence either way. And being the case we have historically supported our RNGs on our ability to produce unpredictable and not duplicateable (sp?) sequences, so far there's only a proposal.

It's okay, Mario. You can say, "pianorain is annoying me." I'm a big boy; I can take it.

I apologize if I sound like I'm trying to dismiss your argument. I simply don't know of and have never heard of any source citing what you're saying. Meanwhile, I know I've heard of sources saying exactly the opposite: that random human typing is non-random. It's beyond me to try to test definitively either way. I'm just trying to present my thoughts. Once again, apologies for any disrespect to your argument.

**whiteflags** · 03-17-2010

Originally Posted by MK27

Pure circular reasoning. It is all based on the idea that "a sequence of random bits is a sequence where all the bits are independent from one another and uniformly distributed". That is a bad definition.

That the bits are independent of one another is sufficient. If you want to test the algorithm, and not just a sample of data, you have to demonstrate that you can guess the next bit at a rate better than the distribution of the bit. The exact distribution is irrelevant.

You still need to think for yourself.

Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.

I'm not saying it's completely nonrandom, (and neither is pianorain) but on the assumption that I can believe the figures you and pianorain have, it's not going to be good enough for a one time pad. Even if it manages to fool all the people here. I mean, the key has to be randomly picked for the otp to work, that's completely why otp does work. Otherwise, the facts will be vulnerable to enemy cryptographers.

**MK27** · 03-17-2010

Originally Posted by whiteflags

Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.

Look: this is stupid.

If the distribution of characters in the English language means that using the mod 2 algorithm will on average yield 47% zero and 53% one, that means that by changing the algorithm, you could achieve exactly 50%. Just pick the characters according to the average distribution.

Presto! Now you have an algorithm, that when applied to English language texts, should ideally (on average, with a greater degree of accuracy in a larger sample, just like any other kind of random data) yield exactly 50% ones and exactly 50% zeros!

Hooray, now we have perfectly uniform distribution and perfectly equal probability! From the exact same source!

Now, explain how this makes the data "more" or "less" random. Otherwise this criteria is pure red herring (and I don't care if some grad student somewhere has fallen for it or not).

I believe the concern with equal probability is for the purpose of evaluating "unknown" sources. If it appears skewed, then that skew will just keep increasing with the volume
of data. In this case, don't bother -- we already have all the data in the form of the Statistical Distribution of letters in the English language.

In the case of the "random typing" to bitstream test, if you want to use the keyboard for random bits, adjust the method until you get, on average, equal probability for each bit. The criticism is that people will not "evenly distribute" their key presses: so there must be statistical trends. If you have fifty keys, take your data for those fifty keys, then assign so that you will end up with the distribution you want.

The whole premise of the evaluation is bogus. I am sure it will be impossible to eliminate recurrent patterns of characters. But if you cannot get an even distribution of bits, that's simply because you did not try to do so.

Which is like turning the Geiger counter up too high and saying, well, the distribution is not even, I guess radioactive decay is no good.

I'm coming to agree with Mario F. here, vis, this is a purely academic debate in pursuit of a definition of "random" which has little or no real value.

**laserlight** · 03-17-2010

Originally Posted by MK27

That the bits are independent of one another is sufficient. If you want to test the algorithm, and not just a sample of data, you have to demonstrate that you can guess the next bit at a rate better than the distribution of the bit. The exact distribution is irrelevant.

Sufficient for what? If the random distribution is not uniform, then using the bits for a one time pad results in less information being contained in the key than expected. Basically, a non-uniform distribution has bias when you want to use it for keys that are equally likely.

Originally Posted by MK27

Yeah, and I have read the paper and hopefully made clear that IMO it is not a good paper

I cannot say if the paper is a good paper after just a cursory glance, but after reading about the amount of data that was considered, it was obvious that the conclusions of that experiment are more or less useless except as a reference for future work.

Originally Posted by kryptkat

if you recognize your own data then you can agree that a one time pad was cracked ? even if you have to add the stipulations that only someone in the "know" can say "yes" that is the data or verified through external verification after the fact by torture or trickery or other means ?

Torture and trickery (and even bribery) are valid methods of cryptanalysis (although torture is immoral and may be illegal). But these are not considered when we talk about the strength of an encryption algorithm because they go against Kerckhoff's assumption that only the key is secret.

Originally Posted by kryptkat

or if the file is long enough to see what comes in to focus <analogy> or the rest of the data makes sense.

You might want to search the Web for "unicity distance". A one time pad has infinite unicity distance.

Originally Posted by kryptkat

please do not say i am wrong. please respect me for having a different opinions or beliefs.

As I said, within the bounds of your own definitions, you are not wrong. But when we use the definitions that are accepted by cryptographers, you are wrong.

Originally Posted by kryptkat

only in theory.

Yes, in practice, one time pads are rarely used because it is so difficult to get them right. Thus...

Originally Posted by kryptkat

laserlight have you ever really used a real nitrocellulose one time pad ?

... no. I have had no reason to use one time pads anyway.

**pianorain** · 03-17-2010

Originally Posted by whiteflags

Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.

Yes, this seems correct. For example, suppose I have a hat with four slips of paper: two with '1' and two with '0'. I could draw a slip out of the hat, write down the binary digit, and return the slip. This method of generating binary digits has both properties of bit independence and uniform distribution. It makes sense.

However, if the hat contained one '1' and three '0', despite bit independence in how the bits are selected, the distribution is wildly skewed. I don't think anyone here would argue that this method of generating random binary digits is good. This illustrates the need for a generator to generate bits with as close to equal probability as possible.

**Mario F.** · 03-17-2010

Originally Posted by laserlight

The Diehard tests are quite famous, and then I believe NIST also provides a test suite. The problem that I am facing right now is how to select source text. You need a great deal of it to be statistically significant.

I believe they should be a function of the number of discreet values, no? And quite a large number of text at that, I suppose.

I don't know what to say at this point. Certainly I would not wish for your to start typing millions of random characters in order to produce something relevant for these tests to chew on. I certainly can't be bothered myself doing it just to try and prove my point.

Originally Posted by pianorain

I simply don't know of and have never heard of any source citing what you're saying. Meanwhile, I know I've heard of sources saying exactly the opposite: that random human typing is non-random. It's beyond me to try to test definitively either way. I'm just trying to present my thoughts. Once again, apologies for any disrespect to your argument.

No. I felt no disrespect. Just annoyance

It's that the author himself cannot prove this. And yet it is being debated as a veritable truth. It's a logical observation no doubt based on some form of testing. But where? How? What? With whom? These are left unanswered by the author and any other references I've looked up on the internet (and oh boy, did I try to search for them!). I cannot simply accept that when all my cells in my body tell me otherwise. When I'm completely incapable of duplicating or predicting a series of characters I myself typed randomly on the keyboard. When in fact, I even sustain that is logical to assume that we will tend to an acceptable distribution if we are instructed to type randomly at a keyboard.

Do you agree tiredness may affect what what characters you type? Do you agree body temperature, weather conditions, your vigil levels, your concentration, your emotional levels, your health, all can affect what keys your fingers will move on? That over a long period of time these factors interact in such a manner resembling the factors behind tossing a bag of die onto the air, rendering any hope of predicting or duplicating the pad virtually useless?

There may be some truth to what you support over a relatively small sequence. But this cannot hold true for on any real live application of a one-time pad where the plaintext could be the size of a small book, or even a page. And especially if the plaintext contains information that can lead to easy confusion, like dates, values or formulas.

**MK27** · 03-17-2010

Originally Posted by laserlight

Sufficient for what? If the random distribution is not uniform, then using the bits for a one time pad results in less information being contained in the key than expected. Basically, a non-uniform distribution has bias when you want to use it for keys that are equally likely.

Sure -- so in reality this would mean what, you start guessing all the keys starting with the 51% more likely bit? Horseshoes and handgrenades! What is that going to help?

Altho I did describe how you could achieve 50/50 distribution by tweaking the algorithm, I still think you would have to be insane to consider that a meaningful improvement.