1. Originally Posted by MK27
If the distribution of characters in the English language means that using the mod 2 algorithm will on average yield 47% zero and 53% one, that means that by changing the algorithm, you could achieve exactly 50%. Just pick the characters according to the average distribution.
Not necessarily, e.g., it may be the case that there is no configuration that achieves exactly 50% on average. But yes, this is why I decided to describe your proposed RNG in terms of an unspecified hash function instead of ASCII value modulo 2: I felt that in a practical implementation it may be necessary to use a different hash function.

I do not think that this is necessarily enough though. My impression is that statistical randomness involves more than just a uniform distribution (after all, a uniform distribution can be converted to other random distributions, and vice versa).

Originally Posted by MK27
Sure -- so in reality this would mean what, you start guessing all the keys starting with the 51% more likely bit? Horseshoes and handgrenades! What is that going to help?
I am not sure: my cryptanalysis knowledge is really bad. But remember, we're talking about the information theoretic perfect secrecy of a one time pad. This is uh, different from the reality of cryptography and cryptanalysis in practice.

2. Originally Posted by laserlight
But remember, we're talking about the information theoretic perfect secrecy of a one time pad.
That is horseshoes and hand-grenades. Even if you get 60% of the bits correct, the secrecy of the one time pad is still perfect -- it has not been weakened at all.

To weaken the value of the pad generator, you would have to demonstrate that it does not have a 100% unbreakable success rate. So whereas a valid pad generator will generate keys that are never broken, a "weaker" generator might produce keys that could be broken 1 in a million times.

By saying, well, I know the generator is skewed to produce 52% 1, so I can get better than 50% of the bits correct -- that is meaningless. Getting 50 or 60 or even 75% of the bits correct will not translate to an improvement in breaking the pad. You will still never break a key, even once.

I think it is good to have an interest in research and (apparently) authoritative criteria and sources, but as I said a while ago, many of you are engaged in naive reification.

Reification is the fallacy of regarding a concept as a real thing. This is a fallacy in so far you are now out to find the real thing (which does not exist) instead of refining your conceptualization. Someone says "the real thing of randomness means even distribution" -- rather than trying to comprehend the significance or value of even distribution with regard to randomness, you now ascribe "even distribution" to be a "real property" of "true randomness". Eventually you end up with hypotheses that make no sense -- eg, this key generator does not achieve perfectly even distribution, therefore it is not truly random, therefore someone could break a key it generates.

That last step is pure faith, and what's more, it's wrong -- no one will ever break a key this way. You have replaced your brain with a set of false idols.

I was interested to learn that "In statistics, reification is the use of an idealized model of a statistical process. The model is then used to make inferences connecting model results, which imperfectly represent the actual process, with experimental observations."

3. Originally Posted by MK27
Now, explain how this makes the data "more" or "less" random.
Remember those contests M&M had, where you had to find the bag with the red M&Ms? The winner of the contest was a random one, because every bag could have been the winner. There is no other observable factor in the winning choice. Now, there was probably more than one person who won the contest, but whether it was five or 100 people, you could not reduce the favorable outcome.

You may have a bag of M&Ms that appears to win the contest for a long time. (M&Ms, being delicious, you decide you don't care about the contest.) But then you come across a blue one, or whatever. If you tried for the longest permutation of red M&Ms in a bag, your results would be skewed by the number of M&Ms in the bag, for all bags.

But the selection of colors that goes into M&Ms bags could be truly random, because there are only 5 (IIRC) and the probability for each is equal. How Mars chooses its colors for each bag is something I do not know. But if the distribution were equal, then you eating candy and guessing the color would be truly random because an endless string of red candies is as possible as any other permutation.

It may be that an imperfect solution is good enough most of the time, but it comes down to there being no other observable factor in a try in a permutation. We may not live in a truly random world, but equal probability is not hard to strive for in nature. You could simply have five of each color in your hand.

And they wont melt. HTH.

4. Originally Posted by whiteflags
But if the distribution were equal, then you eating candy and guessing the color would be truly random because an endless string of red candies is as possible as any other permutation.
This is a terrible example. Even if the distribution were 1:100:1000:10000:100000 it would be make exactly no difference to determining which M&M ended up in which bag.

Maybe if you could apply this to the case at hand (the one time pad generator), you might make some sense.

5. Originally Posted by Mario F.
Do you agree tiredness may affect what what characters you type? Do you agree body temperature, weather conditions, your vigil levels, your concentration, your emotional levels, your health, all can affect what keys your fingers will move on? That over a long period of time these factors interact in such a manner resembling the factors behind tossing a bag of die onto the air, rendering any hope of predicting or duplicating the pad virtually useless?
I definitely agree that all of those factors could play a part in what characters are typed in flail-text. Indeed, I'm sure that as my fingers got tired generating 4 680-character flail-text strings, I hit keys differently than I would if I'd given myself a break. However, I can't agree that those factors make the pad generation "random".

The "tossing a bag of dice" is not a good example. I think tossing a bag of dice is a fine example of randomness. An example I think more closely illustrates what I'm trying to convey would be tossing a bag of dice, where each die has been slightly weighted to land on 6 more often than the others. Tossing the bag is fine; the problem is with the dice. Similarly, flail-text is fine; the problem is with the input device. When flailing, I know that my fingers hit some keys significantly more often than others. No matter how hard I seem to try, I end up with most of my keys being closer to the middle of the keyboard, only rarely hitting keys near the corners. As the author of the paper pointed out, flail-text may contain statistically significant two-letter sequences due to a finger pressing both keys at once; I know mine do. Of course, this is a personal observation; your flail-text may be more random than my own.

The reduction of flail-text to the binary representation may be able adjust for these deficiencies, as laserlight pointed out. Now that I understand her suggestion to use an unspecified hash function instead of just mod 2, it seems more reasonable.
Originally Posted by Mario F.
There may be some truth to what you support over a relatively small sequence. But this cannot hold true for on any real live application of a one-time pad where the plaintext could be the size of a small book, or even a page. And especially if the plaintext contains information that can lead to easy confusion, like dates, values or formulas.
I'm not suggesting any sort of practical attack. I'm just looking at things logically. If the pad used in a OTP is not generated randomly, then an attacker can use the information about how a pad is generated to develop an attack. (Remember, the user gets only one secret, and it's not how the pad is generated.) If flail-text is not sufficiently random, it should not be used to generate a pad. In this case, it seems that the standards used to test a CS-PRNG would be strict enough for a definition of 'sufficiently random'. Because a CS-PRNG is tested for "next-bit" vulnerabilities, flail-text should also be tested in this manner. The only part I'm a bit shaky on is whether or not the "next-bit" test implies that all bit strings of a single length show up with equal probability. Now, if we're defining 'sufficiently random' as 'it's good enough for me', then obviously we don't have to worry about such things.

6. If the random number generator cannot select an output from a source where all possible outputs could be outputs, how is it truly random?

It doesn't matter how random the output looks. It doesn't matter if you can nigger rig the output to lie with statistics, the crucial point is that if your going to make a truly random coin flipper, you can't make the algorithm prefer heads. You have to make it so that it could choose heads or tails equally. Otherwise you get patterns, which means that there is a seed somewhere that could reproduce the outcome in a reliable way.

Or rather I should use the word period: the term for the length of outputs a generator gives, before it repeats. That's deterministic and random is by definition not deterministic.

7. Originally Posted by MK27
Even if you get 60% of the bits correct, the secrecy of the one time pad is still perfect -- it has not been weakened at all.
Actually, no. If you get just one of the bits correct, the one time pad has been broken, for that bit. Each bit of the keystream is a separate key in its own right.

But I think that this is just due to inaccurate terminology: you probably mean to say that even if you can statistically predict that 60% of the bits will be 0 (or 1, as the case may be), you still have no information as to which bit is which.

Originally Posted by MK27
rather than trying to comprehend the significance or value of even distribution with regard to randomness, you now ascribe "even distribution" to be a "real property" of "true randomness".
"Randomness" has different meanings in different fields. There is no single universally accepted definition of randomness.

Originally Posted by MK27
That last step is pure faith, and what's more, it's wrong -- no one will ever break a key this way.
Yes, it is faith: I trust that the experts know what they are talking about because I do not have the requisite knowledge, and have not had the opportunity to examine what they would reply to such objections.

There is the saying that a little knowledge can be a dangerous thing. Your arguments go beyond my knowledge, but then I am not convinced that you actually have more knowledge than I do: you are reasoning with less knowledge than experts, so it may be the case that your reasoning is flawed because you have not considered something.

On the other hand, newcomers to a field do make breakthroughs by insights that have been ignored because of unwarranted unassumptions. However, such cases are rare.

Originally Posted by MK27
You have replaced your brain with a set of false idols.
I suggest that you avoid these kind of ad hominem comments. It may be the way you like to express yourself, but frankly, it is offensive.

8. Originally Posted by pianorain
I'm not suggesting any sort of practical attack. I'm just looking at things logically. If the pad used in a OTP is not generated randomly, then an attacker can use the information about how a pad is generated to develop an attack. (Remember, the user gets only one secret, and it's not how the pad is generated.) If flail-text is not sufficiently random, it should not be used to generate a pad.
Well, there's nothing to argue there.

But let me end this way: If one day it is indeed proven that by randomly typing at a keyboard we can not produce good enough random sequences, I'll be very dissapointed with mankind.

9. Originally Posted by laserlight
Actually, no. If you get just one of the bits correct, the one time pad has been broken, for that bit. Each bit of the keystream is a separate key in its own right.
So no "properly generated" one time pad can be broken less than 50%. By simply setting all bits to one, I now can rest assured that 50% of the characters in my deciphered message are correct.

Yes, it is faith: I trust that the experts know what they are talking about because I do not have the requisite knowledge, and have not had the opportunity to examine what they would reply to such objections.
I'll assume they would say that if no one can break any key you have generated, then your message is still safe.

There is the saying that a little knowledge can be a dangerous thing.
No doubt.

you are reasoning with less knowledge than experts, so it may be the case that your reasoning is flawed because you have not considered something.
We probably have a somewhat different understanding of who experts are and what they are quite often about. Mine is cynical

I suggest that you avoid these kind of ad hominem comments. It may be the way you like to express yourself, but frankly, it is offensive.
Yeah. I apologize. I've gotten a little over the top obsessive in this thread and there is nothing left for me to prove. I'll stop now.

Originally Posted by Mario F.
But let me end this way: If one day it is indeed proven that by randomly typing at a keyboard we can not produce good enough random sequences, I'll be very dissapointed with mankind.
Reality is a tough nut to crack.

10. Originally Posted by MK27
So no "properly generated" one time pad can be broken less than 50%. By simply setting all bits to one, I now can rest assured that 50% of the characters in my deciphered message are correct.
Using your definition of what it means to correctly guess a bit, yes, that is true.

Originally Posted by MK27
I'll assume they would say that if no one can break any key you have generated, then your message is still safe.

But the test here is not whether no one can break any key that I have generated; the test is whether, given infinite storage, instant retrieval, processing power that can perform all computations instantly, and with sufficient ciphertext, it is possible to obtain any information at all about the plaintext (other than possibly its length). If that can be done, then the cipher is not equivalent to a one time pad.

That said, my lack of knowledge of information theory is not a dead end; we can always pose questions to one or more experts.

So, could you restate your objection to the requirement of uniform distribution for a RNG whose output is suitable for use as a one time pad?

Originally Posted by MK27
We probably have a somewhat different understanding of who experts are and what they are quite often about. Mine is cynical
That is the kind irrational paranoia that is a good quality in security professionals

11. Originally Posted by laserlight
Using your definition of what it means to correctly guess a bit, yes, that is true.
My definition? It's not...

Originally Posted by laserlight
If you get just one of the bits correct, the one time pad has been broken, for that bit. Each bit of the keystream is a separate key in its own right.
the test is whether, given infinite storage, instant retrieval, processing power that can perform all computations instantly, and with sufficient ciphertext, it is possible to obtain any information at all about the plaintext (other than possibly its length). If that can be done, then the cipher is not equivalent to a one time pad.
If you know the generator is "a proper one" with 50% distribution, then you can get half the bits correct by just guessing one all the time. If you know that the distribution (by the algorithm) is 53/47, then you should be able to get 53%. The problem is, in both cases you will not know which ones are correct, so I am not sure if this constitutes "any information at all".

So, could you restate your objection to the requirement of uniform distribution for a RNG whose output is suitable for use as a one time pad?
Because it is neither a necessary nor a sufficient criteria in and of itself. I think it is a decent guiding principle, but that is sort of self evident: no one is going to mistakenly believe that a generator that does this: 1111111111111 is effective. This is just fiddling with a gauge on the Geiger counter, and believing that because you have not tuned your inputs correctly that radioactive decay is no good.

By treating it as a necessary criteria, you fall into the trap whiteflags did: believing that an uneven probability indicates less randomness. If have a box with 10 holes in the bottom, 6 of them blue, 3 of them green, and 1 of them red, if I drop a ball in the box and shake it until it goes into a hole, there is an even probability that it will land in any particular hole, but only a 10% chance that hole will be red: is that random?

That is the kind irrational paranoia that is a good quality in security professionals
It is not paranoia if they really are out to get you -- then it's wisdom.

12. Originally Posted by MK27
The problem is, in both cases you will not know which ones are correct, so I am not sure if this constitutes "any information at all".
Yes, that is why I think that you are not actually getting the bits correct. You are just getting the distribution correct.

Originally Posted by MK27
By treating it as a necessary criteria, you fall into the trap whiteflags did: believing that an uneven probability indicates less randomness. If have a box with 10 holes in the bottom, 6 of them blue, 3 of them green, and 1 of them red, if I drop a ball in the box and shake it until it goes into a hole, there is an even probability that it will land in any particular hole, but only a 10% chance that hole will be red: is that random?
It is. But it this randomness sufficient for a one time pad? Basically, the aim is to generate as much random information as there is plaintext. This is related to the concept of entropy. If I should link to Wikipedia: Entropy (information theory).

Notice that Schneier prefaces his statements with "for our purposes" and "from our point of view". In fact, I have left out the part where he briefly discusses some philosophical ideas about randomness which are not used in cryptography.

13. Originally Posted by laserlight
Notice that Schneier prefaces his statements with "for our purposes" and "from our point of view".
Well, to put it bluntly: As far as I am concerned, anyone who thinks that plain language text arbitrarily converted (with mod 2) to a bitstream does not represent the very essence of entropy has failed to understand life, nature, the universe, and mostly everything in it.

I would consider any circuitous attempt to demonstrate this is not clearly so misguided or babylonic (ie, non-sensical), not matter how much material you want to reference (here not referring to anything in particular, I haven't read that article yet), and most of this thread bares that out.

That may seem arrogant or ignorant, at first glance, but I assure you it isn't.

14. Originally Posted by MK27
Well, to put it bluntly: As far as I am concerned, anyone who thinks that plain language text arbitrarily converted (with mod 2) to a bitstream does not represent the very essence of entropy has failed to understand life, nature, the universe, and mostly everything in it.

I would consider any circuitous attempt to demonstrate this is not clearly so misguided or babylonic (ie, non-sensical), not matter how much material you want to reference.
Well... so you are saying that you will never change your mind, even if you can be shown to be wrong?

I would think that a more sensible approach is to hold to your convictions unless you are shown evidence that they are untenable.

Originally Posted by MK27
That may seem arrogant or ignorant, at first glance, but I assure you it isn't.
I find it rather ironic that you claim to be cynical of experts and yet so easily give assurances as if you are an expert

15. Originally Posted by laserlight
Well... so you are saying that you will never change your mind, even if you can be shown to be wrong?
Not at all, I do that here occasionally. Some days I seem to end up doing it a lot...

However, that does not mean that I need to give credence to every possible objection, or that I need to exhaustively research the color of the sky to be very confident it is blue.

I would think that a more sensible approach is to hold to your convictions unless you are shown evidence that they are untenable.
Definitely.

I find it rather ironic that you claim to be cynical of experts and yet so easily give assurances as if you are an expert
I never claim to be an expert on anything, because I am not. But I don't think you need to be an expert in anything in order to understand something. There are a lot of "somethings" that I am confident I understand and maybe I seem to give "assurances" about them.

I will admit that sometimes I'll assert something I'm uncertain of, just to see if someone will contradict or correct me, in a rational manner (in which case I acknowledge I was mistaken).

WRT experts, I don't trust authority: so I will not take someone's word for something because they claim expertise (including credentials) if I think they are wrong -- they also need to make sense, unfortunately. In my experience, that does not seem to be difficult for genuinely knowledgeable people to do if they are so inclined.

Conversely, many people in all walks of life seem to consider it their raison d'etre to habitually introduce irrational doubts into the world. That is not necessarily bad (qv, some Sufis) but it is also not necessarily good (qv, some Sufis). Either way, it does go on.