Quote:
Originally Posted by
Mario F.
Nah, nah, nah. There's no place here for conjectures. You are cheating with that argument.
So any hypothesis is cheating? That must make it difficult to get anywhere. My conjecture is based on the assumption that humans act within reason. That, of course, is debatable, but it is supported in this field by the fact that we can publish lists of the most popular passwords. If humans were better at being random, this tiny dictionary of 500 plain-text words wouldn't be able to successfully attack 11% of the population.
Quote:
Originally Posted by
Mario F.
Please check post #132. I give you there the real challenge. And even at the face of what you say here, you will still not be able to predict or duplicate the conditions. You achieved randomness. Dare to contradict me?
Not contradict, merely suggest an alternative view.
The ability or inability to duplicate a result is not a requirement of a PRNG. For an example in the computer science field, consider the seed of a PRNG. If you seed a PRNG with the same number every time, you will generate the same numbers every time. This does not mean that the PRNG is broken; it's just being utilized poorly. That an event is repeatable does not factor either way into whether or not a PRNG is suitably random.
The lack of prediction, on the other hand, is a major requirement into determining the suitability of a PRNG. This is where my entire argument lies.
First, an explanation of the "next-bit" test. Straight from Wikipedia:
Quote:
Given the first k bits of a random sequence, there is no polynomial-time algorithm that can predict the (k+1)th bit with probability of success better than 50%.
Why do I think simple typing on the keyboard fails this test? Because I can tell you the algorithm I think will predict the next bit (with greater than 50% success) given only a single bit.
To illustrate, I took four samples of text from this thread:
Quote:
01001100010011110010001010110001010010110110001010 10100110010010100111010001010010011001010100110000 10101111000100011100001001010101000101101001111001 00101101001010011101000101110010100010000110101100 10010110011111101100100010101001111001010011010110 10010100101100001000010110000010100000101011010100 01010101010100111100101100001101001001010100101010 11001000010010111000100011001001010101110011000101 11001001100010010111000101010010110010001010110111 01010010101111010011010000101000101010101111100100 010011001010010110001011010
Quote:
Note that no generator can be proved random or non-random. It can only be tested to suggest randomness. In this case, I'm suggesting that this method of producing random bits may fail a 'next-bit' test because of plain-text statistical distribution problems. Obviously my sample size of flail-text is tiny, but it also suggests distribution problems. Seeing as passing the 'next-bit' test is one of the requirements for a cryptographically secure PRNG, this is important. The only way to know definitively is to run the tests and see what happens.
Quote:
If you have a key that generates a meaningful message, it would be easy to take that bit pattern and generate a corresponding text for it (you have 13 choices each time -- actually with punctuation and capitals, lets say you have 30 choices each time). Chance are you could do that with a message of any length. Going back to the five letter word, using modulus how many 30 character phrases could you come up with that would produce the "key" to produce a word from 5 scrambled bytes? The same number of phrases as there are possible keys, and most likely, even more than that, since just one such bit sequence could be produced by a whole bunch of possible word combinations.
Quote:
The problem is that the sets of text are not all equally likely to be chosen by a given user. Recall why brute force can be effective in practice against hashed passwords (where a salt is not used): users tend to choose weak passwords for convenience, and these passwords are thus susceptible to a dictionary attack. Now, in practice these text are more like passphrases rather than passwords (though the distinction is largely just a matter of style, in my opinion), thus brute force is likely to fail anyway. Yet, just as users reuse passwords, they may reuse source text. Training them not to do so is easier said than done; even trained spies blunder.
I then converted the text to binary using mod 2 arithmetic and found the distributions of two-character sequences.
Quote:
00: 0.233396584440228
01: 0.311195445920304
10: 0.311195445920304
11: 0.142314990512334
00: 0.226691042047532
01: 0.297989031078611
10: 0.297989031078611
11: 0.175502742230347
00: 0.223042836041359
01: 0.280649926144756
10: 0.282127031019202
11: 0.212703101920236
00: 0.242748091603053
01: 0.274809160305344
10: 0.274809160305344
11: 0.206106870229008
I noticed that 01 and 10 are always higher than 00 or 11. (They're also always very close to having the same probability, though I wasn't sure what to do with this.) This suggests a simple algorithm for guessing the next bit with greater than 50% accuracy would be to simply guess the other bit: if you are given a 1, guess a 0 (and vice versa).
I then created four strings of flail-text (each 680 characters in length). Here were the distribution results:
Quote:
00: 0.222058823529412
01: 0.260294117647059
10: 0.261764705882353
11: 0.254411764705882
00: 0.254411764705882
01: 0.239705882352941
10: 0.239705882352941
11: 0.264705882352941
00: 0.222058823529412
01: 0.282352941176471
10: 0.283823529411765
11: 0.210294117647059
00: 0.263235294117647
01: 0.245588235294118
10: 0.245588235294118
11: 0.244117647058824
This means that if on any given bit, I was given the bit and told to guess the next one, I would have been correct 51.7% of the time if given a 0, and I would have been correct 51.4% of the time if given a 1. In other words, it fails the "next-bit" test.
Obviously, my sample size is small. I understand that I'm not proving anything. I'm merely suggesting that this limited evidence suggests that statistical distribution is a problem for this method.