1. Ha, I had quite a post written up before I noticed the bug in my experimentation code. That changed my results to be significantly more random. Still, I'll try to explain my thoughts.
Originally Posted by Mario F.
Again, what are the only two requirements for the pad? Be the same size or larger than the cipher and be entirely random.
I am disputing that this method of generating a pad is random.

First, to my understanding, the keyboard is not the key. An attacker would know the complete details of the algorithm, including how text is reduced to binary. The only secret would be the binary representation of the text provided by the user.

What bothers me about this method of generating a pad is (as EVOEx pointed out) statistical distributions. In a perfectly random environment, the odds of encountering a 0 or a 1 should be 50%. In the case where a user enters plaintext (passages out of a book, etc.), this does not happen. For example, a user has about a 47% chance to enter a character mapping to a 1. Not hugely bad, but still non-random. I'd love to analyze the chart for second-order groupings (if my first character is a 1, how likely is it that the second character is a 1?), but I don't feel like doing it by hand. My source is Statistical Distributions of English Text.

In the case where a user 'bangs on the keyboard', I've got to say that I'm not convinced. In my 'random' strings, I had about a 57% chance to enter a 0. In my binary strings, I had the greatest chance (about 30%) to have a "00", while the least chance (about 13%) to have a "11". Perhaps a different keyboard mapping would improve this.

Of course, that's sort of my point. Let's say I knew that a 'random' string had a given distribution. For sake of example, let's say it has the distribution I found above. If I simply use a key of all 0's, I've already got more than half the bits correct. Of course, this wouldn't produce anything from the encrypted message, so unless I was already able to do some guessing, that's not good enough. If I use all the pieces of the distribution, surely I can do a bit better than "all 0's". Whether or not it's enough better, I'm not sure.

2. Yes, but I think you are failing to see one fundamental aspect:

A pad, to be effective, is to be used only once. So, there's really no "For example, a user has about a 47% chance to enter a character mapping to a 1"(sic). That's what you get on one key. But may be something else entirely on another key. That is, the keyboard mapping you did is only valid once. It will be something entirely different next time you encrypt something.

Ultimately, you can generate the 1s and 0s from an entirely random set of alphanumeric characters, which essentially means you cannot ever decrypt the pad, given that, in fact, you don't even have a cipher anymore. Since this algorithm destroys information that can never be recovered, you can no longer reverse the cipher even with the algorithm and the key.

I think that MKs suggestion has a real application in its ability to reduce the physical size of the pad for very large plaintext targets. It concerns me however what kind of algorithm is meant to be ran on this pad that can produce more than 2 possibilities for each character in the cypher. This is where I don't see how it can be of any use. But as for its ability to sustain random values, I don't question that.

First, to my understanding, the keyboard is not the key. An attacker would know the complete details of the algorithm, including how text is reduced to binary.
Well, think it this way: What will constitute the key you will try to find when decrypting his "cipher"? Is it not the relationship between characters and the bit value?

3. Even if MK27's method is random and not vulnerable to frequency analysis, (I'm not going to argue whether it is or isn't) it seems highly impractical. To encrypt a 1 megabyte file, you would need 8 megabytes of text. How do you get that? (You would need to type the Bible twice)

4. Originally Posted by Mario F.
He said randomly type on the keyboard.

Here's a challenge:
Type randomly on your keyboard for 30 seconds fast and furiously with all your fingers from both hands... Done?
Now, do it again. Wake me up in 20 or 30 years when you finally match the exact same sequence of characters.
If you were not aware, it is accepted in cryptography that human attempts at random keyboard input have non-random characteristics that make it unsuitable as a random source for cryptographic use.

But the typical usage of such "random" input differs from what MK27 is suggesting, which is why I am open to the possibility that it may actually be a RNG that is suitable for cryptography, including use as a one time pad.

Originally Posted by Mario F.
Exactly how do you plan to convince me a pad made up of random set of bits is not going to serve ours purposes?
How do you know that the pad is "made up of random set of bits", with the keys generated in a uniform distribution? I suspect that it is, but I prefer to remain skeptical until more evidence in the form of statistical tests of randomness shows that the hypothesis is probably not flawed.

Originally Posted by Mario F.
A pad, to be effective, is to be used only once.
Yes, I made a mistake here when talking about the probable tendency for users to reuse source text. This is not a good point, because users who might make the mistake of reusing source text might also make the general mistake of reusing keys for a one time pad, regardless of how the keys were obtained.

Originally Posted by Mario F.
So, there's really no "For example, a user has about a 47% chance to enter a character mapping to a 1"(sic). That's what you get on one key. But may be something else entirely on another key. That is, the keyboard mapping you did is only valid once. It will be something entirely different next time you encrypt something.
However, my other point is still a concern, at least for a practical implementation: different users may have the tendency to select the same source text. This kind of bias becomes problematic when you want to use the RNG output for a keystream, especially as a one time pad.

Originally Posted by Mario F.
Ultimately, you can generate the 1s and 0s from an entirely random set of alphanumeric characters, which essentially means you cannot ever decrypt the pad, given that, in fact, you don't even have a cipher anymore. Since this algorithm destroys information that can never be recovered, you can no longer reverse the cipher even with the algorithm and the key.
This is not a cipher. MK27's use of the term "one time pad" is rather unfortunate in this respect because it has the potential to confuse the cipher and the keystream. This is a RNG that utilises a hash function, and whose output might be suitable for use as a one time pad, i.e., as the keystream of a one time pad cipher.

Originally Posted by NeonBlack
To encrypt a 1 megabyte file, you would need 8 megabytes of text. How do you get that?
This might just be the key management problem of a one time pad, but yes, it is a practical implementation problem related to my concern of bias in source text selection.

5. Originally Posted by Mario F.
Yes, but I think you are failing to see one fundamental aspect:

A pad, to be effective, is to be used only once. So, there's really no "For example, a user has about a 47% chance to enter a character mapping to a 1"(sic).
The pad is the binary representation of the text, since this binary representation of the text is what will be used to XOR the message. The pad is not the mapping of the keyboard to ones or zeros. The mapping of the keyboard is part of the algorithm being used to produce the pad.

Consider this in relation to a real OTP. The pad is the pad; it has the keys to be used to encrypt the message. The method used to produce the pad only matters in that it produces completely random distributions.

So, if using plain-text to produce the pad, then yes, there will be statistical distribution problems. It doesn't matter if I'm writing an original composition or copying from War and Peace; English letter distribution is reasonably consistent, regardless of the source. If using flail-text, I can imagine that the distribution is skewed by striking keys towards the center of the input device far more often than striking keys towards the outside. Because of these two distribution problems, the pad generated by this method is not completely random.
Originally Posted by Mario F.
Well, think it this way: What will constitute the key you will try to find when decrypting his "cipher"? Is it not the relationship between characters and the bit value?
My understanding of cryptography may be flawed, but I was under the impression that the user gets exactly one secret (excluding the message), with the attacker getting everything else. If the keyboard mapping is the secret you're choosing to keep, then the attacker gets the binary representation of the text. In that case, I don't need the keyboard mapping at all. I just use the binary representation and XOR the encrypted message, yielding the correct result.
Originally Posted by laserlight
If you were not aware, it is accepted in cryptography that human attempts at random keyboard input have non-random characteristics that make it unsuitable as a random source for cryptographic use.
I also think this is the case (in fact, I thought I read it in a thread not long ago), but I could not find a source. Can you cite one?

6. I also think this is the case (in fact, I thought I read it in a thread not long ago), but I could not find a source. Can you cite one?
i did mention this in this thread earlier, 'the code book' mentions this when discussing random key generation

7. Originally Posted by pianorain
Wait, did you just say "use a human as a PRNG"? Humans are not random.
You know what you are? You are an "answer bot". You are the kind of dumb animal that goes straight to the last post in a thread, finds some phrase that makes sense to them, and starts spamming a lot of bland, obvious non-thoughts absorbed from reading some 101 level textbook. The goal seems to be to do as little thinking and as much regurgitating of decontextualized, inapplicable crap as possible.

If you had even glanced thru the thread, you would have noticed this point has already been raised and debated and that you have fundamentally misunderstood what is going on.

LEARN TO READ BEFORE YOU POST.

Also: in order to be truly random, a binary sequence DOES NOT need to consist of exactly 50% 0 and exactly 50% 1. That is not random, that is contrived.

8. Originally Posted by MK27
You know what you are? You are an "answer bot". You are the kind of dumb animal that goes straight to the last post in a thread, finds some phrase that makes sense to them, and starts spamming a lot of bland, obvious non-thoughts absorbed from reading some 101 level textbook. The goal seems to be to do as little thinking and as much regurgitating of decontextualized, inapplicable crap as possible.

If you had even glanced thru the thread, you would have noticed this point has already been raised and debated and that you have fundamentally misunderstood what is going on.

LEARN TO READ BEFORE YOU POST.

I think the best part was when you finally realize what is going on, rather than just acknowledge, you resort to greater and greater forms of absurdity -- eg, that in order to be truly random, a binary sequence needs to consist of exactly 50% 0 and exactly 50% 1. That is NOT random, that is contrived.
This will be the only time I respond to you, since you appear to be unable to defend your idea beyond flaming.

A random statistical distribution does not say that a single generated binary sequence will consist of exactly 50% zeros and ones. It says that over the long run, the generation of binary sequences will tend towards 50% zeros and ones.

If you want anyone to take you seriously, try responding to points raised rather than resorting to ad hominem attacks.
Originally Posted by rogster001
i did mention this in this thread earlier, 'the code book' mentions this when discussing random key generation
Thank you. I knew I wasn't going crazy. I just didn't find it when I looked back through the thread.

9. Originally Posted by rogster001
i did mention this in this thread earlier, 'the code book' mentions this when discussing random key generation
Here we go again -- why can you not understand the difference between the context to which this applies, and the context to which it does not?

I already explained how completely non-random human readable text (a paragraph, etc) could be used here as well. Since that is obviously true, how on earth do you believe that pseudo-randomesque input -- hitting the keyboard -- would not also do?

If you wanted to produce random patterns of characters, typing at the keyboard is not the best method. That is the sense in which "it is accepted in cryptography that human attempts at random keyboard input have non-random characteristics that make it unsuitable as a random source for cryptographic use". No one, including me, has claimed anything else.

But producing random patterns of characters is not what was being discussed. The reduction to a random binary pattern DOES NOT require random patterns of characters.

Originally Posted by pianorain
Thank you. I knew I wasn't going crazy. I just didn't find it when I looked back through the thread.
No, your problem is you want to quote "an authority" out of context because you cannot think for yourself properly.

10. Originally Posted by pianorain
Thank you. I knew I wasn't going crazy. I just didn't find it when I looked back through the thread.
It still bothers me, though. How can this be in any way applicable? How does someone intend to prove and properly quantify someone else's tendency for a given distribution when they are typing on a keyboard?

It does put a dent on the notion of randomness, I concede there (if this is indeed something that has been proved and not just academic discussion). But what to say of the fact atomic decay itself has yet to be proved as random?

If we are going beyond practical barriers to the generation of random numbers, insisting on some unattainable notion of randomness, then it becomes evident no matter what method you choose, it will never be truly random.

What use is this then? Insist on raising the bar and you get nothing.

EDIT:
More,

It may be true if the subject is typing mindlessly at the keyboard without a purpose. I still question how can anyone take advantage of that, in order to come with a practical cryptanalysis algorithm, but whatever. Now, what if the subject is told to type on the keyboard randomly and to try and keep a fair distribution? How can anyone sustain the result has no practical randomness?

11. With the risk of being called an answer bot I don't really understand this part.. Note that I'm not saying you are wrong, I just don't understand it.
Originally Posted by MK27
If you wanted to produce random patterns of characters, typing at the keyboard is not the best method. That is the sense in which "it is accepted in cryptography that human attempts at random keyboard input have non-random characteristics that make it unsuitable as a random source for cryptographic use". No one, including me, has claimed anything else.

But producing random patterns of characters is not what was being discussed. The reduction to a random binary pattern DOES NOT require random patterns of characters.
With the algorithm you posted (character modulo 2) every character has a static mapping to either 1 or 0. So if the inputted character string isn't considered random then how can the outputted binary string be?

12. I actually question that quote from MK. Entirely. I support it is as random as random you can get. Which is pretty random. Given that not even you can repeat or predict the sequence of characters you are going to type.

Two problems:

- (1)Radioctive decay: It is predictable. So, by the requirements being defended on this thread, it doe not constitute a good source for a RNG.

- (2)Throwing a bag of 1000 die into the air: It follows newtonian physics. Under the same conditions it is duplicated. It does not constitute a good source for a RNG.

What's wrong with this picture? The fact you guys are overestimating randomness at the expense of predictability. We have yet to prove randomness. We simply don't know it exists. But what we do know is about our capacity to predict or not certain scenarios and to duplicate them or not. And the two above are very hard to predict or duplicate. Incidentally, in lab conditions, the first one is conceivably a lot easier than the second one and yet everyone insists on radioactive decay has the best method.

And as such, we have been so far achieving randomness supported by our inability to predict the result (example 1) or to create the conditions to duplicate the result (example 2). Now, I understand I'm not a scholar. I'm just the guy next corner. But I dare you to try and prove me randomly typing on a keyboard is in any way less random than any of the two methods above. It is both unpredictable and has no duplicability. Not even by the system that generated it in the first place (you).

13. Originally Posted by Mario F.
It does put a dent on the notion of randomness, I concede there (if this is indeed something that has been proved and not just academic discussion). But what to say of the fact atomic decay itself has yet to be proved as random?
Note that no generator can be proved random or non-random. It can only be tested to suggest randomness. In this case, I'm suggesting that this method of producing random bits may fail a 'next-bit' test because of plain-text statistical distribution problems. Obviously my sample size of flail-text is tiny, but it also suggests distribution problems. Seeing as passing the 'next-bit' test is one of the requirements for a cryptographically secure PRNG, this is important. The only way to know definitively is to run the tests and see what happens.
Originally Posted by Mario F.
It may be true if the subject is typing mindlessly at the keyboard without a purpose. I still question how can anyone take advantage of that, in order to come with a practical cryptanalysis algorithm, but whatever. Now, what if the subject is told to type on the keyboard randomly and to try and keep a fair distribution? How can anyone sustain the result has no practical randomness?
Type randomly, but be fair? Of course I have no data, but I'd imagine that the human mind (whether consciously or not) will cling far heavier to the ordered concept of 'be fair' than it would to the chaotic concept of 'be random', so that result would likely be far more ordered than it should be to be sufficiently random.

I could see a possible scenario in which multiple people are typing at the same time, with characters being added to the key as they come in. This seems much more random because it adds the timing of each keystroke to the "randomness" equation.
Originally Posted by _Mike
So if the inputted character string isn't considered random then how can the outputted binary string be?
Creating randomness from a non-random source is exactly what PRNG's attempt to do. A PRNG is totally deterministic if you know how it works and the internal state of the PRNG. It's just math. So, methods exist to create the appearance of randomness from a non-random source. Whether or not MK27's method is one of them is uncertain.

14. Originally Posted by _Mike
With the risk of being called an answer bot I don't really understand this part.. Note that I'm not saying you are wrong, I just don't understand it.

With the algorithm you posted (character modulo 2) every character has a static mapping to either 1 or 0. So if the inputted character string isn't considered random then how can the outputted binary string be?
Because of the reduction. The reason text like this:
For our purposes, entropy is the information delivered within a stream of bits. It is defined as:
H = - sum_x p_x log_2( p_x )
where (x) is a possible value in a stream of values (e.g., in this example, a contiguous set of bits of some fixed size -- a byte, a word, 100 bytes, ...) and (p_x) is its probability of occurrence (from an infinite population of (x) values, not just a finite sample). Typically, as the values (x) over which entropy is computed increase in size, the entropy increases but not as rapidly as the size.
is not random input is because it is easily predictable -- it's English language text. If I convert it with something like this:
Code:
`for (i=0;i<strlen(str);i++) printf("%d", str[i]%2);`
You get:
01001100010011110010001010110001010010110110001010 10100110010010100111010001010010011001010100110000 10101111000100011100001001010101000101101001111001 00101101001010011101000101110010100010000110101100 10010110011111101100100010101001111001010011010110 10010100101100001000010110000010100000101011010100 01010101010100111100101100001101001001010100101010 11001000010010111000100011001001010101110011000101 11001001100010010111000101010010110010001010110111 01010010101111010011010000101000101010101111100100 010011001010010110001011010
288 ones and 239 zeros. Now, it does not take too many brains here to realize that this is irreversible (that's part of what "reduction" means -- eg, converting Celsius to Fahrenheit is non-reductive, converting base 26 to base 2 is very reductive).

That means, while it is simple to produce the bitstream from the text, it is impossible to determine the text which produced the bitstream. The text is not random, but the bitstream produced is: it contains no recognizable patterns.* This is what I meant with the comparison to radioactive decay:

To clarify: even if the entire sequence of data is based on 100% predicable phenomenon, without the phenomenon, you could not say (based on a few seconds of decay data) what happened during the rest of the time because you do not have the matter (which decayed, producing the data) to examine. In the same way, given part of the bit sequence created from a text, you could not say what the rest of the text was, so you certainly could not say what the rest of the bit sequence was. The same few seconds of decay data could probably be produced by quite different decay events, just like the same sequence of bits could be produced by very different texts. The conversion is highly reductive and therefore irreversible.
* and it is the bitstream which is being used in our cryptography algorithm, not the text -- just as it is not the fissile material that you would use in some other algorithm, it is output from a device. Considering what the device is for, that data is not at all random either: it is completely determined by what it is measuring/detecting. Without the causative event (which cannot be wholly reconstructed with only some portion of the data) -- the text, as it were -- the data (a product of reduction) appears random, that is, highly entropic and unpredictable.

15. Here we go again -- why can you not understand the difference between the context to which this applies, and the context to which it does not?
mk at thetime i was not referring to your example at all im afraid, i was just saying that in general, keyboard bashing ALONE is not considered random enough, it just so happened that you mentioned it in a PART of your idea, bit lazy of me really not to point that out at the time