Of course you can't know. Just like frequency analysis is useless...
In small texts.
But larger texts make it possible what I - and apparently kryptcat - said.
Printable View
Almost akin to arguing about whether the sky is blue then finally going "Oh, you mean that sky"...what do you think the rules/odds are? In general? More silliness laserlight ;)
I'll take your word for that -- I don't know.Quote:
It depends on what you mean by "pragmatically", but the undergraduate physics that my university teaches at cross-faculty level holds that there is "real evidence".
And I have to take the word of my lecturers as I don't know what they really teach at higher levels :)Quote:
Originally Posted by MK27
No, frequency analysis is useless against a one time pad.Quote:
Originally Posted by MK27
But now the argument becomes a little clearer: if your argument rests on the assertion that "the text is a one time pad key to the bit sequence", then key management automatically becomes a practical problem, e.g., if users have a tendency to enter or otherwise pick common text, then the method fails.
So tell me why my argument is wrong for long texts. I'm not going to take your word on this.
I'm not saying it's easy. It's going to take a lot of brute forcing and analysis to decide whether texts make sense... But don't you agree that if two texts make sense you know you've cracked the encryption?
I cannot believe this is what is blinding you here. You are totally out of context; I have already explained that 3-4 times.
If I dissolved your body in acid, allowed the remains to solidify, and used radioactive decay, would that data be "less random" because a human produced it?
Think again about what I am talking about. Here is part of this post converted with:
10110010001011010001101101010011000100101011100101 00011101010010100101100100110010000010010011110010 01110001010100111000100111000010011110100101101010 10010001110001011110101001111001110000010001010010 00111001001100011111101001110000101110010100010101 1110010110010100101100010111010*Code:for (i=0;i<strlen(str);i++) printf("%d", str[i]%2);
Now, there are "non random" patterns in the event -- in fact it is not random at all, it is actual meaningful text. This is what I meant about the radioactive decay event -- which could be something that is totally predictable and full of meaningful, information rich patterns, but also produce random data of a given sort.
Why this is so
It may be hard for humans to avoid repetition in conscious acts and thus we are bad at producing random data. However, it would be very hard EVEN INTENTIONALLY to produce repeated patterns in a bit sequences converted one to one from characters in some text you type, while still using a meaningful sequence of words (ie, you can't go "word word word word" -- you must construct a sentence). Since it will be very difficult just to produce any pattern at all after conversion, I promise: it will be literally impossible to do it by accident. This is what I mean by "out of context".
*for some reason this appeared with spaces in it...
No. This is the one time pad debate, and as far as we have been able to determine, kryptcat is the only person on Earth who thinks it is breakable. So you don't have to take my word on that -- google.
Long or short text -- it does not matter. The key is the same length as the text. It contains no determinate repeating patterns (the text does). All you can do is assert that some part of it, eg:
1
Must be (assuming lower case ascii) one of acegikmoqsuwy, and not one of bdfhjlnprtvxz. And so on. With say 500 digits, you have now 13^500 possibilities. Punch "13^500" into your calculator. Trying to determine the inputed data for the key is going to be even harder than just trying all possible keys (2^500* -- still a ridiculously huge number). Even when your computer is done brute forcing the possibilities (in a few millennia), you will still not be much further along than where you started, vis, reducing them.
*since the bit key need to be 8X as long as the message text, 500 is not a valid length, but you could have 504 bit key for a 63 byte message.
Just answer this question:
If we get a meaningful text for which we apply your algorithm to convert it to your key and decrypt an encrypted text with this and we get another meaningful text.... Do you agree that you can be pretty much sure that the meaningful text you started with was what the user entered?
This is actually a pretty good idea -- you do not have to keep the algorithm secret, and you do not need an RNG or a Geiger counter. And you do not have to use real text, any set of character strings at all will do. Just make sure it is not related to the message text. And you will have a truly random key generator.
If you have the key, yeah, that is the whole point. But if you have the key you don't need to "break" the text which created the key -- you can skip straight to decrypting the message.
If you don't have the key, it would be more or less totally insane to start trying to come up with the random text which produced it, and test that to see if it produces a key which will decrypt the message itself.
Now you just gave me exactly what I needed to prove that your encryption is weaker than using a proper one time pad.
Imagine the following algorithm:
For each possible string of characters (2^(8*n) possibilities, where n is the number of encrypted characters), apply your algorithm to convert this to a key. Then using this key, decrypt the text. If the decrypted text and the initial string of characters both were sensible in some way, we cracked the code.
Agreed that this would work? It's a slow algorithm, 2^(8*n) tries, and practically impossible. But not theoretically impossible.
I think you'll have to agree with this.
However, having a real one-time pad DOES NOT suffer from this flaw. It is theoretically, not just practically impossible to break. This is because you only have one string of characters to find out if it makes sense.
So, which part of this did you not agree with?
That the algorithm is assumed to be known to the adversary is one of the basic principles of cryptography.Quote:
Originally Posted by MK27
Isn't what you are proposing supposed to be a RNG?Quote:
Originally Posted by MK27
So how would you address the problem of key management, i.e., how should the sequence of characters be chosen and not reused? A fully automated approach does not seem feasible since it requires randomness to select a sequence, yet it is this very sequence that is supposed to provide the randomness. Relying on the human opens up the possibility of human related predictability, e.g., humans might tend to quote Shakespeare.Quote:
Originally Posted by MK27
Sort of -- but it uses a potentially very very long "seed", which makes it unbreakable (vs a RNG, which just produces LONG_INT_MAX possible sequences). It's also just one line of very simple code.
Take some text, unrelated to the message, 8X the message length. Generate a key. Throw the key generating text away.Quote:
So how would you address the problem of key management, i.e., how should the sequence of characters be chosen and not reused?
Yes, they might also do this "asdf asdf asdf" on purpose -- again, the stupid possible case is just that. It would be handy for key management in this sense -- you could just use a specific point in a specific available text (a certain bible edition, eg). As long as that is secret, there will be no guessing the key. You don't even need to keep the "actual" key, you just need to know those details.Quote:
Relying on the human opens up the possibility of human related predictability, e.g., humans might tend to quote Shakespeare.
That makes it better than radioactive decay data.
No, apparently I have given you just want you need to feed your delusions further :p
Yes. It is more than slow, I am afraid. A 140 character twitter message would be 2^1120. You may think that is not such a big number, but you are wrong -- neither of us even knows a word describing this number, and probably you do not have any hardware or software available to you right now to tell you just how big a number that is.Quote:
Imagine the following algorithm:
For each possible string of characters (2^(8*n) possibilities, where n is the number of encrypted characters), apply your algorithm to convert this to a key. Then using this key, decrypt the text. If the decrypted text and the initial string of characters both were sensible in some way, we cracked the code.
Agreed that this would work? It's a slow algorithm, 2^(8*n) tries.
Your grandchildren will be long dead before your (now antique) computer has finished calculating the possibilities. Then you will need several Earth planet's worth of staff to consider them. This is the reality of decimal numbers. If a container can hold any number of liters with 10 digits in it (a big container!), an 11 digit value (just one more digit!) might potentially require 10 such containers, and a 12 digit value (just 2 more numbers!) is guaranteed to do so.
I completely agree! Maybe the algorithm can be optimized some to be actually feasible. But let's say it can't. I'm fine with that.
However, there IS such an algorithm. While there IS no such algorithm for a proper one time pad. Meaning that even if your idea is strong (I'm not saying it is), it is NOT as strong as a real one time pad.
It's like saying 65536 bits RSA is as strong as 131072 bits RSA. We won't be able to crack either within the lifetime of the universe probably (using our current hardware), but that doesn't mean that one isn't stronger than the other.
So a real one time pad is still stronger than your idea.
Can you agree with this?
Sorry for the confusion: when I say RNG, I mean random number generator, not pseudo-random number generator (PRNG).Quote:
Originally Posted by MK27
Unfortunately, that does not adequately answer the question because "take some text, unrelated to the message, 8X the message length" is not precisely specified; it seems to me that it is easier said than done if you really want randomness. This is related to:Quote:
Originally Posted by MK27
Quote:
Originally Posted by MK27
My whole point is that reducing a set of base 27 numbers down to binary will always produce a random result. Depending how you do it, eg with hex:
0-7 = 0
8-f = 1
If the numbers are sequential:
0 1 2 3 4 5 6 7 8 9 a b c d e f
You will get:
0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
Which you can deduce something there if you know the algoritm (0-7 = 0, 8-f = 1). But because the "pattern" in sequential numbers has nothing to do with the "patterns" in meaningful language:
d e a d b e e f
1 1 1 1 1 1 1 1
Since you might have sequential numbers, the modulus method is better (because it also elimate the "pattern" in sequential numbers), you will just get stuff like
1 0 0 1 1 0 0 1
Now, since this all pertains to the key source text and not the message -- well, it is unbreakable, as I said.
I am still awaiting your reply to your reply to my previous post, as I consider it a key element to us getting to an understanding on this subject.
But this quote... Didn't you just agree with me that it was breakable? Albeit only theoretically and extremely difficult and slow?
No, it does not: that mapping is deterministic; it is a hash function. The way I see it is that your point requires that 'the "pattern" in sequential numbers has nothing to do with the "patterns" in meaningful language'.Quote:
Originally Posted by MK27
Unless it is feasible to guess the key source text.Quote:
Originally Posted by MK27
Did you mean this?
No, it is a one time pad. I know you said, "However, there IS such an algorithm" -- yes, in the same sense there is "an algorithm" to produce all the possible keys for a one time pad, and since this is a one time pad, I guess so. It is not "easier". How could it be? Because you can attempt to guess that text too -- which is eight times longer than the message?
I agreed that the one time pad is limited to 2^(8n) possible keys, and since that set is finite you could generate them "theoretically" and it would be "extremely difficult and slow". The text that produced the key actually has 26^(8n) possibilities. The wrong tree to bark up, in other words.Quote:
But this quote... Didn't you just agree with me that it was breakable? Albeit only theoretically and extremely difficult and slow?
But as a bunch of people tried quite a bit to demonstrate to kryptcat, that won't do you much good beyond getting a lot of practice counting things. To somewhat rephrase laserlight, if you buy all the tickets to a lottery, you can say in advance that you have won. But if the winning ticket remains a secret, and no can acknowledge which one it is, in reality you can never win until you choose a ticket. If the person who wrote the message is patient and honest, you could show them all the possibilities you have successfully generated in the past few centuries and they would have to admit, one of them is correct.
More realistically: having all the possibilities won't help. And there is NOTHING else you can do. That is why it is unbreakable. The key does not contain a pattern. The text that generated the key did -- but like the fissile material with the geiger counter, that is irreversibly gone, and it has nothing to do with the message. It is truly random natural data.
I lost track of what is being debated here. The basis of the OTP is that the quantity of information in the key is at least as large as the quantity of information in the plaintext. If that holds, then the cipher is unbreakable.
Note that "quantity of information" refers to the true information of the key -- if you start with some fixed number of bits (a random seed) and expand that using a PRNG into some larger number of bits, you have NOT increased the quantity of information by that much (it does increase a TINY amount because the details of the PRNG algorithm are encoded into its output sequence), and you certainly do not increase it arbitrarily.
Any method of generating the key bits that does not involve a truly random source, is not a true OTP.
Well I fulfilled both criteria:
1) the key generated will be at least as large as the data.
2) the key was generated using a truly random source (to understand why and how this is so, read the thread*).
*paraphrased: human language texts considered as a modulus 2 of ascii values does not contain any pattern; it is a random sequence of 0 and 1.
MK27 proposed a method of generating the key bits: given a sequence of characters as input, compress every character with a hash function such that it maps to a bit. He posits that the resulting bitstream is a truly random sequence suitable for use as a one time pad, even if the sequence of characters is say, the text of a novel.Quote:
Originally Posted by brewbuck
The trick is that you can actually KNOW when you have the proper key as both the input text as the decrypted text make sense. If that is true, you can be fairly sure you have the right combination, even though it is fairly unlikely you will find this using this algorithm.
However, if you find the valid key using brute force with a real random pad, you don't know if you found a right key. You can find any key for any plain text of the same length, and the key will never make sense. In your case the text will make sense, making it obvious you got the right key.
So you don't know you "got the winning ticket" because the key is completely meaningless. In your algorithm you can be fairly certain you got the right key as only a few keys unlock a box. In the original situation, there are a whole lot of keys that unlock a box and you can't know you unlocked the right box as you don't know what's inside.
Laserlight, however, is afraid that the ancient space faring civilization that invented humans may have been controlling the evolution of language in such a way that there are secret satanic patterns which will emerge when we statistically analyse the result of the conversion. ;)
Another possibility: MK27, in a naive quest for a source of "random noise", has hit upon what could be proof of God's hand at play in the works of man. So this may only be true of "God's chosen languages" and not just any particular one.
EVOEx: when you talk about cryptanalysis, are you are talking about the use of MK27's proposed RNG to generate a keystream, and then to cryptanalyse the resulting ciphertext to reason if the keystream constitutes a one time pad?
I'm talking about brute forcing a text that generates a keystream to decode a captured message. If both the initial text used to get the random numbers and the decryption using this key make sense, you know you've broken the encryption.
Which, of course, is impossible for actual random pads.
Alright, here's what I see as your point, which at first glance is sort of feasible. This is the scrambled text, byte values for a 5 letter word:
41 211 90 7 146
All you have to do is find a 30 character "sensible" text that could generate a 30 bit key according to the known algorithm, %2, whereby that key will produce a five letter word and bingo -- that's gotta be it.
Wrong.
If you have a key that generates a meaningful message, it would be easy to take that bit pattern and generate a corresponding text for it (you have 13 choices each time -- actually with punctuation and capitals, lets say you have 30 choices each time). Chance are you could do that with a message of any length. Going back to the five letter word, using modulus how many 30 character phrases could you come up with that would produce the "key" to produce a word from 5 scrambled bytes? The same number of phrases as there are possible keys, and most likely, even more than that, since just one such bit sequence could be produced by a whole bunch of possible word combinations.
So it will not help -- very likely all 2^(8n) possible keys could be turned into meaningful key generating texts. This is because of the "irreversible reduction" involved.
What's more, using text was only a suggestion -- the algo would work with garbage data too.
It is not cheating if you can determine that certain sets of text are much more likely to be used than others. It is only a one time pad if the pad is used once.Quote:
Originally Posted by MK27
The pad would only be used once.* One of the "statistics" thrown about Queens, where I live, is that there are more than 200 languages actively used by the residents. I presume these all have written forms and potentially ascii-ish forms too (if not, it could be accommodated); maybe we can rule out the ones that that use pictograms.
That's a lot of available text. I don't have to read or understand it -- my suggestion about using it was just to save time typing, etc. Plus, you can now take your (effectively) randomly chosen chunk of text and randomly choose a language to google translate it into. That will be even better, since the key text is less likely to make sense.
* someone else could by coincidence use the same one, but the same is true no matter how you generate it (it could have been used before).
Unicode.Quote:
Originally Posted by MK27
The problem is that the sets of text are not all equally likely to be chosen by a given user. Recall why brute force can be effective in practice against hashed passwords (where a salt is not used): users tend to choose weak passwords for convenience, and these passwords are thus susceptible to a dictionary attack.Quote:
Originally Posted by MK27
Now, in practice these text are more like passphrases rather than passwords (though the distinction is largely just a matter of style, in my opinion), thus brute force is likely to fail anyway. Yet, just as users reuse passwords, they may reuse source text. Training them not to do so is easier said than done; even trained spies blunder.
Actually, very few combination of letters actually form something meaningful. Especially if the texts become longer. Try it; type 10 random characters and see how many make sense. The longer the text, the less combinations will make sense.
Especially if it takes context into account.
Given a long text, how many combinations make sense? Not a lot. And in how many cases will both the decrypted and the input text makes sense? My bet: 1.
Arrgggh -- EVOEx, are you the kind of person that when you reach a T junction with this sign, you habitually turn left and are baffled by the oncoming traffic? :p
You are not dealing with 10 random characters, you are dealing with ten random bits:
1011101100
Each of which could be any one of thirteen letters (let's ditch punctuation and spaces) -- I think you will find there is more than one possibility for any bit sequence. So there will always be a "meaningful" correspondence. You cannot reduce the number of possible keys this way...you will cause an "accident".
I'm still not sure what there is to debate about MK's idea. It may not be your traditional approach to randomness. But I cannot see how it is any less efficient.
I think we are rapidly approaching a concept of randomness no one here is in any position to prove or disprove. Patterns inside a randomly chosen key, do not remove the random nature of the key. Is a randomly picked phrase in War & Peace less random because we can actually read the text? Should a PRNG previously calculated distribution be reviewed because once we drew 7 numbers and they came up as 1234567?
A pad only requirement is that it is truly random. This means what it means. An effective key can indeed be Chapter II of War & Peace in the Ukranian language and with gross translation errors, unless this choice can be traced. A system that renders text to a completely unrelated one-way binary representation is not better or worse.
The problem for anyone trying to break a one-time pad is not finding the key. Is finding the relevant original text among the near infinite number of possible matches.
No, this is wrong. You're relying on a human to be random. If humans were adequately random, brute force dictionary attacks would not work.This is irrelevant. We're not concerned with finding the text used to produce the ten random bits. We're only concerned with finding the ten bits. In this case, see above.
Humans are not random.
He said randomly type on the keyboard.
Here's a challenge:
Type randomly on your keyboard for 30 seconds fast and furiously with all your fingers from both hands... Done?
Now, do it again. Wake me up in 20 or 30 years when you finally match the exact same sequence of characters.
nope. You are interested in finding the solution to the cipher. Not in finding the key. Or better, your only hope to break a One-time Pad is trying to decipher the cipher. Not trying to find the key.Quote:
This is irrelevant. We're not concerned with finding the text used to produce the ten random bits. We're only concerned with finding the ten bits. In this case, see above.
Wait, did you just say "use a human as a PRNG"? Humans are not random.
Fortunately, I don't have to match the exact same sequence of characters. I only have to match the last bit of each character.
Take a look at the ASCII keyboard under this scenario:
I hope I don't "randomly" draw all my characters from the Q row. That's a lot of 1's. Using the Z row isn't a good idea either - lots of 0's.Code:1 1 1 0 0 1 1 1 1 0
1 1 0 0 1 0 0 1 0
0 0 1 0 0 0 1
The theoretical argument is simple; other people have been covering that. So, I wrote a quick program to compare a single "key" against my other random tries. So far, an average of 52 out of 59 bits are the same; my range of "same bits" is 50-53. This is with me furiously pounding keys on the keyboard. Of course, this is a small sample but I think it illustrates the point. Humans are not random.
Really? Prove it. If you can, you just did science a big favor and answered one of the fundamental questions of humanity. Nobel Prize will be too little for you. Meanwhile, you may as well have proven we live in an entirely deterministic universe and the quest for randomness is forever a wild goose chase.
Erm... I don't think you understood MK's algorithm.Quote:
Fortunately, I don't have to match the exact same sequence of characters. I only have to match the last bit of each character.
Take a look at the ASCII keyboard under this scenario:
Code:1 1 1 0 0 1 1 1 1 0
1 1 0 0 1 0 0 1 0
0 0 1 0 0 0 1
It seems to be emphatically stated many times.Indeed, I doubt even MK27 understands his algorithm, considering in one place he says "random text" and yet:
No. Not that type of understanding. But what his algorithm means.
Again, what are the only two requirements for the pad? Be the same size or larger than the cipher and be entirely random. Exactly how do you plan to convince me a pad made up of random set of bits is not going to serve ours purposes?
You can argue, because those bits are being constructed from a binary alphabet representation of well-known characters? More, you can then follow on with "and the pad represents some unknown text". So what you have in fact is a pad that is a cipher that shows 1s and 0s created from some unknown text that can in fact have been randomly generated, or not.
So how do you propose to solve the cipher that will give you the correct pad?
EDit: Oh, and remember: That keyboard you draw... that's the key. But you don't know the key. So you can't draw it as you did. So, again how do you plan to decipher the pad?
Ha, I had quite a post written up before I noticed the bug in my experimentation code. That changed my results to be significantly more random. Still, I'll try to explain my thoughts.I am disputing that this method of generating a pad is random.
First, to my understanding, the keyboard is not the key. An attacker would know the complete details of the algorithm, including how text is reduced to binary. The only secret would be the binary representation of the text provided by the user.
What bothers me about this method of generating a pad is (as EVOEx pointed out) statistical distributions. In a perfectly random environment, the odds of encountering a 0 or a 1 should be 50%. In the case where a user enters plaintext (passages out of a book, etc.), this does not happen. For example, a user has about a 47% chance to enter a character mapping to a 1. Not hugely bad, but still non-random. I'd love to analyze the chart for second-order groupings (if my first character is a 1, how likely is it that the second character is a 1?), but I don't feel like doing it by hand. My source is Statistical Distributions of English Text.
In the case where a user 'bangs on the keyboard', I've got to say that I'm not convinced. In my 'random' strings, I had about a 57% chance to enter a 0. In my binary strings, I had the greatest chance (about 30%) to have a "00", while the least chance (about 13%) to have a "11". Perhaps a different keyboard mapping would improve this.
Of course, that's sort of my point. Let's say I knew that a 'random' string had a given distribution. For sake of example, let's say it has the distribution I found above. If I simply use a key of all 0's, I've already got more than half the bits correct. Of course, this wouldn't produce anything from the encrypted message, so unless I was already able to do some guessing, that's not good enough. If I use all the pieces of the distribution, surely I can do a bit better than "all 0's". Whether or not it's enough better, I'm not sure.
Yes, but I think you are failing to see one fundamental aspect:
A pad, to be effective, is to be used only once. So, there's really no "For example, a user has about a 47% chance to enter a character mapping to a 1"(sic). That's what you get on one key. But may be something else entirely on another key. That is, the keyboard mapping you did is only valid once. It will be something entirely different next time you encrypt something.
Ultimately, you can generate the 1s and 0s from an entirely random set of alphanumeric characters, which essentially means you cannot ever decrypt the pad, given that, in fact, you don't even have a cipher anymore. Since this algorithm destroys information that can never be recovered, you can no longer reverse the cipher even with the algorithm and the key.
I think that MKs suggestion has a real application in its ability to reduce the physical size of the pad for very large plaintext targets. It concerns me however what kind of algorithm is meant to be ran on this pad that can produce more than 2 possibilities for each character in the cypher. This is where I don't see how it can be of any use. But as for its ability to sustain random values, I don't question that.
Well, think it this way: What will constitute the key you will try to find when decrypting his "cipher"? Is it not the relationship between characters and the bit value?Quote:
First, to my understanding, the keyboard is not the key. An attacker would know the complete details of the algorithm, including how text is reduced to binary.
Even if MK27's method is random and not vulnerable to frequency analysis, (I'm not going to argue whether it is or isn't) it seems highly impractical. To encrypt a 1 megabyte file, you would need 8 megabytes of text. How do you get that? (You would need to type the Bible twice)
If you were not aware, it is accepted in cryptography that human attempts at random keyboard input have non-random characteristics that make it unsuitable as a random source for cryptographic use.Quote:
Originally Posted by Mario F.
But the typical usage of such "random" input differs from what MK27 is suggesting, which is why I am open to the possibility that it may actually be a RNG that is suitable for cryptography, including use as a one time pad.
How do you know that the pad is "made up of random set of bits", with the keys generated in a uniform distribution? I suspect that it is, but I prefer to remain skeptical until more evidence in the form of statistical tests of randomness shows that the hypothesis is probably not flawed.Quote:
Originally Posted by Mario F.
Yes, I made a mistake here when talking about the probable tendency for users to reuse source text. This is not a good point, because users who might make the mistake of reusing source text might also make the general mistake of reusing keys for a one time pad, regardless of how the keys were obtained.Quote:
Originally Posted by Mario F.
However, my other point is still a concern, at least for a practical implementation: different users may have the tendency to select the same source text. This kind of bias becomes problematic when you want to use the RNG output for a keystream, especially as a one time pad.Quote:
Originally Posted by Mario F.
This is not a cipher. MK27's use of the term "one time pad" is rather unfortunate in this respect because it has the potential to confuse the cipher and the keystream. This is a RNG that utilises a hash function, and whose output might be suitable for use as a one time pad, i.e., as the keystream of a one time pad cipher.Quote:
Originally Posted by Mario F.
This might just be the key management problem of a one time pad, but yes, it is a practical implementation problem related to my concern of bias in source text selection.Quote:
Originally Posted by NeonBlack
The pad is the binary representation of the text, since this binary representation of the text is what will be used to XOR the message. The pad is not the mapping of the keyboard to ones or zeros. The mapping of the keyboard is part of the algorithm being used to produce the pad.
Consider this in relation to a real OTP. The pad is the pad; it has the keys to be used to encrypt the message. The method used to produce the pad only matters in that it produces completely random distributions.
So, if using plain-text to produce the pad, then yes, there will be statistical distribution problems. It doesn't matter if I'm writing an original composition or copying from War and Peace; English letter distribution is reasonably consistent, regardless of the source. If using flail-text, I can imagine that the distribution is skewed by striking keys towards the center of the input device far more often than striking keys towards the outside. Because of these two distribution problems, the pad generated by this method is not completely random.My understanding of cryptography may be flawed, but I was under the impression that the user gets exactly one secret (excluding the message), with the attacker getting everything else. If the keyboard mapping is the secret you're choosing to keep, then the attacker gets the binary representation of the text. In that case, I don't need the keyboard mapping at all. I just use the binary representation and XOR the encrypted message, yielding the correct result.I also think this is the case (in fact, I thought I read it in a thread not long ago), but I could not find a source. Can you cite one?
i did mention this in this thread earlier, 'the code book' mentions this when discussing random key generationQuote:
I also think this is the case (in fact, I thought I read it in a thread not long ago), but I could not find a source. Can you cite one?
You know what you are? You are an "answer bot". You are the kind of dumb animal that goes straight to the last post in a thread, finds some phrase that makes sense to them, and starts spamming a lot of bland, obvious non-thoughts absorbed from reading some 101 level textbook. The goal seems to be to do as little thinking and as much regurgitating of decontextualized, inapplicable crap as possible.
If you had even glanced thru the thread, you would have noticed this point has already been raised and debated and that you have fundamentally misunderstood what is going on.
LEARN TO READ BEFORE YOU POST.
Also: in order to be truly random, a binary sequence DOES NOT need to consist of exactly 50% 0 and exactly 50% 1. That is not random, that is contrived.
This will be the only time I respond to you, since you appear to be unable to defend your idea beyond flaming.
I have already read the entire thread.
A random statistical distribution does not say that a single generated binary sequence will consist of exactly 50% zeros and ones. It says that over the long run, the generation of binary sequences will tend towards 50% zeros and ones.
If you want anyone to take you seriously, try responding to points raised rather than resorting to ad hominem attacks.Thank you. I knew I wasn't going crazy. I just didn't find it when I looked back through the thread.
Here we go again -- why can you not understand the difference between the context to which this applies, and the context to which it does not?
I already explained how completely non-random human readable text (a paragraph, etc) could be used here as well. Since that is obviously true, how on earth do you believe that pseudo-randomesque input -- hitting the keyboard -- would not also do?
If you wanted to produce random patterns of characters, typing at the keyboard is not the best method. That is the sense in which "it is accepted in cryptography that human attempts at random keyboard input have non-random characteristics that make it unsuitable as a random source for cryptographic use". No one, including me, has claimed anything else.
But producing random patterns of characters is not what was being discussed. The reduction to a random binary pattern DOES NOT require random patterns of characters.
No, your problem is you want to quote "an authority" out of context because you cannot think for yourself properly.
It still bothers me, though. How can this be in any way applicable? How does someone intend to prove and properly quantify someone else's tendency for a given distribution when they are typing on a keyboard?Quote:
Originally Posted by pianorain
It does put a dent on the notion of randomness, I concede there (if this is indeed something that has been proved and not just academic discussion). But what to say of the fact atomic decay itself has yet to be proved as random?
If we are going beyond practical barriers to the generation of random numbers, insisting on some unattainable notion of randomness, then it becomes evident no matter what method you choose, it will never be truly random.
What use is this then? Insist on raising the bar and you get nothing.
EDIT:
More,
It may be true if the subject is typing mindlessly at the keyboard without a purpose. I still question how can anyone take advantage of that, in order to come with a practical cryptanalysis algorithm, but whatever. Now, what if the subject is told to type on the keyboard randomly and to try and keep a fair distribution? How can anyone sustain the result has no practical randomness?
With the risk of being called an answer bot :) I don't really understand this part.. Note that I'm not saying you are wrong, I just don't understand it.
With the algorithm you posted (character modulo 2) every character has a static mapping to either 1 or 0. So if the inputted character string isn't considered random then how can the outputted binary string be?
I actually question that quote from MK. Entirely. I support it is as random as random you can get. Which is pretty random. Given that not even you can repeat or predict the sequence of characters you are going to type.
Two problems:
- (1)Radioctive decay: It is predictable. So, by the requirements being defended on this thread, it doe not constitute a good source for a RNG.
- (2)Throwing a bag of 1000 die into the air: It follows newtonian physics. Under the same conditions it is duplicated. It does not constitute a good source for a RNG.
What's wrong with this picture? The fact you guys are overestimating randomness at the expense of predictability. We have yet to prove randomness. We simply don't know it exists. But what we do know is about our capacity to predict or not certain scenarios and to duplicate them or not. And the two above are very hard to predict or duplicate. Incidentally, in lab conditions, the first one is conceivably a lot easier than the second one and yet everyone insists on radioactive decay has the best method.
And as such, we have been so far achieving randomness supported by our inability to predict the result (example 1) or to create the conditions to duplicate the result (example 2). Now, I understand I'm not a scholar. I'm just the guy next corner. But I dare you to try and prove me randomly typing on a keyboard is in any way less random than any of the two methods above. It is both unpredictable and has no duplicability. Not even by the system that generated it in the first place (you).
Note that no generator can be proved random or non-random. It can only be tested to suggest randomness. In this case, I'm suggesting that this method of producing random bits may fail a 'next-bit' test because of plain-text statistical distribution problems. Obviously my sample size of flail-text is tiny, but it also suggests distribution problems. Seeing as passing the 'next-bit' test is one of the requirements for a cryptographically secure PRNG, this is important. The only way to know definitively is to run the tests and see what happens.Type randomly, but be fair? Of course I have no data, but I'd imagine that the human mind (whether consciously or not) will cling far heavier to the ordered concept of 'be fair' than it would to the chaotic concept of 'be random', so that result would likely be far more ordered than it should be to be sufficiently random.
I could see a possible scenario in which multiple people are typing at the same time, with characters being added to the key as they come in. This seems much more random because it adds the timing of each keystroke to the "randomness" equation.Creating randomness from a non-random source is exactly what PRNG's attempt to do. A PRNG is totally deterministic if you know how it works and the internal state of the PRNG. It's just math. So, methods exist to create the appearance of randomness from a non-random source. Whether or not MK27's method is one of them is uncertain.
Because of the reduction. The reason text like this:
is not random input is because it is easily predictable -- it's English language text. If I convert it with something like this:Quote:
For our purposes, entropy is the information delivered within a stream of bits. It is defined as:
H = - sum_x p_x log_2( p_x )
where (x) is a possible value in a stream of values (e.g., in this example, a contiguous set of bits of some fixed size -- a byte, a word, 100 bytes, ...) and (p_x) is its probability of occurrence (from an infinite population of (x) values, not just a finite sample). Typically, as the values (x) over which entropy is computed increase in size, the entropy increases but not as rapidly as the size.
You get:Code:for (i=0;i<strlen(str);i++) printf("%d", str[i]%2);
288 ones and 239 zeros. Now, it does not take too many brains here to realize that this is irreversible (that's part of what "reduction" means -- eg, converting Celsius to Fahrenheit is non-reductive, converting base 26 to base 2 is very reductive).Quote:
01001100010011110010001010110001010010110110001010 10100110010010100111010001010010011001010100110000 10101111000100011100001001010101000101101001111001 00101101001010011101000101110010100010000110101100 10010110011111101100100010101001111001010011010110 10010100101100001000010110000010100000101011010100 01010101010100111100101100001101001001010100101010 11001000010010111000100011001001010101110011000101 11001001100010010111000101010010110010001010110111 01010010101111010011010000101000101010101111100100 010011001010010110001011010
That means, while it is simple to produce the bitstream from the text, it is impossible to determine the text which produced the bitstream. The text is not random, but the bitstream produced is: it contains no recognizable patterns.* This is what I meant with the comparison to radioactive decay:
* and it is the bitstream which is being used in our cryptography algorithm, not the text -- just as it is not the fissile material that you would use in some other algorithm, it is output from a device. Considering what the device is for, that data is not at all random either: it is completely determined by what it is measuring/detecting. Without the causative event (which cannot be wholly reconstructed with only some portion of the data) -- the text, as it were -- the data (a product of reduction) appears random, that is, highly entropic and unpredictable.Quote:
To clarify: even if the entire sequence of data is based on 100% predicable phenomenon, without the phenomenon, you could not say (based on a few seconds of decay data) what happened during the rest of the time because you do not have the matter (which decayed, producing the data) to examine. In the same way, given part of the bit sequence created from a text, you could not say what the rest of the text was, so you certainly could not say what the rest of the bit sequence was. The same few seconds of decay data could probably be produced by quite different decay events, just like the same sequence of bits could be produced by very different texts. The conversion is highly reductive and therefore irreversible.
mk at thetime i was not referring to your example at all im afraid, i was just saying that in general, keyboard bashing ALONE is not considered random enough, it just so happened that you mentioned it in a PART of your idea, bit lazy of me really not to point that out at the timeQuote:
Here we go again -- why can you not understand the difference between the context to which this applies, and the context to which it does not?
Nah, nah, nah. There's no place here for conjectures. You are cheating with that argument. Please check post #132. I give you there the real challenge. And even at the face of what you say here, you will still not be able to predict or duplicate the conditions. You achieved randomness. Dare to contradict me?
@MK27
I think I understand what you mean now, and your method does seem to be more secure.
For example; If you had access to parts of the output stream from a PRNG like rand() and knew the exact implementation you could theroretically deduce what the seed was, and therefore get the rest of the keystream. But with your algorithm it would be impossible to figure out the entire key based on only having a part of it.
The input is a sequence of bytes. Once you can form numbers, you can apply statistical tests of randomness.Quote:
Originally Posted by Mario F.
It would probably be feasible to use frequency analysis, if you know the keyboard in use and have information on the characteristics of sequences generated by people "typing mindlessly at the keyboard without a purpose".Quote:
Originally Posted by Mario F.
I think that the problem is that it is easier said than done.Quote:
Originally Posted by Mario F.
Rather unfortunately, it seems that the books I have at hand mainly talk about PRNGs. Schneier does talk about RNGs, so I will quote him here along with two earlier points about PRNGs:
This property is puzzling; I am not sure what is the input that Schneier mentions here because I never considered the possibility of input to a RNG. I would have thought that the defining property is something like a notion of unpredictability that is stronger than being computationally infeasible to predict. (I concede that "cannot be reliably reproduced" is also a stronger notion than "computationally infeasible to predict", but it seems stronger than I expected.)Quote:
For our purposes, a sequence generator is pseudo-random if it has this property:
1. It looks random. This means that it passes all the statistical tests of randomness that we can find.
For a sequence to be cryptographically secure pseudo-random, it must also have this property:
2. It is unpredictable. It must be computationally infeasible to predict what the next random bit will be, given complete knowledge of the algorithm or hardware generating the sequence and all of the previous bits in the stream.
From our point of view a sequence generator is real random if it has this additional third property:
3. It cannot be reliably reproduced. If you run the sequence generator twice with the exact same input (at least as exact as humanly possible), you will get two completely unrelated random sequences.
The output of a generator satisfying these three properties will be good enough for a one time pad, key generation, and any other cryptographic applications that require a truly random sequence generator. The difficulty is in determining whether a sequence is really random. If I repeatedly encrypt a string with DES and a given key, I will get a nice, random-looking output; you won't be able to tell that it's non-random unless you rent time on the NSA's DES cracker.
I would not be so bold as to say that it is predictable when many of the experts in the field currently believe otherwise. But since you have a second example, you're still okay :)Quote:
Originally Posted by Mario F.
Okay.
If I had a 1000 bytes, and it turned out there were exactly 4000 1's and 4000 0's, then you would take this as evidence that the data is random? No -- but if it were 47% one and the other, then that would also mean it's not random? Your thinking is flawed here, because you fail to grasp the meaning of your own words (the bold part is correct).
I'm sure you could tweek Geiger input to yield more ones than zeros. Would that make it less random? If so, then the "truest" random setting would be when you tweek the machine to yield exactly the same number of each? No -- but it is a goal to aim for, because, obviously, ending up with all one or the other is undesirable.
The key here is "over the long run, the generation of binary sequences will tend towards 50% zeros and ones". Arriving at 47% based on Statistical Distributions of English Text is an example of that. It means that the larger your sample of text, the more likely you are to end up with 47%. You could have a small sample of text that is exactly 50% -- that does not mean it is a "more random sample". You would expect some subset of a text that is 47% to be 50% -- that reflects the tendency. If every subsample were exactly 50% (or 47%), you DEFINITELY would not have random data, you would have 1010101010.
If you want exactly 50% "in the long run", you could easily take the statistical distribution and alter the algorithm to aim at specific characters rather than just using modulus. It would be simple to achieve this. But why bother? 47% is far enough away from 100 and 0 -- it is evidence that, in general, using text as an input will tend toward 50% as the size of the input increases. It is the tenancy toward 50% and away from 100 (or 0) that is conceptually important. You do not have to arrive at 50%. Indeed, doing that too easily might indicate a lack of randomness. The purpose of this is to avoid things which do the opposite: for example, if you have a sample of data that is 50%, but the more data you add the more it tends away from 50%, your method is not a good one, because you are more likely to get occasional samples that are 100% one way or the other. That is very unlikely with English language text.
You cannot examine a 2 digit sequence for statistical distribution as a randomness criteria, nb.
This is what I meant by the mindless regurgitation of "textbook 101 principles". What I meant by "learning to think for yourself" is that you need to comprehend the proper use of those principles so that you can apply them intelligently, rather than like a piece of spam software.
Ah, infinite are the arguments of mages. Menezes, van Oorschot and Vanstone present a somewhat different definition of a RNG, in terms of a random bit generator:
You may also want to read section 2 of that chapter, which is available as a PDF document: Handbook of Applied Cryptography, Chapter 5Quote:
A random bit generator is a device or algorithm which outputs a sequence of statistically independent and unbiased binary digits.
Unpredictability is not mentioned in that definition itself, but it should be obvious that it is a requirement, though unfortunately this also means that it is not very well defined, at least in that chapter.
So any hypothesis is cheating? That must make it difficult to get anywhere. My conjecture is based on the assumption that humans act within reason. That, of course, is debatable, but it is supported in this field by the fact that we can publish lists of the most popular passwords. If humans were better at being random, this tiny dictionary of 500 plain-text words wouldn't be able to successfully attack 11% of the population.Not contradict, merely suggest an alternative view.
The ability or inability to duplicate a result is not a requirement of a PRNG. For an example in the computer science field, consider the seed of a PRNG. If you seed a PRNG with the same number every time, you will generate the same numbers every time. This does not mean that the PRNG is broken; it's just being utilized poorly. That an event is repeatable does not factor either way into whether or not a PRNG is suitably random.
The lack of prediction, on the other hand, is a major requirement into determining the suitability of a PRNG. This is where my entire argument lies.
First, an explanation of the "next-bit" test. Straight from Wikipedia:Why do I think simple typing on the keyboard fails this test? Because I can tell you the algorithm I think will predict the next bit (with greater than 50% success) given only a single bit.Quote:
Given the first k bits of a random sequence, there is no polynomial-time algorithm that can predict the (k+1)th bit with probability of success better than 50%.
To illustrate, I took four samples of text from this thread:
Quote:
01001100010011110010001010110001010010110110001010 10100110010010100111010001010010011001010100110000 10101111000100011100001001010101000101101001111001 00101101001010011101000101110010100010000110101100 10010110011111101100100010101001111001010011010110 10010100101100001000010110000010100000101011010100 01010101010100111100101100001101001001010100101010 11001000010010111000100011001001010101110011000101 11001001100010010111000101010010110010001010110111 01010010101111010011010000101000101010101111100100 010011001010010110001011010
Quote:
Note that no generator can be proved random or non-random. It can only be tested to suggest randomness. In this case, I'm suggesting that this method of producing random bits may fail a 'next-bit' test because of plain-text statistical distribution problems. Obviously my sample size of flail-text is tiny, but it also suggests distribution problems. Seeing as passing the 'next-bit' test is one of the requirements for a cryptographically secure PRNG, this is important. The only way to know definitively is to run the tests and see what happens.
Quote:
If you have a key that generates a meaningful message, it would be easy to take that bit pattern and generate a corresponding text for it (you have 13 choices each time -- actually with punctuation and capitals, lets say you have 30 choices each time). Chance are you could do that with a message of any length. Going back to the five letter word, using modulus how many 30 character phrases could you come up with that would produce the "key" to produce a word from 5 scrambled bytes? The same number of phrases as there are possible keys, and most likely, even more than that, since just one such bit sequence could be produced by a whole bunch of possible word combinations.
I then converted the text to binary using mod 2 arithmetic and found the distributions of two-character sequences.Quote:
The problem is that the sets of text are not all equally likely to be chosen by a given user. Recall why brute force can be effective in practice against hashed passwords (where a salt is not used): users tend to choose weak passwords for convenience, and these passwords are thus susceptible to a dictionary attack. Now, in practice these text are more like passphrases rather than passwords (though the distinction is largely just a matter of style, in my opinion), thus brute force is likely to fail anyway. Yet, just as users reuse passwords, they may reuse source text. Training them not to do so is easier said than done; even trained spies blunder.
I noticed that 01 and 10 are always higher than 00 or 11. (They're also always very close to having the same probability, though I wasn't sure what to do with this.) This suggests a simple algorithm for guessing the next bit with greater than 50% accuracy would be to simply guess the other bit: if you are given a 1, guess a 0 (and vice versa).Quote:
00: 0.233396584440228
01: 0.311195445920304
10: 0.311195445920304
11: 0.142314990512334
00: 0.226691042047532
01: 0.297989031078611
10: 0.297989031078611
11: 0.175502742230347
00: 0.223042836041359
01: 0.280649926144756
10: 0.282127031019202
11: 0.212703101920236
00: 0.242748091603053
01: 0.274809160305344
10: 0.274809160305344
11: 0.206106870229008
I then created four strings of flail-text (each 680 characters in length). Here were the distribution results:This means that if on any given bit, I was given the bit and told to guess the next one, I would have been correct 51.7% of the time if given a 0, and I would have been correct 51.4% of the time if given a 1. In other words, it fails the "next-bit" test.Quote:
00: 0.222058823529412
01: 0.260294117647059
10: 0.261764705882353
11: 0.254411764705882
00: 0.254411764705882
01: 0.239705882352941
10: 0.239705882352941
11: 0.264705882352941
00: 0.222058823529412
01: 0.282352941176471
10: 0.283823529411765
11: 0.210294117647059
00: 0.263235294117647
01: 0.245588235294118
10: 0.245588235294118
11: 0.244117647058824
Obviously, my sample size is small. I understand that I'm not proving anything. I'm merely suggesting that this limited evidence suggests that statistical distribution is a problem for this method.
Thanks for the references, laserlight.
I do think however they support my claim. Although I understand there can be a debate around "statistically independent and unbiased binary digits", exactly because as you say this is open for interpretation... or suggestive of a certain unmeasurable property.
One note about the Radioactive decay though. I was under the impression, that given a large number of similar atoms, radioactive decay can theoretically, at least partially, be predicted. This is not to say it is not random. But exactly that it is as random as our lack of ability to predict or duplicate. Why I find Bruce Schneier description more complete.
Another note about RNG inputs. As I understand an RNG input is the combination of forces that will activate and condition the system. The strength, vector, temperature, wind, rotation of the earth, atmospheric pressure, etc that activate and condition the throwing of the die, or the electrostatic and nuclear forces that activate and condition the radioactive decay.
I took a look at your attempt. And I cannot see how you proved to me you can predict what's the next character I'm going to type after this sequence "çlksjrpwqjrkasnf,ndb sd" with a probability of success better than 50%. (note: I really typed that on the fly)Quote:
Originally Posted by pianorain
Edit: Also note that by typing more characters (millions, billions, trillions of them) I'm not necessarily increasing your ability to predict the character, more than studying millions, billions, trillions of similar atoms in order to predict a window in time of atomic decay, or tossing a dice millions, billions and trillions of times in a similar fashion in order to predict the next number.
The prediction has to do with the proportion of atoms in the sample that will decay over time, but RNGs that use it are usually based on a measurement concerned with when the next atom decays.Quote:
Originally Posted by Mario F.
If you read Asimov's Foundation series, you might notice that he used this kind of concept in psychohistory: Seldon's science allows him to predict events of humanity for centuries to come, but tells him nothing about what the person next to him will do next.
51.4% may be more than 50% but it sounds like another version of the same bad joke: a random sample must yield 50/50 to be random.
You miss the forest for the trees. You don't need to do any sort of analysis at all to get a 50% success rate, if the data tends toward 50%: just guess the same thing every time. The next bit is either 0 or 1. The criteria is completely meaningless.
You already know the baseline here is 47%, which is why you can come up with a way "to guess" that will be correct slightly more than 50% -- just guess 1. If (because of some extremely silly misapplication of principles) you are determined to get 50%, like I said, adjust the algorithm to reflect English language distribution. The all your samples will be 50/50, and you will be unable to guess better than exactly 50%.
That will get your analysis more in line with what you want, but it does not take away from the fact that this is just a grotesque distortion of rational principles. If the reason you can get 51.4% is because the ultimate baseline is 47% (which it is), then you will never get a better rate than 53%, no matter how big the sample. And 53% will never enable you to break the cypher.
Let me tell you what I seem to believe is the problem here:
There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.
I suggest you stop.
for my own part i am merely saying that historically cryptographers have not used simple random typing as a secure form of randomness,whether through empirical evidence or theoretical study it has been found that it is not a usable method, the fact that its use has been shunned in real world applications by people that genuinely need to guarantee security as far as they can, is evidence enough for me.Quote:
There's this vague notion written by someone that "randomly typing at a keyboard does not constitute a random event". You guys picked this at face value and are stating it as law. Trying to prove something that the author himself couldn't.
Or just quote a source. This paper was extremely interesting. Even after de-skewing, the typing test failed.
It's a good argument. But then,
It probably hasn't been used because it really isn't practical to type hundreds of random characters in a keyboard and it does tend to look silly and rather unprofessional for a cryptanalyst to do it ;)
But more seriously (despite the above paragraph being more serious than what I'm making it look), it says nothing of the randomness approach to a One-Time pad. Why? Because your only hope to find the plaintext is to force the cypher against bruteforce and hope for the best, not to study the properties of the pad.
It probably did. As this sequence I just generated from an Mersenne Twister may fail: 195433953634
That is a gross misrepresentation of what I am trying to say. If you actually read the material that I linked to, you would notice that it affirms my claim that even well recognised sources of randomness must be tested to check that they are suitable for use as a RNG in cryptography. If they are not suitable as implemented, then it may be possible to account for bias, but failure to do so means that the resulting RNG's output cannot be used for a one time pad.Quote:
Originally Posted by Mario F.
But there's more: I am also concerned with a practical application of this RNG, given that it depends on source text that might have a tendency to be repeated. It would be more difficult to test for this though, since this depends on the implementation and patterns of human behaviour should humans be involved in source text selection. But if humans are not involved, then how can this be correctly automated?
that of course is very true, a key generation method has to be ideally easily replicated, but then your mad professor might be the jackson pollock of the crytography world and get a kick out of such key generation ;)Quote:
It probably hasn't been used because it really isn't practical to type hundreds of random characters in a keyboard and it does tend to look silly and rather unprofessional for a cryptanalyst to do it
This is a non-response. Either explain the faults of the experiment's procedure or accept its conclusions. Keep in mind: you cannot prove or disprove the randomness of any generator. You can only run tests that suggest that it was random or non-random in the past. In this case, random human typing was found to be non-random. I have seen no case in which a serious effort has been exerted that shows that random human typing is, in fact, random.
All that indicates is that the typing test was poorly designed (the "de-skewing" criteria is even worse). You could easily keep adjusting the mechanism until it passed the test -- which implies to me the person who wrote this paper is chasing their own tail.
It would be simple to modify the input algorithm to pass the de-skewing test as described (if two bits are the same, throw them away, if not keep the first one). You don't even have to redo the test: just take the data and reassign the characters to yield a completely uniform distribution.
Pure circular reasoning. It is all based on the idea that "a sequence of random bits is a sequence where all the bits are independent from one another and uniformly distributed". That is a bad definition.
That the bits are independent of one another is sufficient. If you want to test the algorithm, and not just a sample of data, you have to demonstrate that you can guess the next bit at a rate better than the distribution of the bit. The exact distribution is irrelevant.
You still need to think for yourself.
Actually it wasn't directed at you laserlight. Your argument stands firm, I believe. You question, but do not dismiss entirely. You suggest traditional methods to test this, and I accept that. But where are these tests so we can put this debate to a rest?
Instead the statement was directed at attempts to entirely dismiss my argument. I find them slightly annoying because there's still no evidence either way. And being the case we have historically supported our RNGs on our ability to produce unpredictable and not duplicateable (sp?) sequences, so far there's only a proposal.
Indeed, particularly because the researchers "assign either 1 or 0 to each key on the keyboard". This is a hash function that is pretty much equivalent to what MK27 proposed.Quote:
Originally Posted by pianorain
By the way:
Clearly, computer science students are good substitutes for monkeys :DQuote:
Ideally, we then open a text editor and get a monkey to hit the keyboard. In practice, we were forced to use computer science students instead.
The reason why even brute force fails against a one time pad is that the information contained in the key is no less than the information contained in the plaintext. But if the key comes from a source that is not sufficiently random, possibly due to bias, then the key would contain less information than you think it does, and thus the stream cipher would not be equivalent to a one time pad. So, it makes sense to study the properties of the pad if the randomness of the generator that produced the pad is suspect.Quote:
Originally Posted by Mario F.
Yes, that is an important consideration mentioned in the paper: "both the number of sequences and the number of bits per sequence are too low to be statistically significant".Quote:
Originally Posted by Mario F.
The Diehard tests are quite famous, and then I believe NIST also provides a test suite. The problem that I am facing right now is how to select source text. You need a great deal of it to be statistically significant.Quote:
Originally Posted by Mario F.
Yeah, and I have read the paper and hopefully made clear that IMO it is not a good paper -- simply quoting someone else's shoddy work does not legitimize either the work or the criticism it is mean to support.
The entire assessment was not based not on predictability, but whether a 50/50 distribution is produced. That's no good.
It also misses the point that over time, the keyboard test MUST tend toward 50%. If it does not, you have designed the test badly. It cannot be any other way. One test with a few students does not change that.
lets say you have the porn.bmp file encrypted using 4 different one time pads. first pad uses a computer generated random numbers. second pad uses a true random number generator. third pad uses the numbers you put in to a file key.txt using 30 sided dice so that the numbers would only be from 1 to 30 per character. the fourth pad would use a sequential number generator. 1 2 3 4 5 up to 255 a stair case pattern. the control file or pad. do you really think it matters what it was encrypted with if a brute force on all the pads would open all the files ? trying every possible key.
would you not recognize your own data ? porn.bmp file ? in the above example. if it was your data or file you would know when you have the correct key. after the fact of opening and looking through a LOT of data to get possible contents of the file that was encrypted. that could be verified later through torture or trickery or other means. if you were not the one that encrypted the file in the one time pad. the difference being that if yours you would know right a way which one is the correct data or the second external verification after the fact <or not> which would be the correct data.
that would make a good twilight zone or outer limits show with the porn generator accidently discovered by trying to decrypt a one time pad by brute force. but you know what would really happen in real life ? some corrupt district attorney would charge you with 9,999,999 counts of illegal porn. instead of just the one porn.bmp file count.
do you think one time pads should have a master key or skeleton key or back door incase you loose your key.file or hard drive crash say key on one and file on another drive. some encryption algorithms do contain a master key.
you that put up short encryptions do you not recognize your own data if the encryption was brute forced ? even if you lost your key ?
maybe but it is a fact that certain characters are used more than others therefore an inverse freq could be used.Quote:
No, frequency analysis is useless against a one time pad.
example 2 a random run repeated key encryption and a vernam one time pad. if you do not have the key to eighter file at the beginning of the cryptanalysis it does not matter if it is a one time pad or not ? but after you have the key is the only time when that matters as to wether you can use the key again or not.
back to brute forcing your own file. if you recognize your own data then you can agree that a one time pad was cracked ? even if you have to add the stipulations that only someone in the "know" can say "yes" that is the data or verified through external verification after the fact by torture or trickery or other means ? or if the file is long enough to see what comes in to focus <analogy> or the rest of the data makes sense.
please do not say i am wrong. please respect me for having a different opinions or beliefs.
only in theory.Quote:
theoretic perfect secrecy
schrodingers cat is dead you "geniususes" forgot to feed her. a biological entity can only live for three days with out water. a biological entity can only live for three weeks with out food. oh gawd did you even provide oxygen ? a biological entity can only live for three minute with out oxygen. if so then the "geniususes" are dead too because of the radiation leek. "geniususes" ....Quote:
Schrodingers Kat
it is not like starving a computer of electricity to crack rsa.
laserlight have you ever really used a real nitrocellulose one time pad ?
It's okay, Mario. You can say, "pianorain is annoying me." I'm a big boy; I can take it. ;)
I apologize if I sound like I'm trying to dismiss your argument. I simply don't know of and have never heard of any source citing what you're saying. Meanwhile, I know I've heard of sources saying exactly the opposite: that random human typing is non-random. It's beyond me to try to test definitively either way. I'm just trying to present my thoughts. Once again, apologies for any disrespect to your argument.
Can I ask another question then? Doesn't unform distribution (and fair distribution, for that matter) mean equal probability? Natural methods of randomness have this property. Correct me if I am wrong but you are saying that this simple method of yours is suitable for a one time pad. That can't be true if it has a bias toward specific bits, no matter how small.
I'm not saying it's completely nonrandom, (and neither is pianorain) but on the assumption that I can believe the figures you and pianorain have, it's not going to be good enough for a one time pad. Even if it manages to fool all the people here. I mean, the key has to be randomly picked for the otp to work, that's completely why otp does work. Otherwise, the facts will be vulnerable to enemy cryptographers.