If you presume the absolute stupidest case, that is a good point. In that case, nothing will work -- I am sure that given time I could demonstrate that /dev/rand could generate the same dumb-ass sequence, and the same will be true of *any* random natural phenomenon.
The re-occurrence of 0011, non-aligned, in an (eg), 8000 bit sequence would not amount to a predictable pattern because there are only 16 possible sequences of 4-bits, you will find all of them plentifully. Please meditate on this truth before you make any more silly observations which do not apply. Unless you are looking for a bit pattern that could contain no 1's and no 0's (since a 1 more than once might count as "pattern") -- but good luck with that
I presume this is a book about codes which has little or nothing to do with a computer. If you were paying attention, you would notice I was not talking about using this:
aiudnbqensddfdkncudiw834wmcs
as a character to character pad, which I am 100% certain is what your objection from this book concerns (if you had taken the time to explain yourself, you would have probably noticed this and not posted). I am talking about taking each key as either 0 or 1 -- so there can be no pattern
"asdf" or "lkjh" or "dumbass"
by doing that you would turn 8000 keypresses into truly random natural data. Unless you intentionally type asdf over and over again, I guarantee no form of pattern analysis will find anything. The best idea would be to just type words (the space character is even, but all the vowels are odd). An average word is considered 5 characters, and even if you repeat the same word a lot a 5-bit sequence is not a significant binary pattern. A five digit sequence in decimal, hexadecimal, base 26, or base 256 will be a significant pattern because of the number of possibilities.
Last edited by MK27; 03-16-2010 at 09:09 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
I have not read the book, but a quick search of the Web shows that your presumption is unwarranted.Originally Posted by MK27
I think that's the problem: even when they try to be random, humans have a bad habit of being predictable. I am not sure if your suggestion really would avoid this problem, but your guarantee carries no weight when you have not yet performed any tests, studied the literature etc, to verify that your proposed method is actually reliable for its intended purpose.Originally Posted by MK27
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
Malarky.
You do not even have to try to be random. Let me repeat myself:I think that's the problem: even when they try to be random, humans have a bad habit of being predictable. I am not sure if your suggestion really would avoid this problem, but your guarantee carries no weight when you have not yet performed any tests, studied the literature etc, to verify that your proposed method is actually reliable for its intended purpose.
Look at a paragraph of text as an odd-even (binary) sequence based on ascii values. Every character transformed to 0 or 1. You are now looking at truly random natural data.The best idea would be to just type words (the space character is even, but all the vowels are odd). An average word is considered 5 characters, and even if you repeat the same word a lot a 5-bit sequence is not a significant binary pattern. A five digit sequence in decimal, hexadecimal, base 26, or base 256 will be a significant pattern because of the number of possibilities.
Ie, data that is just as random as radioactive decay would be. The fact that it would be hard to avoid pattern in base 26 is irrelevent. So the base 26 patterns inherent in language and how you type are also now irrelevant -- so you have a source of naturally random data in exactly the same sense as /dev/rand or radioactive decay would do this. This does not require analysis to prove -- you would have to prove that there is some pattern inherent in English or whatever when you convert the characters to 0 or 1 using ascii modulus, which it would be very very very astonishing and only a fool would think there is.
Last edited by MK27; 03-16-2010 at 09:21 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
Pardon?Originally Posted by MK27
How do you know that it is "truly random natural data"?Originally Posted by MK27
Let me be clear: I am not saying that your idea will not work. I am saying that I am not yet convinced that it is guaranteed to work. Have you actually checked how this fares with the various statistical tests for randomness?
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
it seems the only guaranteed way to do it is to turn to nature yes,Personally, I like the use of radioactive decay to generate random numbers.
Actually, /dev/random wouldn't fall for anything like that, because it uses a lot more information for randomness. Keystrokes, probably speed of keystrokes, probably time they keys are pressed, mouse movement, probably also processor temperature and network traffic and network packet loss.
But your idea is flawed using, for instance, the "asdf" string I said. Also, using plain texts, certain words or combinations of characters are more common than others. While the more characters you use the less this will be noticeable, I doubt this is a very good method.
Yes. Converting a paragraph of text to 1's and 0's, where each character is either one or the other, is truly natural random data. Here is the alphabet:
10101010101010101010101010101010
The alphabet is historical in origin -- it is not based on frequency of occurrence, or grouped any other way (since those origins predate contemporary languages). That is a COMPLETELY RANDOM NATURAL SEQUENCE OF 26 CHARACTERS. You will notice that patterns of any length from the alphabet -- even just two characters -- are unusual in natural language. Count the number on this page.
I am not saying this is "like naturally random data" or "as good as naturally random data" I am saying this is naturally random data, just like radioactive decay (in this context) is naturally random data.
Asking me to prove that is like asking you to prove the same thing of radioactive decay. Considered one way, radioactive decay is not random at all -- it occurs at a fixed rate. So if I wanted to be silly and argue a false point -- therefore radioactive decay cannot be considered random.
Last edited by MK27; 03-16-2010 at 09:34 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
My worry is that commonly used language structure itself may contain patterns that are present even when the text is converted to bits using your proposed method.Originally Posted by MK27
If you were not aware, even radioactive decay based RNGs are checked with various statistical tests for randomness, just in case. Just because it involves nature does not make it suitable for use as a RNG for cryptography.Originally Posted by MK27
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
That is an rather dumb idea coming from a bright person. Think harder about why that is logically impossible. I'll prime the pump a bit: try and explain how it could have such patterns in it.
It is easy to see how there are patterns of letters, and even meta-patterns of such patterns in phrases, etc, and it is exactly for that reason that there can be no patterns in the sequence reduced to binary. Logic. You could do this with a lot of natural phenomenon -- and nb, I am again asserting that language amounts to a natural phenomenon (like radiation) and considered this way specific instances of it are reduced to true chaos -- they can no longer be meaningful, predictable, or patterned, because what made them meaningful/predictable/patterned (26 characters) has been reduced out. The resulting data is as random as random could be. The context which gave them pattern (the bigger picture) is out of focus, and you are examining the data on a level where it is now meaningless; you can turn a string of characters into a string of 1 and 0, but you will never be able to turn those ones and zeros back into the meaningful sequence of characters which, in fact, determined their order, because each element has 13 totally indeterminate possiblities. The exact same is true of radioactive decay data.
Given sufficient apparatus, context, and knowledge (like a knowledge and appartus we probably do not posses, and it would be absurd) the decay you observe could easily be 100% predicable down to the individual atom. But without that, all you will see is random chaos. It also will not matter, because even given predictability, the event which generated the sequence (an actual decay event) cannot be "deduced" or assumed like: if I have a segment of the data, I can tell you the rest of it without witnessing the event. Nope! The matter is gone! The same thing: given a segment of the 1 and 0 from a string, I can tell you the rest of the string. Nope! The "matter" is gone!
Last edited by MK27; 03-16-2010 at 10:01 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
Well, following this logic I could certainly achieve proper randomness by using a PRNG to, say alter the position of characters on a text of my choosing in order to produce the pad.
Originally Posted by brewbuck:
Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.
@MK27:
Frequency analysis - Wikipedia, the free encyclopedia
Some letters just appear more often than others. And some letter combinations appear more often. So some bit combinations will probably appear more often.
I don't see how that is following this logic at all -- you still have to protect the identity of the PRNG which will be orders of magnitude harder than
1) protecting the identity of a radioactive decay event (impossible to identify, unless you somehow have kept the event on tape somewhere)
2) protecting the identity of a paragraph of English (or whatever) text (also impossible to identify if you write the text then throw it away)
And please please please: if you are just reading this, do not jump in and repeat the criticism already raised by rogster and laserlight, read the thread if you need to understand why they are invalid.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
No. I'm sorry. But you are the one not noticing the relevance of my argument. I'm essentially doing the same thing as you are suggesting should be done. Take a closer look.
And because the secrecy of the key is maintained, I'm in fact being a lot more effective than you are. Your pad can be deduced. Mine can't so easily
Last edited by Mario F.; 03-16-2010 at 10:45 AM.
Originally Posted by brewbuck:
Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.
CAN YOU READ DUDE? Stop being ridiculous. Honestly:
I have repeated this three times now: a 4 or five digit sequence in binary IS NOT THE SAME as a 4 or 5 digit sequence from the alphabet.The best idea would be to just type words (the space character is even, but all the vowels are odd). An average word is considered 5 characters, and even if you repeat the same word a lot a 5-bit sequence is not a significant binary pattern. A five digit sequence in decimal, hexadecimal, base 26, or base 256 will be a significant pattern because of the number of possibilities.
With the alphabet, you have 26^5 possibilites, so the same pattern twice is unlikely and would constitute something unusual. With binary, you have 2^5 possibilities. Let me say this again a few more times in the hopes it sinks in:
The repetition of a 5 digit sequence in binary IS NOT a feasible pattern.
The repetition of a 5 digit sequence in binary IS NOT a feasible pattern.
The repetition of a 5 digit sequence in binary IS NOT a feasible pattern.
There are 32 possible 5-digit combinations of 1 an 0. There are more than 11 million possible 5 character combinations of abcdefghijklmnopqrstuvwxyz. Do you understand the difference and what that means?
In a 10000 digit sequence of binary data, all 32 possible combinations will repeat many times no matter how random it is, therefore: noticing the repetition of a 5-digit sequence will be pointless. It won't mean anything, it will happen often just by chance. In a 10000 character sequence of alphabetic data, having even one repetition 5 characters long is very unlikely if the data is random, therefore: noticing the same repetition will be very meaningful. This makes the analysis of "pattern" in the data very very different indeed.
That some words or letter combinations are more frequent is irrelevant. You to show me how "some combinations of 50 or 100 characters" are more frequent than others -- good luck.
Last edited by MK27; 03-16-2010 at 10:59 AM.
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge