1. Originally Posted by MK27
If you have the key, yeah, that is the whole point. But if you have the key you don't need to "break" the text which created the key -- you can skip straight to decrypting the message.
Now you just gave me exactly what I needed to prove that your encryption is weaker than using a proper one time pad.

Imagine the following algorithm:
For each possible string of characters (2^(8*n) possibilities, where n is the number of encrypted characters), apply your algorithm to convert this to a key. Then using this key, decrypt the text. If the decrypted text and the initial string of characters both were sensible in some way, we cracked the code.
Agreed that this would work? It's a slow algorithm, 2^(8*n) tries, and practically impossible. But not theoretically impossible.
I think you'll have to agree with this.

However, having a real one-time pad DOES NOT suffer from this flaw. It is theoretically, not just practically impossible to break. This is because you only have one string of characters to find out if it makes sense.

So, which part of this did you not agree with?

2. Originally Posted by MK27
you do not have to keep the algorithm secret
That the algorithm is assumed to be known to the adversary is one of the basic principles of cryptography.

Originally Posted by MK27
and you do not need an RNG
Isn't what you are proposing supposed to be a RNG?

Originally Posted by MK27
And you do not have to use real text, any set of character strings at all will do. Just make sure it is not related to the message text. And you will have a truly random key generator.
So how would you address the problem of key management, i.e., how should the sequence of characters be chosen and not reused? A fully automated approach does not seem feasible since it requires randomness to select a sequence, yet it is this very sequence that is supposed to provide the randomness. Relying on the human opens up the possibility of human related predictability, e.g., humans might tend to quote Shakespeare.

3. Originally Posted by laserlight
Isn't what you are proposing supposed to be a RNG?
Sort of -- but it uses a potentially very very long "seed", which makes it unbreakable (vs a RNG, which just produces LONG_INT_MAX possible sequences). It's also just one line of very simple code.

So how would you address the problem of key management, i.e., how should the sequence of characters be chosen and not reused?
Take some text, unrelated to the message, 8X the message length. Generate a key. Throw the key generating text away.

Relying on the human opens up the possibility of human related predictability, e.g., humans might tend to quote Shakespeare.
Yes, they might also do this "asdf asdf asdf" on purpose -- again, the stupid possible case is just that. It would be handy for key management in this sense -- you could just use a specific point in a specific available text (a certain bible edition, eg). As long as that is secret, there will be no guessing the key. You don't even need to keep the "actual" key, you just need to know those details.

That makes it better than radioactive decay data.

Originally Posted by EVOEx
Now you just gave me exactly what I needed to prove that your encryption is weaker than using a proper one time pad.
No, apparently I have given you just want you need to feed your delusions further

Imagine the following algorithm:
For each possible string of characters (2^(8*n) possibilities, where n is the number of encrypted characters), apply your algorithm to convert this to a key. Then using this key, decrypt the text. If the decrypted text and the initial string of characters both were sensible in some way, we cracked the code.
Agreed that this would work? It's a slow algorithm, 2^(8*n) tries.
Yes. It is more than slow, I am afraid. A 140 character twitter message would be 2^1120. You may think that is not such a big number, but you are wrong -- neither of us even knows a word describing this number, and probably you do not have any hardware or software available to you right now to tell you just how big a number that is.

Your grandchildren will be long dead before your (now antique) computer has finished calculating the possibilities. Then you will need several Earth planet's worth of staff to consider them. This is the reality of decimal numbers. If a container can hold any number of liters with 10 digits in it (a big container!), an 11 digit value (just one more digit!) might potentially require 10 such containers, and a 12 digit value (just 2 more numbers!) is guaranteed to do so.

4. Originally Posted by MK27
Yes. It is more than slow, I am afraid. A 140 character twitter message would be 2^1120. You may think that is not such a big number, but I am afraid it is way too big -- neither of us even knows a word describing this number, and probably you do not have any hardware or software available to you right now to tell you just how big a number that is.

Your grandchildren will be long dead before your (now antique) computer has finished calculating the possibilities. Then you will need several Earth planet's worth of staff to consider them.
I completely agree! Maybe the algorithm can be optimized some to be actually feasible. But let's say it can't. I'm fine with that.
However, there IS such an algorithm. While there IS no such algorithm for a proper one time pad. Meaning that even if your idea is strong (I'm not saying it is), it is NOT as strong as a real one time pad.

It's like saying 65536 bits RSA is as strong as 131072 bits RSA. We won't be able to crack either within the lifetime of the universe probably (using our current hardware), but that doesn't mean that one isn't stronger than the other.

So a real one time pad is still stronger than your idea.

Can you agree with this?

5. Originally Posted by MK27
vs a RNG, which just produces LONG_INT_MAX possible sequences
Sorry for the confusion: when I say RNG, I mean random number generator, not pseudo-random number generator (PRNG).

Originally Posted by MK27
Take some text, unrelated to the message, 8X the message length. Generate a key. Throw the key generating text away.
Unfortunately, that does not adequately answer the question because "take some text, unrelated to the message, 8X the message length" is not precisely specified; it seems to me that it is easier said than done if you really want randomness. This is related to:
Originally Posted by MK27
Yes, they might also do this "asdf asdf asdf" on purpose -- again, the stupid possible case is just that.

6. Originally Posted by laserlight
Unfortunately, that does not adequately answer the question because "take some text, unrelated to the message, 8X the message length" is not precisely specified; it seems to me that it is easier said than done if you really want randomness.
My whole point is that reducing a set of base 27 numbers down to binary will always produce a random result. Depending how you do it, eg with hex:

0-7 = 0
8-f = 1

If the numbers are sequential:

0 1 2 3 4 5 6 7 8 9 a b c d e f

You will get:

0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

Which you can deduce something there if you know the algoritm (0-7 = 0, 8-f = 1). But because the "pattern" in sequential numbers has nothing to do with the "patterns" in meaningful language:

d e a d b e e f

1 1 1 1 1 1 1 1

Since you might have sequential numbers, the modulus method is better (because it also elimate the "pattern" in sequential numbers), you will just get stuff like

1 0 0 1 1 0 0 1

Now, since this all pertains to the key source text and not the message -- well, it is unbreakable, as I said.

7. Originally Posted by MK27
Now, since this all pertains to the key source text and not the message -- well, it is unbreakable, as I said.
I am still awaiting your reply to your reply to my previous post, as I consider it a key element to us getting to an understanding on this subject.

But this quote... Didn't you just agree with me that it was breakable? Albeit only theoretically and extremely difficult and slow?

8. Originally Posted by MK27
My whole point is that reducing a set of base 27 numbers down to binary will always produce a random result.
No, it does not: that mapping is deterministic; it is a hash function. The way I see it is that your point requires that 'the "pattern" in sequential numbers has nothing to do with the "patterns" in meaningful language'.

Originally Posted by MK27
Now, since this all pertains to the key source text and not the message -- well, it is unbreakable
Unless it is feasible to guess the key source text.

9. Originally Posted by EVOEx
I am still awaiting your reply to your reply to my previous post, as I consider it a key element to us getting to an understanding on this subject.
Did you mean this?

Originally Posted by EVOEx
Meaning that even if your idea is strong (I'm not saying it is), it is NOT as strong as a real one time pad.

So a real one time pad is still stronger than your idea.

Can you agree with this?
No, it is a one time pad. I know you said, "However, there IS such an algorithm" -- yes, in the same sense there is "an algorithm" to produce all the possible keys for a one time pad, and since this is a one time pad, I guess so. It is not "easier". How could it be? Because you can attempt to guess that text too -- which is eight times longer than the message?

But this quote... Didn't you just agree with me that it was breakable? Albeit only theoretically and extremely difficult and slow?
I agreed that the one time pad is limited to 2^(8n) possible keys, and since that set is finite you could generate them "theoretically" and it would be "extremely difficult and slow". The text that produced the key actually has 26^(8n) possibilities. The wrong tree to bark up, in other words.

But as a bunch of people tried quite a bit to demonstrate to kryptcat, that won't do you much good beyond getting a lot of practice counting things. To somewhat rephrase laserlight, if you buy all the tickets to a lottery, you can say in advance that you have won. But if the winning ticket remains a secret, and no can acknowledge which one it is, in reality you can never win until you choose a ticket. If the person who wrote the message is patient and honest, you could show them all the possibilities you have successfully generated in the past few centuries and they would have to admit, one of them is correct.

More realistically: having all the possibilities won't help. And there is NOTHING else you can do. That is why it is unbreakable. The key does not contain a pattern. The text that generated the key did -- but like the fissile material with the geiger counter, that is irreversibly gone, and it has nothing to do with the message. It is truly random natural data.

10. I lost track of what is being debated here. The basis of the OTP is that the quantity of information in the key is at least as large as the quantity of information in the plaintext. If that holds, then the cipher is unbreakable.

Note that "quantity of information" refers to the true information of the key -- if you start with some fixed number of bits (a random seed) and expand that using a PRNG into some larger number of bits, you have NOT increased the quantity of information by that much (it does increase a TINY amount because the details of the PRNG algorithm are encoded into its output sequence), and you certainly do not increase it arbitrarily.

Any method of generating the key bits that does not involve a truly random source, is not a true OTP.

11. Originally Posted by brewbuck
I lost track of what is being debated here. The basis of the OTP is that the quantity of information in the key is at least as large as the quantity of information in the plaintext. If that holds, then the cipher is unbreakable.
Any method of generating the key bits that does not involve a truly random source, is not a true OTP.
Well I fulfilled both criteria:

1) the key generated will be at least as large as the data.
2) the key was generated using a truly random source (to understand why and how this is so, read the thread*).

*paraphrased: human language texts considered as a modulus 2 of ascii values does not contain any pattern; it is a random sequence of 0 and 1.

12. Originally Posted by brewbuck
Any method of generating the key bits that does not involve a truly random source, is not a true OTP.
MK27 proposed a method of generating the key bits: given a sequence of characters as input, compress every character with a hash function such that it maps to a bit. He posits that the resulting bitstream is a truly random sequence suitable for use as a one time pad, even if the sequence of characters is say, the text of a novel.

13. Originally Posted by MK27
I agreed that the one time pad is limited to 2^(8n) possible keys, and since that set is finite you could generate them "theoretically" and it would be "extremely difficult and slow". The text that produced the key actually has 26^(8n) possibilities. The wrong tree to bark up, in other words.
The trick is that you can actually KNOW when you have the proper key as both the input text as the decrypted text make sense. If that is true, you can be fairly sure you have the right combination, even though it is fairly unlikely you will find this using this algorithm.

However, if you find the valid key using brute force with a real random pad, you don't know if you found a right key. You can find any key for any plain text of the same length, and the key will never make sense. In your case the text will make sense, making it obvious you got the right key.

So you don't know you "got the winning ticket" because the key is completely meaningless. In your algorithm you can be fairly certain you got the right key as only a few keys unlock a box. In the original situation, there are a whole lot of keys that unlock a box and you can't know you unlocked the right box as you don't know what's inside.

14. Originally Posted by laserlight
MK27 proposed a method of generating the key bits: given a sequence of characters as input, compress every character with a hash function such that it maps to a bit. He posits that the resulting bitstream is a truly random sequence suitable for use as a one time pad, even if the sequence of characters is say, the text of a novel.
Laserlight, however, is afraid that the ancient space faring civilization that invented humans may have been controlling the evolution of language in such a way that there are secret satanic patterns which will emerge when we statistically analyse the result of the conversion.

Another possibility: MK27, in a naive quest for a source of "random noise", has hit upon what could be proof of God's hand at play in the works of man. So this may only be true of "God's chosen languages" and not just any particular one.

15. EVOEx: when you talk about cryptanalysis, are you are talking about the use of MK27's proposed RNG to generate a keystream, and then to cryptanalyse the resulting ciphertext to reason if the keystream constitutes a one time pad?