Thread: how to create hashes

  1. #1

    how to create hashes

    (I know this is programming related, but it's more of a concept then a direct programming question, so it didn't seem to fit anywhere else nicely....)


    I'd like to create a couple of my own custom hashes, basically just for the sheer fun/learning proccess of it....

    I don't need anything fancy like MD5, just a simple function that will take a string of characters and spit out a value between 0 and 255 (it needs to be only 1 byte big)....

    what's the proccess that I'd have to go through for this? basically all I'm doing is creating a value, in the end, from which the input string is uncalcuable, right? would something simple like below suffice?

    Code:
    unsigned char hash1(char pwd[])
    {
       int idx,total=0;
       for (idx=0; idx < strlen(pwd); idx++)
             total += pwd[idx] + (42 - idx);
       return(total % 256);
    }
    but with that you can kinda make loose guesses as to what the input string was because each character's value directly influences the resulting value.... so would something like below be a bit more, er, "random"?
    Code:
    unsigned char hash2(char pwd[])
    {
       unsigned int idx,total=1;
       for (idx=0; idx < strlen(pwd); idx++)
          total += (pwd[idx] >> (idx+1)) * (pwd[idx] << (idx+1));
       for (idx=0; idx < strlen(pwd); idx++)
          total *= pwd[idx] % (pwd[idx] >> (idx+1));
       return(total % 256);
    }
    am I running down the right road here, or just a complete, hopeless moron?

    any input or tutorials are the subject would be appreciated.....
    "Cryptographically secure linear feedback shift register based stream ciphers" -- a phrase that'll get any party started.

  2. #2
    Registered User sean345's Avatar
    Join Date
    Mar 2002
    Posts
    346
    I found http://www.gamedev.net/reference/art...rticle2066.asp to be helpful when I was first starting to learn about hashes.

    - Sean
    If cities were built like software is built, the first woodpecker to come along would level civilization.
    Black Frog Studios

  3. #3
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    It depends on what you want your hash to do. Most hashes are used to preserve data integrity. Basically, any change to the input should result in a different output, while two different inputs should not result in the same output.
    Code:
    unsigned char hash1(char pwd[])
    {
       int idx,total=0;
       for (idx=0; idx < strlen(pwd); idx++)
             total += pwd[idx] + (42 - idx);
       return(total % 256);
    }
    This kind of hash has the unfortunate property of commutivity. Try running the strings "welcome" and "leewcom" through it. They both yield the same value. (Not to mention that strlen is used in the for loop condition. You should assign the size to a temporary variable so that you don't recalc the length every loop.)
    Code:
    unsigned char hash2(char pwd[])
    {
       unsigned int idx,total=1;
       for (idx=0; idx < strlen(pwd); idx++)
          total += (pwd[idx] >> (idx+1)) * (pwd[idx] << (idx+1));
       for (idx=0; idx < strlen(pwd); idx++)
          total *= pwd[idx] % (pwd[idx] >> (idx+1));
       return(total % 256);
    }
    Interesting, but unusable. The % operator may (and does with an input of "welcome") hit a divide by 0.

    The overall goal of a hash is to create a number that depends on both the order and the value of the data passed in. This is most easily accomplished using bitwise operators. Consider the following "hash":
    Code:
    unsigned char hash3(char pwd[])
    {
    	unsigned int idx,total=0;
    	unsigned int size = strlen(pwd);
    	for (idx=0; idx < size; idx++)
    		total = (total << 1) ^ pwd[idx];
    	return total;
    }
    This is both order- and value-dependent. With inputs "Welcome" and "eWlcome", the output is 187 and 123. Note that with an 8-bit hash, there will be a significant number of collisions (two inputs that result in the same output).

    I recommend the following link for a more indepth discussion: http://www.geocities.com/SiliconVall...s/8659/crc.htm.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  4. #4
    Thanx for the links, I'll be sure to read through them

    I hadn't thought about accounting for the order of the chars in the string influencing the outcome, thx for mentioning that

    and in what compiler does the '^' operator exist? I've always had to write my own function to do that....

    I know that with a mere 256 possibilities there going to be a heckofa lotta collisions, but my aim is to create about 5 different hashes that will be used, that way there will be 256^5 possible combinations of values from the 5 hashes, meaning that there should be very few collisions for an input pwd that would generate all 5 hashes perfectly, just like the actual origonal pwd..... btw, by doing that I wouldn't be comprimising the origonal string, would I? if they have 5 different hashes of the string, would that make it easier to try and guess what it was?
    Last edited by B-Con; 02-25-2005 at 12:33 AM.
    "Cryptographically secure linear feedback shift register based stream ciphers" -- a phrase that'll get any party started.

  5. #5
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    The ^ operator exists in every C compiler, but it's a bit-wise XOR, not a power operator.

    Oh, and don't put strlen in the loop condition. Take the value once and save it. We had an interesting discussion about this a while ago.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  6. #6
    where was the discussion? got a link, please?
    "Cryptographically secure linear feedback shift register based stream ciphers" -- a phrase that'll get any party started.

  7. #7
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  8. #8
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Quote Originally Posted by B-Con
    btw, by doing that I wouldn't be comprimising the origonal string, would I? if they have 5 different hashes of the string, would that make it easier to try and guess what it was?
    Well, easier? I guess so. After all, the more data someone can get, the "easier" it is. However, a good hash is typically a one-way street. By a hash's very nature, you will lose a certain amount of data about the original string. Recreating that data is practically impossible.

    I suggest using a single 32-bit hash instead of 5 8-bit hashes. The page that I originally posted has some good information on hashes, along with some code that you can download and use.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  9. #9
    I suggest using a single 32-bit hash instead of 5 8-bit hashes. The page that I originally posted has some good information on hashes, along with some code that you can download and use.
    I'm using it to offset bytes, so any value over 256 is pointless.... thus I chose to create several hashes.....
    "Cryptographically secure linear feedback shift register based stream ciphers" -- a phrase that'll get any party started.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Can't create child windows
    By OnionKnight in forum Windows Programming
    Replies: 4
    Last Post: 04-10-2011, 04:13 PM
  2. Cannot create shared memory
    By Phoenix_Rebirth in forum C Programming
    Replies: 3
    Last Post: 11-07-2008, 11:32 AM
  3. Create new combo boxes based on items selected in preview combo box
    By RealityFusion in forum Windows Programming
    Replies: 2
    Last Post: 01-10-2007, 09:50 AM
  4. Create a file from c++ program
    By Dan17 in forum C++ Programming
    Replies: 2
    Last Post: 05-08-2006, 04:25 PM
  5. How to Create reference to an array in C++
    By shiv_tech_quest in forum C++ Programming
    Replies: 2
    Last Post: 12-20-2002, 10:01 AM