Thread: HashTable help?

  1. #16
    Technical Lead QuantumPete's Avatar
    Join Date
    Aug 2007
    Location
    London, UK
    Posts
    894
    Quote Originally Posted by -EquinoX- View Post
    so the key is the one that hash unique values??
    Ok, perhaps we should clean up the terminology. Say you want a hash-table of names->phone numbers. The name is your key, the phone number your data.
    You apply the hashing function to the key and you get your hashkey. The keys are all unique, the hashkeys aren't necessarily. That's why you get collisions. You use the hashkey as an index into your hashtable, where you proceed to store the key and value together. The hashkey is only ever used to access the hashtable.

    Hope that clears things up a bit.

    QuantumPete
    "No-one else has reported this problem, you're either crazy or a liar" - Dogbert Technical Support
    "Have you tried turning it off and on again?" - The IT Crowd

  2. #17
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    yes I got it now.. thank you!

  3. #18
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    It doesnt necessarily have to be used as the index into the table, but they are used to speed up searching. The hashkey is generally much smaller that the key. Imagine you had a list of names, say the average name is 20 characters long, thats 160 bytes. So you make a hash of the 160 bytes into a 16 bit hashkey, now you can just compare the 16 bit values with the search parameters to find potential matches, which you can then verify by doing a full comparison of the 160+/- bits of actual data. It gets even more interestign when you have enormous databases (like google) and multi-level hashkeys running in a multiprocessor system. The speed increase can be huge vs physically comparing each database entry.

  4. #19
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    anyone care to answer my little question:

    okay I got it now.. this may be a simple question but is this function doing according to what it's suppose to do? I am just doing this as a helper function for the hash table later on:

    Code:
    /* Create a brand new copy of the given string on the heap */
    char* StringConstruct(const char *s)
    {
      char* result = (char*) malloc(strlen(s) + 1);
      strcpy(result, s);
      return result;
    }

  5. #20
    Technical Lead QuantumPete's Avatar
    Join Date
    Aug 2007
    Location
    London, UK
    Posts
    894
    Quote Originally Posted by -EquinoX- View Post
    is this function doing according to what it's suppose to do?
    As far as I can see, yes. Though you should check the return value from malloc to make sure its not 0 and don't cast the return value either.
    I'd also reccomend using strncpy, just because getting used to using it will make your code safer.

    QuantumPete
    "No-one else has reported this problem, you're either crazy or a liar" - Dogbert Technical Support
    "Have you tried turning it off and on again?" - The IT Crowd

  6. #21
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    what is strncpy different from strcpy?? why can't I case it, it has to return a char* so I have to cast it unless it's a void pointer..

    and I don't understand this method parameter:

    /* Take a string constructed with the above, and give it back to memory */
    void StringDestruct (char* s);

    do I just do a :

    free(s) here??

  7. #22
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    what is strncpy different from strcpy??
    it gets the buffer length and does not writes more characters than the buffer can accept. no need in your case - you already know that the buffer is big enough

    why can't I case it, it has to return a char*
    What do you mean?


    I have to cast it unless it's a void pointer..
    In C you do not need to cast a void pointer - have you read a FAQ?
    Do you compile as C or C++?

    do I just do a
    yes
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  8. #23
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by -EquinoX- View Post
    what is strncpy different from strcpy?? why can't I case it, it has to return a char* so I have to cast it unless it's a void pointer..
    But malloc() returns a void pointer, so it's fine to use that to assign to char *. [Unless you are using a C++ compiler to compile C-code, in which case you need the cast].

    and I don't understand this method parameter:

    /* Take a string constructed with the above, and give it back to memory */
    void StringDestruct (char* s);

    do I just do a :

    free(s) here??
    I would think so.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #24
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    I am using C here not C++

  10. #25
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by -EquinoX- View Post
    I am using C here not C++
    So no cast will be fine - void * can be automatically converted to char *.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  11. #26
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    one more thing that I don't understand and this has to do with the hashing I guess, I am asked to define in the code if the code is going to be run in 64 bit machine or 32 bit machine as you can see below the hash value returns an unsigned int .. what's the effect of this on a 32 bit machine and 64 bit?

  12. #27
    Technical Lead QuantumPete's Avatar
    Join Date
    Aug 2007
    Location
    London, UK
    Posts
    894
    generally and int is defined to be of WORD length. Where WORD is the size of your databus. In this case either 32 or 64 bit. Hence on a 32 bit system int is likely 4 bytes, on a 64 bit system 8 bytes.

    QuantumPete
    "No-one else has reported this problem, you're either crazy or a liar" - Dogbert Technical Support
    "Have you tried turning it off and on again?" - The IT Crowd

  13. #28
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    so what will 4 byte and 8 byte effect on my code?

  14. #29
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Well, for one thing, an int is normally 32 bit on both 32- and 64-bit machines, so unless you are using a very unusual processor/OS combination, if you are using "int", it probably will not make ANY difference.

    If you use the type "long int", then that is more likely to be 64-bit, but far from guaranteed. [Linux for 64-bit processors use long as 64-bit, but Windows leaves even long int as 32-bit].

    The place where it almost always matters is pointers. A pointer in a 32-bit machines 32-bit and a pointer in a 64-bit machine is 64-bits.

    For most things, it makes no visible difference to you - if you add two 32-bit numbers into a long, you will only see a difference if the 33rd bit gets set by the addition - in a 32-bit machine, you will get a smaller number than you expect, whilst if long is 64-bit you will get a bigger number.

    If you use unsigned int for your hash-value, then you should be seeing no difference.

    But presumably, you do not use the entire 32 bits, as that would necessitate a table of 4G times the size of the hashtable entry - that would be rather huge [and require a 64-bit machine with plenty of memory to run well - at least if you have a good spread of hashvalues and large number of entries]. The normal method is to divide or mask the number to a suitable size, e.g. if your hashtable is 100 long, use hashvalue % 100 to "limit" the hashvalue index.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  15. #30
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    what do you guys think of this add function in the HashTable I created?

    Code:
    boolean HashTableInsert(HashTable *this, const char* key, const char* value)
    {
      list_t *new_list;
      list_t *current_list;
    
      int_u4 hashval = StringHash(key) % this->length;
      /* Attempt to allocate memory for list */
      if ((new_list = malloc(sizeof(list_t))) == NULL)
        outOfMemory(__LINE__, __FILE__, "struct node creation");
    
      /* Does item already exist? */
      if (HashTableContains(this, key)){
        char* val = HashTableLookUp(this, key);
        StringDestruct(val);
        val = StringConstruct(value);
        return false;
      }
      else{
      /* Insert into list */
        new_list->key = StringConstruct(key);
        new_list->value = StringConstruct(value);
        new_list->next = this->table[hashval];
        this->table[hashval] = new_list;
        return true;
      }
    }
    it basically just frees the old value and copies the new value in if the key already exists.. this is the part where I was sure to get.. otherwise it just adds in the key value to the hash table
    HashTableLookUp funtion here just looks up the given key and return a "reference" to the given value.
    Last edited by -EquinoX-; 03-28-2008 at 03:15 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Hashtable v.s. Dictionary
    By George2 in forum C# Programming
    Replies: 1
    Last Post: 06-06-2008, 02:09 AM
  2. Missing Entries in Hashtable
    By wuzzo87 in forum C Programming
    Replies: 3
    Last Post: 05-13-2007, 12:46 AM
  3. Replies: 12
    Last Post: 10-22-2006, 08:37 PM
  4. game hashtable
    By disks86 in forum C++ Programming
    Replies: 1
    Last Post: 12-16-2005, 01:42 PM
  5. oh me oh my hash maps up the wazoo
    By DarkDays in forum C++ Programming
    Replies: 5
    Last Post: 11-30-2001, 12:54 PM