Thread: Hash table with C structures

  1. #1
    Registered User
    Join Date
    Jul 2012
    Posts
    45

    Cool Hash table with C structures

    Hello everyone,

    While I was reading about hash tables in all examples adding new element in the table is done with function with parametars key and value. In my example I have more than 2 parametars , for example I have structure with three parametars:
    Code:
    typdef struct {
    unsigned short int id; // key
    unsigned short int age;
    char buffer[MAX_SIZE]  // value
    }packet;
    I'm receiving those kind of packets with udp recvfrom functions and I want to store packets one by one in hash table, to store the whole packet not just key and value. Also I want to be able to search the packet in hash table by id and to delete it. Is there any implementations alredy in C? If not please advice me how to do this.

    Thx
    Best regards

  2. #2
    Registered User migf1's Avatar
    Join Date
    May 2013
    Location
    Athens, Greece
    Posts
    385
    I am a bit confused. Aren't your hash-table elements defined as pointers to packet structs, with their id field playing the role of "value" when hasing them? (perhaps I have misinterpreted your question)
    Last edited by migf1; 02-04-2014 at 09:00 AM.

  3. #3
    Registered User
    Join Date
    Jul 2012
    Posts
    45
    Yes , the point is that I store key->value to hash table using some hashing function. But I have one more element in my structure, that is int age, and I'm confused how to store this value to hash table ?

  4. #4
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    It doesn't matter how many struct members you have just so long as you continue to hash using a unique key.

  5. #5
    Registered User migf1's Avatar
    Join Date
    May 2013
    Location
    Athens, Greece
    Posts
    385
    Hmm, a typical hash-table with say linked-list chaining is an array of linked lists, with each linked-list consisting of colliding elements (I guess in your case, an element is a struct packet).

    Then you use a hash function to map the key of your element (in your case, this translates to packet->id) into on of the array slots. If the array slot is not empty, it contains the head of a linked-list with elements whose key was already mapped to the same array-slot, so you just insert the new element to that linked-list.

    When you need to retrieve an element (packet) for which you already know its key (packet->id), you call your hash-function passing to it the key of the desired element (packet->id). This gives you the array-slot that the desired element has been hashed into. You then traverse the linked-list of that array-slot until you find the node that contains the exact key (packet->id) you are looking for.

    That's the general idea and it doesn't really care what values have been assigned to the inner fields of each element (well, except for the id of course).

    Now, If I've got it right, you are receiving your packets with their inner fields already assigned their correct values (id included). So all you have to do is to hash each packet into your hash-table by using each packet->id as the hashing key (in other words, you pass the packet->id to your hash function in order to obtain at which array-slot that particular packet will be hashed into).

    Accordingly, to retrieve an already hashed element with id say 4, you first obtain the array-slot it has been hashed into, and then you traverse the linked-list of that array-slot until you find the element whose id equals to 4.

    Something like this...

    Code:
    int i = hash_function( 4, hashTable );
    if ( !hashTable[i] ) {
        puts( "The requested packet has not been hashed" );
    }
    else {
        struct packet *node = llist_lookup( hashTable[i], 4 );
        if ( !node ) {
            puts( "The requested packet has not been hashed" );
        }
        else {
            printf( "The requested packet has been hashed in slot %i\n", i );
            printf( "age = %u, buffer = ", node->age );
            fflush( stdout );
            for (int i=0; i < MAX_SIZE; i++) {
                printf( "%x ", node->buffer[i] );
            }
            putchar( '\n' );
        }
    }
    Last edited by migf1; 02-04-2014 at 10:42 AM. Reason: Changed buffer[i] to node->buffer[i]

  6. #6
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by migf1 View Post
    Hmm, a typical hash-table with say linked-list chaining is an array of linked lists.
    Another option is for the hash table to be an array of pointers to structures, indexed by the hashed or rehashed structure id's. (The array is initialized to all NULL's). The issue with this scheme is that hash collisions can occur for different id's, so when looking for all instances of a specific id, the array needs to be probed until a NULL is found. The hash table array should be 43% or larger than the maximum number of structures (< 70% load factor) for reasonable collision overhead.
    Last edited by rcgldr; 02-04-2014 at 11:45 AM.

  7. #7
    Registered User migf1's Avatar
    Join Date
    May 2013
    Location
    Athens, Greece
    Posts
    385
    Quote Originally Posted by rcgldr View Post
    Another option is for the hash table to be an array of pointers to structures, indexed by the hashed or rehashed structure id's. (The array is initialized to all NULL's). The issue with this scheme is that hash collisions can occur for different id's, so when looking for all instances of a specific id, the array needs to be probed until a NULL is found. The hash table array should be 43% or larger than the maximum number of structures (< 70% load factor) for reasonable collision overhead.
    As far as the array implementation is concerned, I think we both said the same thing. An array of linked-lists is indeed an array of pointers to struct element, initiallized to all NULLs, no?

  8. #8
    Registered User
    Join Date
    Jul 2012
    Posts
    45
    As I will have always different id's I suposse that I don't need to have linked list because I will not have collissions ? So the better options will be an array of pointers to structures ? What would be the proper logic of hash function If I have always the same id's?

  9. #9
    Registered User migf1's Avatar
    Join Date
    May 2013
    Location
    Athens, Greece
    Posts
    385
    Quote Originally Posted by anya View Post
    As I will have always different id's I suposse that I don't need to have linked list because I will not have collissions ? So the better options will be an array of pointers to structures ? What would be the proper logic of hash function If I have always the same id's?
    You keep confusing me! What's the value range of your ids? If it is reasonably small, then yes I guess you could store just one packet per array-slot. But usually the data sets are unreasonably big, so we want to hash them down to a much smaller array, thus collisions are inevitable (for example, if you can have up to say 4,294,967,295 different ids -a 32bit unsigned entity- then perhaps it's not a good idea to keep an array of 4,294,967,295 whatever data-type in memory, where most of its slots will be unused).
    Last edited by migf1; 02-05-2014 at 10:26 AM. Reason: typos

  10. #10
    Registered User
    Join Date
    Jul 2012
    Posts
    45
    Sorry , hash tables are also confusing for me. The id's range won't be big,up to 200 ... but I need to use hash tables because at the end it will be application in real time and I need add/find operations in constant time ....So I don't need linked list? What should I use for this case

  11. #11
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by anya
    The id's range won't be big,up to 200 ... but I need to use hash tables because at the end it will be application in real time and I need add/find operations in constant time
    Hmm... but if the range of ids will be that small, and given that each id is unique, an array of 201 elements would effectively provide you with a perfect hash table where the perfect hash function is the id itself.
    Last edited by laserlight; 02-05-2014 at 10:48 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  12. #12
    Registered User migf1's Avatar
    Join Date
    May 2013
    Location
    Athens, Greece
    Posts
    385
    Exactly what laserlight just pointed out.

    In your case it sounds like you don't even need a hash-function. Just allocate an array of 201 pointers to struct packet (so you can keep track of ids ranging from 0 to 200) and directly index it with your packet.id. If you don't care about memory you can even skip the pointers to struct packet part, and directly allocate an array of 200 struct packets (this will spare you the need for mallocing/freeing the hashed packets, at the cost of risking to preallocate much more memory than you may need).

  13. #13
    Registered User
    Join Date
    Jul 2012
    Posts
    45
    I was thinking to use uthash for that , it is well documented I think that fits in this description ....I just must find there how to change hash function because as you said the perfect hash is the id itself

  14. #14
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by anya
    I just must find there how to change hash function because as you said the perfect hash is the id itself
    Well, if you really want:
    Code:
    unsigned short int the_most_perfect_hash_function_ever(const packet *x)
    {
        return x->id;
    }
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #15
    Registered User
    Join Date
    Jul 2012
    Posts
    45
    Hi laserlight, thank you for the_most_perfect_hash_function_ever But my thoughts was how to change hash_function in uthash header , but that is something different from this topic ...Thanks to all that helped me to figure out how hash works !

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. hash table in c++
    By baxy in forum C++ Programming
    Replies: 7
    Last Post: 12-07-2012, 12:53 PM
  2. Hash table help
    By lemon-ice-tea in forum C++ Programming
    Replies: 1
    Last Post: 05-18-2011, 09:31 AM
  3. What is a Hash Table?
    By rip1968 in forum C++ Programming
    Replies: 3
    Last Post: 06-18-2002, 08:57 PM
  4. Hash Table
    By ihsir in forum C++ Programming
    Replies: 0
    Last Post: 04-13-2002, 12:08 AM
  5. Hash table
    By Unregistered in forum C Programming
    Replies: 3
    Last Post: 01-14-2002, 07:06 PM

Tags for this Thread