Hash table with C structures

**anya** · 02-04-2014

Hello everyone,

While I was reading about hash tables in all examples adding new element in the table is done with function with parametars key and value. In my example I have more than 2 parametars , for example I have structure with three parametars:

Code:

typdef struct {
unsigned short int id; // key
unsigned short int age;
char buffer[MAX_SIZE]  // value
}packet;

I'm receiving those kind of packets with udp recvfrom functions and I want to store packets one by one in hash table, to store the whole packet not just key and value. Also I want to be able to search the packet in hash table by id and to delete it. Is there any implementations alredy in C? If not please advice me how to do this.

Thx
Best regards

**migf1** · 02-04-2014

I am a bit confused. Aren't your hash-table elements defined as pointers to packet structs, with their id field playing the role of "value" when hasing them? (perhaps I have misinterpreted your question)

**anya** · 02-04-2014

Yes , the point is that I store key->value to hash table using some hashing function. But I have one more element in my structure, that is int age, and I'm confused how to store this value to hash table ?

**MutantJohn** · 02-04-2014

It doesn't matter how many struct members you have just so long as you continue to hash using a unique key.

**migf1** · 02-04-2014

Hmm, a typical hash-table with say linked-list chaining is an array of linked lists, with each linked-list consisting of colliding elements (I guess in your case, an element is a struct packet).

Then you use a hash function to map the key of your element (in your case, this translates to packet->id) into on of the array slots. If the array slot is not empty, it contains the head of a linked-list with elements whose key was already mapped to the same array-slot, so you just insert the new element to that linked-list.

When you need to retrieve an element (packet) for which you already know its key (packet->id), you call your hash-function passing to it the key of the desired element (packet->id). This gives you the array-slot that the desired element has been hashed into. You then traverse the linked-list of that array-slot until you find the node that contains the exact key (packet->id) you are looking for.

That's the general idea and it doesn't really care what values have been assigned to the inner fields of each element (well, except for the id of course).

Now, If I've got it right, you are receiving your packets with their inner fields already assigned their correct values (id included). So all you have to do is to hash each packet into your hash-table by using each packet->id as the hashing key (in other words, you pass the packet->id to your hash function in order to obtain at which array-slot that particular packet will be hashed into).

Accordingly, to retrieve an already hashed element with id say 4, you first obtain the array-slot it has been hashed into, and then you traverse the linked-list of that array-slot until you find the element whose id equals to 4.

Something like this...

Code:

int i = hash_function( 4, hashTable );
if ( !hashTable[i] ) {
    puts( "The requested packet has not been hashed" );
}
else {
    struct packet *node = llist_lookup( hashTable[i], 4 );
    if ( !node ) {
        puts( "The requested packet has not been hashed" );
    }
    else {
        printf( "The requested packet has been hashed in slot %i\n", i );
        printf( "age = %u, buffer = ", node->age );
        fflush( stdout );
        for (int i=0; i < MAX_SIZE; i++) {
            printf( "%x ", node->buffer[i] );
        }
        putchar( '\n' );
    }
}

**rcgldr** · 02-04-2014

Originally Posted by migf1

Hmm, a typical hash-table with say linked-list chaining is an array of linked lists.

Another option is for the hash table to be an array of pointers to structures, indexed by the hashed or rehashed structure id's. (The array is initialized to all NULL's). The issue with this scheme is that hash collisions can occur for different id's, so when looking for all instances of a specific id, the array needs to be probed until a NULL is found. The hash table array should be 43% or larger than the maximum number of structures (< 70% load factor) for reasonable collision overhead.

**migf1** · 02-05-2014

Originally Posted by rcgldr

Another option is for the hash table to be an array of pointers to structures, indexed by the hashed or rehashed structure id's. (The array is initialized to all NULL's). The issue with this scheme is that hash collisions can occur for different id's, so when looking for all instances of a specific id, the array needs to be probed until a NULL is found. The hash table array should be 43% or larger than the maximum number of structures (< 70% load factor) for reasonable collision overhead.

As far as the array implementation is concerned, I think we both said the same thing. An array of linked-lists is indeed an array of pointers to struct element, initiallized to all NULLs, no?

**anya** · 02-05-2014

As I will have always different id's I suposse that I don't need to have linked list because I will not have collissions ? So the better options will be an array of pointers to structures ? What would be the proper logic of hash function If I have always the same id's?

**migf1** · 02-05-2014

Originally Posted by anya

As I will have always different id's I suposse that I don't need to have linked list because I will not have collissions ? So the better options will be an array of pointers to structures ? What would be the proper logic of hash function If I have always the same id's?

You keep confusing me! What's the value range of your ids? If it is reasonably small, then yes I guess you could store just one packet per array-slot. But usually the data sets are unreasonably big, so we want to hash them down to a much smaller array, thus collisions are inevitable (for example, if you can have up to say 4,294,967,295 different ids -a 32bit unsigned entity- then perhaps it's not a good idea to keep an array of 4,294,967,295 whatever data-type in memory, where most of its slots will be unused).

**anya** · 02-05-2014

Sorry , hash tables are also confusing for me. The id's range won't be big,up to 200 ... but I need to use hash tables because at the end it will be application in real time and I need add/find operations in constant time ....So I don't need linked list? What should I use for this case

**laserlight** · 02-05-2014

Originally Posted by anya

The id's range won't be big,up to 200 ... but I need to use hash tables because at the end it will be application in real time and I need add/find operations in constant time

Hmm... but if the range of ids will be that small, and given that each id is unique, an array of 201 elements would effectively provide you with a perfect hash table where the perfect hash function is the id itself.

**migf1** · 02-05-2014

Exactly what laserlight just pointed out.

In your case it sounds like you don't even need a hash-function. Just allocate an array of 201 pointers to struct packet (so you can keep track of ids ranging from 0 to 200) and directly index it with your packet.id. If you don't care about memory you can even skip the pointers to struct packet part, and directly allocate an array of 200 struct packets (this will spare you the need for mallocing/freeing the hashed packets, at the cost of risking to preallocate much more memory than you may need).

**anya** · 02-05-2014

I was thinking to use uthash for that , it is well documented I think that fits in this description ....I just must find there how to change hash function because as you said the perfect hash is the id itself

**laserlight** · 02-05-2014

Originally Posted by anya

I just must find there how to change hash function because as you said the perfect hash is the id itself

Well, if you really want:

Code:

unsigned short int the_most_perfect_hash_function_ever(const packet *x)
{
    return x->id;
}

**anya** · 02-05-2014

Hi laserlight, thank you for the_most_perfect_hash_function_ever

But my thoughts was how to change hash_function in uthash header , but that is something different from this topic ...Thanks to all that helped me to figure out how hash works !

Thread: Hash table with C structures

Thread Tools

Search Thread

Display

Hash table with C structures

Similar Threads

hash table in c++

Hash table help

What is a Hash Table?

Hash Table

Hash table

Tags for this Thread