Thread: Make a hash table array of linked Lists, separate chaining technique.

  1. #1
    Registered User
    Join Date
    Jul 2013
    Posts
    10

    Make a hash table array of linked Lists, separate chaining technique.

    Hi guys I was hoping for some examples or a tutorial if possible on how to create a hash table array and use linked lists inside the elements.

    Thanks in advance!

  2. #2
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    So each element within the hash table array is some form of linked list structure? What value would a search of the hash table array be looking for if each element is a linked list structure?

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Welcome to the forum.

    You are having a problem with one of the hundreds of such examples on the net already? You need to Google it, and post your code and your problem, (if it should arise and you're unable to solve it).

    Then we can assist.

  4. #4
    Registered User
    Join Date
    Jul 2013
    Posts
    10
    Okay so this is my issue. I have a linked List structure like so:

    Code:
    struct listNode{
    
    	char *phone;
    	char *first;
    	char *last;
    	struct listNode * next;
    
    
    };
    
    
    typedef struct listNode List;
    Now I have already got a program that has taken in a file of contacts, loaded them into the linked list and sorted them.

    Now what I would like to do is create a hash Array for the phone numbers that uses separate chaining to deal with collisions. Separate chaining meaning that each element in the hashArray has a linkedlist in it (or pointed to it whatever) and I can add multiple nodes to each element of the hash Array if I get a collision.

    So I am looking to make a dynamically allocated hashArray that is only 2/3 the size of my linkedList. Can you guys help me with this please?

    Thanks!

  5. #5
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by rcgldr View Post
    So each element within the hash table array is some form of linked list structure?
    I assume the OP is trying to find examples on how to implement separate chaining. [Ah yes, noted in a later post, after I started writing this one.]

    The implementation is straightforward. My preferred one is using a structure with a copy of the hash value (especially when the key is a string), so that values do not need to be rehashed when resizing the hash table. If the hash values are strings, I also include the length of the string, it comes in handy that often. In all, a typical implementation for hashed string might look like
    Code:
    struct hash_entry {
        struct hash_entry  *next;
        size_t              size;
        unsigned long       hash;
        char               *data;
    };
    
    typedef struct {
        size_t              size;   /* Number of entries in 'entry' array */
        size_t              used;   /* Number of hashed items */
        struct hash_entry **entry;
    } hash_table_t;
    If the hash table size is fixed, and at most one thread at a time removes entries from the table, you can use compiler-provided atomic built-ins for the operations to be lockless. (Multiple concurrent removes must be ordered, so usually you have a mutex that is only taken for removes.)

    For single-threaded applications, or when the hash table itself is protected by a mutex (but remember that means all multithreaded accesses will be sequential, no concurrency), you can use the used value to grow the table when necessary.

    The index in the hash table is always (hash % table->size). For ASCII data, a fast and simple hash function I like is DJB2 XOR-variant hash,
    Code:
    unsigned long hash(const void *const data, const size_t size)
    {
        const unsigned char *p = (const unsigned char *)data;
        const unsigned char *const q = (const unsigned char *)data + size;
        unsigned long h = 5381UL;
        while (p < q)
            h = (h * 33UL) ^ (unsigned long)(*(p++));
        return h;
    }
    There are many who complain about it, saying e.g. how one should hash four characters at a time. Sure, but strings are rarely aligned to suitable boundaries with zero paading, and on many architectures, unaligned accesses are slow. A hash that yields good results fast has to be more complicated. The above is simple and good and fast enough for many use cases.

    You do not need to save the hash value itself in the hash entry structure, but you can use it to make searches much faster in the cases when there are multiple elements in the chain. For example, ignoring the atomic accesses needed for multithreaded lockless operation, a find operation is just:
    Code:
    struct hash_entry *find_hashed(const hash_table_t *const table, const void *const data, const size_t size)
    {
        const unsigned long h = hash(data, size);
        struct hash_entry *entry = table->entry[h % table->size];
    
        while (entry)
            if (entry->hash != h || entry->size != size ||
                memcmp(entry->data, data, size))
                entry = entry->next;
            else
                return entry;
    
        return NULL;
    }
    DJB2 hash produces small hashes for very short strings, but if the table is relatively small, say 1000 entries, and the strings at least dozen characters (for 64-bit unsigned longs), then it is very unlikely you ever have to compare any non-matching strings, because most entries typically have an unique hash. (Their modulus to current hash table size just happens to be the same, that's why they're in the same hash table entry chain.)

    As you can see, the data key is first hashed, and the slot in the hash table probed. If the slot is not empty, it's treated as a chain. The chain is scanned, until an entry that matches the hash, size, and the data key is found. Remember that C uses short-circuit evaluation, so the memcmp() is only done if both hash and size already match.

    If you use a high load factor, i.e. relatively small table size compared to number of elements used (or if there are lots of strings of the same length that hash the same, and you for some reason cannot change the hash function), then you can sort the entries in the chain. The simplest way is to just insert new entries in the sorted position in the chain. However, normally this is not worth the effort, as most chains are at most a few entries long.

    If elements are not in sorted order, then I personally tend to just prepend new hash entries to their respective chains.

    Questions?

  6. #6
    Registered User
    Join Date
    Jul 2013
    Posts
    10
    Thank you for the detailed post! I will try to get mine working!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. trying to make a hash table......trouble w/ memory?
    By cuizy in forum C Programming
    Replies: 3
    Last Post: 05-11-2009, 04:47 PM
  2. Array of Linked Lists - Hash table
    By FortinsM3 in forum C Programming
    Replies: 12
    Last Post: 02-25-2009, 10:20 PM
  3. How can I make a hash table for my sequences?
    By advancedk in forum C Programming
    Replies: 6
    Last Post: 08-01-2008, 12:25 PM
  4. Hash Table with Chaining
    By osxd00d in forum C Programming
    Replies: 12
    Last Post: 12-04-2007, 09:01 PM
  5. Replies: 5
    Last Post: 08-01-2007, 06:17 AM