Thread: trouble with hash function implementing

  1. #1
    Registered User
    Join Date
    Jul 2012
    Posts
    87

    trouble with hash function implementing

    Hi, I'm trying to build a trivial hash function using division method with divisor as table size (e.g. key % size ) with separate chaining using doubly linked list. I have gone over the code over and over and cannot see why my logic to create an array of pointers to type hockey_player doesn't work. I have checked around, do I have to use a struct as container to hold my nodes? I don't see why my way doesn't work.

    Code:
    #include <iostream>
    
    using namespace std;
    
    struct hockey_player
    {
        hockey_player* next;
        hockey_player* back;
        int num;//NB: this is key (jersey number)
        string name;
        string team;
    };
    
    bool is_duplicate(hockey_league* bucket, int cur_bucket, int size, int num_, string name_, string team_)
    {
        if ( bucket->front == NULL )//if list is empty, of course not duplicate
        return false;
            
        for ( hockey_player* cur = bucket->front; cur != NULL; cur = cur->next )/**Else traverse bucket to find if itm to insert is a duplicate or not*/
         if ( cur->num == num_ && cur->name == name_ && cur->team == team_ )
         return true;
                
         return false;
    }
    
    int main()
    {
        hockey_player* p_1_front = NULL;//pointer to the items in each bucket (ie. the address in each array index to a list)
        hockey_player* p_2_front = NULL;
        hockey_player* p_3_front = NULL;
        hockey_player* p_4_front = NULL;
        hockey_player* p_5_front = NULL;
        hockey_player* p_6_front = NULL;
        hockey_player* p_7_front = NULL;
        hockey_player* p_8_front = NULL;
        hockey_player* p_9_front = NULL;
        hockey_player* p_10_front = NULL;
        hockey_player* p_11_front = NULL;
        hockey_player* p_12_front = NULL;
        hockey_player* p_13_front = NULL;
        hockey_player* p_14_front = NULL;
        hockey_player* p_15_front = NULL;
        
        hockey_league* bucket[15] = {p_1_front, p_2_front, p_3_front, p_4_front, p_5_front, p_6_front, p_7_front, p_8_front, p_9_front, p_10_front,
            p_11_front, p_12_front, p_13_front, p_14_front, p_15_front};
        
        int size_ = 15;
        
        hockey_player* newGuy = new hockey_player;
        newGuy->num = 99;
        newGuy->name = "Wayne Gretzky";
        newGuy->team = "EDM";
        newGuy->next = NULL;
        newGuy->back = NULL;
        
        int i = newGuy->num % size_;
        cout << i << endl;
        bucket[i]->front = newGuy;
    
      return 0;
    }
    I was trying to use a container to hold a front pointer in each bucket (index) to a list, but is it really necessary.
    I mean:
    Code:
    struct hockey_league
    {
        hockey_player* front;//to keep track of our list
    };
    Anyways, I just need help for why my bool is_duplicate function that seems so trivial doesn't work.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    The first place I would look is the place where you insert nodes into a bucket. The easiest way for is_duplicate to break is for the linked lists to be broken.

  3. #3
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Unfortunately for you, lines 28 to 45 amount to just doing this:
    Code:
    hockey_league* bucket[15] = {};
    Both just initialise an array of 15 pointers to NULLs.

    Besides, when you find yourself using variables like that, you should be thinking of using an array - hooray!
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  4. #4
    Registered User
    Join Date
    Jul 2012
    Posts
    87
    I was aiming for that b/c I thought it was an array of pointers. So initially all are NULL, so then I would insert a new node into slot hash(key) where my key is for e.g. newGuy->num % size (trivial but just to get going w/ hash tables implementation). So since newGuy has key (i.e. newGuy->num) 99, it would be inserted into index: 9 so now the front is updated to reflect this:
    Code:
    bucket[i]->front = newGuy;
    And as I am typing this late at night, I realized if all are NULL, NULL->front doesn't make sense so I'll need to fix this up.

    Also, does it matter to have a container to hold each nodes so I mean:
    Code:
    struct hockey_player
    {
     hockey_player* next;
     hockey_player* back;
     int num;//the key 
     string team;
     string name;
    };
    
    struct hockey_league
    {
     struct hockey_player* front;
    };
    And also, I need to insert keys into all slots, right, not just use the key to computer the slot to insert to I mean.

  5. #5
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    >> And also, I need to insert keys into all slots, right, not just use the key to computer the slot to insert to I mean.

    Well, after you hash either to retrieve a node or to put something in the table you're supposed to use the linked list to store all of the values in the bucket. You will want to match up the key element with something in the list.

    A good example hash table is this one even though the hash is very poor. http://en.literateprograms.org/Hash_table_(C)

  6. #6
    Registered User
    Join Date
    Jul 2012
    Posts
    87
    Hi whiteflags, so I do have to have a struct to hold all the individual pointers to each bucket and related issue, is there something I'm doing wrong w/ passing the pointer of the array declared in main(called bucket) since the code seems so simple but it doesn't run.

    Code:
    bool is_duplicate(hockey_player* bucket, int cur_bucket, int size, int num_, string name_, string team_)
    {
        if ( bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
            return false;
            
        for ( hockey_player* cur = bucket[cur_bucket]; cur != NULL; cur =  cur->next )/**Else traverse bucket to find if itm to insert is a  duplicate or not*/
            if ( cur->num == num_ && cur->name == name_ && cur->team == team_ )
                return true;
                
        return false;
    }

  7. #7
    Registered User
    Join Date
    Jul 2012
    Posts
    87
    I put the same code in main and it works, could it do w/ pointers to pointers that I need to pass if I decide to not use another struct to hold the individual head pointers to each bucket's list. This is the first time I have implemented a hash table via separate chaining (with doubly linked lists I'm using). I understand pointers enough and linked lists, but what could I be doing wrong. The code below copied from body of my above bool is_duplicate function runs that's why I'm concerned it's the parameter I'm passing. I'm not sure...

    I had looked at the provided link http://en.literateprograms.org/Hash_...20use:mystrdup and it seems pretty straightforward, but it uses 2 structs, one as container to hold each individual node, I mean do I really need to have 2 structs.

    Another implementation that uses 2 structs but slightly different I found online was:
    Code:
    struct IntListNode {
        int element;
        IntListNode* next;
    };
    
    struct IntList {//NB: the author's use of this struct is really just storing head and tail pointers for a given list so is this struct really necessary?
        IntListNode* first;
        IntListNode* last;
    };
    
    IntList* createList() {
        IntList* list = new IntList;
        list->first = NULL;
        list->last  = NULL;
        return list;
    }
    Here is my code copied directly from my bool is_duplicate function. I hope someone can help me fig out what I'm doing wrong.
    Code:
    int main(int argc, char** argv)
    {
        hockey_player* p_1_front = NULL;//front pointer to the items in each bucket (ie. the address in each array index to a list)
        hockey_player* p_2_front = NULL;
        hockey_player* p_3_front = NULL;
        hockey_player* p_4_front = NULL;
        hockey_player* p_5_front = NULL;
        hockey_player* p_6_front = NULL;
        hockey_player* p_7_front = NULL;
        hockey_player* p_8_front = NULL;
        hockey_player* p_9_front = NULL;
        hockey_player* p_10_front = NULL;
        hockey_player* p_11_front = NULL;
        hockey_player* p_12_front = NULL;
        hockey_player* p_13_front = NULL;
        hockey_player* p_14_front = NULL;
        hockey_player* p_15_front = NULL;
        
        hockey_player* bucket[15] = {p_1_front, p_2_front};
        
        int size = 15;
        
        hockey_player* newGuy = new hockey_player;
        newGuy->num = 1;
        newGuy->name = "RNH";
        newGuy->team = "EDM";
        newGuy->back = NULL;
        newGuy->next = NULL;
        p_2_front = newGuy;
        
        int cur_bucket = newGuy->num % size;
        
        if ( bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
            cout << false << endl;
        //else    
        for ( hockey_player* cur = bucket[cur_bucket]; cur != NULL; cur = cur->next )/**Else traverse bucket to find if itm to insert is a duplicate or not*/
            if ( cur->num == newGuy->num && cur->name == newGuy->name && cur->team == newGuy->team )
                cout << true << endl;
    
        return 0;
    }

  8. #8
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    is there something I'm doing wrong w/ passing the pointer of the array declared in main(called bucket) since the code seems so simple but it doesn't run.
    Yes there is. It's as you figured out: a pointer to pointer will work. The reasoning, if you remember, an array decays to a pointer to the first element. The type of the first element is hockey_player* not hockey_player, so if we keep that in mind, it explains why it didn't work before.

    do I really need to have 2 structs.
    You do not need to use two structures when you implement a hash table. A hash table is just an array of nodes. There are some organizational benefits to having two structs though. In particular I think the literate programming example makes the most sense when you do that. Since the hash function is a member of the hash table struct, you won't accidentally use the wrong hash function on the wrong table, and the size of the table is also right there when you go to resize any tables.

  9. #9
    Registered User
    Join Date
    Jul 2012
    Posts
    87
    Apologize for late reply, but I tried passing a pointer to a pointer, but it still doesn't compile. here is my code snippet

    Code:
    //DLL nodes
    struct hockey_player
    {
     hockey_player* next;
     hockey_player* back;
     int num;//NB: this is key (jersey number)
     string name;
     string team;
    };
    //helper fcn to check if an itm to insert is unique (diff number(key) AND diff data)
    bool is_duplicate(hockey_player** bucket, int cur_bucket, int size, int num_, string name_, string team_)
    {
     if ( *bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
      return false;
      
     for ( hockey_player* cur = *bucket[cur_bucket]; cur != NULL; cur = cur->next )/**Else traverse bucket to find if itm to insert is a duplicate or not*/
      if ( cur->num == num_ && cur->name == name_ && cur->team == team_ )
       return true;
       
     return false;
    }

  10. #10
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    When I compiled your source, I #included <string> and added the "using namespace std;" statement. After that, compiling returned this:
    Code:
    main.cpp: In function 'bool is_duplicate(hockey_player**, int, int, int, std::string, std::string)':
    main.cpp:15: error: no match for 'operator==' in '* *(bucket + ((unsigned int)(((unsigned int)cur_bucket) * 4u))) == 0'
    main.cpp:18: error: cannot convert 'hockey_player' to 'hockey_player*' in initialization
    Process terminated with status 1 (0 minutes, 0 seconds)
    2 errors, 0 warnings
    The first error was on this line:
    Code:
    if ( *bucket[cur_bucket] == NULL )//if list is empty, of course not duplicate
    Now there should be a syntax error, because even if you don't understand the full error, it says that there is no equality operator to use in this line yet you should be able to compare pointers.

    The second error in this case is a big hint to what's going on in both cases. It is caused by this line, by the same expression:
    Code:
    for ( hockey_player* cur = *bucket[cur_bucket]; ...
    What's on the right hand side of the assignment is clearly not a pointer.

    Remember that using the subscript operator will dereference the pointer once. Using the subscript operator is correct here, but remember if the type is hockey_player** and you want to get hockey_player*, you only need to dereference once. Extra stars do not make pointers in expressions.
    Last edited by whiteflags; 07-21-2012 at 03:47 PM.

  11. #11
    Registered User
    Join Date
    Jul 2012
    Posts
    87
    You're a life saver, whiteflags. I've been tearing my hair out trying to figure that out, that passing by ptrs to ptrs and also forgetting that subscript notation is just syntactic sugar for pointer arithmetic.

  12. #12
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Just a reminder: you can pass references to pointers, you know...
    Best would be to use a container so you don't have to use pointers.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #13
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Quote Originally Posted by Elysia View Post
    Just a reminder: you can pass references to pointers, you know...
    Best would be to use a container so you don't have to use pointers.
    Actually, he can't use a pointer reference. An array of pointers can't be flattened to a pointer reference.

  14. #14
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    You can pass an array by reference, too, but I was merely throwing out the suggestion there, not necessarily implying that it can and should be used. Just a little FYI, if you will.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #15
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Quote Originally Posted by Elysia View Post
    You can pass an array by reference, too
    That is and always has been a horrible idea, as it forces the size of the array to be fixed and any code that uses such array references is forever incompatible with the STL, and several threads have been spent telling you such things.
    I was merely throwing out the suggestion there, not necessarily implying that it can and should be used. Just a little FYI, if you will.
    A lot of my bad advice starts with good intentions, too.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Hash function to hash key into geographic coordinate
    By dominic_tran201 in forum C++ Programming
    Replies: 2
    Last Post: 02-11-2011, 10:03 AM
  2. trying to make a hash table......trouble w/ memory?
    By cuizy in forum C Programming
    Replies: 3
    Last Post: 05-11-2009, 04:47 PM
  3. A little trouble implementing a blur filter
    By omishompi in forum C++ Programming
    Replies: 2
    Last Post: 04-16-2006, 08:57 AM
  4. Having trouble implementing enumeration.
    By RP319 in forum C++ Programming
    Replies: 8
    Last Post: 03-02-2005, 07:03 PM