Thread: typedef question

  1. #16
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    I understand that this will not be a sorted order and that's why I want to traverse the HashTable and put the elements into an array and then sort it using the build in qsort that C has in library.

    If you read my problem above, about optimization .. what would you suggest? I understand that using a binary tree would solve this, but inserting into a binary takes O(n log n) and I need something faster

  2. #17
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by tabstop View Post
    Unless your array of lists is somehow exponentially larger than it needs to be, traversal of a hash table is just as efficient of traversal of anything else; most every traversal is O(n) since you have to visit every item once.
    It doesn't actually have to be that much bigger than needed to start making a difference. I recently worked on a project (CornedBee's game of life challenge, actually), where I discovered that using a better hash function actually led to worse iteration performance.

    I could not believe it myself at first. Some investigation with a profiler revealed that the "smart" hash function was spreading things out better (as it was supposed to), but this caused performance to DECREASE when iterating over all the elements in the table, since so much time was spent looking in empty buckets. A good spread makes lookup/insertion more efficient but can actually hurt iteration. So there you go.

    This was a normal set of affairs, not a pathological case. So I reverted back to the stupid hash function which didn't spread things out very well, and got the performance back up.

  3. #18
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    Quote Originally Posted by brewbuck View Post
    It doesn't actually have to be that much bigger than needed to start making a difference. I recently worked on a project (CornedBee's game of life challenge, actually), where I discovered that using a better hash function actually led to worse iteration performance.

    I could not believe it myself at first. Some investigation with a profiler revealed that the "smart" hash function was spreading things out better (as it was supposed to), but this caused performance to DECREASE when iterating over all the elements in the table, since so much time was spent looking in empty buckets. A good spread makes lookup/insertion more efficient but can actually hurt iteration. So there you go.

    This was a normal set of affairs, not a pathological case. So I reverted back to the stupid hash function which didn't spread things out very well, and got the performance back up.
    Hmmm. I can't see how spreading things out would cause more empty buckets? Or do you mean that empty buckets were so fast to search (i.e., you don't search them at all) that searching three longer lists was faster than searching twenty shorter lists?

  4. #19
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    tabstop can you answer my question?

  5. #20
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    My suggestion would be, if you expect a large number of "Words" in your dictionary, and you have to at any point list a set of them in sorted order, to keep a bst around. If you're only going to do "this word was not found" and look up a single word at a time, a hashtable would be fine.

  6. #21
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    the thing is that after looking at single words and see it they exist or not and store them in a hash table and until that part it's find.. BUT I will have to write that hash table out into a file and it has to BE in a sorted order.. how can I do this with a hash table?

  7. #22
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    ? Insertion into a binary tree is O(log n), or O(n) if you forget to scramble before you build the tree and end up with a linked list.

  8. #23
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    Average case is O(log2n).

    Worst case is O(n).

  9. #24
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    Quote Originally Posted by -EquinoX- View Post
    the thing is that after looking at single words and see it they exist or not and store them in a hash table and until that part it's find.. BUT I will have to write that hash table out into a file and it has to BE in a sorted order.. how can I do this with a hash table?
    Mergesort? (IOW: sort each linked list, and then compare the top of each list, write out the first alphabetically, throw it away, and repeat.)

  10. #25
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    Quote Originally Posted by tabstop View Post
    Mergesort? (IOW: sort each linked list, and then compare the top of each list, write out the first alphabetically, throw it away, and repeat.)
    so I have to write a mergesort to do this? you mean compare each of the front of the list?

  11. #26
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    Quote Originally Posted by -EquinoX- View Post
    Average case is O(log2n).

    Worst case is O(n).
    Right. So the question comes down to, when do you want to pay the piper? Insertion into a bst costs O(log n), perhaps O(n), but output in sorted order comes free with any O(n) traversal. Insertion into a hashtable is more or less free (O(1) certainly), but now you have (number-of-linked-lists + 1) sorting problems, and that will cost O(n log n) at the end of the day.

    Edit to add: So n*log(n) for the bst, provided you have even a little bit of randomness in your original word list, vs O(n log n) at the end for the hashtable, which looks more or less like a draw, unless you find yourself sorting often.
    Last edited by tabstop; 05-02-2008 at 05:09 PM.

  12. #27
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    Quote Originally Posted by -EquinoX- View Post
    so I have to write a mergesort to do this? you mean compare each of the front of the list?
    Mergesort seems like the obvious choice to me. If your number of linked lists is a power of 2 (or even if it isn't), it will actually make life better to merge them two at a time in the "normal" way rather than try to merge however-many different lists into one.

  13. #28
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    the thing is that I have to have an iterator.. I have to have a next method and that will give me the next word in the sorted list

    so say my list is

    apple
    orange
    giraffe
    mail

    and I create an iterator and it goes apple, orange, etc...

    do you mean I do a merge sort and put it inside an array?

  14. #29
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,334
    Quote Originally Posted by -EquinoX- View Post
    the thing is that I have to have an iterator..
    Ook. Honestly, if it were me? I would add an extra spot in my struct Word or whatever it's called for a link to the next word in line. Read in all the words, put them in a bst, traverse the tree with an extra static pointer keeping track of the previous entry, and set a link from previous to next every step. You could do the same with a hashtable, of course; sort each linked list, and then set links across the lists in the mergesort style.

  15. #30
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    tabstop,

    I can't change this, I must have an iterator.. there are parts of the code that I can't change, it's like the interface..

    hmm..but would it be much faster if I am doing this in a Hash Table?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help with ZwQueryDirectoryFile for my Driver.
    By taDo in forum Windows Programming
    Replies: 9
    Last Post: 11-27-2008, 08:54 AM
  2. Error: redefinition of typedef xxxx
    By Rocha in forum C Programming
    Replies: 2
    Last Post: 11-24-2008, 09:19 AM
  3. Replies: 48
    Last Post: 09-26-2008, 03:45 AM
  4. Help...typedef struct...confusing...
    By darkchild in forum C Programming
    Replies: 1
    Last Post: 01-23-2007, 08:03 AM
  5. question about typedef
    By volk in forum C++ Programming
    Replies: 8
    Last Post: 05-30-2003, 10:53 PM