Thread: Hashing Vs Binary Search Trees

  1. #1
    Registered User
    Join Date
    Jun 2010
    Posts
    103

    Hashing Vs Binary Search Trees

    Hi

    I need a suggestion from you guys regarding an implementation.

    I havea requirement of maintaining data as Key-Value pair.On searching, I found that it can be done with the help of either hashing or through binary Search tree.

    What I have found is that hashing use index based searching which is must faster than tree traversing.

    Could any one suggest me which will be better?


    Records count can be great number.

    Thanks in Advance.

    Thanks
    Nickman

  2. #2
    Registered User
    Join Date
    May 2010
    Location
    Naypyidaw
    Posts
    1,314
    Depends on what you're doing.
    Is ordering important?
    memory usage of hash table,collision, etc..

  3. #3
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Can you tolerate the possibility that searching will be slow? If you can, the hash table would likely be a better choice. If you cannot, the binary search tree would likely be a better choice.

    Must the data be sorted in order of key? If yes, the binary search tree would likely be a better choice.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #4
    Registered User
    Join Date
    Jun 2010
    Posts
    103
    Thanks for reply...
    Didn't get what you mean by ordering.?

    While I want is speed,accurracy. Memory consumption is not a problem.

    Thanks

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by nickman
    Didn't get what you mean by ordering.?
    It means that the elements can be accessed in sorted order by the keys.

    Quote Originally Posted by nickman
    While I want is speed,accurracy. Memory consumption is not a problem.
    Both approaches are accurate. They are also fast: although finding an element in a hash table is a constant time operation in the average case, there is time taken to compute the hash.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Will you have to implement these data structures yourself? If you can use 3rd party libraries, I wouldn't stress too much over your choice of data structure. As others have said, both provide good insert and lookup speed in the average case, but, depending on implementation, they can both provide atrocious performance in the worst case. Here is a quick overview of the worst-case scenarios:

    Binary trees can basically degenerate into a linked list, in what is often called a "degenerate tree". This is when every node has only one child, and usually happens when the data is inserted in sorted order (lowest->highest or highest->lowest). It can be solved by using balanced binary trees, such as AVL or red-black trees, though their implementation is a good bit more complex than a plain binary search tree.

    As laserlight mentioned, hashes do give constant performance once you have computed the hash value. Generally, this is a relatively quick function, but you can still run into problems if you encounter any collisions, i.e. two distinct keys that hash to the same value. Once you start getting collisions, expect the performance of inserts into and lookups on your hash table to drop significantly. The solution is to make your table an appropriate size and to choose a good hash function for the data you're working with. Choosing the right size and hash function can be something of an art however, and I don't know much about it (though methods like MD5 and SHA come to mind).

  7. #7
    Registered User
    Join Date
    Jun 2010
    Posts
    103
    Thanks for all responses.

    What are the creteria for choosing hash function?
    Any open source C library for this?

    Also, if I have unique key then I think that problem of collision will not occur . Am I write.

    I appreactiate if some one can give me link of PDF's which demonstrate implementation of hashing and trees. I tried, but didn't get more and in comprehensive manner.

    In nutshell, I just want the best way to implement, rather than doing changes again & again.
    That's why troubling you guys.

    Thanks once again.

  8. #8
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by anduril462
    Binary trees can basically degenerate into a linked list, in what is often called a "degenerate tree". This is when every node has only one child, and usually happens when the data is inserted in sorted order (lowest->highest or highest->lowest). It can be solved by using balanced binary trees, such as AVL or red-black trees, though their implementation is a good bit more complex than a plain binary search tree.
    To clarify: my statements earlier in this thread assume that whenever binary trees are mentioned, it refers to balanced binary trees.

    Quote Originally Posted by anduril462
    Choosing the right size and hash function can be something of an art however, and I don't know much about it (though methods like MD5 and SHA come to mind).
    Unfortunately, cryptographic hash functions tend to be relatively expensive to compute.

    Quote Originally Posted by nickman
    What are the creteria for choosing hash function?
    The basic criteria is that it should be fast and result in few collisions. I suggest that you search the Web for relevant tutorials and articles.

    Quote Originally Posted by nickman
    Any open source C library for this?
    Try searching the Web.

    Quote Originally Posted by nickman
    Also, if I have unique key then I think that problem of collision will not occur . Am I write.
    No, because the hash function maps keys to hash values. Unless you have a perfect hash function (which is only feasible in special cases), a set of unique keys could be mapped to a set of hash values with collisions.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  9. #9
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Quote Originally Posted by nickman View Post
    Any open source C library for this?
    You can look into using glib, which actually has complete implementations of hash functions and balanced binary trees, along with many other data structures and tools to play with.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Binary Search of Array
    By pantherman34 in forum C Programming
    Replies: 21
    Last Post: 05-04-2010, 09:39 AM
  2. Binary Search Trees
    By bleuz in forum C++ Programming
    Replies: 2
    Last Post: 04-30-2010, 01:44 AM
  3. Binary Search Trees
    By lorannex in forum C Programming
    Replies: 3
    Last Post: 04-25-2009, 06:24 AM
  4. Newbie questions regarding binary search trees
    By Ham in forum C++ Programming
    Replies: 1
    Last Post: 11-04-2001, 07:48 AM