Thread: hash table's performance

  1. #1
    Registered User
    Join Date
    May 2004
    Posts
    73

    hash table's performance

    Hi everyone, I did a little benchmark, and I'm trying to figure out why the <unordered_set> (visual studio 2010) hash table performance is so much worse than std::set.


    Code:
    Time is measured in seconds. Disregard SOHTBST,
    that's just another implementation of a tree.
    
    adding to a container stats follow.
    the first row is 2^11 randomly generated integers,
    doubling each time until the last row,
    which is 2^18 randomly generated integers.
    std::set    SOHTBST     HashTable1     HashTable2
      0.001210    0.000504    0.005610    0.003793
      0.002509    0.000998    0.017613    0.015412
      0.005010    0.002214    0.034532    0.014783
      0.010273    0.004754    0.048999    0.029705
      0.020482    0.009841    0.079733    0.060976
      0.042823    0.021546    0.669926    0.140907
      0.087980    0.044820    0.793633    0.238754
      0.197135    0.103221    1.048396    0.480977
    (HashTable1 is a new container, HashTable2 is
    that same container after having all elements
    removed. The effect is that HashTable2 doesn't need any resizing)
    Adding uses:
    start = timeSinceEpoch();
    for (int i = 0; i < testSize; i++)
    	container.insert(testData[i]);
    end= timeSinceEpoch();
    
    
    iterating through all the elements:
    std::set    SOHTBST     HashTable
      0.000547    0.000017    0.000685
      0.001091    0.000032    0.001361
      0.002334    0.000059    0.002653
      0.004378    0.000126    0.005325
      0.008599    0.000226    0.010637
      0.017478    0.000467    0.021505
      0.034945    0.001034    0.042649
      0.071910    0.003394    0.087486
    Iterating uses:
    start = timeSinceEpoch();
    typename Algorithm::iterator iter = container.begin();
    while (iter != container.end())
    	iter++;
    end = timeSinceEpoch();
    
    finding each of the elements:
    std::set    SOHTBST     HashTable
      0.000929    0.000118    0.001475
      0.001918    0.000285    0.003258
      0.003887    0.000637    0.005811
      0.008105    0.001380    0.012648
      0.016566    0.003361    0.027964
      0.034278    0.006734    0.042473
      0.071597    0.016240    0.087051
      0.161150    0.040815    0.209862
    Finding uses:
    start = timeSinceEpoch();
    for (int i = 0; i < testSize; i++)
    	container.find(testData[i]);
    end = timeSinceEpoch();
    
    removing all of the elements, one by one:
    std::set    SOHTBST     HashTable
      0.001538    0.000268    0.010862
      0.003081    0.000532    0.027662
      0.006087    0.001043    0.215371
      0.012096    0.002074    0.481068
      0.024105    0.004128    1.014633
      0.048194    0.008248   13.332217
      0.096763    0.016540   30.043754
      0.196106    0.034952   62.446652
    Removing uses:
    start = timeSinceEpoch();
    for (iter = container.begin(); iter != container.end(); ) {
    	typename Algorithm::iterator nextIter = iter;
    	nextIter++;
    	container.erase(iter);
    	iter = nextIter;
    }
    end = timeSinceEpoch();
    
    
    
    
    Hash function:
    struct myhash {
        unsigned int operator() (int a) const {
            // Robert Jenkins
           a = (a+0x7ed55d16) + (a<<12);
           a = (a^0xc761c23c) ^ (a>>19);
           a = (a+0x165667b1) + (a<<5);
           a = (a+0xd3a2646c) ^ (a<<9);
           a = (a+0xfd7046c5) + (a<<3);
           a = (a^0xb55a4f09) ^ (a>>16);
           return a;
        }
    };


    My main question is, why is the hash table so much slower in these use cases than stl::set?

    And why is hash table's erase() function so terribly slow (60 seconds!)?

    Please, give me some insight into how they may have implemented their hash table, that would lead to this. Or maybe its an error in my benchmarking?

    - Evan Ovadia

    An extra note, a few other hash tables I've measured against show the same behavior: worse than stl::set, and terrible erase().
    Last edited by Verdagon; 12-08-2010 at 02:46 AM. Reason: using code blocks

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Perhaps you should post the actual benchmark program so that we can run it to verify your results, as well as check what exactly you are doing.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    May 2004
    Posts
    73
    I lost the code, but I managed to whip up another benchmark to show the same weirdness.

    Output with TEST_SIZE = 1000
    stl set add 0.017129 iterate 0.008975 find 0.013035 erase 0.019165
    unordered add 0.093343 iterate 0.006815 find 0.013745 erase 0.683252

    Output with TEST_SIZE = 10000
    stl set add 0.200412 iterate 0.056483 find 0.144453 erase 0.208875
    unordered add 1.09719 iterate 0.066118 find 0.137345 erase 52.7872

    Running on a core 2 duo @ 1.7ghz, at 30% processor (due to power settings), so you may get much faster results, but the results should be similar.

    Code:
    #include <iostream>
    using namespace std;
    
    #include <Windows.h> // for QueryPerformanceCounter
    #include <set>
    #include <unordered_set>
    
    #define TEST_SIZE 1000
    
    double getTime() {
    	static __int64 ticksPerSecond = 0;
    	if (ticksPerSecond == 0)
    		QueryPerformanceFrequency((LARGE_INTEGER *)&ticksPerSecond);
    
    	__int64 ticks = 0;
    	QueryPerformanceCounter((LARGE_INTEGER *)&ticks);
    
    	return (ticks * 1000000 / ticksPerSecond) / 1000000.0;
    }
    
    template<typename SetType>
    void testSet(int testData[TEST_SIZE], char *name) {
    	double start = 0, end = 0;
    
    	SetType testSet;
    	cout << name;
    
    	start = getTime();
    	for (int i = 0; i < TEST_SIZE; i++)
    		testSet.insert(testData[i]);
    	end = getTime();
    	cout << " add " << end - start;
    
    	start = getTime();
    	typename SetType::iterator iter = testSet.begin();
    	while (iter != testSet.end())
    		iter++;
    	end = getTime();
    	cout << " iterate " << end - start;
    
    	start = getTime();
    	for (int i = 0; i < TEST_SIZE; i++)
    		testSet.find(testData[i]);
    	end = getTime();
    	cout << " find " << end - start;
    
    	start = getTime();
    	for (int i = 0; i < TEST_SIZE; i++)
    		testSet.erase(testSet.find(testData[i]));
    	end = getTime();
    	cout << " erase " << end - start;
    
    	cout << endl;
    }
    
    int main(int argc, char **argv) {
    	srand(0);
    
    	int testData[TEST_SIZE] = { 0 };
    	for (int i = 0; i < TEST_SIZE; i++)
    		testData[i] = i;
    
    	// a bunch of random swaps
    	for (int i = 0; i < TEST_SIZE; i++) {
    		int swapIndexA = rand() % TEST_SIZE;
    		int swapIndexB = rand() % TEST_SIZE;
    		int temp = testData[swapIndexA];
    		testData[swapIndexA] = testData[swapIndexB];
    		testData[swapIndexB] = temp;
    	}
    	
    	testSet< set<int> >(testData, "stl set");
    
    	testSet< unordered_set<int> >(testData, "unordered");
    }

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    find() is O(n) for hash tables. I believe you've been told this in other threads.

    Have you tried looking at a code profiler to find the bottleneck? Just timing the beginning to end of the program is often not enough for a good benchmark.

  5. #5
    Registered User
    Join Date
    May 2004
    Posts
    73
    Yeah, I have been told that... it's hard for me to accept it because it goes against everything I've believed. My world has been shattered. I guess I have to believe now that hash tables are only fast in theory, not in practice.

    Does this output not surprise you too? That erasing is so costly?

  6. #6
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Not surprising to me. Searching a hash table and the look up operation are different things.

  7. #7
    Registered User
    Join Date
    May 2004
    Posts
    73
    But isn't erase() in a hash table usually just finding the node, and setting it's "dead bit" to 1? I would think that would be at least as fast as adding.

  8. #8
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    The call for erase() that you tested is

    testSet.erase(testSet.find(testData[i]));

    thus erase() is now as bad as search. You did not need to benchmark to know this really.

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Verdagon
    I lost the code, but I managed to whip up another benchmark to show the same weirdness.

    Output with TEST_SIZE = 1000
    stl set add 0.017129 iterate 0.008975 find 0.013035 erase 0.019165
    unordered add 0.093343 iterate 0.006815 find 0.013745 erase 0.683252

    Output with TEST_SIZE = 10000
    stl set add 0.200412 iterate 0.056483 find 0.144453 erase 0.208875
    unordered add 1.09719 iterate 0.066118 find 0.137345 erase 52.7872

    Running on a core 2 duo @ 1.7ghz, at 30% processor (due to power settings), so you may get much faster results, but the results should be similar.
    These are my results after compiling your example code from post #3:
    Output with TEST_SIZE = 1000
    stl set add 0.000203 iterate 1.2e-005 find 6.1e-005 erase 0.000175
    unordered add 0.000174 iterate 3e-006 find 1.6e-005 erase 9.8e-005

    Output with TEST_SIZE = 10000
    stl set add 0.003122 iterate 0.000315 find 0.001419 erase 0.002949
    unordered add 0.001949 iterate 2.6e-005 find 0.000153 erase 0.001246

    Output with TEST_SIZE = 200000
    stl set add 0.044286 iterate 0.003525 find 0.018918 erase 0.030041
    unordered add 0.02817 iterate 0.001536 find 0.004315 erase 0.022138
    I compiled with the same compiler as you in the default release configuration.

    It is apparent that my results are very different from yours. std::unordered_set beat std::set in each case.

    EDIT:
    Of course, this does not mean that std::unordered_set is necessarily better than std::set. As was mentioned earlier, std::unordered_set's worst case for find is still O(n), compared to O(log n) for std::set.

    Quote Originally Posted by Verdagon
    Does this output not surprise you too? That erasing is so costly?
    That does surprise me. The numbers are clearly an anomaly (worst case performance for erase should be proportional to the number of elements in the hash table), but considering that it has been repeated by you, that is rather strange.
    Last edited by laserlight; 12-08-2010 at 02:22 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    Registered User
    Join Date
    May 2004
    Posts
    73
    You have no idea how glad I am to hear that. Another project I'm working on requires that hash map be faster than a binary tree. Thanks for running that benchmark!

    I would have run it on another computer myself, but the timer requires windows and I only have one windows machine >.<

  11. #11
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Verdagon
    Another project I'm working on requires that hash map be faster than a binary tree.
    That sounds like a strange requirement

    Furthermore, faster for what operations? In the average case, or in the worst case, or both?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Hash tables
    By marya in forum C Programming
    Replies: 4
    Last Post: 12-02-2010, 12:48 AM
  2. Iterable hash tables
    By OnionKnight in forum Tech Board
    Replies: 5
    Last Post: 07-21-2008, 02:02 AM
  3. need help with hash tables with direct chaining
    By agentsmith in forum C Programming
    Replies: 4
    Last Post: 01-05-2008, 04:19 AM
  4. Group Project Help/Volunteer
    By DarkDot in forum C++ Programming
    Replies: 3
    Last Post: 04-24-2007, 11:36 PM
  5. Problems with Hash Tables
    By Zildjian in forum C Programming
    Replies: 6
    Last Post: 11-06-2003, 08:53 PM