hash table's performance

**Verdagon** · 12-08-2010

Hi everyone, I did a little benchmark, and I'm trying to figure out why the <unordered_set> (visual studio 2010) hash table performance is so much worse than std::set.

Code:

Time is measured in seconds. Disregard SOHTBST,
that's just another implementation of a tree.

adding to a container stats follow.
the first row is 2^11 randomly generated integers,
doubling each time until the last row,
which is 2^18 randomly generated integers.
std::set    SOHTBST     HashTable1     HashTable2
  0.001210    0.000504    0.005610    0.003793
  0.002509    0.000998    0.017613    0.015412
  0.005010    0.002214    0.034532    0.014783
  0.010273    0.004754    0.048999    0.029705
  0.020482    0.009841    0.079733    0.060976
  0.042823    0.021546    0.669926    0.140907
  0.087980    0.044820    0.793633    0.238754
  0.197135    0.103221    1.048396    0.480977
(HashTable1 is a new container, HashTable2 is
that same container after having all elements
removed. The effect is that HashTable2 doesn't need any resizing)
Adding uses:
start = timeSinceEpoch();
for (int i = 0; i < testSize; i++)
	container.insert(testData[i]);
end= timeSinceEpoch();


iterating through all the elements:
std::set    SOHTBST     HashTable
  0.000547    0.000017    0.000685
  0.001091    0.000032    0.001361
  0.002334    0.000059    0.002653
  0.004378    0.000126    0.005325
  0.008599    0.000226    0.010637
  0.017478    0.000467    0.021505
  0.034945    0.001034    0.042649
  0.071910    0.003394    0.087486
Iterating uses:
start = timeSinceEpoch();
typename Algorithm::iterator iter = container.begin();
while (iter != container.end())
	iter++;
end = timeSinceEpoch();

finding each of the elements:
std::set    SOHTBST     HashTable
  0.000929    0.000118    0.001475
  0.001918    0.000285    0.003258
  0.003887    0.000637    0.005811
  0.008105    0.001380    0.012648
  0.016566    0.003361    0.027964
  0.034278    0.006734    0.042473
  0.071597    0.016240    0.087051
  0.161150    0.040815    0.209862
Finding uses:
start = timeSinceEpoch();
for (int i = 0; i < testSize; i++)
	container.find(testData[i]);
end = timeSinceEpoch();

removing all of the elements, one by one:
std::set    SOHTBST     HashTable
  0.001538    0.000268    0.010862
  0.003081    0.000532    0.027662
  0.006087    0.001043    0.215371
  0.012096    0.002074    0.481068
  0.024105    0.004128    1.014633
  0.048194    0.008248   13.332217
  0.096763    0.016540   30.043754
  0.196106    0.034952   62.446652
Removing uses:
start = timeSinceEpoch();
for (iter = container.begin(); iter != container.end(); ) {
	typename Algorithm::iterator nextIter = iter;
	nextIter++;
	container.erase(iter);
	iter = nextIter;
}
end = timeSinceEpoch();




Hash function:
struct myhash {
    unsigned int operator() (int a) const {
        // Robert Jenkins
       a = (a+0x7ed55d16) + (a<<12);
       a = (a^0xc761c23c) ^ (a>>19);
       a = (a+0x165667b1) + (a<<5);
       a = (a+0xd3a2646c) ^ (a<<9);
       a = (a+0xfd7046c5) + (a<<3);
       a = (a^0xb55a4f09) ^ (a>>16);
       return a;
    }
};

My main question is, why is the hash table so much slower in these use cases than stl::set?

And why is hash table's erase() function so terribly slow (60 seconds!)?

Please, give me some insight into how they may have implemented their hash table, that would lead to this. Or maybe its an error in my benchmarking?

- Evan Ovadia

An extra note, a few other hash tables I've measured against show the same behavior: worse than stl::set, and terrible erase().

**laserlight** · 12-08-2010

Perhaps you should post the actual benchmark program so that we can run it to verify your results, as well as check what exactly you are doing.

**Verdagon** · 12-08-2010

I lost the code, but I managed to whip up another benchmark to show the same weirdness.

Output with TEST_SIZE = 1000
stl set add 0.017129 iterate 0.008975 find 0.013035 erase 0.019165
unordered add 0.093343 iterate 0.006815 find 0.013745 erase 0.683252

Output with TEST_SIZE = 10000
stl set add 0.200412 iterate 0.056483 find 0.144453 erase 0.208875
unordered add 1.09719 iterate 0.066118 find 0.137345 erase 52.7872

Running on a core 2 duo @ 1.7ghz, at 30% processor (due to power settings), so you may get much faster results, but the results should be similar.

Code:

#include <iostream>
using namespace std;

#include <Windows.h> // for QueryPerformanceCounter
#include <set>
#include <unordered_set>

#define TEST_SIZE 1000

double getTime() {
	static __int64 ticksPerSecond = 0;
	if (ticksPerSecond == 0)
		QueryPerformanceFrequency((LARGE_INTEGER *)&ticksPerSecond);

	__int64 ticks = 0;
	QueryPerformanceCounter((LARGE_INTEGER *)&ticks);

	return (ticks * 1000000 / ticksPerSecond) / 1000000.0;
}

template<typename SetType>
void testSet(int testData[TEST_SIZE], char *name) {
	double start = 0, end = 0;

	SetType testSet;
	cout << name;

	start = getTime();
	for (int i = 0; i < TEST_SIZE; i++)
		testSet.insert(testData[i]);
	end = getTime();
	cout << " add " << end - start;

	start = getTime();
	typename SetType::iterator iter = testSet.begin();
	while (iter != testSet.end())
		iter++;
	end = getTime();
	cout << " iterate " << end - start;

	start = getTime();
	for (int i = 0; i < TEST_SIZE; i++)
		testSet.find(testData[i]);
	end = getTime();
	cout << " find " << end - start;

	start = getTime();
	for (int i = 0; i < TEST_SIZE; i++)
		testSet.erase(testSet.find(testData[i]));
	end = getTime();
	cout << " erase " << end - start;

	cout << endl;
}

int main(int argc, char **argv) {
	srand(0);

	int testData[TEST_SIZE] = { 0 };
	for (int i = 0; i < TEST_SIZE; i++)
		testData[i] = i;

	// a bunch of random swaps
	for (int i = 0; i < TEST_SIZE; i++) {
		int swapIndexA = rand() % TEST_SIZE;
		int swapIndexB = rand() % TEST_SIZE;
		int temp = testData[swapIndexA];
		testData[swapIndexA] = testData[swapIndexB];
		testData[swapIndexB] = temp;
	}
	
	testSet< set<int> >(testData, "stl set");

	testSet< unordered_set<int> >(testData, "unordered");
}

**whiteflags** · 12-08-2010

find() is O(n) for hash tables. I believe you've been told this in other threads.

Have you tried looking at a code profiler to find the bottleneck? Just timing the beginning to end of the program is often not enough for a good benchmark.

**Verdagon** · 12-08-2010

Yeah, I have been told that... it's hard for me to accept it because it goes against everything I've believed. My world has been shattered. I guess I have to believe now that hash tables are only fast in theory, not in practice.

Does this output not surprise you too? That erasing is so costly?

**whiteflags** · 12-08-2010

Not surprising to me. Searching a hash table and the look up operation are different things.

**Verdagon** · 12-08-2010

But isn't erase() in a hash table usually just finding the node, and setting it's "dead bit" to 1? I would think that would be at least as fast as adding.

**whiteflags** · 12-08-2010

The call for erase() that you tested is

testSet.erase(testSet.find(testData[i]));

thus erase() is now as bad as search. You did not need to benchmark to know this really.

**laserlight** · 12-08-2010

Originally Posted by Verdagon

I lost the code, but I managed to whip up another benchmark to show the same weirdness.

Output with TEST_SIZE = 1000
stl set add 0.017129 iterate 0.008975 find 0.013035 erase 0.019165
unordered add 0.093343 iterate 0.006815 find 0.013745 erase 0.683252

Output with TEST_SIZE = 10000
stl set add 0.200412 iterate 0.056483 find 0.144453 erase 0.208875
unordered add 1.09719 iterate 0.066118 find 0.137345 erase 52.7872

Running on a core 2 duo @ 1.7ghz, at 30% processor (due to power settings), so you may get much faster results, but the results should be similar.

These are my results after compiling your example code from post #3:

Output with TEST_SIZE = 1000
stl set add 0.000203 iterate 1.2e-005 find 6.1e-005 erase 0.000175
unordered add 0.000174 iterate 3e-006 find 1.6e-005 erase 9.8e-005

Output with TEST_SIZE = 10000
stl set add 0.003122 iterate 0.000315 find 0.001419 erase 0.002949
unordered add 0.001949 iterate 2.6e-005 find 0.000153 erase 0.001246

Output with TEST_SIZE = 200000
stl set add 0.044286 iterate 0.003525 find 0.018918 erase 0.030041
unordered add 0.02817 iterate 0.001536 find 0.004315 erase 0.022138

I compiled with the same compiler as you in the default release configuration.

It is apparent that my results are very different from yours. std::unordered_set beat std::set in each case.

EDIT:
Of course, this does not mean that std::unordered_set is necessarily better than std::set. As was mentioned earlier, std::unordered_set's worst case for find is still O(n), compared to O(log n) for std::set.

Originally Posted by Verdagon

Does this output not surprise you too? That erasing is so costly?

That does surprise me. The numbers are clearly an anomaly (worst case performance for erase should be proportional to the number of elements in the hash table), but considering that it has been repeated by you, that is rather strange.

**Verdagon** · 12-08-2010

You have no idea how glad I am to hear that. Another project I'm working on requires that hash map be faster than a binary tree. Thanks for running that benchmark!

I would have run it on another computer myself, but the timer requires windows and I only have one windows machine >.<

**laserlight** · 12-08-2010

Originally Posted by Verdagon

Another project I'm working on requires that hash map be faster than a binary tree.

That sounds like a strange requirement

Furthermore, faster for what operations? In the average case, or in the worst case, or both?

Thread: hash table's performance

Thread Tools

Search Thread

Display

hash table's performance

Similar Threads

Hash tables

Iterable hash tables

need help with hash tables with direct chaining

Group Project Help/Volunteer

Problems with Hash Tables