Thread: Vector vs. array.

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677

    Vector vs. array.

    The discussion a couple of days ago relating to vectors and arrays got me thinking: If we have a 2D array, how does that compare to a vector of vectors when it comes to performance.

    The answer is that in non-optimized debug it's horrible - vector of vector is many times slower than a 2D array. Fortunately, the compiler magic in optimized mode makes the difference much smaller.

    My test-code is this:
    Code:
    #include <iostream>
    #include <iomanip>
    #include <vector>
    #include <ctime>
    
    const int size = 100;
    
    class B
    {
    public:
    	virtual int sum() = 0;
    };
    
    class V : public B
    {
    private:
    	std::vector < std::vector <int> > vec;
    public:
    	V() 
    	{
    		for(int i = 0; i < size; i++)
    		{
    			vec.push_back(std::vector<int>(0));
    			for(int j = 0; j < size; j++)
    			{
    				vec[i].push_back(i * j);
    			}
    		}
    	}
    	virtual int sum() {
    		int s = 0;
    		for(int i = 0; i < size; i++) 
    			for(int j = 0; j < size; j++)
    				s += vec[i][j];
    		return s;
    	}
    };
    
    
    class A : public B
    {
    private:
    	int arr[size][size];
    
    public:
    	A()
    	{
    		for(int i = 0; i < size; i++)
    		{
    			for(int j = 0; j < size; j++)
    			{
    				arr[i][j] = i * j;
    			}
    		}
    	}
    	int sum() {
    		int s = 0;
    		for(int i = 0; i < size; i++) 
    			for(int j = 0; j < size; j++)
    				s += arr[i][j];
    		return s;
    	}
    };
    
    
    int func(B *b)
    {
    	return b->sum();
    }
    
    
    void timeIt(char *name, int (*f)(B *b), B * b)
    {
    	int x = 0;
    	clock_t t = clock();
    	for(int i = 0; i < 1000000; i++)
    	{
    		x += f(b);
    	}
    	t = clock() - t;
    	std::cout << "x = " << x << std::endl;
    	std::cout << name << ": " << std::setprecision(6) << static_cast<double>(t) / CLOCKS_PER_SEC << std::endl;
    }
    
    
    #define TIME_IT(f, b) do { timeIt(#b, f, &b); } while(0)
    
    int main()
    {
    	A a;
    	V v;
    	TIME_IT(func, a);
    	TIME_IT(func, v);
    }
    And the result is about 4.3s (4.265-4.281) and 5.9s (5.906-5.931) over 6 runs-take away the extremes. That makes 5.9/4.3 -> 1.37 -> about 37% performance loss.

    Of course, this is simply testing the performance of accessing an array/vector rather intensely. A lot of the time, we do other things as well, and there are certainly benefits to using vectors if we need to grow the array dynamically - if nothing else because it's easier to write the code.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  2. #2
    Super Moderater.
    Join Date
    Jan 2005
    Posts
    374
    Seems logical.

  3. #3
    The larch
    Join Date
    May 2006
    Posts
    3,573
    Strangely, the output that I got with MingW was:
    x = -288423680
    a: 19.703
    x = -288423680
    v: 16.157
    Which means that the vector version was faster?! (Does the compiler do that bad with plain arrays or is it beneficial to store the subvectors in separate chunks of memory, seeing that it doesn't perform "real" random access.

    But with VC++ 2005:
    x = -288423680
    a: 7.812
    x = -288423680
    v: 73.453
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  4. #4
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by anon View Post
    But with VC++ 2005:
    Code:
    x = -288423680
    a: 7.812
    x = -288423680
    v: 73.453
    Holy crap!
    Did you enable full optimization, or was that in Debug mode or something?

  5. #5
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Quote Originally Posted by anon View Post
    Which means that the vector version was faster?!
    That doesn't surprise me, because I've heard vectors are pretty fast. Although you'd have thought the plain array would have a slight advantage. I wonder if it's because the array is on the stack.

  6. #6
    The larch
    Join Date
    May 2006
    Posts
    3,573
    Release, with Code::Blocks selected best optimization (no other needed).
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  7. #7
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Try adding
    #define _SECURE_SCL 0
    #define _SECURE_SCL_THROWS 0
    too.
    To remove range checking and other stuff. This will make it more comparable to other compilers.
    Also, mats, your example wouldn't work correctly for me and frankly, I don't see why you're making it so complex. I tested with:

    Code:
    const int iterations = 1000000; // 1 million
    const int arraysize = iterations / 1000;
    int array[arraysize][arraysize];
    std::vector< std::vector<int> > varray;
    
    int main()
    {
    	int sum = 0;
    	
    	DWORD dwTick = timeGetTime();
    	for (int i = 0; i < arraysize; i++)
    		for (int j = 0; j < arraysize; j++)
    			array[i][j] = i * j + j;
    	for (int i = 0; i < arraysize; i++)
    		for (int j = 0; j < arraysize; j++)
    			sum += array[i][j];
    	cout << "Took " << timeGetTime() - dwTick << " ms.\n";
    
    	dwTick = timeGetTime();
    	for (int i = 0; i < arraysize; i++)
    	{
    		varray.push_back( std::vector<int>() );
    		for (int j = 0; j < arraysize; j++)
    			varray[i].push_back(i * j + j);
    	}
    	for (int i = 0; i < arraysize; i++)
    		for (int j = 0; j < arraysize; j++)
    			sum += varray[i][j];
    	cout << "Took " << timeGetTime() - dwTick << " ms.\n";
    }
    Results are (in release) - 4-5 ms for native and 17-21 ms for vector of vector.
    In debug, it's 15 ms / 938 ms.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #8
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Code:
    x = -288423680
    a: 6.75
    x = -288423680
    v: 5.36
    64-bit Linux. gcc 4.2.3. Core 2 Duo @ 3.2ghz. Compiled with just a simple "-O3".

    EDIT
    Interestingly, it's faster (5.23s, 4.78s) if I compile it to a 32-bit binary (-m32).
    Last edited by cyberfish; 06-20-2008 at 07:24 PM.

  9. #9
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    That doesn't surprise me, because I've heard vectors are pretty fast. Although you'd have thought the plain array would have a slight advantage. I wonder if it's because the array is on the stack.
    I doubt so. The only difference would be in time taken to allocate the memory, which is not included in the timing (in the constructor).

  10. #10
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    I can't run Elysia's version, since apparently DWORD and timeGetTime are Windows-only.

    After doing some hacking, including defining timeGetTime using the standard gettimeofday -
    Code:
    #include <vector>
    #include <iostream>
    #include <sys/time.h>
    
    typedef int DWORD;
    
    const int iterations = 1000000; // 1 million
    const int arraysize = iterations / 1000;
    int array[arraysize][arraysize];
    std::vector< std::vector<int> > varray(arraysize, std::vector<int>(arraysize));
    
    DWORD timeGetTime() {
    	timeval tv;
    	gettimeofday(&tv, 0);
    	DWORD rtn = tv.tv_usec/1000;
    	rtn += tv.tv_sec*1000;
    	return rtn;
    }
    
    int main()
    {
    	int sum = 0;
    	
    	DWORD dwTick = timeGetTime();
    	for (int i = 0; i < arraysize; i++)
    		for (int j = 0; j < arraysize; j++)
    			array[i][j] = i * j + j;
    	for (int i = 0; i < arraysize; i++)
    		for (int j = 0; j < arraysize; j++)
    			sum += array[i][j];
    	std::cout << "Took " << timeGetTime() - dwTick << " ms.\n";
    
    	dwTick = timeGetTime();
    	for (int i = 0; i < arraysize; i++)
    	{
    		for (int j = 0; j < arraysize; j++)
    			varray[i][i] = i * j + j;
    	}
    	for (int i = 0; i < arraysize; i++)
    		for (int j = 0; j < arraysize; j++)
    			sum += varray[i][j];
    	std::cout << "Took " << timeGetTime() - dwTick << " ms.\n";
    	
    	std::cout << sum << std::endl;
    }
    The results -
    Code:
    Took 2 ms.
    Took 10 ms.
    On a second thought, you are including the memory allocation/moving time in the vector version (in push_back). This is not fair comparison.

    *edit*
    I editted the code to fix that. The cout statement at the end is so the compiler won't optimize everything out. That, I think is why matsp has it in his version, too.

    the results -
    Code:
    Took 3 ms.
    Took 1 ms.
    1391646332
    The number of significant figures also seem a bit too few... blah too lazy to fix that .
    Last edited by cyberfish; 06-20-2008 at 08:16 PM.

  11. #11
    The larch
    Join Date
    May 2006
    Posts
    3,573
    Try adding
    #define _SECURE_SCL 0
    #define _SECURE_SCL_THROWS 0
    Ok, that brought vector's time down to 8 seconds.

    I wouldn't have thought that they'd kill STL performace completely with default settings.
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  12. #12
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by cyberfish View Post
    On a second thought, you are including the memory allocation/moving time in the vector version (in push_back). This is not fair comparison.
    It may not be, but I pushed for a more general scenario. We're more likely to just let the vector manage the memory instead of pre-allocating, so I used push_back instead of resize.

    Quote Originally Posted by anon View Post
    I wouldn't have thought that they'd kill STL performace completely with default settings.
    They do range checking by default, and I can't really say it's a bad thing, since it's more secure. But it leaves you the option to disable that, which is a very fine thing, indeed.
    Actually, in release, if you try to access an out-of-bounds element in the vector, with range checking OFF, the compiler will completely optimize away that line of code.

    I tried a test by preallocating vector and the result was that they were pretty equal, or vector was faster!
    4-6 ms for vector, and 4-7 ms for native array.
    Last edited by Elysia; 06-21-2008 at 04:48 AM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #13
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Quote Originally Posted by Elysia View Post
    They do range checking by default, and I can't really say it's a bad thing, since it's more secure. But it leaves you the option to disable that, which is a very fine thing, indeed.
    It's a horrible thing. I've heard some serious horror stories about the compatibility problems when a binary-only library links against the "secure" STL and the client code wants the unsafe version because secure version is just too damn slow. Well, no can do, sorry. No mixing of the two. You get to beg the library vendor for a version linking against the unsafe version.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  14. #14
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    MinGW on a 32-bit XP machine...

    x = -288423680
    a: 38.515
    x = -288423680
    v: 477.266
    -O3

    x = -288423680
    a: 10.266
    x = -288423680
    v: 7.719
    Might I suggest this machine could be a little faster in general?

  15. #15
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    After doing some hacking, including defining timeGetTime using the standard gettimeofday -
    gettimeofday() isn't ANSI standard, of course, but it's BSD and POSIX according to my man page, which I guess is what you were referring to . . . .

    [edit] matsp's original program on an AMD64 processor under 64-bit Debian GNU/Linux Lenny:
    Code:
    x = -288423680
    a: 68.82
    x = -288423680
    v: 145.68
    That was compiled with
    Code:
    g++ vecarray.cpp -o vecarray
    I was too lazy to test with -O2 or -g. [/edit]
    Last edited by dwks; 06-21-2008 at 01:24 AM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 16
    Last Post: 05-29-2009, 07:25 PM
  2. Replies: 6
    Last Post: 11-09-2006, 03:28 AM
  3. [question]Analyzing data in a two-dimensional array
    By burbose in forum C Programming
    Replies: 2
    Last Post: 06-13-2005, 07:31 AM
  4. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  5. Quick question about SIGSEGV
    By Cikotic in forum C Programming
    Replies: 30
    Last Post: 07-01-2004, 07:48 PM