What did I miss about boosts sparse matrix library?

**pheres** · 04-13-2008

Hi,

I want to store 2D terrain information in memory (to be able to process it online) which can become very large. But because the terrain abstractions I'm using are sparsely populated this should not be a problem.
I took a look at the boost sparse matrix adapter classes:

http://www.boost.org/doc/libs/1_35_0...rix_sparse.htm

and tried to play with their little demo program:

Code:

#include <boost/numeric/ublas/matrix_sparse.hpp>
#include <boost/numeric/ublas/io.hpp>


int main ()
{
    using namespace boost::numeric::ublas;
    
    mapped_matrix<char> m (3000, 3000);
    
    for (unsigned i = 0; i < m.size1 (); ++ i)
    {
        for (unsigned j = 0; j < m.size2 (); ++ j)
        {
            m (i, j) = rand() + 1;
        }
    }
    sleep(8);
}

This should fill a matrix with all but zeros of size 3000x3000.
So it is not sparse and should take up 9 megabytes in memory.
If I instead use a vector<vector<char> > it does.
But if I run the test program I see top showing the VmSize going up to about 280 megabytes. Why is that?

Now even more weird: If I fill the matrix completely with zeros it should be very very sparse and should take up far below 9 megabytes in memory. But Top shows exactly the same as if I fill it with non-zero values.

The other sparse matrix adapters the boost library offers (I did not get the difference between them yet) behave about the same.

What did I miss?

Thank you in advance!

**CornedBee** · 04-13-2008

But if I run the test program I see top showing the VmSize going up to about 280 megabytes. Why is that?

The sparse matrix needs more memory per non-zero element than a dense matrix. That's the price you pay for the sparseness. mapped_matrix uses a std::map<size_t, T> internally - depending on the tree's implementation, that means that every non-zero node requires 5 or more machine words - that's 40 bytes on a 64-bit system! (As opposed to 1 byte in a dense char-matrix.) So you see, a matrix has to be very sparse indeed before this is worth it.
compressed_matrix uses run-length encoding, if I understand the documentation correctly, so it's good when you have many elements with the same value.

But Top shows exactly the same as if I fill it with non-zero values.

I don't think the matrix is meant to be filled with zeros. Those are there by default, but sparse. I can't find anything in the code that indicates that the matrix frees elements when you assign zero to them. Basically, you're meant to leave the map alone, except for setting the few elements that you care about.

The example in the documentation is poor - notice that it is the same for every matrix type. It doesn't demonstrate anything specific to the sparse matrices.

**pheres** · 04-13-2008

I see, thanks for the nice explanation!

Originally Posted by CornedBee

I don't think the matrix is meant to be filled with zeros. Those are there by default, but sparse.

That seems to be right. Access to elements which are not written explicitly returns zero (or better the value the default ctor sets I think).

Is it known why they are using trees for element storage?
If I look at the old Yale storage format for sparse matrices, for an NxN matrix with NZ non-zero values, it's size requirements are NZ * sizeof(T) + NZ + N +1 (vector of non-zero values, column index vector, row index vector).
So for the 3000x3000 matrix of char with no zero value in my example this would make 3000 x 3000 * 1 + 3000 x 3000 + 3001 ~ 18 megabytes, which is much lower than the 280 megabytes I saw in top for the boost implementation using trees.

Is their reason to allow faster insertions of rows/columns that it would be possible using vectors as in the Yale format?

If so I better implementing the Yale format, because I have no need for resizing/insertion/erase.

**CornedBee** · 04-13-2008

I have no idea how the Yale format works. The std::map approach has the advantage of being simple.

**pheres** · 04-13-2008

It is pretty easy:

Vector of non-zero-values holds all non-zero values read row by row from the matrix.
Row-index-vector points into the value-vector at the positions every row starts
column-index-vector tells the coliumn of every entry inside the value vector.

for example the 3x3 matrix

0,0,2
0,4,5
6,0,8

would be stored as

value-vector: 2,4,5,6,8
row-index-vector: 0,1,3,5 //rows start at index 0, 1, 3 and 5(imaginary 4th row)
col-index-vector: 2,1,2,0,2 // column of every value

But insertions of elements would be slow. Maybe thats why they use the maps with that big storage overhead.

Thread: What did I miss about boosts sparse matrix library?

Thread Tools

Search Thread

Display

What did I miss about boosts sparse matrix library?

Similar Threads

getting orientation and position in opengl

Genrate sparse matrix - C99

Very handy matrix functions - part 1

C++ Matrix library needed

Library for matrix math/ linear algebra?