To tree or not to tree (alternatives to std::map)

This is a discussion on To tree or not to tree (alternatives to std::map) within the C++ Programming forums, part of the General Programming Boards category; I noticed that when working with maps containing a great number of entries (1,000,000+) and requiring a large volume of ...

  1. #1
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,600

    To tree or not to tree (alternatives to std::map)

    I noticed that when working with maps containing a great number of entries (1,000,000+) and requiring a large volume of insertions/deletions, a severe degradation of performance developed ( presumably because of frequent tree-rebalancing). So I switched to (sorted) arrays of pointers - just hoping to eke out a little faster processing time - and as it turns out, the bottleneck completely disappeared! Apparently, even though every single insert/delete required a memcpy/memmove of several magabytes, it was still several orders of magnitude faster than the (relatively) occasional task of rebalancing a large binary tree. So I wonder: why use binary trees at all? Or am I missing something here?
    Code:
    if( numeric_limits< byte >::digits != bits_per_byte )
        error( "program requires bits_per_byte-bit bytes" );
    24bbs.cpp

  2. #2
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,522
    map, like a few other containers, does include memory management instructions in the insert and delete operations. The overhead from those is most likely the source of your performance problems, meanwhile, an array (or deque) wouldn't necessarily have the same overhead for millions of elements.

    So rather than ask why use binary trees, maybe a better question is why you couldn't have used a differently designed one. A tree that used a pool of nodes would probably be much faster than map.

  3. #3
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,600
    >> So rather than ask why use binary trees, maybe a better question is why you couldn't have used a differently designed one.

    That is exactly my point! If an array performs better than a tree (and it appears that it does both in search and insertion/deletion efficiency), then why not use the former exclusively when implementing std::maps/sets? Again, I'm not sure if I've examined every angle here (still euphoric ), but from what I can tell there is no other conclusion. Every binary tree has to be rebalanced (expensive) and only a perfectly balanced tree (rare) delivers close to optimal serching. On the other hand, sorted arrays don't suffer from these defects...
    Code:
    if( numeric_limits< byte >::digits != bits_per_byte )
        error( "program requires bits_per_byte-bit bytes" );
    24bbs.cpp

  4. #4
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,522
    It's different from a memory pool. A memory pool would still offer you some of the benefits though because you could get a million nodes, link and unlink them at will, and free the pool when your finally done.

    If you implemented a binary tree using an underlying array of hypotheticals, then you inherit the problems that arrays have in becoming elastic. Arrays can only be resized from the ends, so there is substantial shuffling in any event if you needed to insert or delete an item. Which might not incur overhead, but it leads into a completely different problem:

    Hypotheticals might not be a pointer of some sort and probably don't have a convenient way to mark themselves dead so that you can eschew resizing the array to another point in time. In any event, it would be nice if memory management was the responsibility of the class and you would need an algorithm to determine the optimum time to reallocate the array.

    Assuming that you never came up with anything and left it to the programmer to tell his tree to "resize now," actually freeing the element is the expected behavior. I suspect that many would be surprised that they have to routinely do something like this after several "deletions" to get rid of dead weight.

    special::map foo.swap(special::map(foo));

    There is nothing wrong with this but it strikes me as a brittle implementation intended for everyday use, which the STL is meant to be.
    Last edited by whiteflags; 07-25-2008 at 10:51 PM.

  5. #5
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,261
    This was discussed fairly recently. There is an excellent replacement for a map that uses a vector internally in the Loki library, called AssocVector.
    There are downsides though. From the header file:
    Code:
    ////////////////////////////////////////////////////////////////////////////////
    // class template AssocVector
    // An associative vector built as a syntactic drop-in replacement for std::map
    // BEWARE: AssocVector doesn't respect all map's guarantees, the most important
    //     being:
    // * iterators are invalidated by insert and erase operations
    // * the complexity of insert/erase is O(N) not O(log N)
    // * value_type is std::pair<K, V> not std::pair<const K, V>
    // * iterators are random
    ////////////////////////////////////////////////////////////////////////////////
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  6. #6
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,600
    Alas, the speedup was actually due to a miscalculation in the element-shifting routine. Fixing that, the std::map was *much* faster at insertions/deletions.

    <bows to binary tree>
    Code:
    if( numeric_limits< byte >::digits != bits_per_byte )
        error( "program requires bits_per_byte-bit bytes" );
    24bbs.cpp

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Interpreter.c
    By moussa in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 05:59 PM
  2. Binary Tree, couple questions
    By scoobasean in forum C Programming
    Replies: 3
    Last Post: 03-12-2005, 08:09 PM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 09:33 AM
  5. BST/Red and Black Tree
    By ghettoman in forum C++ Programming
    Replies: 0
    Last Post: 10-24-2001, 10:45 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21