Profiling

This is a discussion on Profiling within the Tech Board forums, part of the Community Boards category; I've written a bayer tree implementation in C for storing key-value pairs (etc). It works pretty well -- eg. it's ...

  1. #1
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300

    Profiling

    I've written a bayer tree implementation in C for storing key-value pairs (etc). It works pretty well -- eg. it's faster than std::map.

    My intention was to then take this and use it in a hash table, such that the hash table would be an array of bayer trees. This allows you to tweak a couple of elements to suit the size of the data set (the array size and the m-level of the trees). When I was writing that, it also occurred to me that since the trees are independent of one another, operations on them could be threaded. So I sort of doubled hashed the table into "realms", where each realm would represent a portion of the whole, and operations in that realm would be done with a single thread. In other words, if you initialized the total table size to 200 and with 200 realms, you'd have one thread for each table (but, nb, the "double hashing" would be irrelevant). Alternately, you could set the total size to 200 with only 10 realms -- as long as the total can be evenly divided into the number of realms, it's all good. While obviously each thread would not get it's own processor, the fewer trees per thread, the better the chance that "consecutive" operations could occur simultaneously, the next one starting before the last one finishes -- whereas with fewer threads, you have more trees per thread and a higher chance that two consecutive operations will take place in the same realm (with no chance of concurrency).

    That seemed pretty clever and I got it all done up to play around and see what settings worked best. Varying the number of realms seems to have little effect (unlike the m-level of the trees). Even worse, the whole table scheme, despite seeming fairly simple, is orders of magnitude slower than just the single bayer tree. So I applied gprof, and here's where I'm stuck:

    Gprof says 95% of the time is spent in the bayer tree functions, not the hash table. For example, building a structure of 80,000 items requires 80,000 bayer tree "add" calls, whether that's one tree or a whole table of them. But using a single tree, those 80,000 adds take < 0.2 seconds. Using a hash table of trees, there are still only 80,000 "add" calls, but they take 1.5 seconds. The hash table "add" function encapsulates the bayer tree add, so I had figured there was a bottleneck there, but apparently not -- it only takes 5% of the time. I had assumed the slight overhead of the encapsulation would be compensated for because the tree adds would be on much smaller simpler trees, thus faster, but instead it seems they are taking longer???

    Something doesn't make sense there. Anyone got any gprof tips or other ideas about how to get to the bottom of this? Like alternative profiling tools to confirm or deny this strange conclusion?
    Last edited by MK27; 06-04-2010 at 07:42 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Looks like this was my somewhat absurd threading scheme, altho I'm still upset that the grof output doesn't reflect the problem -- the threads are all created in the hash table code, removing them solved the speed problem.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. profiling
    By zaac in forum C++ Programming
    Replies: 3
    Last Post: 01-30-2009, 09:06 AM
  2. Profiling for c code in eclipse
    By hai12345 in forum C Programming
    Replies: 2
    Last Post: 10-10-2008, 04:46 PM
  3. Profiling with GPROF
    By RoshanX in forum C Programming
    Replies: 2
    Last Post: 03-30-2007, 02:38 PM
  4. I can't get profiling to work with Visual Studio 6
    By Darkness in forum C++ Programming
    Replies: 7
    Last Post: 12-30-2004, 09:37 PM
  5. Enabling Profiling in VS.NET
    By xds4lx in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 04-17-2002, 01:10 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21