Thread: benchmarking?

  1. #1
    Registered User
    Join Date
    Jan 2008
    Posts
    69

    benchmarking?

    I'm interested in doing some benchmarking, but am new to the concept of doing so. Any pointers? What do you guys use to measure the running time of your algorithms/functions in C? Do you use tools?

  2. #2
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    I'm certainly no expert on the matter,

    However, I will put my 2 cents in on how I do it (I only really need an approximation).

    On Linux I use time to measure the running time of my program (usually I 'benchmark' algorithms in one program by themselves -- eliminates the need for 'internal timing'). Otherwise I'd use time() or some more accurate time function like setitimer(). However you should be aware of caching, for example if you time an application once, and then do it again the 2nd time is probably going to be quicker because it's cached.

    I'm sure someone will reinforce what I've said, or shoot it out of the sky

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    The simple question leads to a long answer

    Benchmarking and performance improvement is an iterative process, you have to go through the same steps many times.

    The first thing to consider is:
    * What are you trying to achieve?
    Is the goal to improve the performance of your code, or something you can give to customers to measure the performance of their systems? The quality of the benchmark product would depend on the chosen goal.

    * Micro or macro benchmark?
    Do you want to measure the performance of the WHOLE application, or individual functions?
    There are benefits in both. In the former, you probably need some sort of script or similar to "run" the application, or a suitable "typical" input file, perhaps.
    In the latter case, you will have to write some code/script to perform specific functions (e.g. draw the same picture hundreds of times on the screen, output 100MB to a file, or whatever you want to achieve). It's also worth noting that some forms of micro-benchmarks can lead to "overoptimization" - waste of time, and sometimes also leading to overall reduction in performance, for example due to over-inlining, where some particular function gets too large, and throws out other bits of code from the cache, which harms the overall performance.

    After all, your "customers" will never run your code function by function, they will run the overall product - and that's what they require to run sufficiently fast.


    Tools:
    First of all, you need something to tell you what the current performance is. In it's simplest form, it's just a stopwatch and timing the application from start to finish. For less dependency on the finger-speed of the benchmarker, adding some code to the application to show the time taken and perhaps "loops per second" or "bytes per second" or such will help a lot.

    Once you have the "current performance data", you obviously will want to do something about it. First step is probably to make sure you have the right switches to the compiler. A debug build with no optimization can easily be 2-3x slower than a release build with full optimization. For large apps, it's also worth considering "Optimize for size" instead of "optimise for speed" - but measure on YOUR application, because there is no set rule here. Note also that once you have changed the code sufficiently, a different compiler option may turn out to be the best, compared to the first step.

    During my experiments to improve things, I use Excel to record what I've done, what the result was and compare improvements.

    The second step is to figure out where the code is spending time. Here is where a profiler comes in. There are two basic kinds:
    1. Sampling profiler.
    Interrupts in the system are instrumented to record where in the code it was currently executing when the interrupt was taken. Given sufficient runtime of the application, you should get a pretty good idea of where the app spent it's time. Vtune, CodeAnalyst and oprofile are tools in this range, as well as "writing your own" if you have an embedded system.

    2. Compiler helped profiler.
    This involves the compiler adding extra code to the application, which stores data about the execution time, and how many times it's hit, for each portion of code. I'm sure MS has such a feature, but I've never used it, but in gcc it's "-pg" to generate the extra code, and you use "gprof" to get the data presented in a readable form.
    You can of course do this yourself too, and for "only some" functions - just create an array with entries for as many functions as you think you need, and use a macro to indicate start and end of the function call, adding up the time taken for the function.

    Just be aware that adding this sort of code also slows the function down quite a bit.

    So we now have some sort of histogram or "top-ten" list of where the time is spent. Take a look at the top one - it's quite often a LARGE proportion of time spent in one function, so it's usually easy. Now look at that function - how can it be improved for speed? Is it doing a linear search, where keeping a list sorted and using a binary search would make it 8, 20 or 100x faster?

    Sometimes you find that the code isn't spending a lot of time anywhere, just small bits in lots and lots of functions. This makes it hard. It can be a hint that perhaps you have too many really small functions. Or that it's hard to optimize this code in general - perhaps it's already pretty well optimized?

    Finally, you have to "stop somewhere". For large projects, it's almost always possible to continue "forever" to improve the performance, but you have to set a goal, or set a time-limit for your improvements.

    I'm sure others have other ideas.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    Fountain of knowledge.
    Join Date
    May 2006
    Posts
    794
    Quote Originally Posted by cs32 View Post
    I'm interested in doing some benchmarking, but am new to the concept of doing so. Any pointers? What do you guys use to measure the running time of your algorithms/functions in C? Do you use tools?

    I would say the easiest way is to write a simple test 'stub' program which call the
    algorithms/functions a set number of times and then simply time it with a clock or
    wrist watch or whatever.
    Then change the number of times it is called form say 10 to 100 to 1000 or 1,000,000
    or whatever so you get a value you can easily measure, which for example takes about
    10 minutes or so. Then just divide the time by the number of iterations.
    Obviously there may be a slight overhead in loading the program but that will be constant
    for evey run. So you might get results like

    10 runs + constant = D1 seconds
    100 runs + constant = D2 seconds
    1000 runs + constant = D3 seconds
    100,000 runs + constant = D4 seconds etc..

    So it will be fairly simple to work out what the constant value is and hence the run time
    accurately. Anyhow after 10 minutes the load time will be neglible anyway.


    Also this is will help smooth out the results as the run time will be affected by other
    activities your computer has to perform such as processing spurious network activity etc...

  5. #5
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    matsp, thanks for the overview. I've been meaning to profile my C++ code in Xcode on a mac. I know it can be done, but haven't dug into it yet. I'm again motivated. Thanks!

    Todd
    Mainframe assembler programmer by trade. C coder when I can.

  6. #6
    uint64_t...think positive xuftugulus's Avatar
    Join Date
    Feb 2008
    Location
    Pacem
    Posts
    355
    On i686 and later:
    Code:
     uint64_t rdtsc()
     {
        uint32_t lo, hi;
     
        /* We cannot use "=A", since this would use %rax on x86_64 */
        __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
        return (uint64_t)hi << 32 | lo;
     }
    This measures the clock cycles spent since computer boot.
    Simply, subtract two consecutive calls and divide by your CPU frequency.
    It doesn't get more accurate.
    Code:
    ...
        goto johny_walker_red_label;
    johny_walker_blue_label: exit(-149$);
    johny_walker_red_label : exit( -22$);
    A typical example of ...cheap programming practices.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Benchmarking
    By redruby147 in forum C Programming
    Replies: 4
    Last Post: 06-18-2009, 07:44 AM
  2. Benchmarking c compilers
    By freevryheid in forum C Programming
    Replies: 6
    Last Post: 01-23-2009, 05:41 PM
  3. Benchmarking and Optimisation Q's
    By studiesrule in forum C++ Programming
    Replies: 11
    Last Post: 10-19-2006, 07:57 AM
  4. benchmarking
    By codegeek in forum C Programming
    Replies: 8
    Last Post: 02-15-2005, 03:30 PM
  5. Benchmarking
    By RoD in forum Tech Board
    Replies: 5
    Last Post: 10-14-2002, 02:04 PM