Thread: Profiling my code

  1. #1
    Registered User
    Join Date
    Feb 2008
    Posts
    147

    Profiling my code

    Hi,
    I use MINGW and I have a program made by me in which I have included my own profile system. I can't use gcc profile because I have thread and gprof dont support thread althoug of the two thread I have, I only want to measure one of them.

    What I do is a system where a function is call: the first instruction in the function is a call to start the function time counter. When exits, a call is made to stop time function. If in between, the caller time counter is stopped, put in a stack, and a start a new counter. When exit the time counter is stopped, then put again the caller time counter on.
    Something like this:

    Code:
    function a1(int x) {
       PROFILE_START("a1");
        if  (x==1) {
            a2(x); // a1 counter is stopped,when return from a2 is started
            PROFILE_STOP("a1");
            return ;
        } else {
            x+=1;
        } 
        PROFILE_STOP("a1");
    }
    
    function a2(int x) {
        PROFILE_START("a2");
        x += 5;
        PROFILE_STOP("a2");
    }
    I use clock() to measure all this. do you find something wrong with this approach? do you think the result are reasonble?
    My test show coherent results, but I am not sure if the system is detailed or good timer. do you think small function are measured correctly?

    thx all.

  2. #2
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    clock() only has second resolution*, that's a bit small in most cases for profiling. A system specific high-res timer will give better results.

    Well I was wrong

    Still, it doesn't have to be accurate, and may even have a res of 1 second -- ie a 1Hz CPU
    Last edited by zacs7; 07-08-2008 at 04:51 AM.

  3. #3
    Ugly C Lover audinue's Avatar
    Join Date
    Jun 2008
    Location
    Indonesia
    Posts
    489
    See:
    http://cboard.cprogramming.com/showthread.php?t=104873

    Btw, what is the meaning of profiling using gprof or something???

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by zacs7 View Post
    clock() only has second resolution, that's a bit small in most cases for profiling. A system specific high-res timer will give better results.
    Actually, clock() is better than 1s, generally in the 1-10ms range.

    It will probably not be good enough for this sort of measurement.

    If we restrict the usage to gcc/x86, then you could use the RDTSC instruction as inline assembler. See: http://en.wikipedia.org/wiki/RDTSC

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Feb 2008
    Posts
    147
    Quote Originally Posted by matsp View Post
    Actually, clock() is better than 1s, generally in the 1-10ms range.

    It will probably not be good enough for this sort of measurement.

    If we restrict the usage to gcc/x86, then you could use the RDTSC instruction as inline assembler. See: http://en.wikipedia.org/wiki/RDTSC

    --
    Mats
    This actually looks like a precise system, but in order to convert results to seconds I need to know processor frecuency. Does is not problem. I measure cycles between two seconds (using sleep), but , what about if the machine has more than one processor? how this work?

  6. #6
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    RDTSC is not recommended, due to multi-core issues.
    Use appropriate API instead (such as QueryTickCounter on Windows).
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Kempelen View Post
    This actually looks like a precise system, but in order to convert results to seconds I need to know processor frecuency. Does is not problem. I measure cycles between two seconds (using sleep), but , what about if the machine has more than one processor? how this work?
    Each processor(core) has it's own TSC - so if you have more than one processor(core), then they will have (potentially) different TSC values. For AMD processors on Windows, you can use the "dual core optimizer" to ensure that they are reasonably in sync. The worst case scenario would be if a thread switches from one processor(core) to another. This can lead to the time being off by a rather large amount. There is no easy solution to this problem.

    Another option, but it has noticeably higher overhead, would be to use the Windows QueryPerformaceCounter system call. That will not be as precise in granularity, and it may still use TSC, but I believe it's less likely to fluctuate depending on for example which processor you are using at the time of asking. There's a sister call to this function, which I believe is called QueryPerformanceCounterFrequency to find what the current frequency is.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    > such as QueryTickCounter on Windows
    Making up functions now are we?

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Elysia View Post
    RDTSC is not recommended, due to multi-core issues.
    Use appropriate API instead (such as QueryTickCounter on Windows).
    I wrote a slightly longer answer. Both techniques are valid and have their advantages. RDTSC is MUCH faster (as it completes, not counting synchronization instructions) in a couple of clock-cycles. However, it is also less precise. It's a balance. If profiling single functions with few lines of code is the main target, then the RDTSC would be a preferred solution, even if it "misses" sometimes (you can most likely filter those few occurrances out and still get valid statistics) [1]

    For functions that take longer time (several microseconds or more), system calls that may block (these obviously fall into "the microseconds or more" category when blocking), and such, then QueryPerformanceCounter is definitely a preferred method.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Create a separate test which allows you to profile one thread at once (or the thread in question).

    gprof in general will tell you a lot more than your homebrew approach, which will riddle the code with your profiling inserts.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by zacs7 View Post
    > such as QueryTickCounter on Windows
    Making up functions now are we?
    Yup.
    Couldn't remember the name on top off my head and didn't want to bother looking it up, so QueryTickCounter it is
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Profiling for c code in eclipse
    By hai12345 in forum C Programming
    Replies: 2
    Last Post: 10-10-2008, 04:46 PM
  2. Obfuscated Code Contest: The Results
    By Stack Overflow in forum Contests Board
    Replies: 29
    Last Post: 02-18-2005, 05:39 PM
  3. Obfuscated Code Contest
    By Stack Overflow in forum Contests Board
    Replies: 51
    Last Post: 01-21-2005, 04:17 PM
  4. Interface Question
    By smog890 in forum C Programming
    Replies: 11
    Last Post: 06-03-2002, 05:06 PM
  5. Replies: 0
    Last Post: 02-21-2002, 06:05 PM