Thread: Which is faster?

  1. #1
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158

    Which is faster?

    Which one is faster?
    Code:
    int a;
    for(a = 0; a < 100; a++)
       memory[a] = input[a];
    Code:
    CopyMemory(memory, input, a);

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Probably the latter.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    The latter. The compiler might have some tricks up its sleeve to optimize it.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  4. #4
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Why not try both and find out?

  5. #5
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Definately the later. Most compilers generate string instructions for block memory copies. The former solution causes it to compute the index into each array each time, thus using clock cycles. If you have some control over the araibles, there are inline assembly routines that are as fast as possible.

    Code:
     
    DWORD ByteCount = 100 * sizeof(input[0]);
     
    __asm {
     
    mov esi, input
    mov edi, memory
    mov ecx, ByteCount
     
    rep movsb
     
    }

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by abachler View Post
    Definately the later. Most compilers generate string instructions for block memory copies. The former solution causes it to compute the index into each array each time, thus using clock cycles. If you have some control over the araibles, there are inline assembly routines that are as fast as possible.

    Code:
     
    DWORD ByteCount = 100 * sizeof(input[0]);
     
    __asm {
     
    MOV esi, input
    MOV edi, memory
    MOV ecx, ByteCount
     
    REP MOVSB
     
    }
    Surely you would want to use MOVSD at the very least. Something like this is what the compiler usually comes up with:
    Code:
    MOV esi, input
    MOV edi, memory
    MOV ecx, ByteCount
    mov  edx, ecx
    and  edx, 3
    shr  ecx, 2
    rep movsd
    mov ecx, edx
    rep movsb
    That would probably execute roughly four times faster than abachler's code for anything in the "more than a dozen bytes" section.

    But let the compiler deal with it, that's the absolutely best option - if you REALLY want to do fast memcpy, you need to do much more advanced stuff to make the most of the CPU, like using uncachable writes, [if the memory area is large - not on small copies, but we know the size, so it's easy to figure that one out], SSE registers [except in kernel mode, where saving/restoring SSE registers make a nuisance of itself].

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158
    Okay, good to know.

  8. #8
    Super unModrator
    Join Date
    Dec 2007
    Posts
    321
    Quote Originally Posted by cpjust View Post
    Why not try both and find out?
    How exactly do I execute two codes and find out which one is faster ?

  9. #9
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by abk View Post
    How exactly do I execute two codes and find out which one is faster ?
    Write a set of functions, each using different methods for solving the same problem.
    Then make a loop that runs for X amount of time of method [1, 2, 3, etc] (or X number of iterations), and calculate "number of loops per second". You probably want to use clock() to get a reasonably precise timing, and CLOCKS_PER_SEC to get it into a useful measure. It's a good idea to run for at least a couple of seconds on each method.

    The one that runs the most number of loops per second is the fastest one.

    In this case, I would also run variations with small and larger amounts of data.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  10. #10
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    And with compile-time-known sizes and runtime-known sizes. Also, see if VC++ supports profile-driven optimization.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  11. #11
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    And do not forget - to profile the optimized build, otherwise it has no use
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Faster bitwise operator
    By Yarin in forum C++ Programming
    Replies: 18
    Last Post: 04-29-2009, 01:56 PM
  2. Faster way of printing to the screen
    By cacophonix in forum C Programming
    Replies: 16
    Last Post: 02-04-2009, 01:18 PM
  3. Computations - which is faster?
    By ulillillia in forum C Programming
    Replies: 9
    Last Post: 12-09-2006, 10:23 PM
  4. does const make functions faster?
    By MathFan in forum C++ Programming
    Replies: 7
    Last Post: 04-25-2005, 09:03 AM
  5. Floating point faster than fixed-point
    By VirtualAce in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 11-08-2001, 11:34 PM