Hello,
I'm copying some memory from one region to another. As I'm dealing with 32-bit colour data, the area of each memory region is a multiple of 4, so I was wondering whether memcpy(...), which copies a single byte at a time, is best optimised for the task at hand.
May I be better off using a bit of inline asm involving MOVSD (Move a doubleword string)? I gather there's been a bit of argument about this on the net, as some people say that the some CPUs doesn't always do it faster.