
Originally Posted by
Salem
There are a multitude of reasons why.
1. Your program is a userland process, so all your clocks are to some extent an approximation.
2. calling malloc may involve a very expensive trap into the OS to get more memory. This seems quite likely on the very first call.
But you don't know when the first call is. The C startup code before you get to main may also call malloc, so you really don't know.
3. Repeated calling of the same code may benefit from the code already being in the instruction cache, as opposed to say main memory (or even on disk).
So
- try it with a user supplied buffer rather than calling malloc
- call it many times in a loop, then take the whole elapsed time, divide by the number of loop iterations, to get an average.
The only way you're getting instruction-accurate measurements is on a bare machine with no OS and no interrupts.
The more stuff you have going on in the background, the fuzzier your results are going to be.