I only got warnings for using %d for time_t, so I ran it as is.
I'm not sure about MSVC, but gcc has a compiler flag that sets inline aggressiveness.
64 bit doesn't make any difference. I got similar results in 32 bit. It's the compiler.
AMD Athlon II X2 250 (3 GHz).
I will try yours shortly.
EDIT: Your code: 39 seconds.
EDIT2: My code with your integer: 29 seconds.
o_OCode:Address Line Source Code Bytes Timer samples 0x13f3f11e1 181 remain = num % 10; 48.99
One optimization I would do is to precompute a mapping of integers 0-999 to their 3 characters strings.
That would take 1000*3 = ~3KB of memory. No null terminators needed because you know they are exactly 3 characters long.
Then you can do % 1000 instead of % 10. I'm guessing that will make it at least twice as fast.
It will also make the logic simpler (since you want a comma every 3 digits), and eliminate a bunch of branching, which are also slow with modern processors.
And switching to a modern compiler will make this all not matter .
Branching? Slow? I don't buy that argument so much.
Today's processors are pretty clever, and since the loop is pretty deterministic, it should easily be able to avoid branch misprediction.
I'm just speculating, though.
But that idea would probably see some speed gains.
I'm going to try it.
EDIT: 14 seconds vs 29 on x64.
EDIT2: With a slightly larger table and x64, it is possible to reduce the runtime to ~8 seconds (if you are willing to sacrifice ~8 MB memory).
The only problem is that you have to correct the numbers:
Code:The value of num is: -1234567890 init_time = 1278283073 The formatted value of num is: -001,234,567,890 end_time = 1278283081 The routine test has taken about 8 seconds to perform +000,500,000,000 cycles of the formatting function
Hmm when you go to 8MB of memory, I don't see how it can be faster.
8MB is larger than the L2 of most/all CPUs, and the values are pretty much random. Cache misses should be much more expensive than doing the calculations.
The loop is fine. The compiler will probably at least partially unroll it anyways.
I was talking about this one
It's true once every 3 iterations in each function call (not globally). I'm not sure if the branch predictor is that smart.Code:if (count == 3)
But the biggest difference would be the elimination of division (%).