If you have a Pentium-class or better processor, you can use the RDTSC assembly instruction.
Here's some example code:
Code:
typedef struct _BinInt32
{
__int32 i32[2];
} BigInt32;
typedef struct _BigInt64
{
__int64 i64;
} BigInt64;
typedef union _bigInt
{
BigInt32 int32val;
BigInt64 int64val;
} BigInt;
long long benchmark()
{
BigInt start_ticks, end_ticks;
_asm
{
CPUID
RDTSC
mov start_ticks.int32val.i32[0], eax
mov start_ticks.int32val.i32[4], edx
}//force serialization, start the tick count
myfunc();//replace this with your function name
_asm
{
CPUID
RDTSC
mov end_ticks.int32val.i32[0], eax
mov end_ticks.int32val.i32[4], edx
}//force serialization, end the tick count
return(end_ticks.int64val.i64-start_ticks.int64val.i64);
//return the total tick count
}
Please note that this code isn't entirely mine.
I took the BigInt part from some website, the serialization from the Intel manual, etc. I've found it quite useful, though, and thought I'd share it. Keep in mind it only works on some processors, though. Any Pentium or above or Athlon or above (not sure about K6-2s) will have it.
EDIT: If your compiler doesn't recognize __int64 or __int32, then I'd say just assume a long has 32 bits and long long has 64 bits (might not work on AMD64 and AMD FX processors, though).
EDIT2: I'd say it will take quite a bit longer than 60 cycles. Please note that the width of the processors' pipelines that this code will work on is 32 bits (and maybe 64 bits for AMD 64s and AMD FXs). You would need quite a few cycles to fit that whole number.