If you have a Pentium-class or better processor, you can use the RDTSC assembly instruction.
Here's some example code:
Please note that this code isn't entirely mine.
typedef struct _BinInt32
typedef struct _BigInt64
typedef union _bigInt
long long benchmark()
BigInt start_ticks, end_ticks;
mov start_ticks.int32val.i32, eax
mov start_ticks.int32val.i32, edx
}//force serialization, start the tick count
myfunc();//replace this with your function name
mov end_ticks.int32val.i32, eax
mov end_ticks.int32val.i32, edx
}//force serialization, end the tick count
//return the total tick count
I took the BigInt part from some website, the serialization from the Intel manual, etc. I've found it quite useful, though, and thought I'd share it. Keep in mind it only works on some processors, though. Any Pentium or above or Athlon or above (not sure about K6-2s) will have it.
EDIT: If your compiler doesn't recognize __int64 or __int32, then I'd say just assume a long has 32 bits and long long has 64 bits (might not work on AMD64 and AMD FX processors, though).
EDIT2: I'd say it will take quite a bit longer than 60 cycles. Please note that the width of the processors' pipelines that this code will work on is 32 bits (and maybe 64 bits for AMD 64s and AMD FXs). You would need quite a few cycles to fit that whole number.