Profiling and benchmarking
I just tried to learn about image convolution process.
Not the math, but in programmer perspective.
I learned the basic concept only.
Then I found this algorithm is very very slow for large images, took about <1 second.
This makes me screaming whole day.
Now I need to do some optimizations, including the use of linear pointer arithmetic.
So, do you know anything I need to compare to decide which right algorithm to use(including algorithm specialization) before cheating using multiple threads?
Currently I have:
- Mean (Average): The test run for N times then the sum of elapsed time for each process divided to N.
- Fastest: The fastest time required to do the process.
- Slowest: The slowest time required to do the process.
Maybe standard deviations or variance (I don't know what are they useful for tough)?
How many times I need to run the process to get the exact result?
I'm currently use between 100 and 1000.
Thanks in advance.
I tend to use portable code. So there is no ASM.