My function (to replace instances of "find" with "rep" and copy):
works, but I feel like it could be sped up with some sort of loop unrolling or comparing in larger blocks. Right now, it is very linear and will spend a significant amount of time in the first loop, as each iteration has to check two things per byte.Code:void new_str_rep(char *dst, const char *src, const char *find, const char *rep) { if (!src || !find) return; size_t i; while (*src) { for (i = 0; find[i] == src[i] && find[i]; ) { i++; } if (!find[i]) { src += i; for (i = 0; rep[i]; i++) *dst++ = rep[i]; continue; } *dst++ = *src++; } *dst = '\0'; }
A simple re-write in assembly actually speeds it up, but only because it uses registers However, compiler optimization makes the same changes with the same results, producing relatively similar code.
Does anyone know of any tricks I could use to make the current code faster, or replace slower segments with ones that are more efficient?



3Likes
LinkBack URL
About LinkBacks




). The cost of calculating find_len and rep_len up front is likely to be relatively small. Big-O analysis would throw that out. The nice thing about using memcpy with a known length is again, it can work word-by-word, very quickly, until it's all done. If you have to find that zero byte, as with a string function, you're losing out on the word-by-word bonus. Even if you can copy word-by-word, you have to check each word for the existence of a zero byte somewhere in that word and only copy a partial word if you found it in there. That's probably costing you a few more cycles than it's saving you.