Greetings. This is actually a problem with a C++ program, but it's about C memory management so I felt it belongs here.
I'm working on a graphics hardware emulation lib and am in need of a general purpose realloc replacement that enforces proper alignment.
I currently have an array of texture objects that are exactly 64 bytes in size. I do not search through this array (I have an array of stubs for that), so I need to access them one at a time, possibly selecting a new one once per triangle. Aligning these structures to 64byte boundaries (coincides with my CPU's cache line size) indeed gave me a healthy performance boost I don't want to lose.
The problem is that I'm dynamically expanding that array. The theoretical maximum size for the array is 524288 objects, 32megs respectively, so statically allocating for that amount is out of the question. In addition, the array is unlikely to house more than a few thousand objects so the memory would go to waste most of time. But for the rare cases where it gets filled up, I don't want to disgracefully fail.
I'm allocating memory in this fashion:
Code:
ubyte* new_block_alloc=(ubyte*)realloc(array,new_size+63);
ubyte* new_block=((new_block_alloc+63)&(~63));
uint old_align_offset=array-array_alloc;
uint new_align_offset=new_block-new_block_alloc;
if (new_align_offset!=old_align_offset)
memmove(new_block_alloc+new_align_offset,new_block_alloc+old_align_offset,new_size);
array_alloc=new_block_alloc;
array=new_block;
I'm doing it this way, because realloc copies the contents of the array to its new place, when the base address of the bigger block differs from the old one. In this case, my alignment may have also changed, forcing me to do a memmove to get it right again.
One parameter I've been playing with is granularity. If I choose to expand the array by big amounts at a time, the memmove is rarely done but then on quite large blocks. This causes noticeable stutters. If I tune down granularity, stuttering goes away, but overall performance suffers as the memmove gets executed quite frequently. This also causes performance degradation over time, as the array itself grows bigger and the amount of work done per memmove does so as well.
I've also spent a bit of time thinking about doing my own heap management. But if I want to keep it aligned and dynamically growing, I'd not only run into the same realloc issues, but the overhead and memory footprint of pointer tracking would start killing me.
So, good people, here's the question:
Do you know of any fairly portable realloc replacements that take care of alignment automatically? I must admit haven't looked into Win32 heap management, and I'd prefer not resorting to that, but will it help?
Any input greatly appreciated.
-zeckensack