Your code is leaking memory -- you have calls to malloc but no corresponding calls to free. You should free the memory once you no longer need the temporary expansion.


Anyway - your question --
It looks like you can't know how long the buffer needs to be until you've expanded stuff. One option would be to do a 'dry run' of the expansion that just counts the number of characters in the string, then malloc that amount then expand into there. To be honest I haven't looked at what you're doing closely enough to know if you could calculate the needed size upfront somehow, or at least make a conservative estimate. I assume if you could do this you would have done it already

I'd probably go for dynamically resizing your buffer with realloc (realloc - C++ Reference).
First decide on an initial buffer size -- for testing, best pick a pathetically small size to make sure your reallocation is working. But after testing, just go for a buffer size that feels like it'll probably fit most cases.

I'd suggest writing another helper function for concatenating the expansion into the main buffer. Before you copy anything into the buffer, you will need to check that there is enough space remaining for the string. Unfortunately I think this means you'll have to pass around a buffer_size variable.
If there isn't enough space in the buffer, increase it by some amount and try again until you have enough space.

Code:
   while (not enough space in buffer0
        increase buffer size var
        realloc buffer with bigger size

    concat string to buffer
Read the docs about realloc -- in particular, beware that it might return a different block of memory, so you'll need to return the pointer from the function.

You could of course realloc for every single expansion, just allocating as much as you need, but that would impact performance, especially if realloc needs to copy the string to a new place often. On your average desktop machine I doubt you'd ever observe any actual slowdown, but it's good to be considerate of such things and consider the tradeoff you're making (size of upfront memory allocation vs frequency of calls to realloc).

You could also do it with some dynamic data structure like a linked list of strings. You could traverse it at the end and turn it back into a string. I really don't think that's the right solution though, overkill.

Not sure if other people will suggest better ideas -- I've not used realloc a lot in real code, so not sure if it's actually a good way. Better than allocated a huge fixed length buffer and hoping for the best..... which I see all the time in production code!