I'm reviewing some code I've written some time ago, and I'm taking I'm taking this opportunity for to dust off my C.
I'm using uint32_t arrays to store UTF8-encoded codepoints. They are treated as arrays, not strings. Every cell is assumed to contain one codepoint and one codepoint only.
Occasionally I access a cell as a uint8_t array. We can safely assume it to be a macro for unsigned char.
I've read that compilers assume that char* aliases other types, but not the way around.Code:uint8_t (*t)[4]; t = (uint8_t (*)[4])&chars[i]; //chars is an array of uint32_t (*t)[0] = utf8Stringt[i++]; (*t)[1] = utf8String[i++]; //...
I've seen there's a lot of discussion on the subject, still going strong to this day, but despite that, I'd like to get my code to abide to strict aliasing rules.
I never attempt to read/write uint8_t arrays as uint32_t, and that would break strict aliasing.
But one thing I do, is providing a helper macro LTCHAR to cast a "single unicode glyph literal string" (like "↺") to a uint32_t.
One approach could be to move the data through a char array, or with memcpy(), with a helper function. But I was wondering:Code:typedef uint32_t VTChar; #define LTCHAR *(VTChar *const) //strict aliasing is broken void vtFill(VTChar fillChar); //the function works with "uint32_t chars", but the macro helps creating one on the fly vtFill(LTCHAR"↺");
this is only a shortcut to pass a literal UTF8 codepoint to the function. If we assumed the string to be discarded afterwards (as if it was an rvalue), and assume it to be something like const char[4], would strict aliasing be maintained?
Can strict aliasing be considered enforced, when the aliased pointer points to constant data?