I don't think it's possible to do this very well with the C preprocessor. I'd be happy to be proven wrong though!
This (first answer) is:
windows - C++: How to encrypt strings at compile time? - Stack Overflow
seems a decent sort of approach. With a small tweak you can build up the size fairly quickly:
Code:
#define CRYPT16(str) { CRYPT8_(str), CRYPT8_(str+8), '\0' }
#define CRYPT8(str) { CRYPT8_(str), '\0' }
#define CRYPT8_(str) (str)[0] + 1, (str)[1] + 2, (str)[2] + 3, (str)[3] + 4, (str)[4] + 5, (str)[5] + 6, (str)[6] + 7, (str)[7] + 8
However, the range of places where this can be used is pretty small. It's also not legal C to use this to initialise a global array (it is legal C++ though).
Code:
char arr[] = CRYPT8("hello"); // works
char *str = CRYPT8("hello"); // doesn't work, can't initialise char* from braced list of multiple elements
void foo(void)
{
printf(CRYPT8("hello")); // doesn't work, can't convert from braced list to char*
}
Another reason this isn't a great idea is that because you need to decrypt the strings at runtime, you'll have both encryption and decryption of the same strings all over the place in the code. It'd probably be pretty horrible to read, depending on what your program looks like.
The ideal solution would be to extend a compiler to support this. Outputting ciphered strings should be easy. Having the compiler automatically generate calls decryption functions for runtime decryption would probably be harder. That's not really a recommendation, bit OTT.
I'd recommend having a separate preprocessing tool go over your code and cipher your strings in the C code, before the compiler or preprocessor have run. How you do this depends on what your code looks like. If you have loads of global strings, it might be easiest to put them all in a separate file and process that. If you have lots of calls to printf with string literals, it'd probably be better to write a tool that can pick out strings from code. I think in perl it'd go something like:
Code:
for each line in file
split line on " characters. # Will either get back the whole line if there are no " chars, or alternating code and literals
leave the code before the " unchanged
cipher the next piece of code, which should be a string literal
join pieces with "
print joined pieces
You might be able to automatically insert some calls to a decipher function too, but that won't work in all cases - i.e. mostly only works in C++, not C. Won't work for initialising an array.
Actually, I just noticed something:
Code:
#define noCRYPT8(str) { noCRYPT8_(str "\0\0\0\0\0\0\0\0"), '\0' }
#define noCRYPT8_(str) (str)[0], (str)[1], (str)[2], (str)[3], (str)[4], (str)[5], (str)[6], (str)[7]
extern const char str[] = noCRYPT8("ABCDefghijkl");
The effect of this is to change the initialisation from a string literal to an array. On GCC on x86-64 this stops the string from being visible in a hex dump. Instead the array is constructed in code -- one movb instruction for each array element. I'm not sure if that's reliable across platforms or not, but is quite convenient!