Having a brain fart on basic bitwise conversion

**awsdert** · 03-09-2023

To make my library both portable & future proof (in the event more than 32bits are ever needed) I'm defining my own character encoding to be using in runtime only, not in files or network protocols, for those I'm making a converter for the current common standards of UTF8,UTF16 & UTF32. At the moment I'm doing the UTF8 conversion and am struggling to get the gears in my head moving on working out the last bitwise shifting part of the character.

My format is under the definition of:
1x... means read the next character as the bottom part of this character, x... always means unicode point, so 2 16 bit characters of mine converted to a 28bit UTF32 character would wind up looking like:

Code:

char32_t c32 = vc & PAWVC_BOTTOM;
c32 <<= PAWVC_WIDTH;
c32 |= (vc & PAWVC_BOTTOM);

The character is defined to always be at least 16 bits via short or long (a non-conforming system of CHAR_BIT = 4 would always result in a long of 16 bits wide) thus ensuring L"" would be the only valid way to assign a string literal to it, someone mind taking a look at the code below and helping me fix the last bit to extract the last applicable bits.

Code:

		one = *src & PAWVC_BOTTOM;
		two = src[1];
		C = dst + n;
		if ( bits > 18 )
		{
			i += 2;
			if ( bits <= PAWL16D_WIDTH - 1 )
			{
				C[3] = 0x80 | (one & 0x3F);
				C[2] = 0x80 | ((one >> 6) & 0x3F);
				C[1] = 0x80 | ((one >> 12) & 0x3F);
				C[0] = 0xF0 | ((one >> 18) & 07);
				continue;
			}
			C[3] = 0x80 | (two & 0x3F);
			C[2] = 0x80 | ((two >> 6) & 0x3F);
			C[1] = 0x80 | ((two >> 12) & 0x3F);
			C[0] = 0xF0 | (two >> 18);
#if PAWL16D_WIDTH - 12 >= 6
			C[0] |= one & ~(-1 << (PAWL16D_WIDTH - 18));
#else
			c[1] |= (one & ~(-1 << (PAWL16D_WIDTH - 12));
#endif
		}

**awsdert** · 03-09-2023

Never mind, I think I finally got those gears turning again:

Code:

#define LEFT (18 - PAWL16D_WIDTH)
#define LAST (21 - PAWL16D_WIDTH)
#define LENG (7 - LAST)
#if LENG <= 3
			C[0] |= one & ~(-1 << LENG);
#else
			C[1] |= one & ~(-1 << LEFT);
			C[0] |= one & ~(-1 << LENG) >> (LENG - 3);
#endif
#undef LENG
#undef LAST
#undef LEFT

Still gotta compile at test it once I finish moving header code but this will do until then unless someone spots an issue before then

Thread: Having a brain fart on basic bitwise conversion

Thread Tools

Search Thread

Display

Having a brain fart on basic bitwise conversion

Similar Threads

Conditional IF Brain Fart

brain fart

Brain Fart Problem

Brain fart

Tags for this Thread