Hello,
This may be more appropriate in the Game Programming forum, but anyway...
I've written some code that reads in bits from 4 seperate bitplanes and multiplexes them into bytes, i.e. the upper half of the first byte contains the first bits from the bitplanes, the lower half the second bits, etc.
The thing is, it's not terribly fast. Running this on a 2 GHz Pentium 4, CPU usage of up to 98% is observed when bmpBitmap describes a set of 640x480 bitplanes and the code is executed 60 times a second.Code:// inputs: unsigned char *lpBuffer, bitmap_t *bmpBitmap int i, j, iMaskShift = 7, iShift = 4, iSize; unsigned char *lpPointers[4], ucColour, ucSourceMask = 0x80, ucDestMask = 0xF; // setup pointers for (i=0;i<4;i++) lpPointers[i] = bmpBitmap->lpPlanes[i]; iSize = bmpBitmap->sWidth * bmpBitmap->sHeight; for (i=0;i<iSize;i++) { ucColour = 0; for (j=0;j<4;j++) ucColour |= ((*lpPointers[j] & ucSourceMask) >> iMaskShift) << (j + iShift); *lpBuffer = (*lpBuffer & ucDestMask) | ucColour; ucDestMask = ~ucDestMask; if (iShift) iShift = 0; else { iShift = 4; lpBuffer++; } if (iMaskShift--) ucSourceMask >>= 1; else { iMaskShift = 7; ucSourceMask = 0x80; for (j=0;j<4;j++) lpPointers[j]++; } }
Could anyone suggest some performance enhancements?