Thread: Multiplexing bitplanes into bytes

  1. #1
    Registered /usr
    Join Date
    Aug 2001
    Location
    Newport, South Wales, UK
    Posts
    1,273

    Multiplexing bitplanes into bytes

    Hello,

    This may be more appropriate in the Game Programming forum, but anyway...

    I've written some code that reads in bits from 4 seperate bitplanes and multiplexes them into bytes, i.e. the upper half of the first byte contains the first bits from the bitplanes, the lower half the second bits, etc.
    Code:
    // inputs: unsigned char *lpBuffer, bitmap_t *bmpBitmap
    int i, j, iMaskShift = 7, iShift = 4, iSize;
    unsigned char *lpPointers[4], ucColour, ucSourceMask = 0x80, ucDestMask = 0xF;
    
    // setup pointers
    for (i=0;i<4;i++)
    	lpPointers[i] = bmpBitmap->lpPlanes[i];
    
    iSize = bmpBitmap->sWidth * bmpBitmap->sHeight;
    for (i=0;i<iSize;i++)
    {
    	ucColour = 0;
    	for (j=0;j<4;j++)
    		ucColour |= ((*lpPointers[j] & ucSourceMask) >> iMaskShift) << (j + iShift);
    
    	*lpBuffer = (*lpBuffer & ucDestMask) | ucColour;
    	ucDestMask = ~ucDestMask;
    	if (iShift)
    		iShift = 0;
    	else
    	{
    		iShift = 4;
    		lpBuffer++;
    	}
    
    	if (iMaskShift--)
    		ucSourceMask >>= 1;
    	else
    	{
    		iMaskShift = 7;
    		ucSourceMask = 0x80;
    		for (j=0;j<4;j++)
    			lpPointers[j]++;
    
    	}
    
    }
    The thing is, it's not terribly fast. Running this on a 2 GHz Pentium 4, CPU usage of up to 98% is observed when bmpBitmap describes a set of 640x480 bitplanes and the code is executed 60 times a second.

    Could anyone suggest some performance enhancements?

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Yes, this should probably have gone in the Game Programming Forum.
    Code:
    // setup pointers
    for (i=0;i<4;i++)
    	lpPointers[i] = bmpBitmap->lpPlanes[i];
    You could initialize those pointers to those values.

    The thing is, it's not terribly fast. Running this on a 2 GHz Pentium 4, CPU usage of up to 98% is observed when bmpBitmap describes a set of 640x480 bitplanes and the code is executed 60 times a second.
    Your code looks pretty good. What delay function are you using to keep the frame rate at 60 FPS? Most delay functions that I know of run the CPU at 100%. The DOS delay(), the SDL function, etc all run my processor up, too.

    To see how fast the code really is, run it on a slower computer (like mine), or remove the syncronizing code (if you can) and see what the FPS is then.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #3
    Registered /usr
    Join Date
    Aug 2001
    Location
    Newport, South Wales, UK
    Posts
    1,273
    Not quite sure that I understand your first point, but as for the second, it's actually using Windows WM_TIMER messages clocked at 16ms (1000/60 = 16.6r). If I comment out the call to my code, CPU usage stays at zero, so that's where all the action is.

  4. #4
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Exactly. Don't worry about the CPU usage. Try running it on a slower computer an it should be fine.

    (What I meant by my first point was this:
    Code:
    unsigned char *lpPointers[4];
    
    // setup pointers
    for (i=0;i<4;i++)
    	lpPointers[i] = bmpBitmap->lpPlanes[i];
    ->
    Code:
    unsigned char *lpPointers[4] = {bmpBitmap->lpPlanes[0], ...};
    which may be a bit faster.)
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  5. #5
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Code:
    >unsigned char *lpPointers[4], ucColour, ucSourceMask = 0x80, ucDestMask = 0xF;
    >
    >// setup pointers
    >for (i=0;i<4;i++)
    >	lpPointers[i] = bmpBitmap->lpPlanes[i];
    You could change this to:
    Code:
    unsigned char lpPointers[4], ucColour, ucSourceMask = 0x80, ucDestMask = 0xF;
    
    // setup pointers
    for (i=0;i<4;i++)
    	lpPointers[i] = *(bmpBitmap->lpPlanes[i]);
    And then later on it would just be lpPointers[j] without the dereference.

  6. #6
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    > lpPointers[j]++;
    Actually you're incrementing your pointers here, so my last suggestion is no good.

  7. #7
    Registered /usr
    Join Date
    Aug 2001
    Location
    Newport, South Wales, UK
    Posts
    1,273
    Plus you're both ignoring the possibility that if bmpBitmap is NULL there'll be a big explosion and I'll get radioactive goop on my face.

    But cheers for trying.

  8. #8
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Ok, here's another try.
    Code:
    int i, j, iMaskShift = 7, iShift = 4, iSize;
    unsigned char *lpPointers[4], ucColour, ucSourceMask = 0x80, ucDestMask = 0xF;
    
    // setup pointers
    for (i=0;i<4;i++)
    	lpPointers[i] = bmpBitmap->lpPlanes[i];
    
    iSize = bmpBitmap->sWidth * bmpBitmap->sHeight;
    for (i=0;i<iSize;i+=8)
    {
       for (k=0; k<8; k++)
       {
          
          ucColour = 0;
          for (j=0;j<4;j++)
             ucColour |= ((*lpPointers[j] & ucSourceMask) >> iMaskShift) << (j + iShift);
    
          *lpBuffer = (*lpBuffer & ucDestMask) | ucColour;
          ucDestMask = ~ucDestMask;
          if (iShift)
             iShift = 0;
          else
          {
             iShift = 4;
             lpBuffer++;
          }
    
          iMaskShift--;
          ucSourceMask >>= 1;
       }
    
       iMaskShift = 7;
       ucSourceMask = 0x80;
       for (j=0;j<4;j++)
          lpPointers[j]++;
    
    }

  9. #9
    Registered /usr
    Join Date
    Aug 2001
    Location
    Newport, South Wales, UK
    Posts
    1,273
    Hmm, I kinda see where you're going there, effectively replacing a conditional with a loop. I don't think that would be any faster, though.

    What about expanding the source and destination operands so that it processes a doubleword (four bytes) at a time instead of one, would that help?

  10. #10
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934

    Lightbulb

    What about expanding the source and destination operands so that it processes a doubleword (four bytes) at a time instead of one, would that help?
    That would probably help.

    Actually from what I've read your best bet is to check the optimization flags of your compiler, and try out each different one, something you may have already done. Compilers these days are very good at optimizing code.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reverse Engineering on a Download file
    By c_geek in forum C Programming
    Replies: 1
    Last Post: 03-22-2008, 03:15 PM
  2. Page File counter and Private Bytes Counter
    By George2 in forum Tech Board
    Replies: 0
    Last Post: 01-31-2008, 03:17 AM
  3. Replies: 16
    Last Post: 11-23-2007, 01:48 PM
  4. HUGE fps jump
    By DavidP in forum Game Programming
    Replies: 23
    Last Post: 07-01-2004, 10:36 AM
  5. socket question
    By Unregistered in forum C Programming
    Replies: 3
    Last Post: 07-19-2002, 01:54 PM