Right, I think we have covered almost everything that needs covering here.
There is however a couple of point I'd like to point out:
1. There are situations when a certain set of data items HAS TO FIT in a certain space. Typically, hardware registers that have multiple fields within a certain space (e.g. a 32-bit register that has more than one item of content). We have two choices as to how to solve that: Either define masks and shift values to manipulate the fields, or use bitfields. Yes, bitfields are potentially flawed, but it's a neat way to define it IF IT WORKS OK.
One case may be the 16-bit colour combinations using RGB565 format:
Code:
struct rgb
{
unsigned short r:5;
unsigned int g:6;
unsigned int b:5;
};
2. It is rarely useful to use bitfields for software only situations. In this case, it's nearly always better to use fields of char, short, int, long and long long to represent the data. It gives better performance, and it's rarely meaningfull to try to squeeze the data into a smaller space.
I wrote a little program to benchmark the difference in setting such an RGB setup with a couple of different solutions.
Code:
#include <time.h>
#include <stdio.h>
#define SIZE 100000
void rgb1(void)
{
struct rgb
{
unsigned short r:5;
unsigned short g:6;
unsigned short b:5;
};
static struct rgb arr[SIZE];
unsigned int i;
for(i = 0; i < SIZE; i++)
{
arr[i].r = i & 31;
arr[i].g = i & 63;
arr[i].b = (i >> 3) & 31;
}
}
#define RED(x, r) ((x) &= ~((1 << 5)-1) << 11, (x) |= ((r) << 11))
#define GREEN(x, g) ((x) &= ~((1 << 6)-1) << 5, (x) |= ((g) << 5))
#define BLUE(x, b) ((x) &= ~((1 << 5)-1), (x) |= (b))
void rgb2(void)
{
static unsigned short arr[SIZE];
unsigned int i;
for(i = 0; i < SIZE; i++)
{
RED(arr[i], i & 31);
GREEN(arr[i], i & 63);
BLUE(arr[i], (i >> 3) & 31);
}
}
void rgb3(void)
{
struct rgb
{
unsigned char r;
unsigned char g;
unsigned char b;
};
static struct rgb arr[SIZE];
unsigned int i;
for(i = 0; i < SIZE; i++)
{
arr[i].r = i & 31;
arr[i].g = i & 63;
arr[i].b = (i >> 3) & 31;
}
}
void timeIt(char *name, void (*f)(void))
{
int i = 0;
clock_t t = clock();
for(i = 0; i < 5000; i++)
{
f();
}
t = clock() - t;
printf("%s: %8.6f\n", name, (double)(t) / CLOCKS_PER_SEC);
}
#define TIME_IT(f) do { timeIt(#f, f); } while(0)
int main()
{
TIME_IT(rgb1);
TIME_IT(rgb2);
TIME_IT(rgb3);
return 0;
}
With MS Visual Studio .Net:
Code:
rgb1: 1.515000
rgb2: 1.313000
rgb3: 1.281000
So, not a whole lot of difference between the solutions, with the winner being the one using whole bytes - but only by a little bit. [rgb1 and rgb2 are only different in some minor details in the assembler code - the rgb1 version is using a partial-load sequence where a 32-bit register is cleared to zero, then loading the lowest byte, whilst the rgb2 uses a full 32-bit load in the first place.
Using gcc-mingw:
Code:
rgb1: 7.781000
rgb2: 1.234000
rgb3: 1.281000
Clearly, gcc-mingw misses out on a few "tricks" in the optimisation game here. Maybe someone who has access to gcc 4.x can try it out and see if that version has done a better job.
--
Mats