Representing floats with color?

**Elysia** · 12-21-2008

As I see it, it is better to waste memory than do the conversion every time.

**DrSnuggles** · 12-21-2008

Originally Posted by Elysia

As I see it, it is better to waste memory than do the conversion every time.

The things this code might be working with bitmaps as large as 4096X4096 pixels so that is a lot of memory, like an array size over 16 million. + the bitmap is already laid out in a good format for checking neighbors to values which will be useful.

I will test both ways though to see what kind of speed hit the conversion produces. Might be negligable compared to other calculations I do.

**C_ntua** · 12-21-2008

You have to test what is better. I recently have made tests about these kind of things. And I have to tell you that it really depends. There is a trade-off between calculations and memory and it can even depend from system to system. We don't have the whole code to say anything more.

Code:

float r = (float) ((int) (val & 0x000000FF) >> 0) * div;

Too much casting.
val is int. div is float. The result is float. So no casting needed at all. When you have int * float the int is casted to a float. In any case you wanted this:

Code:

float r = (float)((float) ((val & 0x000000FF) >> 0) * div);

or simpler

Code:

float r = ((val & 0x000000FF) >> 0) * div);

An obvious optimization is:

Code:

float r = ((val & 0x000000FF) >> 0) / 255f);

The f will make a division with floats.

Also the & 0x000000FF means that you keep the last 8 bits. Thus 1 byte. So yeah, since you have a byte it makes no sense. Just emit it

**DrSnuggles** · 12-21-2008

Originally Posted by C_ntua

You have to test what is better. I recently have made tests about these kind of things. And I have to tell you that it really depends. There is a trade-off between calculations and memory and it can even depend from system to system. We don't have the whole code to say anything more.

Code:

float r = (float) ((int) (val & 0x000000FF) >> 0) * div;

Too much casting.
val is int. div is float. The result is float. So no casting needed at all. When you have int * float the int is casted to a float. In any case you wanted this:

Code:

float r = (float)((float) ((val & 0x000000FF) >> 0) * div);

or simpler

Code:

float r = ((val & 0x000000FF) >> 0) * div);

An obvious optimization is:

Code:

float r = ((val & 0x000000FF) >> 0) / 255f);

The f will make a division with floats.

Also the & 0x000000FF means that you keep the last 8 bits. Thus 1 byte. So yeah, since you have a byte it makes no sense. Just emit it

Thanks, yes all the casting may not be necessary.

I don't agree with the division optimization though. Division is a lot slower than multiplying so since division will happen often using the same value it is better to calculate it as 1.0f / 255.0f once(outside the loop) and resuse that for multiplication.

**C_ntua** · 12-21-2008

You are probably right on that.

The other thing is try to use the float as an array rather than using bit wise operations. Like:

Code:

unsigned char*  ptr = (unsigned char*)&val;
float r = val[0] * div;
float g = val[1] * div;
float b = val[2] * div;
float a = val[3] * div;

This might be faster, might be slower. I would expect it to be a little faster. Try it.

**Magos** · 12-21-2008

Any decent compiler would transform a division-by-constant to a multiplication-by-inverse-constant anyway.

**R.Stiltskin** · 12-21-2008

Originally Posted by DrSnuggles

I don't want to store an array in memory with possibly several million floats. Working with such an array would also be slow. With a bitmap I have the benefit of being able to easially check neighbors of the values as well.

It's still not clear why you want to do so much arithmetic on several million pixels.

Why would working with floats be slower than working with quadruples of 4 chars? And why would checking neighbors be easier with 4 chars. If I understand your solution, you will have to look at all 4 chars for each comparison, since (for example) [255, 0, 0, 0] will be closer to [0,255,255,255] than [0,0,0,0] is to [0,0,0,10]. Correct?

If you are starting out with float values to begin with, and are primarily trying to save space, why is EVOEx's solution not preferable? The bitmap can be an array of unions instead of an array of 4-tuples of chars. Initially you would store the float values there. Do whatever processing you need to do on them, and do only a single conversion at the end, storing the float in each pixel with the 4 chars to display the image?

What's wrong with that?

**m37h0d** · 12-21-2008

Originally Posted by DrSnuggles

Hey, I appreciate all the input. I have found a solution to this that should be fast enough to use, so it is a win situation. I save memory and can work with a single structure (the bitmap) to do all I need. It is similar to some of the latest posts. I split an integer into its bytes. Here is the code:

Converting to color:

Code:

float thickness = 678460.545f;
int val = (int) (thickness * 1000.0f); //Gives me three decimals which is enough
float div = 1.0f / 255.0f;
float r = (float) ((int) (val & 0x000000FF) >> 0) * div;
float g = (float) ((int) (val & 0x0000FF00) >> 8) * div;
float b = (float) ((int) (val & 0x00FF0000) >> 16) * div;
float a = (float) ((int) (val & 0xFF000000) >> 24) * div;

Converting back to float:

Code:

byte b1 = (byte) ((int)(r * 255.0f));
byte b2 = (byte) ((int)(g * 255.0f));
byte b3 = (byte) ((int)(b * 255.0f));
byte b4 = (byte) ((int)(a * 255.0f));
int vv = 0;
vv += (int) ((b1 & 0x000000FF) << 0);
vv += (int) ((b2 & 0x000000FF) << 8);
vv += (int) ((b3 & 0x000000FF) << 16);
vv += (int) ((b4 & 0x000000FF) << 24);
float thickness2 = vv * 0.001f;

If you see any optimization that can be done let me know. As an example removing the "& 0x000000FF" in the converting back makes it still work. Not sure if that is safe though...?

Also the talk about thickness is not really relevant here, that is a separate calculation I do on a 3d object that is already covered. Only in the final step will the bitmap display something worth looking at.

Will check out your solution as well iMalc, thanks.

the one i posted here on page 3 is probably faster

http://cboard.cprogramming.com/showp...6&postcount=39

**m37h0d** · 12-21-2008

this still seems like madness though. why can't you just cast it??

**iMalc** · 12-21-2008

Originally Posted by Magos

Any decent compiler would transform a division-by-constant to a multiplication-by-inverse-constant anyway.

Indeed, and in fact I was relying on this for the code I provided. Interestingly enough though, whether VS2005 and up do this sometimes depends on the floating point consistency model selected for the project. I mean for powers of two it'll surely convert to a multiplication, but for something like dividing by 3 it probably wont use a multiplication by 1/3rd for the precise model, because that can't be represented exactly, whereas 3 can, and the multiplication result might differ by a couple of least-significant bits in the significand.

Originally Posted by DrSnuggles

Thanks, yes all the casting may not be necessary.

I don't agree with the division optimization though. Division is a lot slower than multiplying so since division will happen often using the same value it is better to calculate it as 1.0f / 255.0f once(outside the loop) and resuse that for multiplication.

Yes division is certainly a lot slower than even 3 multiplications. Any reduction in the number of divisions performed has got to be a win. There's no harm in explicitly doing the optimisation of reducing the divides by hand here.

Note that with the method I posted, you don't need any premultiplication step. It should be able to operate on whatever range of values your float initially contains, and maintains accuracy for small and large values. By all means use whatever turns out to be fastest though

Minor optimisation, always do the bitmasks after a right shift:

Code:

float thickness = 678460.545f;
unsigned int val = (unsigned int) (thickness * 1000.0f); //Gives me three decimals which is enough
float div = 1.0f / 255.0f;
float r = (val & 0xFF) * div;
float g = ((val >> 8) & 0xFF) * div;
float b = ((val >> 16) & 0xFF) * div;
float a = ((val >> 24) * div;

It takes less bytes of machine code to represent smaller integer constants. And leaving out the last bitmask is safe so long as val is positive, hence I've made it unsigned to be sure. I'd also leave out the zero-shift even though it doesn't look as pretty and the compiler would have generated the same code anyway, but that's just me.

**m37h0d** · 12-21-2008

couldn't you also bitshift by 8 instead of dividing by 256?

**Magos** · 12-21-2008

Originally Posted by m37h0d

couldn't you also bitshift by 8 instead of dividing by 256?

And you think the compiler makers didn't think of this, why?

**m37h0d** · 12-21-2008

because they've made it floating point multiplication? i guess the real answer is that they can't do that because that would shift all the data off the end of the end, and so they'd get nothing.

this whole thing seems odd to me tho. seems like a lot of trouble to somewhat optimize an inherently inefficient system.

**VirtualAce** · 12-21-2008

byte b1 = (byte) ((int)(r * 255.0f));
byte b2 = (byte) ((int)(g * 255.0f));
byte b3 = (byte) ((int)(b * 255.0f));
byte b4 = (byte) ((int)(a * 255.0f));
int vv = 0;
vv += (int) ((b1 & 0x000000FF) << 0);
vv += (int) ((b2 & 0x000000FF) << 8);
vv += (int) ((b3 & 0x000000FF) << 16);
vv += (int) ((b4 & 0x000000FF) << 24);
float thickness2 = vv * 0.001f;

You have effectively removed over 2 million colors by selecting int for vv. Colors are unsigned int which gives you the full range of over 4 million colors in 32-bit color. Negative r,g,b values do not make sense and will most likely result in some color inversion. Your version will overflow the data type you have selected.

**CodeMonkey** · 12-22-2008

Wow. Long thread.

As it seems to have been repeated quite a lot, it's memory overhead of using two arrays vs. processing overhead of casting (even if it's some custom cast). Well? If you've chosen to let the CPU take the heat, then there are many solutions here (i.e. last six pages). However, I suggest you use two arrays. What's a factor of two among programmers?