# Thread: Representing floats with color?

1. Originally Posted by phantomotap
I want to know what he is trying to solve.
I interpret it as:

1) He has a bitmap, a 2D array of RGB values (pixels)
2) He performs some operation on the bitmap resulting in a "thickness" value for each pixel
3) This "thickness" is a float value
4) He wants to store this "thickness" map, but refuses to save it as a new array and instead wants to reuse the bitmap
5) He converts the "thickness" into RGBA (float to 4 bytes, for each pixel) and places it in the bitmap
6) When all pixels are converted he traverses all pixels to find the min and max "thickness" value
7) Using min/max he normalizes all "thickness" values in the bitmap (into range 0.0 -> 1.0)
8) He then converts this new normalized "thickness" into a grayscale color
9) his bitmap, whatever it is, is now done

2. An example:

Here is one way to store a value up to 99999.999f as a color if you would do it manually.

Color col;
col.r = 99;
col.g = 99;
col.b = 99;
col.a = 99;

While calculating you would first multiply the number by 1000 to to get three of the decimals as whole numbers and then store sets of 2 digits in each color channel.

That is not a high enough number though, but since each color channel could also have an additional 1 before the 99 that could be used in some way to increase the number.

This is just a simple example, there is probably a better way, but something like this is what I want to do.

3. but you keep saying the floats are bounded from 0-255. you aren't being very consistent. here's how you could get floats in and out of a bitmap.

Code:
```float input = 99999.999;
int x;
int y;
bitmap b;
b.pixel[x][y]=*(int *)&input;

float output = *(float *)*b.pixel[x][y];```

4. One possible method, although it might be too computational, would be to take a slice of the HSV color space across constant V, which is a circle, then define a spiral which begins at the center of this circle and spirals to the edge -- the value of the float is then the absolute distance along this spiral to the given color.

I don't know if that's even feasible for your application though.

5. Originally Posted by Magos
I interpret it as:

1) He has a bitmap, a 2D array of RGB values (pixels)
2) He performs some operation on the bitmap resulting in a "thickness" value for each pixel
3) This "thickness" is a float value
4) He wants to store this "thickness" map, but refuses to save it as a new array and instead wants to reuse the bitmap
5) He converts the "thickness" into RGBA (float to 4 bytes, for each pixel) and places it in the bitmap
6) When all pixels are converted he traverses all pixels to find the min and max "thickness" value
7) Using min/max he normalizes all "thickness" values in the bitmap (into range 0.0 -> 1.0)
8) He then converts this new normalized "thickness" into a grayscale color
9) his bitmap, whatever it is, is now done
Yeah that is pretty much the scenario but the operations I do to get the thickness value is on a 3d object. Of course step 5 needs to be very fast. After this there is much joy all around and some fireworks go off in the distance...

6. I havn't rejected anything but havn't seen a complete solution so far.
You did, but that's neither here nor there.

I also said that the [...] thing you just did.
I want to store a [...] by some logic.
Okay, so you say you want to store a single 'float' value that represents... something, a "thickness" you say, in a class containing a quartet of 'float' values ranging from 0-255 representing a RGBA color reference of the associated pixel in a bitmap?

What is there to not understand?
I'll tell you what I don't understand: why you'd want to pollute one representation with another, why you think that it is going to be faster relying on this pollution, why you alter the requirements, why we can't get a reasonable set of requirements, why you didn't tell us beforehand the latest offered requirements... the list goes on.

While calculating you would [...] then store sets of 2 digits in each color channel.
Okay, so something like the code below?

That is not a high enough [...] in some way to increase the number.
What do you mean it isn't a high enough number? You need a big number? What for? Earlier you said three decimals is fine, what happened to that?

This is just a simple example, there is probably a better way, but something like this is what I want to do.
Something like this, or exactly this? What happened to the storage requirement?

Yeah that is pretty much [...] value is on a 3d object.
So all you want to do is store a single 32 bit floating point value in a class with four floating point values each capable of representing values corresponding to the 8 bit range? Easy, and you have been offered the solution multiple times! All that needs to change: passing the four bytes you get to the class which stores each byte as a separate channel of the RGBA value.

Soma

Code:
```#include <iostream>
#include <cmath>

int main()
{
float alpha(255.0f);
float thickness(18.0f);
float joined_value((255.0f * 1000.0f) + thickness);
float restored_alpha(std::floor(std::fmod(joined_value / 1000.0f, 1000.0f)));
float restored_thickness(std::floor(std::fmod(joined_value, 1000.0f)));
std::cout << restored_alpha << '\n';
std::cout << restored_thickness << '\n';
return(0);
}```

7. Originally Posted by phantomotap
Okay, so you say you want to store a single 'float' value that represents... something, a "thickness" you say, in a class containing a quartet of 'float' values ranging from 0-255 representing a RGBA color reference of the associated pixel in a bitmap?
Yes.

Originally Posted by phantomotap
I'll tell you what I don't understand: why you'd want to pollute one representation with another, why you think that it is going to be faster relying on this pollution, why you alter the requirements, why we can't get a reasonable set of requirements, why you didn't tell us beforehand the latest offered requirements... the list goes on.
Polluting? I would rather call it killing two flies with one hit. At least that is the saying in swedish.

Originally Posted by phantomotap
What do you mean it isn't a high enough number? You need a big number? What for? Earlier you said three decimals is fine, what happened to that?
Three decimals is fine but the way I did it in the example there is not room for any more digits. With "high" I mean like 2.0 is higher than 1.0.

Originally Posted by phantomotap
Something like this, or exactly this? What happened to the storage requirement?
Something like that.

Originally Posted by phantomotap
So all you want to do is store a single 32 bit floating point value in a class with four floating point values each capable of representing values corresponding to the 8 bit range? Easy, and you have been offered the solution multiple times! All that needs to change: passing the four bytes you get to the class which stores each byte as a separate channel of the RGBA value.

Soma

Code:
```#include <iostream>
#include <cmath>

int main()
{
float alpha(255.0f);
float thickness(18.0f);
float joined_value((255.0f * 1000.0f) + thickness);
float restored_alpha(std::floor(std::fmod(joined_value / 1000.0f, 1000.0f)));
float restored_thickness(std::floor(std::fmod(joined_value, 1000.0f)));
std::cout << restored_alpha << '\n';
std::cout << restored_thickness << '\n';
return(0);
}```
I don't get what you are doing here, what is the joined_value used for and in what way does the thickness value get mapped into the 0 to 255 range? Are you saying an additional step is needed to split up the float into its bytes? If so why not do that directly to the thickness value? How would that be done and how would you later reassemble the bytes to a float?

8. >> Yeah that is pretty much the scenario but the operations I do to get the thickness value is on a 3d object. Of course step 5 needs to be very fast. After this there is much joy all around and some fireworks go off in the distance...

So if that's all you need, why not just cast? eg:

Code:
```assert( sizeof( float ) == 4 );
// first pass
color&
clr = *seq++;
float&
flt = static_cast< float& >( clr );
flt = ColorToThickness( clr );
// second pass
color&
clr = *seq++;
float&
flt = static_cast< float& >( clr );
clr = ThicknessToColor( flt );```

9. ...or you aren't actually reading my posts. The information you need is simple. We have 4 values that go from 0 to 255. How do we best make them represent a float value that is as big as possible with as many decimals as possible? (3 decimals is minimum)
Well if you have 4 values that range from 0 to 255 and you want to represent them as a float then each value needs to represent some portion of the whole. In normal 32-bit color operations the values are represented as:

unsigned int color = (a << 24) + (r << 16) + (g << 8) + b;

The sum total of this is MAX_INT. The floating point value for this color is a simple normalization.

float color_coef = static_cast<float>(color) / static_cast<float>(MAX_INT));

However this is not a good way to represent color since it requires you find the value for your color and then extract the RGB.
To find the color you would have to do this:

Code:
```unsigned int color = color_coef * MAX_INT;

unsigned char alpha = (color & 0xFF000000) >> 24;
unsigned char red =    (color & 0x00FF0000) >> 16;
unsigned char green = (color & 0x0000FF00) >> 8;
unsigned char blue = (color & 0x000000FF);```
'Thickness' has me completely stumped. Normally if you want to use algorithms and structures for color you would use the widely known widely accepted formats and algorithms instead of your own custom one. To me the problem you are trying to solve does not fit the data structure you have chosen.

Trying to make 4 chars in range 0 to 255 represent a float without any information about how these 4 values are used to compute the final float is impossible. Whether it's 4 chars, floats, doubles, etc, it is just data and meaningless unless you know how to interpret and use it.

10. Whilst it would be possible to do what Bubba says, if you do that in a float, data will be lost - there is only 23 bits of "data" in a float - the remaining bits are sign and exponent, which will take the appropriate value according to the size and sign of the resulting number (so the amount of lost data will depend on the exact value of the float being used).

--
Mats

11. Hehe. I didn't say it was a good idea.

12. If you don't want to union it you can pack RGB values into a float using something this, which avoids encountering NANs etc:
Code:
```float packARGB(int a, int r, int g, int b)
{
return ldexp(0.5 + r/512.f+ g/131072.f+ b/33554430.f), a>>2);
}

void unpackARGB(float f, int &a, int &r, int &g, int &b)
{
f = frexp(f, a);
a <<= 2;
r = (int)(f*512.f) & 0xFF;
g = (int)(f*131072.f) & 0xFF;
b = (int)(f*33554430.f) & 0xFF;
}```
Assuming I got it right (untested code above) it packs the alpha into the exponent, and the rgb into the significand. It drops a few least significant bits here and there (1 of blue and 2 of alpha), but this could be lessened by using the sign bit as well etc.

Edit: It seems that I lost sight of the fact that you want to start with a float, chop that up into ARGB values, and then reconstruct the float out of that. The above code still works great for that, and you can use a huge range of values for the initial float value. It does however mean that you can probably ignore the comment about the sign bit. If your numbers aren't negative then that obviously needn't come into it. You can probably drop the shift left and right on a as well.

13. Originally Posted by DrSnuggles
What is there to not understand? It is the logic for how the conversion is done that is the challenge here.
This thread is fascinating -- especially in that it's so difficult to understand the question. It is clear that you want to store thickness data in 4 bytes - unsigned chars, presumably, since you want to store them in a bitmap array. I understand that you want to take a float value and "somehow" store it in the array. And I understand that you want to take the data from the array and recover the float. What there is not to understand is this: Are you wanting to store the floats in the bmp array only to avoid wasting space on another array? OR do you also require the ability to take the array while it has the floats encoded in it, and SEE it as an actual visual image?

14. I have to say I am lost here.
Especially from the fact that you have 4 floats for the colors and you want to store one float and that not possible for whatever reason.
Second, you say you don't want to use a union. I understand that you don't want to use a union because you have already a class. Then just use the float as an array of 4 chars as suggested. Or use a temporary union.

In any case just post your code. Or a pseudo code. There is really no sense proposing solutions when we don't know EXACTLY what you are doing.

I understand what you want to do from the beginning I think. Because your example is like splitting the digits of the floats and storing them in a char array. But there are more "clever" solutions. That is a really general solution, not taking into account how things are stored in the memory.

15. Hey, I appreciate all the input. I have found a solution to this that should be fast enough to use, so it is a win situation. I save memory and can work with a single structure (the bitmap) to do all I need. It is similar to some of the latest posts. I split an integer into its bytes. Here is the code:

Converting to color:

Code:
```float thickness = 678460.545f;
int val = (int) (thickness * 1000.0f); //Gives me three decimals which is enough
float div = 1.0f / 255.0f;
float r = (float) ((int) (val & 0x000000FF) >> 0) * div;
float g = (float) ((int) (val & 0x0000FF00) >> 8) * div;
float b = (float) ((int) (val & 0x00FF0000) >> 16) * div;
float a = (float) ((int) (val & 0xFF000000) >> 24) * div;```
Converting back to float:

Code:
```byte b1 = (byte) ((int)(r * 255.0f));
byte b2 = (byte) ((int)(g * 255.0f));
byte b3 = (byte) ((int)(b * 255.0f));
byte b4 = (byte) ((int)(a * 255.0f));
int vv = 0;
vv += (int) ((b1 & 0x000000FF) << 0);
vv += (int) ((b2 & 0x000000FF) << 8);
vv += (int) ((b3 & 0x000000FF) << 16);
vv += (int) ((b4 & 0x000000FF) << 24);
float thickness2 = vv * 0.001f;```
If you see any optimization that can be done let me know. As an example removing the "& 0x000000FF" in the converting back makes it still work. Not sure if that is safe though...?

Also the talk about thickness is not really relevant here, that is a separate calculation I do on a 3d object that is already covered. Only in the final step will the bitmap display something worth looking at.

Will check out your solution as well iMalc, thanks.