# Thread: Representing floats with color?

1. To get rid of rounding errors I'm going to do this:

Code:
`val = floor(val * 1000.0f);`
Which gives me no decimals, but three decimals to work with later when the value is multiplied by 0.001.0f.

2. O_o

Color me confused...

So, what you want is a way to freely and arbitrarily convert between a set of four integer values of unknown range and a single floating point value?

If you have infinite precision libraries available this isn't a problem.

If you want to freely and arbitrarily convert between a set of four eight bit integer values and a single 32 bit IEEE floating point value you do have problems. (This is a problem of representable information having nothing to do with rounding errors.) A 64 bit IEEE floating point value should manage the conversion fine, but even then you'll have problems with rounding errors.

What I'm asking is, what operations are you performing that requires such a requirement as you seem to be holding? Of if I'm wrong about what you are wanting, can you try to rephrase your desires for me?

Soma

3. I also have to say I really don't get what you want to do. You have a float (4bytes) and you say you want to store colors there. Then you can just do that. A float is nothing more than 4 bytes in the memory. You can use bit-operators or indexers if you want. Like:
Code:
```#include <iostream>

int main(){
float f = 12.1f; int i;
unsigned char* ptr;
ptr = (unsigned char*)&f;
ptr[0] = 255;
ptr[1] = 0;
ptr[2] = 3;
ptr[3] = 2;
for(i =0; i < 4; ++i)
std::cout << (unsigned int)ptr[i] << std::endl;
}```
I am guessing that you need the colors and then you want to assign a float number to it?
Like:
Code:
```//unsigned char == uchar
void putColor(float* data, uchar R, uchar G, uchar B, uchar A)
{
uchar* ptr = (uchar*)data;
ptr[0] = R; ptr[1] = G; ptr[2] = B; ptr[3] = A;
}
void putThickness(float* data, float thickness)
{
*data = thickness;
}```
so you use one bitmap that can store colors AND thickness, but not both at a given time??

4. Originally Posted by C_ntua
I am guessing that you need the colors and then you want to assign a float number to it?
Like:
Code:
```//unsigned char == uchar
void putColor(float* data, uchar R, uchar G, uchar B, uchar A)
{
uchar* ptr = (uchar*)data;
ptr[0] = R; ptr[1] = G; ptr[2] = B; ptr[3] = A;
}
void putThickness(float* data, float thickness)
{
*data = thickness;
}```
so you use one bitmap that can store colors AND thickness, but not both at a given time??
If that's true, he can easily use a union for that:
union {
COLOR clr;
float thickness;
};

About my union-post, that was merely a joke. I still fail to grasp why one would do such a thing. I hoped that my post would make it clear that the sizes of 4 bytes and floats are usually the same, so you wouldn't actually use less space. And there are no calculations you can't do on 4 bytes that you can do on a float.

So, sorry, I still don't get it :P

5. A float is nothing more than 4 bytes in the memory.
That depends on the floating point model and what value an instance may have.

I think this is exactly what Elysia was talking about. There is not problem storing and retrieving the correct number of bytes as a floating point value. It just may happen that the value you retrieve may not be a valid binary representation of a floating point instance of the relevant model.

For example, if the bits, in the four bytes, just happens to be an IEEE floating point representation for 'NaN' you will not be able to do anything with the value you retrieve--including, depending on the hardware, storing it again.

Soma

6. Indeed. Floating points are more complicated than just "4 bytes in the memory." That might hold true with integers, but not floating point.
As my example clearly illustrated, assigning the individual bytes of the float provides a nonsensical floating point value.

7. The question was assigning a float to a COLOR class. Didn't your example Elysia with the union showed that you can assign the 4 char and gain the value you want? You just might have nonsensical float value or a possible NaN, but why would that matter since you won't use the floating point value? You will use the 4 char only. Before you use the float value again you will assign it first.
So either you assign colors and read colors, either you assign a float and read a float.

So wouldn't this also work? (Don't have a C++ compiler from where I am)
Code:
```#include <iostream>

union something
{
float fv;
unsigned char sv[4];
};

int main()
{
something e;
e.fv = 50.0f;
std::cout << "Float: " << e.fv << std::endl;
e.sv[0] = 0x00;
e.sv[1] = 0x00;
e.sv[2] = 0x48;
e.sv[3] = 0x42;
std::cout << "Chars: " << std::hex << (int)e.sv[0] << ", " << (int)e.sv[1] << ", " << (int)e.sv[2] << ", " << (int)e.sv[3] << std::endl;

//run test again
e.fv = 1231.231f;
std::cout << "Float: " << e.fv << std::endl;
e.sv[0] = 0x00;
e.sv[1] = 0x00;
e.sv[2] = 0x48;
e.sv[3] = 0x42;
std::cout << "Chars: " << std::hex << (int)e.sv[0] << ", " << (int)e.sv[1] << ", " << (int)e.sv[2] << ", " << (int)e.sv[3] << std::endl;
}```
If it does I see no problem.

And I know the union idea was just a joke , but it seemed kind of OK to me. What I posted is worse of course, just wanted to say that you can do it "even with that".

8. It's wrong because it's a C way.
I clearly showed a better way using C++ arrays to store values in an array.
Plus there is no guarantee that a float will be 4 bytes.

And since the float part is not used, you can throw it away and you basically have a struct of 4 chars.

9. Code:
```        float f = 1.234;
int i = *(int *)&f;

char a = i>>24;
char b = i>>16;
char g = i>>8;
char r = i;

int result = ((a<<24)&0xFF000000) | ((b<<16)&0xFF0000) | ((g<<8)&0xFF00) | (r&0xFF);
float floatResult = *(float *)&result;```
the union makes more sense though.

not sure what all this hubbub is about 'how it depends on the floating point model'. the floating point model will be the same on the computer running it, no? the question was, essentially, how to convert a 4 chars (possibly aggregated as an int) to a float.

if your RGBA values are a continguous 4 char array, you can do it just by casting.

Code:
```        char RGBA[4]={0xaa,0xbb,0xcc,0xdd};
float f = *(float *)RGBA;```

10. Originally Posted by Elysia
It's wrong because it's a C way.
I clearly showed a better way using C++ arrays to store values in an array.
Plus there is no guarantee that a float will be 4 bytes.

And since the float part is not used, you can throw it away and you basically have a struct of 4 chars.
But the array won't be able to store any float number as requested. Well, actually it would if you use two elements for the decimal part and two for the rest, but that would be slower.

I don't see a C++ way to do this, since the obvious solution is to make another Bimap with floats in order to do your job. But then you can have a reduce in peformance, because your color Bitmap would already be in the cache (possibly), but if you needed to load another Bitmap then you would lose time.

So why not do it the C way if it provides a performance gain? You want to use one thing both as char and both as float. The union seems perfect for this job, since that is what it does. Bit-operators will do the job also, but they might be slower.

11. 1) Do not make premature optimizations. Don't talk about cache and assume. Run it first, then see if it can be optimized.
2) What use is there for the float, really? You cannot directly convert between chars and floats using this way.
3) It is better to use 4 chars and 1 float separately, unless memory is really a problem.

12. Originally Posted by Elysia
2) What use is there for the float, really? You cannot directly convert between chars and floats using this way.
the equipment and servers that run the factories that produce all your electronics do it all the time.

13. I mean either you use the chars or the float and not both, because you simply cannot convert between them this way, and so they should be separate. I still don't know what it's all about, but I digress. Not my area of expertise.

14. Even if C_ntua is right, considering his interpretation of the question, thickness and RGB values are separate properties of colors that deserve valid representations in any structured/compound type. So Elysia is still right: you should not assume that several integer values stored contiguously form a unique, valid floating-point representation.

If memory is an issue then an algorithm is required to output a valid float with the RGB as input.

We also need more clarification from the original poster, so I don't understand why everyone keeps posting when they only know half the real question.

15. I've pretty much been keeping telling that using this union approach is not the right one. It stinks, even for C.

Popular pages Recent additions