RGB to Black&White

**Rustik_** · 08-20-2009

I think guys, i'll do my job in to ways

Anyway, thanx to all for helping me.
Best regards, Rustam.

**VirtualAce** · 08-20-2009

1. The OP asked about black and white, and everybody is talking about grayscale. To get black and white, dithering will be required. There are multiple methods.

For true black and white yes this is true and it's not a simple task since you have black or white or dithered black to make the various grey's. However most people are referring to greyscale when they say black and white.

On a side note it is interesting to see that the human eye actually perceives shades of orange better than reds. Seems that stoplights should have orange for stop instead of red since we are less likely to see red than orange.

**abachler** · 08-21-2009

Originally Posted by Bubba

Seems that stoplights should have orange for stop instead of red since we are less likely to see red than orange.

And more likely to confuse it with yellow which would cause a lot of accidents. The primary colors Red Green Blue are all generally perceived the same, however, intermediate colors like orange (which is seen easier because it stimulates both the red and green receptors) can be perceived differently by different individuals due to small variations in the genes that code for color perception. So the same shade of orange may look more red to me and more yellow to you. So I would stop and you would get ........ed off and fly around me giving me the one finger salute just before a 40 ton truck plastered you all over the road.

**VirtualAce** · 08-21-2009

can be perceived differently by different individuals due to small variations in the genes that code for color perception.

There is a huge difference between orange construction lights and yellow stop lights. They aren't even the same tint. If your color receptors are that far off then you probably shouldn't be driving in the first place.

**brewbuck** · 08-21-2009

Originally Posted by Bubba

There is a huge difference between orange construction lights and yellow stop lights. They aren't even the same tint. If your color receptors are that far off then you probably shouldn't be driving in the first place.

Red-green color blindness occurs at such a high rate that banning people from driving because of color vision deficits would be rather draconian

**VirtualAce** · 08-21-2009

Red-green color blindness occurs at such a high rate that banning people from driving because of color vision deficits would be rather draconian

But it would mean less slow traffic in the left lane for me to deal with.

**brewbuck** · 08-23-2009

Originally Posted by abachler

Actually brew, those coefficients are specifically for conversion of a color NTSC signal to a B&W signal.

That's true. My issue, though, is the level of precision. You quote three digits of precision -- that's 1 part in 1000. A 24-bit RGB image has only 256 levels per channel, not 1000. And it's probably unlikely that any random image will have a color profile which is known correct to 1 part in 1000. So the use of 3 digits of precision in these coefficients is not justifiable.

The pure average is, in a sense, just the same coefficients, but to only (about) a single digit of precision. It's simply acquiescing to the fact that you do not know the original color profile in the first place. Your goal is simply to reduce the number of channels by throwing away color information.

I wager that if we selected 10 random images, and converted them using the standard perceptual coeffs vs. simple averaging and tried a blind quiz, that we would not be very good at all telling which was which. The natural variability is too high.

**abachler** · 08-24-2009

Originally Posted by brewbuck

That's true. My issue, though, is the level of precision. You quote three digits of precision -- that's 1 part in 1000. A 24-bit RGB image has only 256 levels per channel, not 1000. And it's probably unlikely that any random image will have a color profile which is known correct to 1 part in 1000. So the use of 3 digits of precision in these coefficients is not justifiable.

That depends on what the downstream processing looks like. If you are only converting it for human consumption then you could probably get away with 2 or even 1 digit of precision (i.e. 10% 60% 30%), but if you are converting it for input into say a neural network, you will want that extra ratio preserved. Remember, when you take a small percentage from one color its going to another color, so it may over-represent that colors effect in the overall image, and ultimately you may end up processing just a dirty green image etc. If you keep the ratios correct the neural network can more easily discern details in the photograph.

The pure average is, in a sense, just the same coefficients, but to only (about) a single digit of precision. It's simply acquiescing to the fact that you do not know the original color profile in the first place. Your goal is simply to reduce the number of channels by throwing away color information.

I wager that if we selected 10 random images, and converted them using the standard perceptual coeffs vs. simple averaging and tried a blind quiz, that we would not be very good at all telling which was which. The natural variability is too high.

Yes but if you chose 10 similar pics of similar scenes with different colors and asked people to math their black and white pictures withe the scene ti most likely matched, i think you would find a higher rate of error in the 1 or 2 digit of precision photos.

**brewbuck** · 08-24-2009

Originally Posted by abachler

That depends on what the downstream processing looks like. If you are only converting it for human consumption then you could probably get away with 2 or even 1 digit of precision (i.e. 10% 60% 30%), but if you are converting it for input into say a neural network, you will want that extra ratio preserved. Remember, when you take a small percentage from one color its going to another color, so it may over-represent that colors effect in the overall image, and ultimately you may end up processing just a dirty green image etc. If you keep the ratios correct the neural network can more easily discern details in the photograph.

If I was feeding the image into a neural network I would just hand it the RGB values directly and let the network figure out the best coefficients for itself. I would use an input layer of three units, fully linked to a hidden layer of one unit, and after that the rest of the network. An extra three links per pixel will not impact network run time significantly, and allow the network to use the color information better.

**abachler** · 08-24-2009

Originally Posted by brewbuck

If I was feeding the image into a neural network I would just hand it the RGB values directly and let the network figure out the best coefficients for itself. I would use an input layer of three units, fully linked to a hidden layer of one unit, and after that the rest of the network. An extra three links per pixel will not impact network run time significantly, and allow the network to use the color information better.

real time whole image processing through complex neural networks is not currently feasible even using GPU's

3 times as many links per pixel will triple the runtime of the network. On modern hardware the calculations aren't what take so much time, its the memory transfer requirements. Memory bandwidth is the bottleneck.

So unless you are talking very simple neural networks, fewer than about 64k connections total, which means about a 320x240 image for greyscale and 1/3 that for full color. Hence why most applications reduce the image to greyscale for the neural network.

**brewbuck** · 08-24-2009

Originally Posted by abachler

3 times as many links per pixel will triple the runtime of the network. On modern hardware the calculations aren't what take so much time, its the memory transfer requirements. Memory bandwidth is the bottleneck.

It is not three times as many links per pixel. It is an additional three links per pixel. That is only "three times" if the original number of links per pixel was 1. But that would imply that each input pixel connects to only a single unit in the hidden layer. That would be a pointless topology, since the only purpose of such a configuration would be to apply the transfer function to the pixel value -- you could achieve that by simply pre-processing the input data.

In pictures, here's what I mean:

Code:

Red
     \                     /
Green  -- CCUnit -- (Fanout)
     /                     \
Blue

This increases the number of links in the network by precisely 3*N where N is the number of pixels. If the network already had much greater than 3 links per pixel, this is negligible.

(The sweet thing about this is that each individual pixel can get its own set of coefficients. If that's not desirable, a special update rule can be used which causes all the CC links to share the same weights)

**abachler** · 08-24-2009

Originally Posted by brewbuck

It is not three times as many links per pixel. It is an additional three links per pixel.

Its 3 links per pixel per hidden layer node. Not 3 links per pixel. So the end result is 3 times as many links.

e.g. a 32x240 image

320x240 = 76800 pixels

BW = 76800 inputs x 2 hidden nodes = 153600 connections
RGB = 230400 inputs x 2 hidden nodes = 460800 connections

460800 / 153600 = 3 times as many connections

3 times as many connections means 3 times as long to compute

**brewbuck** · 08-24-2009

Originally Posted by abachler

Its 3 links per pixel per hidden layer node. Not 3 links per pixel. So the end result is 3 times as many links.

I think you're misunderstanding the diagram I drew. The R, G, B units feed into a single color-conversion unit. This unit is what produces the gray level. This unit then fans back out into the hidden layer. It is not a direct connection between R, G, B and the hidden layer. I'm inserting another (very sparsely linked) hidden layer in between.

To use your example of a 320x240 image, the input layer is 320x240x3 units. Each triple of units is connected to a hidden color conversion unit. That hidden unit then fully connects to the subsequent hidden layer. That's 320x240x3+320x240x2 = 384,000 links, not 460,800 links. And you're being unfair by assuming only two hidden units. I'm imaging thousands of hidden units. So let's take 1000 hidden units. For gray you have 320x240x1000 = 76800000 links. My network has 320x240x3+320x240x1000 = 77030400, an increase of only 0.3%.

That's really the point I'm getting at. The color conversion layer is sparsely connected.

**abachler** · 08-24-2009

Originally Posted by brewbuck

That's really the point I'm getting at. The color conversion layer is sparsely connected.

Then you are essentially not doing anything except increasing the time it takes to train the network. At this point brew I have to discontinue the discussion as it's bordering on violating a non-disclosure agreement. Hopefully someone can pick up from here.

**VirtualAce** · 08-24-2009

Most of us stopped discussing it long ago. Somehow I do not think the OP is writing this for any type of advanced applications like the ones you two are discussing. And if he is and he is coming here to find out how to do it then he probably wouldn't be on a dev team for that type of application in the first place.