Thread: which data type in image program?

  1. #1
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286

    which data type in image program?

    Hi!

    I am on my way now to start making a simple image processing program which shall have function such as blurring, and ... well, other image manipulating functions. But I am not sure whether I shal use an array with unsigned ints to store the whole RGB value, or one array with signed ints for R, one for G and one for B. I believe that using one array for the whole RGB value will be faster, especially since if I use SDL I can write directly into the buffer and display it directly, but using one array for each layer will be much more precise. Do you think there will be visible losses in quality after a few manipulations if I choose to use the less precise type? Which of these methods do you think I should use? What does other raster graphics editors use, like Photoshop and GIMP?

    I also intended to build a simple computer vision library with a few functions, maybe these are not as picky as a human.
    Last edited by TriKri; 06-13-2008 at 04:06 PM.
    Come on, you can do it! b( ~_')

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    It depends on what you want to do. If you're frequently dealing with just one of the three colours, then you might want separate arrays. Most of the time, though, I think you'd want to represent one pixel as one element or one structure. It's more convenient that way.

    Note that you don't have to represent the red, green and blue channels of an image in one integer if you don't want to. You could always just use a 3D array.
    Code:
    Uint32 image[width][height][3];
    I also intended to build a simple computer vision library with a few functions, maybe these are not as picky as a human.
    Do you mean something like image recognition? It's not as easy as it sounds . . . .
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #3
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286
    Thanks for the answer. I'm not sure yet, if I still am going to use 3 int32 per pixel I will probably want to have them in separate arrays. It could become handy sometime I believe.

    Quote Originally Posted by dwks View Post
    Do you mean something like image recognition? It's not as easy as it sounds . . . .
    Ok, have you had any bad experience? No, I really think image recognition would be very hard, at least I haven't got a clue about how to do it. I thought more in the style of merging two images, maybe one wants to make a panoramic image from a few images. I'm also thinking if there is any way to make a 3d-model out of a bunch of images taken at the same object from different angles, that on the other hand, might be a little bit more difficult. :P
    Last edited by TriKri; 06-13-2008 at 04:53 PM.
    Come on, you can do it! b( ~_')

  4. #4
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Quote Originally Posted by TriKri View Post
    Thanks for the answer. I'm not sure yet, if I still am going to use 3 int32 per pixel I will probably want to have them in separate arrays. It could become handy sometime I believe.
    Well, maybe, but I can't really see that happening. For one thing, SDL_Surfaces have the pixel channels in one place, and I'm sure the designers of the SDL know more about the topic than either you or I.

    For another, if you ever needed to go red[x] it would be just as easy to go pixel[x][RED], wouldn't it? . . . .

    Ok, have you had any bad experience? No, I really think image recognition would be very hard, at least I haven't got a clue about how to do it.
    Yes, pretty hard indeed. I didn't think that was what you were referring to.

    I thought more in the style of merging two images, maybe one wants to make a panoramic image from a few images.
    Hmm, that might be difficult too. Images taken with a camera never match exactly, so you'd have to use some algorithms to detect similarities in the images . . . which I have no idea about whatsoever.

    I'm also thinking if there is any way to make a 3d-model out of a bunch of images taken at the same object from different angles, that on the other hand, might be a little bit more difficult. :P
    This would be "a little bit more difficult", for sure.

    There was actually something about this online somewhere, on TED I think. I'll see if I can find it.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    The most common formats are 8/8/8 RGB organized with 3 bytes per pixel (unsigned char), and 8/8/8 RGB packed into 32-bit unsigned integers, with 8 bits of each integer being unused. Which format you choose depends on the application.

    The 33% extra memory usage of the 32-bit representation can be important in memory-intensive applications. On the other hand, it is faster to address pixels because you simply shift the index left two bits. You also have to deal with masking in order to extract the components, whereas with the 3-byte representation you just dereference a byte.

    It really depends on your application.

  6. #6
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    it could be faster also, to store values of the same pixel consecutively in memory, considering locality of reference (CPU cache).

  7. #7
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    Why not just have a POD struct for each pixel.

    This is better than an array, because multidimensional arrays on the heap require either extra memmory or fancy casting. Also It is possible to pass the pixel with a single pointer instead of three to masks that apply to all three colors and possibly alpha.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  8. #8
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by King Mir View Post
    Why not just have a POD struct for each pixel.
    It's nice in theory, but you have no control over the compiler's padding and alignment. In most cases things work out okay, but in the general case of portable code it's a pain.

    For passing single pixels around it is often convenient to pack/unpack them into a pixel structure. But as far as holding large images in memory, most code I've written or worked with just uses arrays of unsigned char or unsigned int.

    Using masks is hard when you first start out. After getting through the first couple thousand lines of it, though, you start to internalize it and it's not so bad anymore. It's becoming less common for people to hammer their own image code, instead relying on libraries, but among those of us who do, we all pretty much speak the same language.

  9. #9
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286
    Quote Originally Posted by brewbuck View Post
    The most common formats are 8/8/8 RGB organized with 3 bytes per pixel (unsigned char), and 8/8/8 RGB packed into 32-bit unsigned integers, with 8 bits of each integer being unused. Which format you choose depends on the application.
    So you don't think I will notice any loss in quality of the picture, after I have modified it a bit with different functions, if I choose the 8/8/8 model over the 32/32/32 model? Just a question, how much slower do you think it will be if I choose to use 3*32 bits/pixel instead of 8/8/8 and 32 bits per pixel?
    Come on, you can do it! b( ~_')

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    http://pippin.gimp.org/image_processing/chap_dir.html
    floats for R/G/B/A, if you're going to be doing a lot of transformations.

    With 8/8/8, every intermediate step is going to get clipped at 255 (or worse, lost altogether through modulo truncation), and any fractional part will be lost as well.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286
    I think that float sounds like a good idea, better that unsigned int too. Thanks for the link by the way, a good article. I realized int32 may be a little hard to use, since whenever you want to do a simple multiplication with another integer > 1, you can't guarantee there won’t be an overflow.

    Maybe I will make a basic image template class, so that I can create more image classes later if I would ever need to have an image of another type.

    In the case of using floats, I think it would probably be better to have the R, G and B values in separate arrays but having the arrays stay next to each other in the same allocation. Or are there any advantages with having a big array, with each third R, each third G and each third B?
    Come on, you can do it! b( ~_')

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > Or are there any advantages with having a big array, with each third R, each third G and each third B?
    If you have separate arrays, you'll need 3 cache lines because each array will be larger than your cache.
    Consecutive RGB values will always end up adjacent in the cache.

    Whether either of these is good or bad remains to be seen, but it's something to think about.

    But optimising programs for locality of reference seems to be a good idea on the whole.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #13
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286
    Quote Originally Posted by Salem View Post
    If you have separate arrays, you'll need 3 cache lines because each array will be larger than your cache.
    Consecutive RGB values will always end up adjacent in the cache.
    Why do I need 3 cache lines? Can you please explain? I'm not that well studied in CPU caches...

    You're probably right about the locality of reference. Thanks, I shall keep that in mind.
    Come on, you can do it! b( ~_')

  14. #14
    Algorithm engineer
    Join Date
    Jun 2006
    Posts
    286
    By the way, since I'm using floats, is there any library that can read a jpg image into floats, and skip the later typecasting of the values into 8/8/8-bit format? It would just be unnecessary in my case.
    Come on, you can do it! b( ~_')

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reading a file with Courier New characters
    By Noam in forum C Programming
    Replies: 3
    Last Post: 07-07-2006, 09:29 AM
  2. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  3. Replies: 26
    Last Post: 06-15-2005, 02:38 PM
  4. Learning OpenGL
    By HQSneaker in forum C++ Programming
    Replies: 7
    Last Post: 08-06-2004, 08:57 AM
  5. Dynamic list of Objects in External File
    By TechWins in forum C++ Programming
    Replies: 3
    Last Post: 12-18-2002, 02:05 PM