Thread: Bitmaps I/O operations - bitmap data order

  1. #1
    Registered User
    Join Date
    Sep 2014
    Posts
    235

    Question Bitmaps I/O operations - bitmap data order

    Why bitmaps have such strange format? The order of the pixels is upside down. When you are going to read or write bitmaps, you must go from the end of buffer where you want to write data to or to read data from. That would not be so strange but another strange thing is that you must read from right to left instead from right to left. This is confusing and I'd like to know why they did not use intuitive way:

    To start from the begin of the buffer and then go from left to right or possibly from the end of buffer and then go from right to left but better the previous. Does it have some relation to binary format and compression?

  2. #2
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    You're talking about the BMP file format. The short answer is:

    Because Microsoft.

    The detailed reason harkens back to the very first Microsoft Windows versions, and political push for using a right-hand coordinate system (origin at lower left corner of the display device, instead of top left corner). Boring stuff.

    You might wish to note, though, that if the height is negative in a BMP file header, the scan order is the normal top-to-bottom one.

  3. #3
    Registered User
    Join Date
    Sep 2014
    Posts
    235
    I have question about bytes_per_row calculation. Is the gap per row necessary only for bmp images or also for png and jpeg images? If I remember correct, when I read manual for libpng (or maybe libjpeg) there was mention that bytes per row is not necessary width*count_of_channels . But they did not mention how to calculate the bytes_per_row. Because now I need to fix my algorhitm to work with bmp which with is not multiple of 4, so I would like to know if I will need to do similar things for the other libraries.

  4. #4
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    I do not recall the details wrt. libpng and jpeg-turbo, but in general you do need to mind the "gap".

    More commonly, it is called the stride from one row to the next. Some libraries do need each row of pixels to start at some aligned address, so the stride is a multiple of 2, 4, 8, 16, or even 32 bytes. It typically does not depend on the image format, but is more about how the library was implemented.

    I'd personally use
    Code:
        unsigned char *origin;
        long row_stride;
    in my image structure. The start of row y is then origin + row_stride * y.

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,398
    Quote Originally Posted by barracuda View Post
    Why bitmaps have such strange format? The order of the pixels is upside down. When you are going to read or write bitmaps, you must go from the end of buffer where you want to write data to or to read data from. That would not be so strange but another strange thing is that you must read from right to left instead from right to left. This is confusing and I'd like to know why they did not use intuitive way:
    What is intuitive about the first byte of the image corresponding to the upper-leftmost pixel? It's nothing more than convention. As far as why Microsoft chose a convention that goes against the grain of many other image formats, who cares? It's just a coordinate system.

    Be careful about this "intuitive" stuff. The reason you think it is intuitive is because you probably come from a culture where written language begins at the upper-left, and progresses rightward and downward. Don't confuse your peculiar cultural biases with what is appropriate or intuitive.

    EDIT: As far as handling the padding between rows, that's a fundamental part of image representation within computer systems, and not optional. Leaving it out just tells me you don't know what you're doing.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  6. #6
    Registered User
    Join Date
    Sep 2014
    Posts
    235

    Exclamation

    @Nominal animal:
    Originally when I wanted to do bitmap operations so I wanted to work with the row in format rgbrgbrgb... etc. Because it is simple to calculate and you need only one loop for it. If you do any aditional calculations and increasing it could be noticable on speed when working with large images e.g.3000x3000 or 4096x4096.

    I have idea (not sure if it is good?) I could copy the data from rows array (libpng,libjpeg) a from bmp buffer to the buffer in format rgbrgb... without using a gap. I think this would be simple to calculate and operate.

    But do you think that using rows array (pointers to the positions in buffer where every row begins) is faster then using such buffer? I already wanted to create such array but was not successful because I didn't find how to declare it correctly.

    Also I am now having some strange problem with the bmp. I have created rgb bitmap 16x16 24bit depth, it has 824 bytes total, header has 54 bytes, image size is 768 and there is a gap on the end which is 2 bytes zero. This gives 770 raster size. I converted the image data with rgb2hsv and save it to new bmp file of exactly same size. In Hex editor I checked the values, the header and the gap. I found no difference in the header and gap and I found the rgb values are correct. However I cannot open the new image in any program. This is what confuses me. I already opened such "hsv" image in 4x4 and that was no problem. I cannot see any differences except the colors itself.

    @Brewbuck:
    Leaving it out just tells me you don't know what you're doing.
    I didn't leave it, IDK where you read it. I ask if any other library uses gaps. Good to know that it is standard, but I need to know if the gap should be on left or on right. I expect that it should be on the left because the data are written to image from right to left. Also should not be faster to use buffer without gaps and onlz one loop to proccess pixels (any operations with colors)? It can be converted back anytime. Is there any performance advantage when using gaps? Or performance lost?
    Last edited by barracuda; 02-20-2015 at 05:33 PM.

  7. #7
    Citizen of Awesometown the_jackass's Avatar
    Join Date
    Oct 2014
    Location
    Awesometown
    Posts
    269
    >"but another strange thing is that you must read from right to left"
    Idk what kind of bmps you are using for testing. I read and render row data from left to right perfectly fine. Though the order of the colors in the pixel was BGR instead of RGB.

    >"I have idea (not sure if it is good?) I could copy the data from rows array (libpng,libjpeg) a from bmp buffer to the buffer in format rgbrgb... without using a gap."
    Without DWORD alignment of rows most software would either show the bitmap as blank or "stretched". It's essential to have appropriate padding for rows, whatever method of copying you use.
    "Highbrow philosophical truth: Everybody is an ape in monkeytown" --Oscar Wilde

  8. #8
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by brewbuck View Post
    What is intuitive about the first byte of the image corresponding to the upper-leftmost pixel?
    At the time BMP format was conceived, all display hardware it could be used on, used top-to-bottom scan order.

    Quote Originally Posted by brewbuck View Post
    Be careful about this "intuitive" stuff.
    Agreed. Be suspicious of any assumptions you make.

    Quote Originally Posted by barracuda View Post
    Originally when I wanted to do bitmap operations so I wanted to work with the row in format rgbrgbrgb... etc.
    It is actually not the most efficient format, because most architectures currently work best in larger units -- 32 bits (four bytes) or larger. Pure 8-bit RGBRGBRGB.. has three bytes or 24 bits per pixel, and that means unaligned accesses, which even on x86 are slow.

    On current hardware, I prefer to use 32-bit pixels, with either 10 bits per component (two highest bits unused or temporary masking bits), or 8-bit ARGB format (where A is typically alpha/opacity, or unused). That way each pixel is aligned to 32 bits, and pixel access is very fast. I recommend using
    Code:
    typedef uint32_t  pixel;
    for portability.

    Edit: I wouldn't use row pointers. I'd just use the pixel type above, and a stride (>= width in pixels), so that origin + y * stride gives me the start of row y.

    If each scan line is aligned to 16 bytes (and padded to a multiple of 16 bytes), you can use C compiler built-in extensions to manipulate the pixels in sets of four -- for example, do colorspace remapping, or blend two images together at different opacities.

    Although this means that your library will use 4 bytes or 32 bits per pixel, no matter what the input format, it is in my opinion an acceptable tradeoff. 40004000 image takes then 400040004 bytes or 64 megabytes, which is not that much. A 1638416384 image would take a gigabyte of RAM, still pretty acceptable in my opinion.

    For truly huge images, like rendering humongous images in High Dynamic Range for something like a detailed wallpaper print, you can use memory mapping techniques to greatly exceed your actual RAM available. Here is my old example for manipulating a terabyte (1,099,511,627,776-byte) structure. In this case, you can use much more bits per pixel. Just make sure accesses are consecutive, or the kernel will have to move data from disk to RAM too often, making it too slow to be useful.

    If you do not need an alpha channel, the x2:R10:G10:B10 format is very interesting. You have two extra bits of precision per color component (10 bits instead of 8), and that helps if you do stuff like colorspace conversion back and forth -- it helps with the rounding errors that may otherwise cause "banding" in smooth color transitions. The two extra bits can be used for some other purpose. (I do not recommend using R11:G11:B10 or such; it's easiest to have all three components the same number of bits.)

    I do have some very interesting functions to do A8R8G8B8 and x2R10G10B10 color effects, like blending from one color to another, really efficiently. I'm pretty sure the same approach could be used to make the colorspace changes really fast, too.
    Last edited by Nominal Animal; 02-20-2015 at 06:20 PM.

  9. #9
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by barracuda View Post
    Also I am now having some strange problem with the bmp.
    Sorry, but BMP is a format I never use, nor care about at all.

    For image generation, I normally have my programs output in PPM (RGB), PGM (grayscale), or PBM (bitmap) formats, then use the NetPBM toolkit or any image manipulation program to convert to whatever format I happen to need. It turns out that this tends to yield for example the best 256-color PNG images (as I like the pnmquant color quantization better than e.g. Gimp's).

    For non-RGB formats I'd use PAM. I don't know what color models e.g. Gimp supports, if it supports PAM at all, but for "untouched originals", it's just fine a format; it allows any number of components with up to 16 bits per component. I compress the files with xz, if they take too much disk space. I can always write an sRGB conversion utility (to PPM), and even include an automatic xz/gz/bzip2 decompressor, if I wanted to. (The decompression library interfaces are really easy to use.)

    These are not Linux-specific formats, and are supported in most image processing programs. As far as I know, you can find installers for the conversion utilities at the first link in this message, so you should be able to use the NetPBM formats everywhere.

  10. #10
    Registered User
    Join Date
    Sep 2014
    Posts
    235
    the_jackass: The purpose of my program is not to display the image with any other application. To work with images I will move them to database which accepts images of the same dimension and same bit depth. I could work with the images from database with many files on one program run. When the images are ready they would be converted back to the normal format. It is also possible to import them in database in standard order. Header of database should contain the information about format. Also my images should be multiple of 4 or at least multiple of 2. This is also because FFT requirements.

    The reason why I think I need it because I am going to use some sorts of algoritms like FFT and DCT for image analysis and also another way of analysis which I can call pre-read analysis which would skip every 4th pixel while scan.

    I can paste here the images for download if anybody wants to check them.
    The original:
    Download Auto-TS from SourceForge.net
    the converted hsv version:
    Download Auto-TS from SourceForge.net

    @Brewbuck:
    What is intuitive about the first byte of the image corresponding to the upper-leftmost pixel?
    When I am going to analyze characters on image I think it is more intuitive to have it reading non flipped. Can you read characters when they are upside down, flipped on x and y axis?

    @Nominal animal
    Just reading: http://www.alexonlinux.com/aligned-v...-memory-access looks pretty interesting. Thanks for your advices. I'll need to do some measuring later when my program will be bug-less.
    Last edited by barracuda; 02-21-2015 at 03:21 AM.

  11. #11
    Registered User
    Join Date
    Sep 2014
    Posts
    235
    @Nominal animal:
    I have no idea how could I read the unsigned char string reading 4 bytes on one cycle. I also have no idea how I would perform the hsv conversion or how to operate with the pixel if I would want to correct color, etc. Because I can work only with the separated components.

    Using 10 bits to preserve precision sounds like good idea, but I have no idea how to do it. Complete program would need to work with 10 bits per byte?

    Edit: Should I also use A channels for jpeg images? Will they consume any additional space on this if they will be saved in RGBA? I mean black channel, no mask.

    Edit2:
    Someone told me I should not use floating point numbers in the conversion function but rather integers, this would also prevent the inacurrancy and banding, don't you think?
    Last edited by barracuda; 02-21-2015 at 05:03 AM.

  12. #12
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by barracuda View Post
    I have no idea how could I read the unsigned char string reading 4 bytes on one cycle.
    You wouldn't. Instead of storing the pixel data as unsigned chars, you store them as uint32_t.

    Here is an example function that converts R, G, B as floats (0.0f to 1.0f) to a R10G10B10-format uint32_t:
    Code:
    typedef uint32_t  color;
    
    color colorf(const float red, const float green, const float blue)
    {
        color result;
    
        if (red >= 1.0f)
            result = 1023U << 20;
        else
        if (red > 0.0f)
            result = (color)(1024.0f * red) << 20;
        else
            result = 0U;
    
        if (green >= 1.0f)
            result |= 1023U << 10;
        else
        if (green > 0.0f)
            result |= (color)(1024.0f * green) << 10;
    
        if (blue >= 1.0f)
            result |= 1023U;
        else
        if (blue > 0.0f)
            result |= (color)(1024.0f * blue);
    
        return result;
    }
    In other words, each uint32_t describes the color of the entire pixel.

    Quote Originally Posted by barracuda View Post
    I also have no idea how I would perform the hsv conversion or how to operate with the pixel if I would want to correct color, etc. Because I can work only with the separated components.
    Well, you separate the components first:
    Code:
    red = (color >> 20) & 1023U;
    green = (color >> 10) & 1023U;
    blue = color & 1023U;
    If you know that all components are within the valid range (0 to 1023, inclusive), then the inverse is just
    Code:
    color = ((uint32_t)red << 20) | ((uint32_t)green << 10) | (uint32_t)blue;
    If you use 8 bits per color, then the shifts are 16 and 8, masks 255U, and the float factor 256.0f, instead. Otherwise the R10G10B10 and R8G8B8 work the exact same way.

    Quote Originally Posted by barracuda View Post
    Should I also use A channels for jpeg images?
    A/alpha channel is only useful if you composite images; i.e. have some kind of transparency, partially blend images together, and so on. If you only do image analysis, I don't think it'll help.

    If you do use uint32_t per pixel, and only 8 bits per color component, the eight highest bits are unused. Like I said, this is a bit of waste (33% more memory used compared to 24-bit color, R8G8B8), but even so, uint32_t is faster and easier to work with, and worth the memory waste.

    Quote Originally Posted by barracuda View Post
    Someone told me I should not use floating point numbers in the conversion function but rather integers, this would also prevent the inacurrancy and banding, don't you think?
    No, not really. Using integer math you avoid the cost of integer-to-float conversion, twice per color component, that's all.

    The banding is a rounding or quantization effect. It happens whenever you do colorspace conversions, and there are colors that are not exactly preserved.

    For example, if you have two RGB color tuples that happen to map to a single HSV tuple, one of the original RGB colors will be mapped to the other RGB color after the RGB-to-HSV-to-RGB conversion. If this occurs relatively often, the number of RGB tuples in a smooth color fade is reduced, and you get the banding effect.

    Having the extra two bits per component, internally, means you can use 10-bit HSV components, and you reduce the likelihood of the rounding errors (mapping two different RGB tuples to the same result after the R8G8B8-to-H10S10V10-to-R8G8B8 conversion).

    It is possible to do the RGB/HSV colorspace conversion directly, using uint32_t colors, making it very, very fast. However, I would only do that after I have the "slow" conversion -- doing it component by component, using the splitting as I showed above.

    (When I have written such optimized conversion routines, I've ALWAYS verified that both ways produce the EXACT same results, using a loop over all possible colors in the original colorspace. Even 32-bit colorspaces (say, C8M8Y8K8) take just 4 gigabytes of RAM, and a second or so to completely check.
    But, if you don't have the slow and correct written first, you cannot check the fast version at all. That's why I ALWAYS write the slow version first, even if I don't include it in the final project at all.)

  13. #13
    Registered User
    Join Date
    Sep 2014
    Posts
    235
    @Nominal animal:
    You wouldn't. Instead of storing the pixel data as unsigned chars, you store them as uint32_t.
    But first of all, I must to read images and this happens with fread. I always used unsigned char * ptr as a argument for fread, but should I pass uint32_t * ptr into fread? I never thought it could be possible.

  14. #14
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by barracuda View Post
    But first of all, I must to read images and this happens with fread.
    Yes; that part does use an unsigned char buffer.

    What I would do, and did in one of the early examples wrt. jpeg-turbo, is read each scan line as an unsigned char buffer, and immediately convert it to the "standard" pixel format you use. This includes any colorspace conversion, too. You just need to be able to convert any pixel formats -- R8G8B8, bitmap, grayscale, etc. -- to your standard format, one scan line at a time.

    In other words, you do not read to the image structure, but to a temporary buffer, scan line by scan line, which you copy -- doing any conversions at the same time -- to the image structure. Writing an image is done the exact same way.

    This approach has another very nice property. Reading and writing to storage is relatively slow. If you can do the colorspace or format conversion during reading or writing the data, it often occurs during the time you'd otherwise be waiting for the storage I/O to complete. The end result is that even with relatively slow conversion routines, this will usually take less wall clock time than first reading the entire image to memory, then using super-fast routines to convert it.

    Note: this approach works really well with ALL I/O, not just images. (Technically, this is about reducing overall running time by avoiding unnecessary latencies.)

    For example, if your program has to read some text file, and sort it alphabetically, the way to minimize the real time (wall clock) used, is to do the sorting while reading the file, instead of afterwards. One can do this easily, by using e.g. a binary tree (and balancing it if it gets too skewed), or one of the self-balancing trees, to sort the lines as they are read into memory. The difference in wall clock time needed, especially for large files, tends to surprise programmers.

    In short, the idea is to have your program do as much work as possible, during the time it would otherwise spend waiting for I/O to complete.

  15. #15
    Registered User
    Join Date
    Sep 2014
    Posts
    235
    @Nominal animal
    In other words, you do not read to the image structure, but to a temporary buffer, scan line by scan line, which you copy -- doing any conversions at the same time -- to the image structure. Writing an image is done the exact same way.
    I don't understand what you mean. What image structure? Which buffer? Full image buffer or pointer array (2D rows array)?

    My actual method of rgb2hsv conversion (last mesured) took 0.6s for 4096x4096 24bit rgb using separated components rgb and increment of 3. The tests were performed on i7 with CPU frequency 2.9GHz. I will do some updates and write fresh measurements.

    So do you think it is better to read image using smaller blocks (I will need to work with 256x256 images)? Or is it faster to read complete file? I see there is difference if we talk about 256 rgb, 1024 rgb, or 4096 rgb.

    Did you try my output.bmp? I see it is possible to open it, but yesterday I had problems. I would be glad if anybody could check it and confirm if the conversion to hsv was successful. I think the conversion was successful, but I would be more happy if anybody more experienced could check the two images. Thanks
    Last edited by barracuda; 02-21-2015 at 01:27 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Order of Operations help (no code)
    By tmac619619 in forum C Programming
    Replies: 4
    Last Post: 10-25-2012, 02:41 PM
  2. Order of operations question
    By GuitarNinja in forum C++ Programming
    Replies: 4
    Last Post: 04-11-2009, 06:14 AM
  3. order of operations...
    By jverkoey in forum C++ Programming
    Replies: 2
    Last Post: 05-21-2004, 11:38 PM
  4. Order of Operations
    By C-Duddley in forum C Programming
    Replies: 3
    Last Post: 12-06-2002, 08:10 PM
  5. order of operations
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 18
    Last Post: 10-31-2002, 08:51 PM