Thread: Bitmap implementation in memory

  1. #1
    TEIAM - problem solved
    Join Date
    Apr 2012
    Location
    Melbourne Australia
    Posts
    1,907

    Bitmap implementation in memory

    Hey all

    I want to make a very large bitmap as a true/false array.

    How would the following be implemented in memory? From what I am seeing, it appears to have spread the bits across several bytes. Maybe forcing it to be a union with a char array with n/8+1 bytes might help

    I'm using code:blocks v10.05

    The code that I'm using to declare this is below - Please note that I am using 8 as a test number - It will change to 1000000.

    Code:
        struct
        {
            int val : 1;
        } sievebound[8];
    Any insight would be greatly appreciated.

  2. #2
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    Structures are padded so they can be used in arrays. Your struct is probably padded to 4 or 8 bytes, so your 8-element array is probably between 32 and 64 bytes in size.

    It's best to use an array of unsigned ints and write some bit-fiddling helper routines to set, reset, flip and test any particular bit.
    Code:
    #define WORD_SIZE (CHAR_BIT * sizeof(int))
    #define TOTAL_BITS 1000000
    #define SETBIT(b,n) ((b)[(n)/WORD_SIZE] |= (1 << ((n) % WORD_SIZE)))
    
    // in main
        unsigned bits[TOTAL_BITS / WORD_SIZE + 1] = { 0 };
        SETBIT(bits, 125000);
    Last edited by oogabooga; 08-17-2012 at 09:56 PM. Reason: added WORD_SIZE
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

  3. #3
    TEIAM - problem solved
    Join Date
    Apr 2012
    Location
    Melbourne Australia
    Posts
    1,907
    The bits are 4 bytes apart -> when I tried to put in 1000000, I ended up with a "computer says no" situation.

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    You cant make an array of anything smaller than a byte. It's just not possible in C.
    If you want to access the individual bits, write a function that gets the bit at position x, y for you, and another that sets it.

    Of course if you want it to be anything but dead-slow, then you need to start doing things in terms of groups of pixels, not a-pixel-at-a-time. In other words, what you currently want to do is to do it the wrong way.
    Last edited by iMalc; 08-17-2012 at 10:06 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    Quote Originally Posted by iMalc View Post
    Of course if you want it to be anything but dead-slow, then you need to start doing things in terms of groups of pixels, not a-pixel-at-a-time. In other words, what you currently want to do is to do it the wrong way.
    I don't think he wants the "bitmap" for pixels but for a bitmapped data structure to use in the Sieve of Eratosthenes alg. (I infer that from the name "sievebound".) Some clarification is in order.
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Oh, yes you're probably right.

    In that case, it's possibly not worth it, at least in terms of speed. I've been there and profiled such a sieve that uses one bit per flag vs one byte per flag, and the byte version wins by a fair margin.
    Much of what I said earlier still applies though, if you mask out 4, 8, or 16 multiples of 2 at once, then you may get back some of that speed loss. You also need to do it by writing a function (or macro) to set a bit. oogabooga has already given you what you need for that.

    Any further questions?
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  7. #7
    TEIAM - problem solved
    Join Date
    Apr 2012
    Location
    Melbourne Australia
    Posts
    1,907
    My goal was to reduce the amount of memory required - I knew that there might be a problem as soon as I looked at what was happening with the memory dump

    Basically (for clarification), yes it was a "Sieve of Eratosthenes alg" that I was trying to implement. It was suggested as an alternative to add up all the prime numbers from 0 to 2,000,000. I have a fairly fast function to calculate if a number is a prime, so my original approach used a for() loop to and that function to get the sum - I'm just curious of the other approach... Send me a message if you want my "is_prime()" function... My fastest time to add up all the primes from 0 to 2,000,000 was 1.57sec -> See if you can beat it :P

    I played around with this code a bit...
    Code:
    int main(void)
    {
        union
        {
            struct
            {
                int val1 : 1;
            } sievebound_bit[8];
    
    
            char sievebound_byte;
    
    
        } data;
    
    
        data.sievebound_byte = 0;
    
    
        printf("%x ",data.sievebound_byte);
    
    
        data.sievebound_bit[0].val1 = 1;
        printf("%x ",data.sievebound_byte);
    
    
        data.sievebound_bit[1].val1 = 1;
        printf("%x ",data.sievebound_byte);
    
    
        data.sievebound_bit[0].val1 = 0;
        printf("%x ",data.sievebound_byte);
    
    
        data.sievebound_bit[3].val1 = 1;
        printf("%x ",data.sievebound_byte);
    
    
        return EXIT_SUCCESS;
    
    
    }
    It clearly didn't work for anything else than [0] - I think that the only think appropriate for this code is to dig a large hole, put it in there, and fill the hole as fast a possible... Actually, I think that the delete key will do the job - and that way I don't have to put my computer in a hole in the ground...

    Bottom line is that this approach does not work

  8. #8
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    Why don't you create a regular array of unsigned char or even unsigned int? You can get the index in the array and the bit at that index for any given bit by:

    Code:
        size_t index  = nbit / (sizeof(bitmap_data_t) * 8);
        size_t bitpos = nbit % (sizeof(bitmap_data_t) * 8);
    Where bitmap_data_t is the data type you have picked for your array, uint32_t or other.

  9. #9
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Yes, to be perfectly clear, that approach cannot work. There is no way to force C to do any kind of native bit-array. Something like what oogabooga or Subsonics wrote is the way to do that.

    Mine calculated that sum (142913828922) in 28 milliseconds in a release build.
    Note that you need to use a 64-bit data type to hold the result, otherwise your program suffers from integer overflow and the result wont be correct.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #10
    TEIAM - problem solved
    Join Date
    Apr 2012
    Location
    Melbourne Australia
    Posts
    1,907
    iMalc "Mine calculated that sum (142913828922) in 28 milliseconds in a release build"
    How did you measure the time of your program on a released code? Did you build the time functions in your code?

  11. #11
    Registered User
    Join Date
    May 2012
    Location
    Arizona, USA
    Posts
    948
    Quote Originally Posted by Click_here View Post
    How did you measure the time of your program on a released code? Did you build the time functions in your code?
    Stopwatch and fast reflexes.

    Unix systems have a "time" command that reports how much user time, system time, and real time a program takes to execute. He probably used that or something like it.

    Edit: I shouldn't say what someone else "probably" uses. But that's what I would use.

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    I ran the function inside my own unit test framework, which uses QueryPerformanceCounter and QueryPerformanceFrequency (For Windows obviously).
    I assumed you used the same technique I was using, just as you assumed others would use what you did.

    My guess would be that in yours, you are calling sqrt inside the loop or something like that.

    Do you get the same result?
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    Quote Originally Posted by iMalc View Post
    I ran the function inside my own unit test framework, which uses QueryPerformanceCounter and QueryPerformanceFrequency (For Windows obviously).
    I assumed you used the same technique I was using, just as you assumed others would use what you did.

    My guess would be that in yours, you are calling sqrt inside the loop or something like that.

    Do you get the same result?
    If I remember correctly, Code::Blocks displays the runtime, which is handy.

    As for the result, I got 1,179,908,154 ! I put in a test to see if the 32-bit int I used for the sum ever goes negative and it never does so .... (I also tried a long long and got the same answser.)

    Anyway, about using a char per bit instead of "bitmapping" (and bit-fiddling) being faster, that seems reasonable since it's not a cache-friendly algorithm. The only way the bit-fiddling could compete is if it allowed more data to be kept in the cache, but this algorithm constantly scans over and over again from the beginning of the array, totally thrashing the cache.

    However, if a person were more interested in range than speed a bit-mapped version would be good, giving 8 times the range. That can be important since you definitely don't want the algorithm to start thrashing the disk.
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

  14. #14
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Nope, I have three very different algorithms that calculate the list of primes up to N, and they all give the same result.
    1179908154 just happens to be the value that you get if you take the lowest 32-bits of 142913828922. So one way or another you had a bug.
    Code:
    142913828922 = 0x214653F83A
      1179908154 =   0x4653F83A
    True, you can get to a higher number if you use one bit per flag. When going that far, it's also worth only using half as many flags for just the odd numbers, for double the savings again.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #15
    - - - - - - - - oogabooga's Avatar
    Join Date
    Jan 2008
    Posts
    2,808
    Yep, you're right. I did the "does it go negative" test again and it goes negative tens of thousands of times. I don't know what was wrong with the first time I tested that. And using long long (and turning up the warnings!) I get "unknown conversion type character 'L' in format" so I guess it was just printing half the number.

    As for only representing the odd numbers, I always do that. If you need to save even more space it's not that hard to leave out all numbers divisible by 3 (and even 5 and 7 if you're really trying to squeeze space).
    The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 0
    Last Post: 03-16-2012, 09:59 AM
  2. Replies: 7
    Last Post: 10-01-2008, 07:45 PM
  3. Shared memory implementation using thread
    By kumars in forum C Programming
    Replies: 5
    Last Post: 06-18-2008, 04:24 AM
  4. Create 24 bits bitmap in memory
    By pronecracker in forum Windows Programming
    Replies: 5
    Last Post: 05-05-2007, 10:51 AM
  5. implementation?
    By calQlaterb in forum C++ Programming
    Replies: 1
    Last Post: 12-11-2001, 10:25 AM