Bitmap implementation in memory

**Click_here** · 08-17-2012

Hey all

I want to make a very large bitmap as a true/false array.

How would the following be implemented in memory? From what I am seeing, it appears to have spread the bits across several bytes. Maybe forcing it to be a union with a char array with n/8+1 bytes might help

I'm using code:blocks v10.05

The code that I'm using to declare this is below - Please note that I am using 8 as a test number - It will change to 1000000.

Code:

    struct
    {
        int val : 1;
    } sievebound[8];

Any insight would be greatly appreciated.

**oogabooga** · 08-17-2012

Structures are padded so they can be used in arrays. Your struct is probably padded to 4 or 8 bytes, so your 8-element array is probably between 32 and 64 bytes in size.

It's best to use an array of unsigned ints and write some bit-fiddling helper routines to set, reset, flip and test any particular bit.

Code:

#define WORD_SIZE (CHAR_BIT * sizeof(int))
#define TOTAL_BITS 1000000
#define SETBIT(b,n) ((b)[(n)/WORD_SIZE] |= (1 << ((n) % WORD_SIZE)))

// in main
    unsigned bits[TOTAL_BITS / WORD_SIZE + 1] = { 0 };
    SETBIT(bits, 125000);

**Click_here** · 08-17-2012

The bits are 4 bytes apart -> when I tried to put in 1000000, I ended up with a "computer says no" situation.

**iMalc** · 08-17-2012

You cant make an array of anything smaller than a byte. It's just not possible in C.
If you want to access the individual bits, write a function that gets the bit at position x, y for you, and another that sets it.

Of course if you want it to be anything but dead-slow, then you need to start doing things in terms of groups of pixels, not a-pixel-at-a-time. In other words, what you currently want to do is to do it the wrong way.

**oogabooga** · 08-17-2012

Originally Posted by iMalc

Of course if you want it to be anything but dead-slow, then you need to start doing things in terms of groups of pixels, not a-pixel-at-a-time. In other words, what you currently want to do is to do it the wrong way.

I don't think he wants the "bitmap" for pixels but for a bitmapped data structure to use in the Sieve of Eratosthenes alg. (I infer that from the name "sievebound".) Some clarification is in order.

**iMalc** · 08-17-2012

Oh, yes you're probably right.

In that case, it's possibly not worth it, at least in terms of speed. I've been there and profiled such a sieve that uses one bit per flag vs one byte per flag, and the byte version wins by a fair margin.
Much of what I said earlier still applies though, if you mask out 4, 8, or 16 multiples of 2 at once, then you may get back some of that speed loss. You also need to do it by writing a function (or macro) to set a bit. oogabooga has already given you what you need for that.

Any further questions?

**Click_here** · 08-18-2012

My goal was to reduce the amount of memory required - I knew that there might be a problem as soon as I looked at what was happening with the memory dump

Basically (for clarification), yes it was a "Sieve of Eratosthenes alg" that I was trying to implement. It was suggested as an alternative to add up all the prime numbers from 0 to 2,000,000. I have a fairly fast function to calculate if a number is a prime, so my original approach used a for() loop to and that function to get the sum - I'm just curious of the other approach... Send me a message if you want my "is_prime()" function... My fastest time to add up all the primes from 0 to 2,000,000 was 1.57sec -> See if you can beat it :P

I played around with this code a bit...

Code:

int main(void)
{
    union
    {
        struct
        {
            int val1 : 1;
        } sievebound_bit[8];


        char sievebound_byte;


    } data;


    data.sievebound_byte = 0;


    printf("%x ",data.sievebound_byte);


    data.sievebound_bit[0].val1 = 1;
    printf("%x ",data.sievebound_byte);


    data.sievebound_bit[1].val1 = 1;
    printf("%x ",data.sievebound_byte);


    data.sievebound_bit[0].val1 = 0;
    printf("%x ",data.sievebound_byte);


    data.sievebound_bit[3].val1 = 1;
    printf("%x ",data.sievebound_byte);


    return EXIT_SUCCESS;


}

It clearly didn't work for anything else than [0] - I think that the only think appropriate for this code is to dig a large hole, put it in there, and fill the hole as fast a possible... Actually, I think that the delete key will do the job - and that way I don't have to put my computer in a hole in the ground...

Bottom line is that this approach does not work

**Subsonics** · 08-18-2012

Why don't you create a regular array of unsigned char or even unsigned int? You can get the index in the array and the bit at that index for any given bit by:

Code:

    size_t index  = nbit / (sizeof(bitmap_data_t) * 8);
    size_t bitpos = nbit % (sizeof(bitmap_data_t) * 8);

Where bitmap_data_t is the data type you have picked for your array, uint32_t or other.

**iMalc** · 08-18-2012

Yes, to be perfectly clear, that approach cannot work. There is no way to force C to do any kind of native bit-array. Something like what oogabooga or Subsonics wrote is the way to do that.

Mine calculated that sum (142913828922) in 28 milliseconds in a release build.
Note that you need to use a 64-bit data type to hold the result, otherwise your program suffers from integer overflow and the result wont be correct.

**Click_here** · 08-18-2012

iMalc "Mine calculated that sum (142913828922) in 28 milliseconds in a release build"

How did you measure the time of your program on a released code? Did you build the time functions in your code?

**christop** · 08-18-2012

Originally Posted by Click_here

How did you measure the time of your program on a released code? Did you build the time functions in your code?

Stopwatch and fast reflexes.

Unix systems have a "time" command that reports how much user time, system time, and real time a program takes to execute. He probably used that or something like it.

Edit: I shouldn't say what someone else "probably" uses. But that's what I would use.

**iMalc** · 08-18-2012

I ran the function inside my own unit test framework, which uses QueryPerformanceCounter and QueryPerformanceFrequency (For Windows obviously).
I assumed you used the same technique I was using, just as you assumed others would use what you did.

My guess would be that in yours, you are calling sqrt inside the loop or something like that.

Do you get the same result?

**oogabooga** · 08-18-2012

Originally Posted by iMalc

I ran the function inside my own unit test framework, which uses QueryPerformanceCounter and QueryPerformanceFrequency (For Windows obviously).
I assumed you used the same technique I was using, just as you assumed others would use what you did.

My guess would be that in yours, you are calling sqrt inside the loop or something like that.

Do you get the same result?

If I remember correctly, Code::Blocks displays the runtime, which is handy.

As for the result, I got 1,179,908,154 ! I put in a test to see if the 32-bit int I used for the sum ever goes negative and it never does so .... (I also tried a long long and got the same answser.)

Anyway, about using a char per bit instead of "bitmapping" (and bit-fiddling) being faster, that seems reasonable since it's not a cache-friendly algorithm. The only way the bit-fiddling could compete is if it allowed more data to be kept in the cache, but this algorithm constantly scans over and over again from the beginning of the array, totally thrashing the cache.

However, if a person were more interested in range than speed a bit-mapped version would be good, giving 8 times the range. That can be important since you definitely don't want the algorithm to start thrashing the disk.

**iMalc** · 08-19-2012

Nope, I have three very different algorithms that calculate the list of primes up to N, and they all give the same result.
1179908154 just happens to be the value that you get if you take the lowest 32-bits of 142913828922. So one way or another you had a bug.

Code:

142913828922 = 0x214653F83A
  1179908154 =   0x4653F83A

True, you can get to a higher number if you use one bit per flag. When going that far, it's also worth only using half as many flags for just the odd numbers, for double the savings again.

**oogabooga** · 08-19-2012

Yep, you're right.

I did the "does it go negative" test again and it goes negative tens of thousands of times. I don't know what was wrong with the first time I tested that. And using long long (and turning up the warnings!) I get "unknown conversion type character 'L' in format" so I guess it was just printing half the number.

As for only representing the odd numbers, I always do that. If you need to save even more space it's not that hard to leave out all numbers divisible by 3 (and even 5 and 7 if you're really trying to squeeze space).

Thread: Bitmap implementation in memory

Thread Tools

Search Thread

Display

Bitmap implementation in memory

Similar Threads

how to program median-filter processing on 24 bitmap instead of 8 bitmap?

implementation our own malloc implementation with out using predefined malloc routine

Shared memory implementation using thread

Create 24 bits bitmap in memory

implementation?