Thread: How to bitset in C?

  1. #1
    Registered User
    Join Date
    May 2015
    Posts
    65

    How to bitset in C?

    To dive even deeper into C, I had an idea to test my self, which basically involve porting some code I wrote at long ago, except to do so as efficiently as possible while also making it portable.

    The problem is that I'm going to need some sort of bitset solution, like the std::bitset from the work of C++, and while I may be able to come up with a solution, it most certainly won't be either safe nor efficient, and it needs to be, as I need to mess about with quite a lot of bits, 2^32 potentially, or 512mb of data.

    So, I was wondering if anyone would be able to help.

    Also, I don't want to involve some 3rd party library or something. I'm sure there are some great ones out there, but I won't learn anything from using them.


    Best regards.

  2. #2
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,738
    Sure. Say "bitPlace" is the bit you want to set, and "bits" is an array of unsigned ints, representing your enormous bitfield. Then:
    Code:
    bits[bitPlace / sizeof(unsigned)] |= 0x1 << (bitPlace % sizeof(unsigned));
    it's really easy to expand on this for other operations too.
    Devoted my life to programming...

  3. #3
    Registered User
    Join Date
    May 2015
    Posts
    65
    Quote Originally Posted by GReaper View Post
    Sure. Say "bitPlace" is the bit you want to set, and "bits" is an array of unsigned ints, representing your enormous bitfield. Then:
    Code:
    bits[bitPlace / sizeof(unsigned)] |= 0x1 << (bitPlace % sizeof(unsigned));
    it's really easy to expand on this for other operations too.
    That was about what I had in mind, just seemed too simple.

    Anyway, thanks.

  4. #4
    Registered User
    Join Date
    May 2015
    Posts
    65
    I find my self in a predicament, as I get a segmentation error trying to declare an array taking up that much memory.

    I actually had much the same problem while using std::bitset in c++, and had to allocate the memory manually.

    Is there really no other solution?

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Zacariaz
    I find my self in a predicament, as I get a segmentation error trying to declare an array taking up that much memory.
    You can try using malloc or declaring the array to have static storage location (by declaring it at file scope or declaring it static).

    Quote Originally Posted by Zacariaz
    I actually had much the same problem while using std::bitset in c++, and had to allocate the memory manually.
    std::bitset does not appear to have an allocator template parameter, but you could always wrap a dynamically allocated one using std::unique_ptr, or perhaps try Boost.DynamicBitset (but then the performance may be different).
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    May 2015
    Posts
    65
    The below might not work, but should illustrate my intentions:

    Code:
    uint_fast8_t* test = calloc( (1 << ( sizeof(void*) * 4 ) ) / sizeof( uint_fast8_t ), sizeof( uint_fast8_t ) );
    But of course it is almost always desirable to leave the memory management to the compiler / OS. For instance, if I'm not much mistaken, the above code run alone and on a 64 bit system, with no call to free, will result in one heck of a memory leak of 512mb. Obviously there are cases where you need the ability to dynamically allocate and deallocate memory, but when you basically know upfront how much you need, it does seem silly. Still, there's apparently no way around it.

  7. #7
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by laserlight
    declaring the array to have static storage location (by declaring it at file scope or declaring it static).
    I note a typo error: "static storage location" should have been "static storage duration".

    Quote Originally Posted by Zacariaz
    But of course it is almost always desirable to leave the memory management to the compiler / OS.
    C++ has an advantage due to RAII, but that is a negligible advantage here given that such a huge array will almost certainly have special care given to its lifetime.

    Quote Originally Posted by Zacariaz
    For instance, if I'm not much mistaken, the above code run alone and on a 64 bit system, with no call to free, will result in one heck of a memory leak of 512mb.
    Yes, but surely you would call free, and if you don't, on a modern OS the memory leak will not last beyond the termination of the process.

    Quote Originally Posted by Zacariaz
    when you basically know upfront how much you need, it does seem silly. Still, there's apparently no way around it.
    Then take my suggestion of declaring the array to have static storage duration as that is indeed a "way around it".
    Last edited by laserlight; 05-28-2015 at 11:05 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  8. #8
    Registered User
    Join Date
    May 2015
    Posts
    65
    Quote Originally Posted by laserlight View Post
    I note a typo error: "static storage location" should have been "static storage duration".


    C++ has an advantage due to RAII, but that is a negligible advantage here given that such a huge array will almost certainly have special care given to its lifetime.


    Yes, but surely you would call free, and if you don't, on a modern OS the memory leak will not last beyond the termination of the process.


    Then take my suggestion of declaring the array to have static storage duration as that is indeed a "way around it".
    Regarding the memory leak:
    The program could crash before the call to free, or mistakes could be made, not least in larger projects.


    Are you saying that if declared static, no manual intervention is required for deallocating memory?

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Zacariaz
    Regarding the memory leak:
    The program could crash before the call to free
    As I noted, on a modern OS, the memory would be released anyway. Besides, such a crash would indicate that you have a serious bug that needs to be fixed, so this concern of a memory leak is secondary.

    Quote Originally Posted by Zacariaz
    or mistakes could be made, not least in larger projects.
    That is true, but if that is truly a concern to you, then you should use C++ instead of C so as to take advantage of RAII, not just in this scenario, but in more general cases that would otherwise require manual resource management.

    Quote Originally Posted by Zacariaz
    Are you saying that if declared static, no manual intervention is required for deallocating memory?
    Yes. The lifetime of an object with static storage duration is the entire execution of the program. The good part is that this means that you are guaranteed that at the termination of the program, storage that has been reserved for such an object will be released. The bad part is that the storage is reserved the whole time, which can be as bad as a memory leak for a long running process.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    Registered User
    Join Date
    May 2015
    Posts
    65
    That is very good to know, only I must be doing something wrong, as it's the compiler complains when static is applied:

    error: initializer element is not constant
    Last edited by Zacariaz; 05-28-2015 at 11:57 AM.

  11. #11
    Registered User
    Join Date
    May 2015
    Posts
    65
    Seems like I have figured out most of the issues. I didn't quite get the result I expected, referring to the reason I needed a bitset in the first place, but it worked just the same. Since then I've spend a little time a generic implementation. Haven't actually tested it, but it should work.

    Code:
    #include <stdlib.h>
    #include "bitset.h"
    
    
    static uint_fast8_t* bitset = NULL;
    static uint_fast32_t limit = 0;
    static uint_fast8_t typeWidth = sizeof(uint_fast8_t)*8;
    
    
    void Init(uint_fast32_t n) {
        limit = (n%typeWidth) ? n*typeWidth : (n+1)*typeWidth;
        bitset = calloc(limit,typeWidth/8);
    }
    
    
    void Destroy() {
        free(bitset);
        bitset = NULL;
        limit = 0;
    }
    
    
    int_fast8_t SetBit(uint_fast32_t n) {
        return (n < limit) ? bitset[n/typeWidth] |= 1 << n%typeWidth & 1 : -1;
    }
    
    
    int_fast8_t ResetBit(uint_fast32_t n) {
        return (n < limit) ? bitset[n/typeWidth] &= ~(1 << n%typeWidth) & 1 : -1;
    }
    
    
    int_fast8_t GetBit(uint_fast32_t n) {
        return (n < limit) ? (bitset[n/typeWidth] >> (n%typeWidth)) & 1 : -1;
    }
    Code:
    #ifndef BITSET_H
    #define BITSET_H
    #include <stdint.h>
    
    void Init(uint_fast32_t);
    void Destroy();
    void SetBit(uint_fast32_t);
    void ResetBit(uint_fast32_t);
    int_fast8_t GetBit(uint_fast32_t);
    
    #endif
    It looks right to me, except to say that set/reset bit should probably have some sort of return value.

    Anyway it's past my bedtime, so any testing will have to wait.


    Best regards
    Last edited by Zacariaz; 05-28-2015 at 05:48 PM.

  12. #12
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    If you want to go the calloc/free route, then it would be better to write a reusable library, e.g.,
    Code:
    #ifndef ZACARIAZ_BITSET_H
    #define ZACARIAZ_BITSET_H
    
    /* bitset.h */
    
    #include <stdint.h>
    
    typedef struct Bitset_ Bitset;
    
    void BitsetInit(Bitset *bitset, uint_fast32_t n);
    void BitsetDestroy(Bitset *bitset);
    int_fast8_t BitsetSetBit(Bitset *bitset, uint_fast32_t n);
    int_fast8_t BitsetResetBit(Bitset *bitset, uint_fast32_t n);
    int_fast8_t BitsetGetBit(Bitset *bitset, uint_fast32_t n);
    
    #endif
    Code:
    #include <stdlib.h>
    #include "bitset.h"
    
    
    struct Bitset_ {
        uint_fast8_t *data;
        uint_fast32_t limit;
    };
    
    static const uint_fast8_t BitsetTypeWidth = sizeof(uint_fast8_t) * CHAR_BIT;
    
    
    void BitsetInit(Bitset *bitset, uint_fast32_t n) {
        bitset->limit = ((n % BitsetTypeWidth) ? n : (n + 1)) * BitsetTypeWidth;
        bitset->data = calloc(bitset->limit, sizeof(bitset->data[0]));
    }
    
    
    void BitsetDestroy(Bitset *bitset) {
        free(bitset->data);
        bitset->data = NULL;
        bitset->limit = 0;
    }
    
    /* ... */
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  13. #13
    Registered User
    Join Date
    May 2015
    Posts
    65
    Thanks for the code, and your of course right, though not that far into my "education" if you will, and I have found it very difficult to find comprehensible tutorials and/documentation, explaining how to do such things.

    At any rate, Except for the keyword 'typedef', which for some odd reason has always confused me, though I'm uncertain how exactly to use it. You obviously have declare a type Bitset on your own, in order to be able to pass it to the various functions, but that seems rather unsafe and convoluted.

    Furthermore is was never my intention to go the free() route, but as only one bitset could exist, I thought Destroy might come in handy. Now however, looking at your code, it would seem that you can't do without free, as 'data' is not static.

    Suffice to say I have to consider the pros and cons.

    At any rate, I appreciate your efforts very much and I have indeed learned something.


    Best regards and thanks.

  14. #14
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Zacariaz
    At any rate, Except for the keyword 'typedef', which for some odd reason has always confused me, though I'm uncertain how exactly to use it.
    In this case, typedef is used for convenience: unlike C++, the tag name by itself cannot be used to refer to a struct type, i.e., by declaring struct Bitset_, we cannot just use the name Bitset_, but rather have to use struct Bitset_. This can be useful at times for the reader to be explicitly aware that a struct type is in use, but here it is just verbose, hence I made Bitset an alias for struct Bitset_, thus from the point of the typedef onwards Bitset can be used instead of struct Bitset_.

    Quote Originally Posted by Zacariaz
    You obviously have declare a type Bitset on your own, in order to be able to pass it to the various functions, but that seems rather unsafe and convoluted.
    That is not true. The Bitset type has already been declared. What the user of the library must do is to create a Bitset object in order to be able to pass a pointer to it to the various function. There is nothing unsafe or convoluted about that.

    That said, I noticed that there is a flaw in my example code: I would like the implementation of struct Bitset_ to be hidden in the source file, but if so, the user cannot declare a Bitset object, i.e., this will be invalid:
    Code:
    Bitset bitset;
    BitsetInit(&bitset, 20);
    A solution is to change BitsetInit to take a pointer to a pointer instead:
    Code:
    void BitsetInit(Bitset **bitset, uint_fast32_t n) {
        Bitset *result = malloc(sizeof(*result));
        if (result) {
            result->limit = ((n % BitsetTypeWidth) ? n : (n + 1)) * BitsetTypeWidth;
            result->data = calloc(result->limit, sizeof(result->data[0]));
        }
        *bitset = result;
    }
    Now the user can create a pointer to Bitset instead:
    Code:
    Bitset *bitset = NULL;
    BitsetInit(&bitset, 20);
    Of course, BitsetDestroy would have to be changed accordingly.

    Quote Originally Posted by Zacariaz
    Furthermore is was never my intention to go the free() route, but as only one bitset could exist, I thought Destroy might come in handy. Now however, looking at your code, it would seem that you can't do without free, as 'data' is not static.
    Even if data was static, you cannot do without free. If you want an array with static storage duration so as to do without calloc/free, then it must really be an array, not a pointer:
    Code:
    uint_fast8_t bitset[1234];
    Note that you do not have to actually use the static keyword: just defining at file scope in a source file would suffice. In the header file, you would declare it extern. If you do use the static keyword, then the array becomes local to the source file (or rather the translation unit), in which case declaring it in the header usually does not make sense.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #15
    Registered User
    Join Date
    May 2015
    Posts
    65
    I will refrain from commenting on most of you post, and simple say thanks, again.

    However, on the last point, my original problem was not being able to declare and array large enough, due to (if I'm not mistaken) the stack not being large enough, thus resulting in a segmentation fault.

    In my case this can be solves with a simple command from the terminal, but that is rather platform specific, thus I ask you now.

    Are there any alternatives, where you don't need to manually deallocate memory?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Bitset
    By Kristyy_mariee in forum C++ Programming
    Replies: 2
    Last Post: 03-03-2012, 09:00 AM
  2. Bitset help
    By Kristyy_mariee in forum C++ Programming
    Replies: 1
    Last Post: 03-02-2012, 10:33 PM
  3. Bitset questions
    By serge in forum C++ Programming
    Replies: 17
    Last Post: 05-07-2008, 02:00 PM
  4. possible to have pointers in bitset?
    By franziss in forum C++ Programming
    Replies: 23
    Last Post: 04-07-2007, 10:19 PM
  5. errors with bitset
    By Marksman in forum C++ Programming
    Replies: 11
    Last Post: 07-12-2006, 12:15 PM