Thread: Can you explain these bitwise operations?

  1. #1
    Registered User
    Join Date
    Nov 2006
    Posts
    184

    Can you explain these bitwise operations?

    I am working with a file that has a header with the size of the file encoded into 4 bytes like:

    The ID3v2 tag size is encoded with four bytes where the most significant bit (bit 7) is set to zero in every byte, making a total of 28 bits. The zeroed bits are ignored, so a 257 bytes long tag is represented as $00 00 02 01.
    I have found someone's code that puts this together into the integer value, but i don't understand why they do each step. Is there anyone here who can explain this code to me?

    (I know the code's not C++, but I'm hoping you can explain what they're doing and I can convert it to c++)

    Code:
    //Read in the bytes (why do they read char[] instead of byte[]?)
    char[] tagSize = br.ReadChars(4);    // I use this to read the bytes in from the file
     
    //Store the shifted bytes (why is it int[], not byte[]?)
    int[] bytes = new int[4];      // for bit shifting
     
    int size = 0;    // for the final number
     
    /*
     * Why are they combining these bytes in this way if they're
     * going to again combine them below (in the line setting "size")?
     */
     
    //how do they know they only care about the rightmost bit on the 3rd byte?
    //how do they know to shift it 7 to the left?
    bytes[3] =  tagSize[3] | ((tagSize[2] & 1) << 7) ;
     
    //Why do they use 63 here (I know it's 111111)?
    //how do they know they only want the 3 rightmost of byte 2nd byte?
    //And how know to shift it 6 to the left?
    bytes[2] = ((tagSize[2] >> 1) & 63) | ((tagSize[1] & 3) << 6) ;
    bytes[1] = ((tagSize[1] >> 2) & 31) | ((tagSize[0] & 7) << 5) ;
    bytes[0] = ((tagSize[0] >> 3) & 15) ;
     
    //how do they know to shift these bytes the amount that they do to the left?
    size  = ((UInt64)bytes[3] | ((UInt64)bytes[2] << 8)  | ((UInt64)bytes[1] << 16) | ((UInt64)bytes[0] << 24)) ;
    Last edited by 6tr6tr; 10-29-2008 at 12:20 PM.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    First of all, this is perfectly valid C++ code, I see no reason why you would want/need to change it.

    Second, the original format uses 7 bits out of each byte. To make that into a 32 bit (actually 32-bit) number, it is first converted to a set of 8-bit bytes.

    Bytes[3] is the lowest byte, so it holds 7 bits from tagsize[3] and 1 bit from tagsize[2].
    Bytes[2] is the second lowest byte, so it holds the remaining 6 bits form tagsize[2], and 2 bits from tagsize[1].
    Bytes[1] is the reamining bits of tagsize[1] and part of tagsize[0]
    Bytes[0] is the last bits of the tagsize[0].

    The number of bits correspond to the masks used in the & operation, for example 1 bit -> & 1, 2 bits -> & 3 and 6 bits -> & 63

    Once we have the bytes values, we can then shuffle it all into a 32-bit integer. As each byte is 8 bits, we need to shift by 0, 8, 16 and 24 bits to form the 32-bit number.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Nov 2006
    Posts
    184
    Quote Originally Posted by matsp View Post
    First of all, this is perfectly valid C++ code, I see no reason why you would want/need to change it.

    Second, the original format uses 7 bits out of each byte. To make that into a 32 bit (actually 32-bit) number, it is first converted to a set of 8-bit bytes.

    Bytes[3] is the lowest byte, so it holds 7 bits from tagsize[3] and 1 bit from tagsize[2].
    Bytes[2] is the second lowest byte, so it holds the remaining 6 bits form tagsize[2], and 2 bits from tagsize[1].
    Bytes[1] is the reamining bits of tagsize[1] and part of tagsize[0]
    Bytes[0] is the last bits of the tagsize[0].

    The number of bits correspond to the masks used in the & operation, for example 1 bit -> & 1, 2 bits -> & 3 and 6 bits -> & 63

    Once we have the bytes values, we can then shuffle it all into a 32-bit integer. As each byte is 8 bits, we need to shift by 0, 8, 16 and 24 bits to form the 32-bit number.

    --
    Mats
    THANK YOU! That was a fantastic explanation!

    I was pretty close to figuring it out but what threw me off was the second line where the byte is "and"-ed with 3: (tagSize[1] & 3). I was reading that as:
    xxxxxxxx &
    - - - - - xxx
    But obviously 3 means: 00000011 not 00000111.

    Thanks again!

  4. #4
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    Quote Originally Posted by matsp
    First of all, this is perfectly valid C++ code, I see no reason why you would want/need to change it.
    With at least one small caveat: in C++, the brackets used to denote an array when declaring an array come after the array name.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by laserlight View Post
    With at least one small caveat: in C++, the brackets used to denote an array when declaring an array come after the array name.
    Yes. I originally thought it was C-code [I didn't look very carefully, of course, as there is a "new" in there, as well as brackets in "weird" places].

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Registered User
    Join Date
    Nov 2006
    Posts
    184
    Quote Originally Posted by matsp View Post
    First of all, this is perfectly valid C++ code, I see no reason why you would want/need to change it.

    Second, the original format uses 7 bits out of each byte. To make that into a 32 bit (actually 32-bit) number, it is first converted to a set of 8-bit bytes.

    Bytes[3] is the lowest byte, so it holds 7 bits from tagsize[3] and 1 bit from tagsize[2].
    Bytes[2] is the second lowest byte, so it holds the remaining 6 bits form tagsize[2], and 2 bits from tagsize[1].
    Bytes[1] is the reamining bits of tagsize[1] and part of tagsize[0]
    Bytes[0] is the last bits of the tagsize[0].

    The number of bits correspond to the masks used in the & operation, for example 1 bit -> & 1, 2 bits -> & 3 and 6 bits -> & 63

    Once we have the bytes values, we can then shuffle it all into a 32-bit integer. As each byte is 8 bits, we need to shift by 0, 8, 16 and 24 bits to form the 32-bit number.

    --
    Mats
    One question:

    when constructing the last byte, why is it "& 15" and not "& 255"? Aren't we constructing an 8-bit byte? Or is it that it won't matter, it's the same thing since everything moving in from the left is "0" anyways? Is there any reason that makes "& 15" better/quicker than "&255"?

    Referring to:
    Code:
    bytes[0] = ((tagSize[0] >> 3) & 15) ;

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by 6tr6tr View Post
    One question:

    when constructing the last byte, why is it "& 15" and not "& 255"? Aren't we constructing an 8-bit byte?

    Referring to:
    Code:
    bytes[0] = ((tagSize[0] >> 3) & 15) ;
    Because you just want the last four bits of tagSize[0]>>3? (In other words, if the sign bit gets set somehow, and the machine fills in with the sign bit when shifting right, you don't want all that in your number.)

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. bitwise operations with double
    By henry_kay in forum C Programming
    Replies: 2
    Last Post: 10-03-2007, 04:57 AM
  2. Bitwise operations
    By sh3rpa in forum C++ Programming
    Replies: 16
    Last Post: 09-25-2007, 06:32 PM
  3. bitwise operations
    By black_watch in forum C++ Programming
    Replies: 9
    Last Post: 03-24-2007, 04:48 AM
  4. bitwise operations
    By andrew_tucker in forum C Programming
    Replies: 2
    Last Post: 11-28-2002, 12:46 AM
  5. bitwise operations
    By bukko in forum C Programming
    Replies: 3
    Last Post: 10-06-2001, 06:56 AM