Thread: Structure Padding, pragma pack...

  1. #1
    P.Phant
    Guest

    Structure Padding, pragma pack...

    Hi,
    This is a question from a beginner.
    I've encountered some problems due to structure padding, trying to read data from mapped files/UDP packets and using pointer to structures to do so. For exemple
    Code:
    typedef struct{
    	char title[4];
    	unsigned short port;
     	unsigned long null;
    	unsigned long desc_length;
    }BNDR_HEADER;
    
    typedef union{
    	BNDR_HEADER;
    	char msg[MAX_UDP_PACKET];
    }BNDR, *PBNDR;
    the BNDR_HEADER structure was padded after the port member to 8 bytes (4 byte alignment), and so reading the data using a PBNDR variable I didn't get what I wanted in the null and desc_length members.

    I'm compiling under gcc (or rather Mingw) and #pragma pack(2) fixes this. My question is why shouldn't I do it ? Is it a bad idea? If someone could shortly explain the reason behind structure padding... I see only one alternative to this: it is to work with explicit offsets (#defined) and casts depending on the data I want to read. After using this method in other part of my code, I came to the conclusion it wasn't very readeable, so if you have a better suggestion .
    Thanks.

  2. #2
    Registered /usr
    Join Date
    Aug 2001
    Location
    Newport, South Wales, UK
    Posts
    1,273
    AFAIK, the reason why we have packing and structure alignment is for performance reasons.

    By default, structures in most 32-bit Intel-based compilers are aligned on 8-byte boundaries so that (I believe) the mechanism by which the compiler increments the data pointer register is most optimised (i.e. the fastest).

    You can align structures however you want, but it creates more work for the compiler as it has to add additional instructions to its default handler for incrementing the data pointer and hence more work for the CPU.

    I'm sure my specific reasoning is bogus, anyone care to enlighten?
    Originally posted by P.Phant
    I see only one alternative to this: it is to work with explicit offsets (#defined) and casts depending on the data I want to read.
    Explicit offsets are a no-no, unless you're working with the BIOS or hardware, and if you want to do that, you'd probably find straight assembly language more direct.
    Last edited by SMurf; 06-03-2003 at 05:46 PM.

  3. #3
    P_Phant
    Guest
    Thanks.

    As for explicit offsets; I mean, as a way to describe the format of the data in a file.

    For exemple (hypothetical) a file with a header:
    <type of the file> 4 bytes
    <version> 4 bytes
    <Language ID> 2 bytes
    <offset to name section> 4 bytes
    <offset to description section> 4 bytes

    then an array of entry structures:
    each entry:
    <offset to its name relative to the name section> 4bytes
    <offset to its description in the description section> 4 bytes

    then the name section and description section.


    for the moment I see only two "ways" of parsing the file, supposing it is mapped in memory:
    Code:
    typedef struct{
    	char type[4];
    	char version[4];
    	unsigned short lang_id;
    	unsigned long name_sec;
    	unsigned long desc_sec;
    }*PHEADER;
    
    typedef struct{
    	unsigned long name;
    	unsigned long desc;
    }*PENTRY;

    or:
    Code:
    #define HEADER_LENGTH 18
    #define HE_TYPE 0
    #define HE_VERSION 4
    #define HE_LANGID 8
    #define HE_NAMESEC 10
    #define HE_DESCSEC 14
    
    #define ENTRY_LENGTH 8
    #define EN_NAME 0
    #define EN_DESC 4
    and the code gets rather unpleasant:
    with things like:
    Code:
    *((unsigned long*)(pTemp + EN_DESC ))...
    
    pTemp+= ENTRY_LENGTGH;
    But wihout the pragma, because of the 2 bytes language ID member in this case, it is the only way out I see... This is all supposing I cannot modify the file format.

  4. #4
    PPhant
    Guest
    Thanks a lot. I'm afraid I still have one or two questions:
    You mention #pragma pack(), and you also mention gcc as being your compiler. As far as I know, gcc uses attribute(packed) to indicate packing.
    Yes, I just read that it wasn't supported on all versions of gcc, whereas attribute(packed) is, anyway considering what you and SMurf told me I think I'll avoid those, since what I am doing is rather standard, and I'll try to keep away from bad programming practice for now .

    I was aware of the little endian/ big endian issue, but I'm not sure if it concerns me. The data is little endian. For some reason I am not sure of, the windows world seems to be little endian, correct? What of Mac OSX (doubt it but you never know) or GNU/Linux will I find diversity depending on the machine there? If you have a link, drawing a quick picture of nowadays' situation, it would be much appreciated. The only things I can find searching for big-endian is either general explanations or descriptions of the situation in 1994 and the infamous big-endian UK internet address .

    > *((unsigned long*)(pTemp + EN_DESC ))
    This might not actually work. You may generate an alignment exception. Some machines require some data types to be stored on specifically aligned addresses (which is why your structure has holes in it in the first place).
    I am not sure I follow you there, this would be a way to access a file mapped in memory, as raw data. So how could it have been aligned? By mapped files I mean using CreateFileMapping/MapViewOfFile for Windows, and mmap under UNIX.
    I realise I will have a big problem with the endian thing if I access the mapped file like that and perform arithmetic operation directly like that...
    What I would like to avoid is to copy the file in memory a second time, it seems to me it defies the purpose of mapping it in the first place.
    The kind of operations I do is merging two files into a third one for example. This precisely involves arithmetic operations (such as auditioning the number of entries in the two source files). And yes I do put the result directly into a mapped file for the destination (hum) via pointers. This was maybe flawed from the beginning, but since I do that to learn, and it is just a hobby for me, at least for now, I’d like to get it as right as can be... The files are not enormous, but not small either and speed is an issue, because the manipulation must occur before another program starts.

    For the UDP packets on the other end I'll do it your way since there is no size/performance issue there, duplicating the information is not a problem. Kind of what I was doing anyway, except I got the order wrong (I was just copying the data I needed after having read the raw data through a pointer to a structure).

    Thanks again

  5. #5
    P.Phant
    Guest
    Thanks Salem, I got it this time .

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C Structure Padding
    By audinue in forum C Programming
    Replies: 20
    Last Post: 07-12-2011, 10:14 PM
  2. Padding in Structure
    By ganesh bala in forum C Programming
    Replies: 11
    Last Post: 01-29-2009, 09:25 PM
  3. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  4. Programming a PIC 16F84 microcontroller in C
    By Iahim in forum C Programming
    Replies: 1
    Last Post: 05-13-2004, 02:23 PM
  5. Detecting Structure Padding
    By johnnie2 in forum C++ Programming
    Replies: 2
    Last Post: 03-17-2003, 10:25 AM