Thread: Parsing a buffered binary data

  1. #1
    Registered User
    Join Date
    May 2008
    Posts
    4

    Parsing a buffered binary data

    I'm trying to parse a binary file.
    The file consists of multiple header->record pairs.
    The record length and type are stored in the header.
    I want to be able to read the entire header into a buffer and then split it into the variables.
    Unfortunately I can't figure out how to do it.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #define HEADER_SIZE 4
    
    typedef struct rHead
    {
        unsigned short int REC_LEN;
        unsigned char REC_TYP;
        unsigned char REC_SUB;
    } recHead;
    
    
    recHead readHeader(char *buffer)
    {
        recHead hdr;
        /* no idea how to parse it */
        return hdr;
    }
    
    int main()
    {
        FILE *fp;
        fp=fopen("1.std","rb");
        char *buffer;
        buffer = (char *)malloc(HEADER_SIZE+1);
        fread(buffer,HEADER_SIZE,1,fp);
        recHead header = readHeader(buffer);
        free(buffer);
        return 0;
    }

  2. #2
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Why not read directly into the struct? You can do that, too. Or is there a reason you don't want to do that?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #3
    Registered User
    Join Date
    May 2008
    Posts
    4
    Reading directly into the structure is OK for the header but it will cause problems for the record it self, as it contains some optional fields, so i will need to parse the buffer anyway.
    The second reason is, I started using C yesterday (I'm a Perl programmer) and I want to learn.

  4. #4
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Then I think the easiest way is to use sscanf or pointers (the latter is the one I'm more familiar with).
    A way might be like this:

    Code:
    typedef struct
    {
        char c;
        short s;
        int i;
        long long l;
    } mystruct;
    
    void foo(const char* pBuffer)
    {
        mystruct mys;
        mys.c = *(char*)pBuffer;
        pBuffer += sizeof(mys.c);
        mys.s = *(short*)pBuffer;
        pBuffer += sizeof(mys.s);
        mys.i = *(int*)pBuffer;
        pBuffer += sizeof(mys.i);
        mys.l = *(long long*)pBuffer;
        pBuffer += sizeof(mys.l);
        /* In case you don't want to assign something, just don't run the assign line */
    }
    Example (not tested).
    Just watch out for endian problems. You should write the data in the same order as you read it and you can't port it to another endian machine without writing a new file first.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #5
    Registered User
    Join Date
    May 2008
    Posts
    4
    Thanks it works great.

    But I can't understand how this line works.
    mys.c = *(char*)pBuffer;
    Can you please explain or point me where can I read about this?

    I'm currently not worried about the endian, it's just a project to learn
    general algorithms and coding style in C.

  6. #6
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Although this
    Code:
    mys.c = *(char*)pBuffer;
    ...could just be simplified as this...
    Code:
    mys.c = *pBuffer;
    ...the secret behind these lines is that first casts the char pointer into a pointer to another type, and then dereferences that pointer. Memory itself contains no type information. Data is just treated by the compiler depending on what the type of variable it is. So we're just going to tell the compiler to treat the data differently.
    To walk forward in the buffer, we simply add the size of the variable. Thus we skip forward X number of bytes, past the current data you just assigned. Then we repeat the process.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #7
    Registered User
    Join Date
    May 2008
    Posts
    4
    Thank you very much for your help.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. xor linked list
    By adramalech in forum C Programming
    Replies: 23
    Last Post: 10-14-2008, 10:13 AM
  2. Binary data handling
    By maverickbu in forum C Programming
    Replies: 1
    Last Post: 06-26-2007, 01:14 PM
  3. Binary data -> istream
    By Magos in forum C++ Programming
    Replies: 8
    Last Post: 01-24-2005, 02:57 AM
  4. Help with Reading and writing binary data
    By Yasir_Malik in forum C Programming
    Replies: 3
    Last Post: 12-12-2004, 09:24 AM
  5. ReadFile and binary data
    By alandrums in forum Windows Programming
    Replies: 1
    Last Post: 01-30-2002, 10:46 PM