Thread: Save File Organization Question

  1. #16
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by tabstop View Post
    The main point I was hoping you'd get to is that, so far as I can see, with all these questions about endianness and packing etc, the native format (i.e. how the struct is already stored) is The Right Answer >95% of the time. If you're in a situation where you actually have to choose one endianness over another, you'll know it well ahead of time. So why not start there and change it only if necessary?
    How many corporations actually compile their own software on multiple compilers... Most are buying "prepackaged" systems because of the liability issues (it gives them someone to sue) or they are heavily standardized on a single build system... for the very reasons cited.

    It's only the dumazz little skript kiddies that mess with multiple compilers or care a rats backside about open source or multi-platform...

  2. #17
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by phantomotap View Post
    The point of my concern is dishing out such horrible advice as "just dump the memory out to file" without at explaining the absolute, undeniable fact that you then have no control over the binary layout, its stability, or its portability.
    I see you are still trying to win the argument you lost on the C board.

    List of file formats - Wikipedia, the free encyclopedia

    Pick one. I will be absolutely stunned if they don't say "this is the header file, and you write that out to disk". See, the real world does in fact work like that. You want to make a .bmp? What do you do? You copy their header structure and you write it straight out to disk. You don't screw with worrying about padding, because that is never an issue.


    Quzah.
    Hope is the first step on the road to disappointment.

  3. #18
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by Zafaron View Post
    I did a little bit more research on the topic after reading the comments and found a link that is a good place to start for anybody else who's having the same sort of issues. [36] Serialization and Unserialization ..Updated!.., C++ FAQ

    Thanks again for the insight, just having a place to start searching was a great help.
    Nice article... but! Be very cautious not to confuse "data storage" with "data exchange"... It doesn't matter how a program stores it's data as long as it can get it back when it needs it. It does however matter how it *communicates* that data to other machines. These are two very different and separate issues.

    Take a simple struct...
    Code:
    #pragma pack(1)   // no padding!
    
    typedef struct tPhoneList
      { int  count;
        char FirstName[16];
        char LastName[16];
        char Street[48];
        char City[20];
        char Zip[26];
        char Phone[16];
        char Email[32]; }
        PhoneList, *pPhoneList;
    
    #pragma pack()  // allow padding again
    Look closely... not only do you know what size the struct is... you know what is stored to disk and in what order... It's not "whatever the compiler hands you"... it is *exactly* what you hand the compiler.

    Ok... now tap up a quicky bit of code to put dummy values into that struct...
    Write 1 million of them to your hard disk using something like fwrite() incrementing count as you go...
    Now turn around and read them back in --fread()--...

    How many did it get wrong?
    Believe me it will work like that until you manage to fry your hard disk... years from now.
    Upgrade your computers, get the next version OS, buy a new compiler... and you'll still have access to that data.

    Now open the file in a hex editor, see the raw content... it's pretty obvious how it works.

    Don't let these guys shine you on, because they don't like me... test it yourself with real world data.
    Last edited by CommonTater; 05-11-2011 at 09:56 PM.

  4. #19
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    The main point I was hoping you'd get to is that, so far as I can see, with all these questions about endianness and packing etc, the native format (i.e. how the struct is already stored) is The Right Answer >95% of the time.
    Okay. Let's say I agreed 100%. Now, which native format? The version from "GCC" on "GNU/LINUX" or the on "MSVC2K3" on "Windows"? The version compiled using 80 bit long doubles or the version using 64 bit long doubles?

    The problem remains if in your code you simply dump whatever the binary chunk the compiler produces into a file. The problem remains because you aren't actually in control of the binary format and therefor don't even know if it follows your given understanding of native.

    Now instead if in your code you say "dump this 32 bit little endian value at bit 0", "dump this 64 bit IEEE LSB value at bit 32" and so forth and so on, you still have a very easy code path, a simple flat binary file, still using a "native" file format, and by Yog Sothoth look at that, simply changing a simple parameter during compilation doesn't break support for that data. And guess what, `fwrite' and `fread' don't do any of that.

    I will be absolutely stunned if they don't say "this is the header file, and you write that out to disk".
    I'll be delighted to play that game.

    Let's say we are creating a tool to generate that "BMP" file you keep bringing up on a 32 bit big-endian platform.

    So, here wee are having copied the header for the "BMP" file format verbatim into are code. We set the site field to 53119 bytes because that's the correct value. We've made sure that we've thrown in the right sizes and the right flags, and we are positive we've used the relevant "pack this structure" command when building.

    Now, here we are, we call `fwrite' with that structure and its size as the parameter. We have failed to produce a valid "BMP" file because the size field is required to be little-endian and are compiler handed us a big-endian.

    If only someone had told us about these very real world issues we might have known better than to just throw that binary chunk at a file; if only someone told us we might have to do some manual serialization of our data, but alas Quzah and CommonTater just don't know anything about these real world issues.

    Soma

  5. #20
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by phantomotap View Post
    Okay. Let's say I agreed 100%. Now, which native format? The version from "GCC" on "GNU/LINUX" or the on "MSVC2K3" on "Windows"? The version compiled using 80 bit long doubles or the version using 64 bit long doubles?
    All of that is completely irrelevant, because in the specs for the header, they tell you what each field's size is. For that matter...
    Quote Originally Posted by phantomotap View Post
    Now instead if in your code you say "dump this 32 bit little endian value at bit 0", "dump this 64 bit IEEE LSB value at bit 32" and so forth and so on, you still have a very easy code path, a simple flat binary file, still using a "native" file format, and by Yog Sothoth look at that, simply changing a simple parameter during compilation doesn't break support for that data. And guess what, `fwrite' and `fread' don't do any of that.
    Again, completely irrelevant. You are telling people not to write structures to disk, that no one does it, and that it's bad.

    Everyone does it. It's not bad, and the format can be read by anyone who you describe the file to, with a simple fread that will work because everyone has been doing it for forty years!
    Quote Originally Posted by phantomotap View Post
    Now, here we are, we call `fwrite' with that structure and its size as the parameter. We have failed to produce a valid "BMP" file because the size field is required to be little-endian and are compiler handed us a big-endian.
    Completely irrelevant, and ... wrong. EVERYONE reads .bmp files the same way, they even write them the same way. Everyone. You fill the structure correctly and you write it out in one shot. That is what you are preaching against, and you are wrong to do so, because everyone reads .bmp files the same way.

    You are advocating not writing structures out, and you are wrong to do so, because as I have said everyone does it that way. Your fictitious endian story doesn't exist. Which makes it all the more hilarious when you say:
    Quote Originally Posted by phantomotap View Post
    but alas Quzah and CommonTater just don't know anything about these real world issues.
    You can't possibly be this stupid.


    Quzah.
    Hope is the first step on the road to disappointment.

  6. #21
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You think padding and alignment are irrelevant?

    You think that reading a little-endian value when you have wrote a big-endian value is irrelevant?

    Wow. You may be the most incompetent programmer I've ever met.

    In any event, you keep on being wrong; I'll keep on being right, and when I see you and others tell people that `fread' and `fwrite' are somehow magically going to solve all portability issues related to alignment, packing, endianness, and all those other real world problems I've pointed out, I'll keep correcting you. Don't worry though, you are so invested in being wrong now I have no doubt that you'll keep being wrong for a long time to come.

    Soma

  7. #22
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    You already lost this argument before. You saying writing an entire structure out to disk is wrong doesn't make it so. Fix it in memory, write it in one shot. People do it all the time. You wanting it to be wrong just so you can chime in and say "YOU ARE WRONG!" doesn't make you right. I can absolutely pick any random file format in that link, look at the header they give me, and make a program to read and write whatever that format is, on any computer with a compiler in no time flat, yes, block reading/writing the whole thing, and it will absolutely work. Why? Because that's what file formats are for! So you can write programs to read that data file wherever the hell you want!

    You crying wolf on flat file reads/writes doesn't make it wrong, it makes you stupid.


    Quzah.
    Hope is the first step on the road to disappointment.

  8. #23
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Fix it in memory, write it in one shot.
    Oh, so all of this time I've been wrong telling people of those real world issues and how a given programmer would have to manually serialize data if it is to be in any specific form and how you two are flat out wrong to suggest that `fread' and `fwrite' with whatever chunk of data the compiler may produce will work even in the face of all those considerations, and now you all of a sudden you agree saying "Soma is wrong, but as it turns out a programmer will have to do some manual work after all."?

    Priceless. You are really desperate to be right all of sudden.

    What happened, did you try that "BMP" on an big-endian platform example I posted and find out that, sure enough, exactly as I said, and have been saying all along, the programmer would have manually serialize the data?

    Eh, don't worry; I'm sure you'll find something else to be wrong about later.

    Soma

  9. #24
    the hat of redundancy hat nvoigt's Avatar
    Join Date
    Aug 2001
    Location
    Hannover, Germany
    Posts
    3,130
    Oh, so all of this time I've been wrong telling people of those real world issues
    Those are issues, but not everyone encounters them. What people are telling you is that the solution is fine for the majority of people. Yes, changing compilers more frequently than your underwear has it's own problems. Yes, switching systems everyday has it's own problems. But most people don't do this. For most people, the simple solution is the best solution.
    hth
    -nv

    She was so Blonde, she spent 20 minutes looking at the orange juice can because it said "Concentrate."

    When in doubt, read the FAQ.
    Then ask a smart question.

  10. #25
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You apparently missed the part from earlier where it was said that those real world problems weren't issues at all. Actually, it has been said several times that even noting that those issues exist is idiotic. So, no, that's not what people was trying to tell me.

    In any event, you don't even need to change compilers to need to deal with those issues. Simply changing a compiler option is enough for some compilers.

    Even then, if you ever work with an outside stream or an outside file format, you will have to deal with those issues.

    Telling newbies to simply dump whatever binary chunk the compiler without at least noting those issues is flat out lying. Further, telling newbies that they aren't issues after having the issues explained to them is incredibly stupid.

    Soma

  11. #26
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by phantomotap View Post
    In any event, you don't even need to change compilers to need to deal with those issues. Simply changing a compiler option is enough for some compilers.
    Saying this is just as much of a "lie" as this statement is:
    Quote Originally Posted by phantomotap View Post
    Telling newbies to simply dump whatever binary chunk the compiler without at least noting those issues is flat out lying. Further, telling newbies that they aren't issues after having the issues explained to them is incredibly stupid.

    Soma
    Anyone looking into #pragma pack (or something similar) already knows that they are changing how things are padded, and anyone NOT looking into it isn't going to accidentally change their compiler options enough so that anything you are trying to say here actually matters.


    Quzah.
    Last edited by quzah; 05-12-2011 at 03:50 AM.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 9
    Last Post: 03-29-2010, 09:07 PM
  2. Replies: 7
    Last Post: 04-15-2009, 10:35 AM
  3. Open File/Save File crashes on cancel?
    By Blackroot in forum Windows Programming
    Replies: 0
    Last Post: 08-02-2008, 02:16 AM
  4. Header organization question
    By Bigdog54 in forum C Programming
    Replies: 1
    Last Post: 01-28-2003, 11:56 AM
  5. Save to File
    By Korhedron in forum C++ Programming
    Replies: 19
    Last Post: 12-01-2002, 05:16 PM