Thread: Save File Organization Question

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Registered User
    Join Date
    May 2011
    Posts
    3

    Save File Organization Question

    I've recently started to play with the idea of coding for more than just a passing hobby and one of the first problems that I ran into was what to do with save files. I understand how to write information to disk, but where I get lost is on how that information is typically organized and identified for retrieval later. Is every line given some sort of a tag so it can be searched and assigned to a variable when it's read after the program starts up? Does the program just read everything on line 1 and assign that to some variable?

    I guess what I'm looking for here, is if there are any standardized ways to handle reading and writing from disk in an organized and efficient way. I did search the forums but most of what I was able to find was dealing with specific problems. I am looking for more general knowledge. If someone could point me to a link or give me an idea what to search I would be grateful.

    Also, in case it helps, the file I am playing with would have to store a very large amount of different variables from many different objects. (Many thousands of objects)

    Thanks for any help you can give.

  2. #2
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    A lot of games, and not a few businesses, use record (struct) based filing systems...
    The basic idea is that you organize blocks of data as structs and then just write the struct itself to disk... memory to disk... disk to memory. No translation or intermediate steps needed. In larger storage systems files are made up of lots and lots of structs (millions in some cases) and you can access them very rapidly by using two common techinques "Binary Search" and "Random Access" where you know the size of the struct so if you want the 10,000th one you can go straight to it and read it in fasterthanthis... blink of an eye, really. When you don't know which struct you need Binary Search lets you find 1 of a thousand in only 10 tries, one of 2000 in only 11... even if it's the last record in the file... not quite blink of an eye but surpriszingly fast.

    Do some googling on these techniques, it's a fastinating read.

  3. #3
    Registered User
    Join Date
    May 2011
    Posts
    3

    Thanks

    Thanks for the info, this gives me a good place to start. I had an inkling that adding tags to every line was a terribly inefficient way to go about saving and loading stuff.

  4. #4
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by Zafaron View Post
    Thanks for the info, this gives me a good place to start. I had an inkling that adding tags to every line was a terribly inefficient way to go about saving and loading stuff.
    There are situations where it's the right way to do it...
    For example... the Random Access method works perfectly when you have the exact same information in large numbers of structs (i.e. they're all the same) ... For example an inventory program tracking 100,000 item inventory.

    However this system ain't worth crap if you can't standardize the records. When blocks are differing sizes or you have huge amounts of non-repeating data to store, Random Access ain't gonna do it for ya... That's when you break out the "Formatted Text File" method (look at any windows .ini file) and start reading sequentially.

    It really depends on what you're storing, how standardized it is and how many repetitions you have... You may end up with a sequential file of 30,000 variables that are otherwise unrelated or a Random Access file of 10 identical structs that are clearly related...

    It's about analysing your needs, doing the research and deciding the best course...
    No professional would settle for less.

  5. #5
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    A lot of games, and not a few businesses, use record (struct) based filing systems...The basic idea is that you organize blocks of data as structs and then just write the struct itself to disk... memory to disk... disk to memory. No translation or intermediate steps needed.
    Boy do I get tired of hearing that garbage repeated.

    No, dude, as I've said, in the real world, it is almost always more complicated than dumping whatever binary chunk the compiler gives you to a file; even when a simple flat binary is used in the real world, a specific endianness, packing, alignment, size, and format (such as IEEE floating points or some other form), is expected. No two compilers, or even one compiler with different options used during compile time, will agree on all of those details.

    [Edit]
    Oh, and as this is the C++ side of things, the number of things that compiler vendors can and do change between their different products to make the "simply dump all structures to a disk" route fail is significantly larger.
    [/Edit]

    It can work otherwise as long as you limit yourself to a single platform, provide tools to convert the file, or generate the given file only a temporary cache, but otherwise you are setting yourself up for failure.

    [Edit]
    Oh, I should have made a mention of this:

    It's about analysing your needs, doing the research and deciding the best course...
    No professional would settle for less.
    This bit though, is simply excellent advice.

    There is no universal standard that will match every need.

    If you want advice on any particular data, you'll have to discuss what you have.

    [/Edit]

    Soma
    Last edited by phantomotap; 05-11-2011 at 07:29 PM. Reason: none of your business

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by phantomotap View Post
    even when a simple flat binary is used in the real world, a specific endianness, packing, alignment, size, and format (such as IEEE floating points or some other form), is expected. No two compilers, or even one compiler with different options used during compile time, will agree on all of those details.
    I think I'm confused. Do companies often change compilers between printing copy number 50000 and 50001 of a production run?

  7. #7
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by phantomotap View Post
    Boy do I get tired of hearing that garbage repeated.
    And boy do I ever get tired of you repeatedly harping along about how it's somehow a bad thing.

    It's done all the time!

    Give it a rest, ok?

  8. #8
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    You already lost this argument before. You saying writing an entire structure out to disk is wrong doesn't make it so. Fix it in memory, write it in one shot. People do it all the time. You wanting it to be wrong just so you can chime in and say "YOU ARE WRONG!" doesn't make you right. I can absolutely pick any random file format in that link, look at the header they give me, and make a program to read and write whatever that format is, on any computer with a compiler in no time flat, yes, block reading/writing the whole thing, and it will absolutely work. Why? Because that's what file formats are for! So you can write programs to read that data file wherever the hell you want!

    You crying wolf on flat file reads/writes doesn't make it wrong, it makes you stupid.


    Quzah.
    Hope is the first step on the road to disappointment.

  9. #9
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Fix it in memory, write it in one shot.
    Oh, so all of this time I've been wrong telling people of those real world issues and how a given programmer would have to manually serialize data if it is to be in any specific form and how you two are flat out wrong to suggest that `fread' and `fwrite' with whatever chunk of data the compiler may produce will work even in the face of all those considerations, and now you all of a sudden you agree saying "Soma is wrong, but as it turns out a programmer will have to do some manual work after all."?

    Priceless. You are really desperate to be right all of sudden.

    What happened, did you try that "BMP" on an big-endian platform example I posted and find out that, sure enough, exactly as I said, and have been saying all along, the programmer would have manually serialize the data?

    Eh, don't worry; I'm sure you'll find something else to be wrong about later.

    Soma

  10. #10
    the hat of redundancy hat nvoigt's Avatar
    Join Date
    Aug 2001
    Location
    Hannover, Germany
    Posts
    3,130
    Oh, so all of this time I've been wrong telling people of those real world issues
    Those are issues, but not everyone encounters them. What people are telling you is that the solution is fine for the majority of people. Yes, changing compilers more frequently than your underwear has it's own problems. Yes, switching systems everyday has it's own problems. But most people don't do this. For most people, the simple solution is the best solution.
    hth
    -nv

    She was so Blonde, she spent 20 minutes looking at the orange juice can because it said "Concentrate."

    When in doubt, read the FAQ.
    Then ask a smart question.

  11. #11
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You apparently missed the part from earlier where it was said that those real world problems weren't issues at all. Actually, it has been said several times that even noting that those issues exist is idiotic. So, no, that's not what people was trying to tell me.

    In any event, you don't even need to change compilers to need to deal with those issues. Simply changing a compiler option is enough for some compilers.

    Even then, if you ever work with an outside stream or an outside file format, you will have to deal with those issues.

    Telling newbies to simply dump whatever binary chunk the compiler without at least noting those issues is flat out lying. Further, telling newbies that they aren't issues after having the issues explained to them is incredibly stupid.

    Soma

  12. #12
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by phantomotap View Post
    In any event, you don't even need to change compilers to need to deal with those issues. Simply changing a compiler option is enough for some compilers.
    Saying this is just as much of a "lie" as this statement is:
    Quote Originally Posted by phantomotap View Post
    Telling newbies to simply dump whatever binary chunk the compiler without at least noting those issues is flat out lying. Further, telling newbies that they aren't issues after having the issues explained to them is incredibly stupid.

    Soma
    Anyone looking into #pragma pack (or something similar) already knows that they are changing how things are padded, and anyone NOT looking into it isn't going to accidentally change their compiler options enough so that anything you are trying to say here actually matters.


    Quzah.
    Last edited by quzah; 05-12-2011 at 03:50 AM.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 9
    Last Post: 03-29-2010, 09:07 PM
  2. Replies: 7
    Last Post: 04-15-2009, 10:35 AM
  3. Open File/Save File crashes on cancel?
    By Blackroot in forum Windows Programming
    Replies: 0
    Last Post: 08-02-2008, 02:16 AM
  4. Header organization question
    By Bigdog54 in forum C Programming
    Replies: 1
    Last Post: 01-28-2003, 11:56 AM
  5. Save to File
    By Korhedron in forum C++ Programming
    Replies: 19
    Last Post: 12-01-2002, 05:16 PM