Thread: Mass Data Storage, Files, Howto?

  1. #1
    #junkie
    Join Date
    Oct 2004
    Posts
    240

    Question Mass Data Storage, Files, Howto?

    Could someone give me the best way to hold lots of data that needs to be scanned through and found quickly?

    Befor i started C++ i mostly used INI files to hold the data. It is fairly efficient (to me atleast lol) and easy to maintain. However i have no idea how to do such a thing. I am just starting to move beyond Refrences and head into advanced functions, more indepth OOP, templates, ect. (In my 21 Days Book.) And i was really hoping to get some practical coding in, however i need to be able to store data.
    Creating all my data in the .cpp file as big ass variables just doesent work for me, nor is the way to do it lol.

    So can someone explain a bit and give me some links if possible to the best data control methodology? Like INI, or whatever.

    Also if possible a way to find specific characters in a string would be cool too, but by far the above subject is more important. Thanks!
    01110111011000010110110001100100011011110010000001 11000101110101011010010111010000100000011011000110 10010110011001100101001000000111100101101111011101 0100100000011011100111010101100010

  2. #2
    #junkie
    Join Date
    Oct 2004
    Posts
    240
    side note, as im sure you know by reading that. I cant have such commands that clears the file when you open it if data exists. Just had to make sure that is known hehe.
    01110111011000010110110001100100011011110010000001 11000101110101011010010111010000100000011011000110 10010110011001100101001000000111100101101111011101 0100100000011011100111010101100010

  3. #3
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Could someone give me the best way to hold lots of data that needs to be scanned through and found quickly?
    An external hash table or a B-tree. But seriously, how you structure your file reading depends on how you can process the data. If you can read the file in chunks and process each chunk individually then you can minimize device reads as opposed to handling one record from the file at a time.
    My best code is written with the delete key.

  4. #4
    #junkie
    Join Date
    Oct 2004
    Posts
    240
    i dunno even half of what you just said .

    But! Hash tables (in mIRC lmfao) i am familiar with, if the format is similar to mIRC's i'll be inlove. Any ideas? Hash Format works fine, INI works fine (might be slow though, dunno in c++). no idea what b-tree is. Can you give me examples, links, something to go by lol?
    01110111011000010110110001100100011011110010000001 11000101110101011010010111010000100000011011000110 10010110011001100101001000000111100101101111011101 0100100000011011100111010101100010

  5. #5
    Registered User
    Join Date
    Jun 2004
    Posts
    722
    A Binary Tree is...
    Advantages?? It's always sorted, and when searching, inserting an element in a tree with N elements you get worst case of preformance time log(N) (which is great).
    I've already implemented this data structure.
    If you want I can post the code or mail it to you.

  6. #6
    #junkie
    Join Date
    Oct 2004
    Posts
    240

    Post

    [email protected]

    post or mail, dont matter to me. So this B-Tree is a good way of keeping data? And is as efficient(or better) than ini and whatnot?

    I'll be using this to create like strings of data, ect.
    like for ini i might go.
    Code:
    [XcYcZ]
    X=5
    Y=6
    Z=0
    Name=Room 56
    Town=Blank
    People=Jake Lee John Sarah
    Thats just a simple example of a little XYZ cordset and keeping the data for rooms. I could reproduce such things in B-Tree?

    What about such things as i did for a little ini quiz format.
    Code:
    [QuestionGroupOne]
    1.1.1.T=Topic1
    1.1.1.Q=Question\nLine2.
    1.1.A=Answer
    1.2.T=Topic1
    1.2.Q=Question2\nLine2.
    1.2.A=Answer
    2.1.T=Topic2
    2.1.Q=Question\nLine2.
    2.1.A=Answer
    [QuestionGroupTwo]
    1.1.T=Topic1
    1.1.Q=Question\nLine2.
    1.1.A=Answer
    2.1.T=Topic2
    2.1.Q=Question1\nLine2\nLine3.
    2.1.A=Answer
    2.2.T=Topic2
    2.2.Q=Question2\nLine2.
    2.2.A=Answer
    aparently with the items in the ini format they have a seperating character of 46 (.). Anyway just curious if B-Tree is suited for such things, but still give me an example please
    01110111011000010110110001100100011011110010000001 11000101110101011010010111010000100000011011000110 10010110011001100101001000000111100101101111011101 0100100000011011100111010101100010

  7. #7
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >A Binary Tree is...
    not a B-tree. A B-tree is a data structure designed for huge amounts of data that can't possibly be stored in main memory all at once, so it's a way to store data on disk in a way that can be quickly accessed. An external hash table follows the same principles as a regular hash table, but on a much larger scale. Both are very complex, which is why I was joking.

    Though if you're interested in binary search trees, I have three informative articles about them in the FAQ, They aren't complete by any stretch of the imagination, but I'll probably do rewrites and additions when I get the inclination.
    My best code is written with the delete key.

  8. #8
    #junkie
    Join Date
    Oct 2004
    Posts
    240
    just SOMETHING! tell me and i shall try, anyway, thanks.
    01110111011000010110110001100100011011110010000001 11000101110101011010010111010000100000011011000110 10010110011001100101001000000111100101101111011101 0100100000011011100111010101100010

  9. #9
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >just SOMETHING!
    See my first post and then tell us how you plan on accessing the data. It also couldn't hurt to give an average estimate of how big the files will be. It's possible that what you think is a big file is actually tiny and can be stored in main memory in, say, a binary search tree. In that case I would direct you to here because I haven't yet written a tutorial on hash tables.
    My best code is written with the delete key.

  10. #10
    Registered User
    Join Date
    Jun 2004
    Posts
    722


    Opps... I know what a B-TRee is. But storing it and reading the info, which is the B-TRee purpouse, will be laborous. And, I didn't mentioned, a binary tree is for keeping data in memory. You could then try to store it directly in a file, but that would be a bit hard.

  11. #11
    #junkie
    Join Date
    Oct 2004
    Posts
    240
    Well i guess you would say its tiny, for the sheer size of the data. I am more looking for efficient tree like storage, like how Ini storage is, has a mini 3 branch tree if you will. File, Group, Item.

    But file size, ranging anywhere from 0kb to umm, well on average 0-1meg, but another might be 1-10 megs i spose. None the less very small.

    I am more concerned about the way the data is stored, gathered, ect.
    01110111011000010110110001100100011011110010000001 11000101110101011010010111010000100000011011000110 10010110011001100101001000000111100101101111011101 0100100000011011100111010101100010

  12. #12
    Registered User
    Join Date
    Jun 2004
    Posts
    722
    Well, I advise the B Tree like prelude mentioned. The BTree main purpouse is to store data in files, allowing any random piece of information to be accessed in a very small time. At classes, my teachers used as example storing info of some million people like a civilian record. Google for info on BTree. If you find an implementation better. I think I have a implementation in Java...

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 12
    Last Post: 10-17-2005, 06:49 AM
  2. C diamonds and perls :°)
    By Carlos in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 05-16-2003, 10:19 PM
  3. can't insert data into my B-Tree class structure
    By daluu in forum C++ Programming
    Replies: 0
    Last Post: 12-05-2002, 06:03 PM
  4. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM
  5. Write and use data stored in data files...
    By alex6852 in forum C++ Programming
    Replies: 2
    Last Post: 11-01-2001, 01:45 PM