Thread: file system vs sql database

  1. #1
    Registered User
    Join Date
    Jan 2005
    Posts
    108

    file system vs sql database

    Hello, I figure there's no other place to ask this thing around (general development forums? doesn't seem to be here), so I guess since I'm doing this in c++ this is the best place to ask..

    I will have millions of record, and each have relations to around 6ish other records. Each of these records would be around 200 bytes or less.. and they'll only ever be accessed by an index number (which is a long).

    The question is, would it be better to store these under sql, or would it be better to just store these under the file system, with the file names as their indexes? If I use SQL, all I'll be doing is just "select entry where index = X" or something like that..

    And then there's the caching thing.. does sql (e.g mysql) provide caching for the most retrieved files, or anything like that?

    Thanks in advance.. its hard to find info on this (for a beginner like me that is

  2. #2
    Computer guy
    Join Date
    Sep 2005
    Location
    I'm lost!!!
    Posts
    200
    well, database is not really my strong field. But, as i understand ( i might be wrong ), system would be faster compare to using database, because of all the connection process with the database.

    Good database technique is: Open the connection, get data, store in a buffer, then close the connection. That would take more time. Trust me, i did that to my final project, and the project runs extremely slow everytime I process a function that involving getting data from database.

    The rest of your question, i'll save it for some other expertises.
    Hello, testing testing. Everthing is running perfectly...for now

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Well what else are you going to use the program for?

    If it's just a simple homework, with a console/GUI interface, then perhaps a file approach.

    But if someone might ask later on, "can we have a web interface", then having it all in SQL will make that change exceedingly easy.

    > all I'll be doing is just "select entry where index = X" or something like that..
    If there is any inclination to expand on that in the future, then perhaps it's a good choice now.

    We need more background info to make better suggestions.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Jan 2005
    Posts
    108
    Good database technique is: Open the connection, get data, store in a buffer, then close the connection. That would take more time. Trust me, i did that to my final project, and the project runs extremely slow everytime I process a function that involving getting data from database.
    Hmm, was the performance hit that bad? Which reminds me, doesn't mysql run things off the disk?

    We need more background info to make better suggestions.
    My bad, guess it wasn't detailed enough. It's for a website I'm developing, for storing lots of user-entered geographical data.. it'll be small in the beginning I guess, but just a bit wary of complexity issues if things get crowded. The data will be accessed by a c++ server process, so anything will have to go through c++ first..

    My inclination with file system is that mysql (and the likes) seems to only have "lock table" for concurrency control.. whereas file system, you can have individual file locks? This means multiple processes can look up and change entries independently if they're not related, as opposed to locking the whole table. But then mysql is much more convenient than using the file system..

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Personally, I think I'd go the mysql route.

    It's all ready set up with things like being able to cope with distributed web access, and has already solved the concurrency problems.

    Adding more data, more users, more inquiries etc will be a lot easier in future. All of which would potentially require lots of code on your part.

    Also, for a web-facing application, the security requirements are not to be under estimated.
    http://en.wikipedia.org/wiki/Sql_injection

    Making your C++ application proof against buffer overrun, memory leaks, etc etc will be no small feat in itself.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by underthesun View Post
    I will have millions of record, and each have relations to around 6ish other records. Each of these records would be around 200 bytes or less.. and they'll only ever be accessed by an index number (which is a long).

    The question is, would it be better to store these under sql, or would it be better to just store these under the file system, with the file names as their indexes? If I use SQL, all I'll be doing is just "select entry where index = X" or something like that..
    What, you're telling us that you have millions of records and you never plan to insert more or update existing records?

    Oh, you do plan to do that? Well, do you ever plan on maybe, just maybe, having more than one person to be able to add or edit records at once, or one person updating and another person reading? Might you also want to someday be able to give some people different access such as read-only to the data? Is it possible that you might at some point want to perform some updates in a transactional manner? Do you want to be sure that you'll never have records that refer to invalid record indexes? What about other one-off values that you might otherwise have to store seperately in the registry or an ini file (requiring seperate backing up), as opposed to in a seperate table of special values?

    It seems to me like you should future proof this now and use a database like everybody else.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  7. #7
    Registered User
    Join Date
    Sep 2004
    Location
    California
    Posts
    3,268
    I agree that a database is the way to go. If your database is only accessed through your c++ application and not directly by the webserver, then I suggest you look into using SQLite instead of mysql.

  8. #8
    Registered User
    Join Date
    Dec 2007
    Posts
    37
    I dont know much about sql but yeah go with it because in the long run you would probably later want other people to be able to access it and if it is in test files then they cant or you cnat give certain people certain amounts of access. You could turn the text files into html files so that people could acces their files and then have a user and pass for each so that each individual person could change their own file but hat would take tons of work on your part where it would be so much easier for you to just make a sql file and there are some problems but most would be solved.

    hope I helped,
    h3ckf1r3

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need Help Fixing My C Program. Deals with File I/O
    By Matus in forum C Programming
    Replies: 7
    Last Post: 04-29-2008, 07:51 PM
  2. Batch file programming
    By year2038bug in forum Tech Board
    Replies: 10
    Last Post: 09-05-2005, 03:30 PM
  3. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  4. simulate Grep command in Unix using C
    By laxmi in forum C Programming
    Replies: 6
    Last Post: 05-10-2002, 04:10 PM
  5. File Database & Data Structure :: C++
    By kuphryn in forum C++ Programming
    Replies: 0
    Last Post: 02-24-2002, 11:47 AM