Thread: Optimize file writing

  1. #1
    Registered User
    Join Date
    Oct 2008
    Posts
    77

    Optimize file writing

    Hi,
    I would like to have your view on using an optimum method in order to execute writing operations on a file under Windows environment. It may seem basic as question but I'm experiencing some unpleaseant performance, as following.

    I have a complex application written in C which worked fine so far. It handles files that I operate in binary format, in create, write and read modes. When I bring it now on XP, Vista and Windows Mobile I'm having problems with flushing the write buffers. If I don't flush I have a high speed but memory doesn't clean-up and I have to close, then re-open the file. If I flush the buffer, then continue, I get very low execution time. I execute a large volume of write operations one after the other.
    My application - that I'm currently "refurbishing" for the operating systems above mentioned - has been using basic DOS and C library functions so far.
    My question is: by your experience, do you have a reccomendation for what method (when create should I set some parametr for strainght writing, without buffering?) and what functions (DOS, Win32 API, other?) I must consider so that to have the best performance in terms of high-speed and memory usage for just writing binary files on disk?
    Many thanks in advance!

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Obviously, in a battle between performance and correctness, correctness wins every time.

    However, it may well be that you can do flushing in a different way than you are doing now - only flush when you switch from write to read - have a flag that indicates whether the last operation was a write or a read. In the reading code, check the flag, and flush if the last operation was a write, and reset the flag to "read" state. That way, you are only flushing when you actually NEED to do it.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Oct 2008
    Posts
    77
    I keep track in my applications - even in run time! - of all allocations - RAM, handles, disk ... - and everything seems under control. But when I run in parallel the Process Manager I see that the system's available memory sharply goes down. It's not a matter of flushing (I've tried it, as well as dup()), there is something in the system's buffers that I can't have access to. Very strange and very annoying ...

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Sounds like the OS is caching your file IO and that is taking up memory. Windows (and many other OS's) will use free memory for DISK/FILE caching, since it's no point in having unused memory in the machine.

    If you look in task manager or some similar tool, does your application memory usage actually go up.

    [By the way, I'm far from convinced this should be in the same thread - is it not a completely different subject than the one about "Optimize file writing"? If so, please make a comment here, and one of the moderators can split it off into a separate discussion].

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Oct 2008
    Posts
    77
    Thanks Mats!
    You are right about OS optimal performance tuning of which caching is an important issue, but as far as I know this is algorithmically done in connection with the peripheral data structure. For NTFS is the sector, for TCP comm is the windows size etc. When I opened the thread I was looking for something precise related to XP and Vista, or .Net if you want, since I'm missing real facts, although in principle I have knowledge about. Using C puts us a lot of problems when trying to make performant the applications in these framework/managed code environments.
    Thanks again for you help!

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I have no idea EXACTLY what XP/NT does, but I do know that "if there is spare memory, it is used to cache filesystem data".

    Edit: as to how you optimize your file-writing, I would say "try to write as large chunks as possible", and avoid flushing if there is no need to flush. Of course, if you have two applications both reading and writing to the same file, you will NEED to flush at every write (or every read, whichever is statistically fewer). [Although flushing on read won't work for the other application writing - only the one writing can flush in that case].

    --
    Mats
    Last edited by matsp; 10-23-2008 at 04:33 AM.
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Registered User
    Join Date
    Oct 2008
    Posts
    77
    Thanks Mats for clarifications, really useful!

  8. #8
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    If you're using Flush to reduce memory usage then you're simply using it for the wrong reason. Just let the OS use as much memory for caching your files as it thinks is worthwhile. Afterall, there's a reason your app goes so fast without flushing all the time.
    Flush only when it is critical that the file on disk be up to date immediately. E.g. if it were a database system and your just comitted a transaction.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. File Writing Problem
    By polskash in forum C Programming
    Replies: 3
    Last Post: 02-13-2009, 10:47 AM
  2. Data Structure Eror
    By prominababy in forum C Programming
    Replies: 3
    Last Post: 01-06-2009, 09:35 AM
  3. C++ std routines
    By siavoshkc in forum C++ Programming
    Replies: 33
    Last Post: 07-28-2006, 12:13 AM
  4. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  5. Possible circular definition with singleton objects
    By techrolla in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2004, 10:46 AM