Thread: Help me understand these ridiculous disk performance on few systems

  1. #1
    Registered User
    Join Date
    Mar 2007
    Posts
    142

    Help me understand these ridiculous disk performance on few systems

    I have an Mac/Win application that performs well on Mac, relatively well on some Windows systems and absurdly poor on several Windows machines.

    I'll start with the results, time it takes to generate a report. To get final result, my applications reads records from disk and caches some intermediary results back to disk. So it comes down to read and write operations.

    MacBook Air (SSD disk) - Mac version - 5 seconds.
    MacBook Air - MacOS in VMWare (single core used) - 12 seconds
    MacBook Air - WinXP in VMWare (single core used) - 35 seconds

    NoName Box, cheap c2d, Win7 Home - 10 minutes
    HP Intel Pentium 2.8 GHz, Win 8 Pro - cca 45 minutes

    My friend's Win7, some Intel with 4 cores, normal HDD - 25 seconds

    This looks crazy. Insane time differences. It's the same code, only platform specific calls are different. On Windows it comes down to CreateFile(), ReadFile(), WriteFile(), SetFilePointer() etc.

    On my system I have downloaded FileMon utility by Russinovich and when I look at the activity I can see READ, WRITE, LOCK, UNLOCK calls and they're all marked with SUCCESS. Reads are in shared files, that's why there are locks. The cache file (used for writing) is exclusive, so there are no locks on it.

    Next time I have access to the Win8 computer I'll look at it with FileMon, but I'd like to know if there's any other tool that could tell me what is going on.

    That computer was bought this friday. They had a bit older Win7 machine (in above tests it's a 10 minute machine) and I was telling them it's a slow computer and they should buy something better. Ha!

    This friday they called me and we tried the report and I was perplexed and ashamed at the same time as it is my application and I don't have the slightest idea why it runs so slow. This new Win8 box is again a cheap box (cca 400 EUR + TAX), like I said above, some HP with Intel Pentium at 2.8 GHz, not sure about other specs.

    That's why I'd like to test it somehow and see why is my application so painfully slow. How could performance on any modern hardware be so much different? Five or 25 seconds vs 45 minutes. How can this be?

    Any suggestions for other tools or maybe some system calls I can put in my own code? Any good disk benchmarking tool that can monitor live disk performance in other applications?

    Edit - It's a rather old application that I still maintain, but don't change the basics at all, so it's a 32 bit process on both platforms.
    Last edited by idelovski; 02-17-2013 at 05:27 PM.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Xperf, a new tool in the Windows SDK - Pigs Can Fly - Site Home - MSDN Blogs

    > This new Win8 box is again a cheap box (cca 400 EUR + TAX), like I said above, some HP with Intel Pentium at 2.8 GHz, not sure about other specs.
    Yeah, this seems too cheap for running Win8 (I assume part of that price was the price for the OS as well).
    What else did it come with - keyboard, mouse, monitor?
    My guess is that it has only enough memory to just boot windows and surf the web, and the slowest (aka cheapest) hard disk that the manufacturer could find.

    On my system I have downloaded FileMon utility by Russinovich and when I look at the activity I can see READ, WRITE, LOCK, UNLOCK calls and they're all marked with SUCCESS. Reads are in shared files, that's why there are locks. The cache file (used for writing) is exclusive, so there are no locks on it.
    As well as looking at the time each call takes, also look at the time between calls.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Mar 2007
    Posts
    142
    Quote Originally Posted by Salem View Post
    > This new Win8 box is again a cheap box (cca 400 EUR + TAX), ...
    What else did it come with - keyboard, mouse, monitor?
    All-inclusive. Monitor, Windows, keyboard,... they were quite proud of it, repeated the price several times. That's why I remember it.

    Quote Originally Posted by Salem View Post
    My guess is that it has only enough memory to just boot windows and surf the web, and the slowest (aka cheapest) hard disk that the manufacturer could find.
    As well as looking at the time each call takes, also look at the time between calls.
    But it's a new computer. And it has a brand name on the box. My app was written when average comp had 8 or 16 MB of RAM so now when I look at it I can see it uses between 8 and 20MB as it runs. That's why it was designed to use cache on disk. If I were writing it in this century cached data would have been kept in memory. Well, this is maybe the first thing I'll do this week - move the cache from disk to memory.

    But this is the only report that recreates the cache from scratch on each run. I have other reports where cache is only updated on each run. They will continue suffering when a lot of cached records are invalid for some reason. I would really like to somehow recognize computers that are incredibly bad.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Is it one of these perhaps?
    HP's current crop of <$400 machines.
    It certainly shouldn't be an issue.

    Is the Win8 machine running 64-bit windows? What about other machines - are they running 64-bit OS's, or 32-bit?

    > Reads are in shared files, that's why there are locks.
    Shared from where?
    A busy LAN with large files is a different environment to your MacBook tests with smaller local files.

    How many degrees of variability are there?
    Can you test local (unshared) files on local disks for example?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Mar 2007
    Posts
    142
    I don't remember what was on the box, but within Windows it was declared as Intel Pentium at 2.8. I was surprised Pentium still exists. Files were on local drive. They were just opened with the share flag.

    Other machines? I suppose my friend's box was 64 bit Win7. Native OSX on Air as well as emulated in VMWare are both 64 bit. XP service pack 3 in VMWare on my Air is most likely 32 bit. My customers machines are what they are. Win7Home and Win8Pro. Can't check anything else for the moment. Older computer is less than a year old. The Win8 computer is 3 days old.
    Last edited by idelovski; 02-18-2013 at 06:25 AM.

  6. #6
    Registered User
    Join Date
    Mar 2007
    Posts
    142
    I did one Quick and dirty in memory cache implementation just to see what happens and on my XP in VMWare things improved three times. From 45 seconds to 16 seconds.

    Mac version in MacBook Air is worse now, from 5 seconds to 6, but when I test it on my 2007 C2D MacBook with a regular laptop drive things improved from 1 minute 20 seconds to 18 seconds.

    I can hardly wait to test it on my customers Win8 machine.

  7. #7
    Registered User
    Join Date
    Mar 2007
    Posts
    142
    Well, things improved somewhat with the cache-in-memory version. On their Win7 machine that was taking 10 minutes now it shows the report in 15 seconds. The new machine with Win8 now takes 6 minutes or so. It was 45 minutes before, but still this is incredibly slow compared to anything else I have access to.

    I think I'll write a test application that opens one file for writing and another for reading and copies randomly data from one to the other. Then I'll report the time and then I'll increase the number of files used at the same time and print times for each iteration. From two files opened at the same time, to three, four, five and so on. I think I have like 20 to 25 files in use for the report that is taking 6 seconds on my MacBook Air to 6 minutes on that crazy Intel Pentium.

    With the test application in hand I will see if time increases linearly or exponentially on that particular computer. This whole thing makes no sense to me.
    Last edited by idelovski; 03-07-2013 at 09:37 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Ridiculous infinite loop
    By melodia in forum C Programming
    Replies: 3
    Last Post: 11-16-2009, 05:18 PM
  2. Ridiculous code
    By joshdick in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 05-18-2005, 08:12 AM
  3. Ridiculous reputation?
    By caroundw5h in forum A Brief History of Cprogramming.com
    Replies: 65
    Last Post: 09-05-2004, 06:53 PM
  4. Ridiculous poll # 455
    By iain in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 01-10-2002, 08:39 PM