Thread: Found the source of my unstable system

  1. #16
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    One of the problems with memories and their compatibility with various systems is that when they are run at high speed - such as 400MHz DDR or 800MHz DDR2 - the margins are wafer-thin. And variance isn't just in the memory sticks, but also in the connector, the motherboard, the memory controller and the processor itsef. If you have a particularly slow pad on one pin on the memory controller, the memory can be will within margins, but still won't work on the system.

    I had a memory problem with a Xen virtual machine - the memtest86 that comes with most Linux distro's would run just fine running alone [that is when booing directly into memtest86], but when running a virtual machine in Xen, it would fail to read that memory - just one bit was wrong, and I even wrote my own little version of the particular test in memtest86. Changing the memory to another two sticks of the same brand fixed the problem. Obviously, running with virtualization enabled changed some timing in accessing the memory - perhaps it did some page-table fetching that didn't happen during the normal run, for example.

    And yes, it can be difficult to find memory errors, because other interference can cause problems to come and go.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  2. #17
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    That did, of course, remind me that motherboards are very complex piece of hardware and are, in fact, still created by hand and not a manufacturing process!
    So it may very well be the motherboard at fault and not the memory, or a combination of the two.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #18
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by Elysia View Post
    That did, of course, remind me that motherboards are very complex piece of hardware and are, in fact, still created by hand and not a manufacturing process!
    Huh? The motherboards are designed by humans, but the manufacturing and assembly of millions of boards is not done by human hands to any large extent. Ok, some through-hole mounted connectors [such as the power connector] may be hand-soldered, but certainly the memory sockets and all the main components are machine placed (mounted) and wave-soldered, like any other mass-produced electronics.

    So it may very well be the motherboard at fault and not the memory, or a combination of the two.
    I've seen a fair share of "doesn't work well together" where one processor, or one stick of memory will work fine in one model of motherboard, and move to another model of motherboard and the same component, it will fail. But swap to another similar part on that motherboard, and it works just fine. That's what's called "marginal issues", where the timing parameters of one component are straddling the parameters of the other component - and too fast is just as bad as too slow in this case.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #19
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by matsp View Post
    Huh? The motherboards are designed by humans, but the manufacturing and assembly of millions of boards is not done by human hands to any large extent. Ok, some through-hole mounted connectors [such as the power connector] may be hand-soldered, but certainly the memory sockets and all the main components are machine placed (mounted) and wave-soldered, like any other mass-produced electronics.
    Sure, the individual pieces are all manufactured, but everything is typically (or at least some) put together by human hands due to the complexity. I saw this process once on a demonstration movie.
    Anyway, it just makes it possible for even higher margins of errors.

    Complex stuff...
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #20
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Just out of curiosity, would you say quality control is being relaxed and warranties increased?

    Recently I found first hand that a few hundred thousand BenQ monitors had failed 'quality control' -- dry solder (the solder machine completely missed). Luckily I got my monitors fixed before the warranty expired . Is that quality control's responsibility?

    Also, hypothetically, could a crate of RAM not be stacked next to some crate of 'damaging material'?

  6. #21
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Well, a dry solder joint may not be detected until the equipment starts coming back after it's failed [because the solder is holding up enough to not fail during factory tests]. Sure, someone should visually inspect the joints from time to time, and it's certainly a "QC failure" if something makes it out of the factory with faults like that - but it's also understandable that you expect machines to do what they are supposed to for most of the time.

    In all industries, there is a balance between how much effort you put on checking the product before it gets to the customer, vs. having to replace shipped product. The ideal balance is such that you "don't upset too many customers, don't loose too much money on replacements", but also "don't spend TOO much on QA". Some companies haven't found the right balance all the time - I remember Seagate having problems with their hard-disks some 15-20 years back - not sure about them now, if they still exist...

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #22
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    They do and they recently purchased Maxtor.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Line Counting
    By 00Sven in forum C Programming
    Replies: 26
    Last Post: 04-02-2006, 08:59 PM
  2. fopen();
    By GanglyLamb in forum C Programming
    Replies: 8
    Last Post: 11-03-2002, 12:39 PM
  3. found source! i be tha thrilla killa'!!!
    By doubleanti in forum A Brief History of Cprogramming.com
    Replies: 6
    Last Post: 10-31-2001, 09:15 AM