>> I guess nothing will prevent that...
There's ECC memory - Dynamic random access memory - Wikipedia, the free encyclopedia
gg
>> I guess nothing will prevent that...
There's ECC memory - Dynamic random access memory - Wikipedia, the free encyclopedia
gg
Yes I hope to get hardware configured to run ECC memory. It will cause a performance hit, however. Also ECC memory is not perfect, some soft errors will probably slip through. If a cosmic ray particle flips 2 bits (unlikely but possible) ECC can't fix it, won't even know what happened. That could be enough to throw such huge calculations completely off.
My concern about using a single processor is that these programs are HUGE. One program will easily use up all memory resources (assuming I can find an op system that allows large process size). So they will be running one program at a time. Also the initial request they made to me was for speed, which is out the window if they use 1 out of 4 or 1 out of 8 cores.
The other cores will help cause the op system won't interrupt the main program running, but not a lot. However, again they said that is what they want.
darsunt, I consider you're getting bedazzled with one aspect of the problem: the hardware.
The hardware is actually the cheapest factor in this whole setting. If the problems are able to be split across four cores, they are able to be split across four computers. And the cost of four computers is probably less than the cost of rearchitecting the software for parallelism. And make no mistake: rearchitecting a working software design so it can be run in parallel is a massive rearchitecture. The incidence of systematic errors (i.e. programmer errors) increases substantially with complexity of the system design and with the size of the code base - and parallelism increases both design complexity and code size. One practical rule of thumb: system rearchitecture is an extremely effective method to produce an expensive and unreliable system from a working system.
The hardware will also be routinely replaced. Parallelism, on the other hand, increases both the initial and maintenance cost of the software - because it increases the complexity of the software, making it harder to maintain. "Bang for buck" of hardware will keep going up (even if the software is not written specifically to exploit multiple cores) but that will not necessarily be true with a new software design.
I consider the concerns raised in this thread with floating point precision are irrelevant. The mathematics of floating point is not trivial, but nor is it a show stopper for large scale numerical calculations if it is accounted for in the software - specifically algorithm - design and when interpreting the program outputs. That's true whether the software is parallelised or not.
Similarly, the point about soft errors is irrelevant - even if we ignoring the fact that cosmic rays are only one contributor. Soft errors is a probabilistic effect that needs to be detected (if possible) and accounted for but - if anything - parallel software will be more likely to be adversely affected because parallel software involves use of additional hardware recources (eg memory) for synchronisation. That means a need for more machinery - software and hardware overhead - to detect the impacts following on from soft errors.
What I suspect these guys are looking for from you is optimisation of the code they have - i.e. fine tuning the algorithms - rather than a complete system rearchitecture. In terms of return on investment, I suspect they're right.
how can i Marge two .exe file together and how they can run both r simultaneously...
please replay me as soon as possible..
thanks...
"as soon as possible"