Regarding the precision, couldn't the doubles just be replaced with long doubles for 80 bit precision in a lookup table?
Regarding the precision, couldn't the doubles just be replaced with long doubles for 80 bit precision in a lookup table?
They take 10 bytes each. That means, any meaningful table would eat a huge amount of resources. f.e., a sin table with 0.01 precision would take (PI * 2 / 0.01) * 10 = about 6 kilobytes. Also apply that to cos and tan and you'll quickly eat some resources and risk cache misses. And this is a very low precision if you're using long double.
Comparison: floats (precision 0.01) = 2.5 Kilobytes with a meaningful precision, doubles 5 Kilobytes.
There isn't anything yet to suggest that a lookup table is remotely useful here. The question raised is about efficiency:How can one even entertain the notion of introducing a lookup table when there's a glaringly obvious optimisation that makes it redundant immediately?! If I can see it then perhaps the compiler can too. The whole question of the code efficiency hinges directly on what is stopping the compiler from making this optimisation itself. That's where the answer to this efficiency question lies. It's not as important whether it actually does that optimisation or not, it's more about how the presence of such other code completely alters the efficiency of this entire snippet. The code posted must be as close to the real code as possible, and the mock example we have so far just doesn't tell half of the real story.Would it be more efficient to calculate all the a's first in som other loop, and simply accessing them in memory?
As for number 2, one way or another it just knows, and it is likely different from one compiler, CRT, or OS to another. If you can imagine a way for the CRT to know that, then there's a reasonable chance that you're not far off for how at least one system does it.
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"
There is no COMPILER option to do this, that I am aware of. However, your idea is very insightful, and something that I have implemented manually in the past.
I call it "memscatter," and the idea is to allocate thousands of blocks of memory of random sizes at program startup, then randomly select some fraction of these blocks to free up. The resulting heap fragmentation changes the behavior of the program and makes it easier to shake out memory corruption bugs (i.e. it crashes sooner and therefore makes it easier to find the problem). However, this is a last resort technique which is useful only when no other option is available.
You may want to investigate a tool called "electric fence" which overrides the default memory allocator to align all allocations to the beginning, or end, of a VM page, with invalid pages before and after the allocation. If the code goes outside the array, it will segfault and you can immediately see the bug.
Also, tools like Valgrind, which execute your program in a virtual machine and monitor memory accesses at the level of single bits, can easily find such bugs.
At work, we use Purify, which is an on-line instrumenter which changes the code in magical ways to automatically detect such problems.
Code://try //{ if (a) do { f( b); } while(1); else do { f(!b); } while(1); //}
Internally AFAIK the processor just treats them as 80 bit-doubles. If they take 16 bytes, then this is far less efficient.
Thank you very much for your answers! They are greatly appreciated.
I have noticed something strange. My program seems to run much faster in Ubuntu than in Windows for some reason. Using the same settings my laptop with Ubuntu runs the program in 100 seconds, while my windows machine takes 189 seconds. To further test it I tried running it in Ubuntu in a virtual machine on my Windows machine, here it took 76 seconds. It is really strange since my Windows machine is by far the most powerful computer. The program is just a number-crunching console application, so it seems very weird to me. Any ideas for why this is the case?
Great information! It seems Electric Fence and Valgrind are Linux only however, and I would like it to run on Windows as well. Purify is way out of my budget. Do you know any tools that do the same as these but run on both Windows and Linux?
what compiler are you using on each OS? What command line options are used for each? Even if you're using the same compiler (i.e. gcc) on both, it's possible the defaults are set differently for each. That could mean that optimization is enabled in one case but not in the other.
gcc on both, -Wall and -O3 on both
Actually I did find one called DUMA. It's based on Electric Fence. Couldn't get it to compile though. I did try Valgrind on my Linux machine, and I must say this is a VERY nice tool. It helped me correct quite a few memory leaks. I think I will just memory debug in my Linux environment from now on.
Yeah, valgrind is cool, it catches leaks and access errors and is dead simple to use. At least to do that much -- I haven't tried to do anything else with it.
Be nice if it reported errors by line, like a debugger, instead of by function tho. Actually a debugger that integrated something like valgrind would be pretty nice.
==2419== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 1)
==2419== malloc/free: in use at exit: 0 bytes in 0 blocks.
==2419== malloc/free: 5,999,977 allocs, 5,999,977 frees, 155,999,984 bytes allocated.
==2419== For counts of detected errors, rerun with: -v
==2419== All heap blocks were freed -- no leaks are possible.
Joy....
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge
By golly yer right!
==2483== Invalid read of size 8
==2483== at 0x401E19: cleartree (bayer-demo.c:78)
==2483== by 0x401F5D: main (bayer-demo.c:132)
==2483== Address 0x519d198 is 8 bytes inside a block of size 48 free'd
==2483== at 0x4C2509F: free (vg_replace_malloc.c:323)
==2483== by 0x401E14: cleartree (bayer-demo.c:77)
==2483== by 0x401F5D: main (bayer-demo.c:132)
C programming resources:
GNU C Function and Macro Index -- glibc reference manual
The C Book -- nice online learner guide
Current ISO draft standard
CCAN -- new CPAN like open source library repository
3 (different) GNU debugger tutorials: #1 -- #2 -- #3
cpwiki -- our wiki on sourceforge