Thread: A question about performance and free()

  1. #1
    Registered User
    Join Date
    Apr 2010
    Posts
    12

    A question about performance and free()

    While writing some code that is very heavy in calculations I came to think of 2 things today:

    1: Performance of raw calculation vs. table lookup
    Suppose I have a nested for loop:
    Code:
    for(i=0;i<i_max;i++)
    {
    for(j=0;j<j_max;j++)
    {
    a=some_array[j]*2;
    b=some_other_array[i];
    c[i]=a+b;
    }
    
    }
    Would it be more efficient to calculate all the a's first in som other loop, and simply accessing them in memory? This does require that I malloc some space for another array to contain the a's. I guess it is a waste to calculate the a's over and over again. How much faster is it to access values in memory rather than doing explicit calculations (assuming all numbers are doubles)? Of course there are several unknowns in this question (most notably i_max and j_max), but can anything be said in general? I'd just like a rule of thumb since I try to write my programs as efficient as possible.

    2. My second question is about the inner workings of free(). Suppose I have an array I don't need any more. Let's call it arr. The equivalence of arrays and pointers says that an array is the same as a pointer to the first element. Now when I call free(arr), how does free know when to stop? What prevents it from simply continuing to free up memory until it segfaults? Somehow free() ignores the equivalence between pointers to the first element and arrays. It always knows when the array ends.

    I hope I have made myself clear enough, if not just ask and I will try to elaborate.
    Last edited by The Physicist; 05-21-2010 at 11:14 AM.

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by The Physicist View Post
    Would it be more efficient to calculate all the a's first in som other loop, and simply accessing them in memory? This does require that I malloc some space for another array to contain the a's. I guess it is a waste to calculate the a's over and over again.
    Makes sense and is generally considered true, yep. This is also why if the middle condition of a for loop involves a function call (eg, strlen) but the result is not expected to change, you should put the result of the call into a variable first and use that value instead, since otherwise strlen is called for every iteration. Obviously the same applies to function calls inside the loop. Some beginners get it in their head that "optimization" means using less code and packing lots of ops into one clever line of code, but that is a bad rule.

    How much faster is it to access values in memory rather than doing explicit calculations (assuming all numbers are doubles)?
    Going by literally that particular loop, the calculations are very minor and probably not much more expensive (or even less) than a lookup (I'm guessing, I dunno any assembly). Doing an arithmetic operation on values already in processor registers is faster than fetching something from memory, so it depends on how much arithmetic you are doing. Maybe some one knows of a ratio here (eg, beyond two or three add/subtracts, the fetch is faster).

    2. My second question is about the inner workings of free(). Suppose I have an array I don't need any more. Let's call it arr. The equivalence of arrays and pointers says that an array is the same as a pointer to the first element. Now when I call free(arr), how does free know when to stop?
    Stop right there. You cannot free an array of that sort (try it). A pointer which is malloc'd memory for an array is a pointer, not "an array". The equivalence of arrays and pointers is meant to refer to this kind of array:
    Code:
    int array[32];
    This is stack memory and cannot be freed.
    Last edited by MK27; 05-21-2010 at 11:20 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    Making mistakes
    Join Date
    Dec 2008
    Posts
    476
    1. Table lookup is in general faster, but usually less precise (for floating-point tables) and requires more memory. If you want to calculate the table at runtime, only do this if you know that there will be more lookups than entries in the table.

    2. If you free a malloc'd array, free probably looks at some header before the array where the information on your array is stored (hidden). But if you free an actual (C) array, behavior is undefined. Free can't free arrays, only pointers (which may be arrays).

    the array and the pointer are no longer strictly equivalent
    For free, an array is a pointer outside of the heap, so any other pointer would result in similar behavior.

  4. #4
    Registered User
    Join Date
    Apr 2010
    Posts
    12
    Thank you for your replies! They have really been helpful.

    It seems I wasn't quite clear in my terminology in the second question. I know I cannot free an array such as int a[32], but what I really meant was an array like int *a=calloc(32,sizeof(int)). It still seems magical to me that once free reaches int[31] it simply stops freeing up memory instead of continuing. It is as if there is a table hidden somewhere that keeps account of the blocks that have been malloced during execution.

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by The Physicist View Post
    It is as if there is a table hidden somewhere that keeps account of the blocks that have been malloced during execution.
    Well, yeah. The runtime keeps track, for every call of malloc, how much memory was requested by that call so that free stops in the right place. (It has to not give you the same memory twice, it has to not give more memory than it has*, etc.)

    --
    * Actually my understanding of glibc is that it will happily give you more memory than exists on your system without complaint, until you try to actually use it all.

  6. #6
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by tabstop View Post
    * Actually my understanding of glibc is that it will happily give you more memory than exists on your system without complaint, until you try to actually use it all.
    This is a linux kernel option, you can control it with a switch at boot time (can't remember what it is tho).
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  7. #7
    Registered User
    Join Date
    Apr 2010
    Posts
    12
    Quote Originally Posted by Brafil View Post
    1. Table lookup is in general faster, but usually less precise (for floating-point tables) and requires more memory. If you want to calculate the table at runtime, only do this if you know that there will be more lookups than entries in the table.
    How can it be less precise if it is a lookup in a table? A double is still a double, regardless of calculation or table lookup?



    Is it possible to access this table of malloc calls during runtime? Seems like a neat way to find out how big your arrays are. The information is there, it just has to be found.

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by The Physicist View Post
    Is it possible to access this table of malloc calls during runtime? Seems like a neat way to find out how big your arrays are. The information is there, it just has to be found.
    Good point, but not as far as I know. You can do the same thing yourself fairly easily, of course (keep a count).
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by The Physicist View Post
    Code:
    for (i=0; i<i_max; i++)
    {
        for (j=0; j<j_max; j++)
        {
            a = some_array[j] * 2;
            b = some_other_array[i];
            c[i] = a+b;
        }
    }
    FTFY!
    Unfortunately we cannot help you optimise your real code from that mock example. It's calculating each entry for the c array multiple times needlessly, and your real code almost certainly is not doing that. To show you what I mean, assuming that the final value of a and b are not required, this snippet has the same effect as the above one:
    Code:
    for (i=0; i<i_max; i++)
    {
        c[i] = some_array[j_max-1] * 2 + some_other_array[i];
    }
    Now if you cant perform the exact same optimisation on your real code, then you need to post the real code, so we can do some actual useful optimisations. Otherwise this discussion is moot.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #10
    Making mistakes
    Join Date
    Dec 2008
    Posts
    476
    How can it be less precise if it is a lookup in a table? A double is still a double, regardless of calculation or table lookup?
    AFAIK double tables are mostly used for math calculation, so of course nobody can store all possible results in a table. I correct myself: for math applications, tables may be less precise if used for floating-point operations.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by The Physicist View Post
    How can it be less precise if it is a lookup in a table? A double is still a double, regardless of calculation or table lookup?
    Nope. On many processors, including the x86, the internal precision of the wide floating point types is greater than the precision of a double. For example, a double in memory is 64 bits, but a double in the FPU is 80 bits. By storing intermediate results to memory you lose the 80 bit precision.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    Registered User
    Join Date
    Apr 2010
    Posts
    12
    Quote Originally Posted by MK27 View Post
    Good point, but not as far as I know. You can do the same thing yourself fairly easily, of course (keep a count).
    Yes it is fairly simple to keep track of the sizes of your arrays manually. However it would be neat, and perhaps also less error prone, to simply access the table. It's not a big thing, but I find it interesting nevertheless

    I guess a nice way to easily keep track of the sizes of your arrays would be to simply declare them as structs containing a pointer which has all the data and then constant values such as rows, columns... and so on to store the dimensions of the array. In this way you could also think of these structs as vectors, matrices or tensors in a physical terminology depending on the dimensions.

    Quote Originally Posted by brewbuck View Post
    Nope. On many processors, including the x86, the internal precision of the wide floating point types is greater than the precision of a double. For example, a double in memory is 64 bits, but a double in the FPU is 80 bits. By storing intermediate results to memory you lose the 80 bit precision.
    I didn't know this. This is a good thing to know, I will keep it in mind. Thank you very much.

    I have noticed that if I make an array like char* arr=calloc(8,1) it is some times possible to write to arr[8] without getting a segfault. I assume this is because the program has reserved other memory from an other calloc call, and that this memory lies at the end of the array. Is it possible to tell the compiler, for debugging purposes, to spread out all the malloc'ed memory, thus forcing the program to segfault when accidently writing to a value that is beyond the end of the array? Or is there an other clever way to get the compiler to notice whenever you write beyond the end of an array? I'm using gcc, but I assume that the answer will be the same for all the most common compilers.

  13. #13
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by The Physicist View Post
    I have noticed that if I make an array like char* arr=calloc(8,1) it is some times possible to write to arr[8] without getting a segfault. I assume this is because the program has reserved other memory from an other calloc call, and that this memory lies at the end of the array. Is it possible to tell the compiler, for debugging purposes, to spread out all the malloc'ed memory, thus forcing the program to segfault when accidently writing to a value that is beyond the end of the array? Or is there an other clever way to get the compiler to notice whenever you write beyond the end of an array? I'm using gcc, but I assume that the answer will be the same for all the most common compilers.
    Actually this is more like "there is no way the system is only going to give you eight bytes", assuming you have some sort of normal-ish system (not, say, a chip controlling a wristwatch). You probably get 1K or so at a time, although you really should be nice and not go over what you claimed you needed.

  14. #14
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by The Physicist View Post
    Or is there an other clever way to get the compiler to notice whenever you write beyond the end of an array?
    There are ways available in other languages, and you could implement one for C I suppose, just it is not part of the standard. C is usually considered an "unsafe" language in the sense that if you don't know what you are doing, there's not much to stop you from making a real mess. However, once you get used to it, the boundaries here make sense -- nothing is left impossible, just very few things are automatic. The assumption is that you can take care of yourself.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  15. #15
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by tabstop View Post
    Actually this is more like "there is no way the system is only going to give you eight bytes", assuming you have some sort of normal-ish system (not, say, a chip controlling a wristwatch). You probably get 1K or so at a time, although you really should be nice and not go over what you claimed you needed.
    The CRT doesn't waste that much memory. Typically the CRT gets memory from the system in large chunks (perhaps 4K) and then malloc further allocates smaller chunks from within this if you make requests for smaller allocations. The smallest chunk the CRT will give you is basically 8 bytes on Windows.
    This does not mean that you can assume that you have 8 bytes if you only asked for 2 say, of course.

    Strictly speaking, if you go past the size you asked for it is undefined behaviour. It might not crash for quite a while after the end, or it might crash if you go one over. There are ways of configuring Windows place allocations within your program in a place where overrun will typically be detected the moment going over the end (provided your element size isn't too small e.g. one byte), but they waste a ton of memory to do so. It's called "Page heap allocation", and it usually involves using gflags. Linux probably has something similar.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. free 64-bit C++ compiler on Windows
    By cyberfish in forum C++ Programming
    Replies: 20
    Last Post: 11-04-2008, 12:14 AM
  2. AVG Free 8.0 and false positives
    By VirtualAce in forum A Brief History of Cprogramming.com
    Replies: 10
    Last Post: 08-14-2008, 07:33 AM
  3. Best C compiler (free)
    By esbo in forum C Programming
    Replies: 11
    Last Post: 06-11-2007, 10:38 PM
  4. Custom Allocation
    By SevenThunders in forum C Programming
    Replies: 17
    Last Post: 04-09-2007, 04:10 PM
  5. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM