I need an array that in start of iteration need to be all zeros. Is it better to free it at end of iteration and call calloc at start of each iteration or its better to run a for loop at the end to make it all zeros?
I need an array that in start of iteration need to be all zeros. Is it better to free it at end of iteration and call calloc at start of each iteration or its better to run a for loop at the end to make it all zeros?
If calloc() is working, then so will memset() when you need to reset the array. At any other time, a loop is the best alternative.
There are some types for which setting all the bits to 0 is not appropriate, so in those case, neither calloc() or memset() would work.
I would be shocked if calloc is faster than even a hand rolled loop. Allocation functions are among the heaviest of functions in the standard library. That's not to say you shouldn't use them, just that your inner loop may be less efficient than it could be if you're reallocating rather than reusing the block of memory.Is it better to free it at end of iteration and call calloc at start of each iteration or its better to run a for loop at the end to make it all zeros?
Yes. And those types are anything other than integers (which includes character types).There are some types for which setting all the bits to 0 is not appropriate, so in those case, neither calloc() or memset() would work.
My best code is written with the delete key.
Really? Considering that the integer value supplied to memset is first converted to unsigned character I find that kind of surprising.
and (from c11)
5.2.1 "A byte with all bits set to 0, called the null character , shall exist in the basic execution character set; it is used to terminate a character string"
5.2.1.2 "A byte with all bits zero shall be interpreted as a null character independent of shift state. Such a byte shall not occur as part of any other multibyte character."
7.1.9 "the null character shall have the code value zero"
Edit: The only thing I can find that might not be appropriate is in the case of floating point numbers where all bits zero is positive 0 and negative 0 requires the sign bit to be set. I hardly think this makes using memset() to set everything to all zeroes is inappropriate, though. Looking through the standard even aggregate types are ok, so I'm not sure of what the "inappropriate" cases are
Last edited by Hodor; 01-17-2016 at 08:04 AM.
I think you parsed Prelude's sentence in an unintended way. The integer types include the character types (well, technically char is the character type that is separate from the integer types, but given that it boils down to signed char or unsigned char despite being a different type, that is not important for this case).
Last edited by laserlight; 01-17-2016 at 08:39 AM.
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
For very large arrays calloc is probably faster. Systems often keep pools of zero-initialized memory in RAM in case you want to start using zero-initialized pages quickly. memset will usually have to set the given array in memory to 0, so can't really be faster in the general case.Originally Posted by telmo_d
For a small array, memset is probably faster. With calloc for smaller requested zeroed memory chunks the memory is probably just memset (for example, GLibC does this), and there is some extra work associated with allocating this memory in the first place.
I disagree, for very large arrays calloc is certainly faster. If you're not convinced try it yourself. At least on my implementation it's faster.Originally Posted by prelude
The dynamic memory allocation functions have an overhead, but calling them the heaviest functions of the standard library is spurious.
Last edited by Veltas; 01-17-2016 at 10:01 AM.
I have yet to see a version of memset() that allocates on the heap at all. Individual implementations don't mean much, but I imagine glibc is one of the most widely used:
glibc: string/memset.c Source File - doxygen documentation | Fossies Dox
In the implementations that I have seen, memset() will try to fill the array in the biggest chunks it can, before filling the rest of it in byte-sized chunks. It is not that different from a loop.
A large array may be better to reallocate, but I would say that it would have to be on the order of megabytes big in order to reap the benefits from smart OSes.
Who said that it did?Originally Posted by whiteflags
Yeah well I did say "very large". Although I'd say closer to kilobytes, rather than megabytes, which isn't exactly a crazy size of array. I guess I'll test it.Originally Posted by whiteflags
The malloc/memset version and the calloc/free version below run in about the same time on my computer (64-bit i5/Linux). WIth the given settings:
calloc/free: 23.889s
malloc/memset: 23.877s
Code:#include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #define SIZE 100000 #define REPS 100000 void proc(int *p) { int i; for (i = 0; i < SIZE; i++) p[i] = i; } int main() { int i; clock_t clk = clock(); #if 1 // 1 for this version; 0 for the else version for (i = 0; i < REPS; i++) { int *p = calloc(SIZE, sizeof *p); proc(p); free(p); } #else int *p = malloc(SIZE * sizeof *p); for (i = 0; i < REPS; i++) { proc(p); memset(p, 0, SIZE * sizeof *p); } free(p); #endif printf("%.6fs\n", (double)(clock() - clk)/CLOCKS_PER_SEC); return 0; }
I did some tests and the kind of sizes that get performance increases are about 30MB, so this really is for very large arrays only, but still well within the scope of consideration. EDIT: mind you I was checking for mapping virtual memory, so this will depend slightly on the platform, and more importantly your implementation may have a zero pool near where the data segment is loaded anyway, so calloc may still work significantly better.
I was talking about calloc in that sentence, I've edited it to make that more clear, my bad.
Last edited by Veltas; 01-17-2016 at 10:10 AM.
Wow. Changing the parameters to make a quite big (about 40 meg) array shows calloc to be the clear winner on my machine.
calloc: 20.335s
memset: 43.124s
Code:#include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #define SIZE 10000000 #define REPS 10000 #define STEP 10000 // "process" array by STEP #define REPS_MOD 100 // print rep number every REPS_MOD reps void proc(int *p) { int i; for (i = 0; i < SIZE; i += STEP) p[i] = i; } int main() { int i; clock_t clk = clock(); #if 0 // 1 for this version; 0 for the else version for (i = 0; i < REPS; i++) { // if (i % REPS_MOD == 0) printf("%d\n", i); int *p = calloc(SIZE, sizeof *p); proc(p); free(p); } #else int *p = malloc(SIZE * sizeof *p); for (i = 0; i < REPS; i++) { // if (i % REPS_MOD == 0) printf("%d\n", i); proc(p); memset(p, 0, SIZE * sizeof *p); } free(p); #endif printf("%.6fs\n", (double)(clock() - clk)/CLOCKS_PER_SEC); return 0; }
Well algorism's tests have done a much better job demonstrating calloc than I managed to do, thanks!