C Board  

Go Back   C Board > Platform Specific Boards > Linux Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 09-04-2009, 09:09 AM   #1
Kung Fu Kitty
 
Angus's Avatar
 
Join Date: Oct 2008
Location: Montreal, Canada
Posts: 107
Lazy allocation and *alloc()

Recently I read somewhere that calloc() does lazy allocation, in that it doesn't allocate any real memory, but just virtual memory. Real memory pages are only allocated as writes to it are performed, and any reads performed on uninitialized memory returns 0, just like a sparse file. This made a lot of sense to me, because it seemed strange that ANSI would come up w/a whole new function to do the same thing that malloc() followed by bzero() could do, and besides, I've never found a use for calloc() in my entire career, so far. So I thought that this would be great, because if I have a method that writes a blob to a db, and if the blob column in that db can be up to 1Gb, then my method can lazily allocate a whole Gb, and even if my app only ever handles 10-byte blobs, the most memory that would be wasted is the better part of a page.

Since then I've tried to read up on this, and what I've read is that malloc() and calloc() allocate memory in the same way, and calloc() does actively initialize the memory. Furthermore, that lazy allocation I'm talking about is system-dependent, and if calloc() on a particular system does it, then so does malloc(). Can anyone confirm any of this for me? I'm mostly interested in the Linux kernel, since that seems to be all I program in these days.
Angus is offline   Reply With Quote
Old 09-04-2009, 09:16 AM   #2
subminimalist
 
MK27's Avatar
 
Join Date: Jul 2008
Location: NYC
Posts: 3,944
Quote:
Originally Posted by Angus View Post
This made a lot of sense to me, because it seemed strange that ANSI would come up w/a whole new function to do the same thing that malloc() followed by bzero() could do
Several ways to skin a cat. I do not think it is within the scope of the standard to define whether memory allocated should be "real" or "virtual", so unless you find an explicit statement to the contrary, that cannot be the reason for the two functions; calloc() is just a convenience.
__________________

Accuracy and integrity mean nothing if you don't make it past the censors...PYTHAGORAS
MK27 is offline   Reply With Quote
Old 09-04-2009, 09:24 AM   #3
Registered User
 
Join Date: Sep 2004
Location: California
Posts: 2,845
Linux uses an optimistic memory allocation method. It works the way you described in that malloc will return a valid address, but pages will only be allocated as they are used. This allows the OS to use the memory in other applications and I/O paging while it waits for the app that allocated the memory to use it.

There are some downsides to this approach though. Mainly, you cannot ever be sure that your memory allocation succeeded since malloc returns an address before it actually allocates the memory. If you attempt to use the memory when there is no space available, the OOM killer will simply kill your application (or another one on the system).

Note that this behavior is configurable, and can be turned off by running:
# echo 2 > /proc/sys/vm/overcommit_memory
In which case memory allocations will behave the way they do on Windows.
__________________
bit∙hub [bit-huhb] n. A source and destination for information.
bithub is offline   Reply With Quote
Old 09-04-2009, 09:36 AM   #4
and the hat of vanishing
 
Salem's Avatar
 
Join Date: Aug 2001
Location: The edge of the known universe
Posts: 21,214
> in that it doesn't allocate any real memory, but just virtual memory.
All modern OS's allocate virtual memory.

> Real memory pages are only allocated as writes to it are performed
The virtual pages become mapped if that's what you mean.

> This made a lot of sense to me, because it seemed strange that ANSI would come up w/a whole new function to do the same thing that malloc() followed by bzero() could do
calloc pre-dates ANSI.


> because if I have a method that writes a blob to a db, and if the blob column in that db can be up to 1Gb, then my method can lazily allocate a whole Gb,
All this presumes a particular OS (or kind of OS).

Since you mention Linux, you should be aware of the "OOM" issue. Linux will over-commit VM pages. So whilst your malloc may succeed, the kernel may subsequently fail to map a page of your allocation due to being out of memory.

So you shouldn't lazily alloc 1GB because you can't be bothered to figure out a better answer. The kernel still has to account for your request in some way, even if it doesn't immediately cause a great deal of activity (it isn't free).
__________________
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
Up to 8Mb PlusNet broadband from only £5.99 a month!
Salem is offline   Reply With Quote
Old 09-08-2009, 07:21 AM   #5
Kung Fu Kitty
 
Angus's Avatar
 
Join Date: Oct 2008
Location: Montreal, Canada
Posts: 107
Quote:
Originally Posted by Salem View Post
>
So you shouldn't lazily alloc 1GB because you can't be bothered to figure out a better answer. The kernel still has to account for your request in some way, even if it doesn't immediately cause a great deal of activity (it isn't free).
Account for my request anymore than it would just, say, 10 bytes?
Angus is offline   Reply With Quote
Old 09-08-2009, 09:24 AM   #6
and the hat of vanishing
 
Salem's Avatar
 
Join Date: Aug 2001
Location: The edge of the known universe
Posts: 21,214
Well if you ask for 1GB, then you've used up an awful lot (of a finite number) of VM page table entries for no good reason.

If you use them, then that's fine.
But if you don't, nobody will be impressed by your code.
__________________
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
Up to 8Mb PlusNet broadband from only £5.99 a month!
Salem is offline   Reply With Quote
Old 09-08-2009, 09:33 AM   #7
Kung Fu Kitty
 
Angus's Avatar
 
Join Date: Oct 2008
Location: Montreal, Canada
Posts: 107
Quote:
Originally Posted by Salem View Post
But if you don't, nobody will be impressed by your code.
Is that the only drawback? If I was in the racket to impress folk I probably would have chosen pop music instead.
Angus is offline   Reply With Quote
Old 09-08-2009, 09:51 AM   #8
Registered User
 
Join Date: Sep 2004
Location: California
Posts: 2,845
Quote:
Originally Posted by Angus View Post
Is that the only drawback? If I was in the racket to impress folk I probably would have chosen pop music instead.
The other drawback is that you are relying on the OS behavior instead of proper coding techniques. What happens if you ever want to port your application to a non-Linux platform? What if a Linux user has the overcommit behavior turned off?
__________________
bit∙hub [bit-huhb] n. A source and destination for information.
bithub is offline   Reply With Quote
Old 09-08-2009, 10:04 AM   #9
Kung Fu Kitty
 
Angus's Avatar
 
Join Date: Oct 2008
Location: Montreal, Canada
Posts: 107
Then
  1. The use of macros and #if #endif block should cure that
  2. A quick reading of /proc/sys/vm/overcommit_memory should tell me all I need to know
Angus is offline   Reply With Quote
Old 09-08-2009, 10:17 AM   #10
Registered User
 
Join Date: Sep 2004
Location: California
Posts: 2,845
Quote:
Originally Posted by Angus View Post
Then
  1. The use of macros and #if #endif block should cure that
  2. A quick reading of /proc/sys/vm/overcommit_memory should tell me all I need to know
Why don't you just allocate the amount of memory that you are using? This way you don't need to worry about this issue at all.
__________________
bit∙hub [bit-huhb] n. A source and destination for information.
bithub is offline   Reply With Quote
Old 09-08-2009, 10:45 AM   #11
and the hat of vanishing
 
Salem's Avatar
 
Join Date: Aug 2001
Location: The edge of the known universe
Posts: 21,214
I'm done wasting effort - doesn't want to learn, stuck with a single idea and can't change it.
__________________
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
Up to 8Mb PlusNet broadband from only £5.99 a month!
Salem is offline   Reply With Quote
Old 09-08-2009, 10:53 AM   #12
Kung Fu Kitty
 
Angus's Avatar
 
Join Date: Oct 2008
Location: Montreal, Canada
Posts: 107
Quote:
Originally Posted by bithub View Post
Why don't you just allocate the amount of memory that you are using? This way you don't need to worry about this issue at all.
Because I don't always know how much that is, and I risk calling realloc() a lot

That's a good philosophy, Salem. You should go by it more often

Last edited by Angus; 09-08-2009 at 10:57 AM.
Angus is offline   Reply With Quote
Old 09-08-2009, 11:09 AM   #13
Registered User
 
Join Date: Sep 2004
Location: California
Posts: 2,845
Quote:
Because I don't always know how much that is, and I risk calling realloc() a lot
And what's wrong with calling realloc() a lot? Calling realloc() is exactly what you want to do if your memory usage is growing over time.
__________________
bit∙hub [bit-huhb] n. A source and destination for information.
bithub is offline   Reply With Quote
Old 09-08-2009, 11:35 AM   #14
Kung Fu Kitty
 
Angus's Avatar
 
Join Date: Oct 2008
Location: Montreal, Canada
Posts: 107
Quote:
Originally Posted by bithub View Post
And what's wrong with calling realloc() a lot? Calling realloc() is exactly what you want to do if your memory usage is growing over time.
Well, every time you call realloc() to increase the size of the block you run the risk of having to copy all the data in that block. If you can rely on lazy allocation, though, you'd only have to deal with the expense of the odd page fault.
Angus is offline   Reply With Quote
Old 09-08-2009, 12:15 PM   #15
Registered User
 
Join Date: Sep 2004
Location: California
Posts: 2,845
Quote:
Originally Posted by Angus View Post
Well, every time you call realloc() to increase the size of the block you run the risk of having to copy all the data in that block. If you can rely on lazy allocation, though, you'd only have to deal with the expense of the odd page fault.
Well if it helps you sleep better at night, the glibc implementation of realloc() on Linux uses the mremap() system call. It rarely has to copy the existing memory, and will usually just allocate new virtual pages that are contiguous with the existing memory.
__________________
bit∙hub [bit-huhb] n. A source and destination for information.
bithub is offline   Reply With Quote
Reply

Tags
calloc, lazy allocation, malloc

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dynamic array Allocation deepak_j C Programming 3 08-17-2009 07:18 AM
dynamic allocation from 1 instead of zero cfdprogrammer C Programming 27 04-28-2009 08:21 AM
pointer to array with dynamic allocation cfdprogrammer C Programming 22 04-07-2009 09:56 AM
redundant allocation George2 C++ Programming 22 03-06-2008 06:43 PM
Dynamic memory allocation. HAssan C Programming 3 09-07-2006 05:04 PM


All times are GMT -6. The time now is 10:57 PM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 RC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22