PDA

View Full Version : Expert opinions wanted!



MK27
10-07-2011, 10:31 AM
This was inspired by a recent thread:

http://cboard.cprogramming.com/linux-programming/141549-trying-use-libgtop-examples.html

Cross-posted to linuxquestions.org: http://www.linuxquestions.org/questions/showthread.php?p=4492563#post4492563

In the spring I was involved with a web based project involving some technology new to me and everyone else concerned, and our memory usage turned out to be significantly more than was predicted, so we needed to scrutinize what was responsible for what. To that end, I wrote a simple process monitor that could be used to log statistics for individual process over a period of time (days or weeks). Ie, it is based on the proc filesystem, but it is not a re-invented "top".

Since then I've refined the tool and it seems to me worth sharing, so I sat down to write a manual page. One of the things I wanted to do with that was explain some of the familiar memory measurements (RSS, VSZ, etc) from proc in a concise and concrete way not available elsewhere (although there are many discussions of these things available via online searching). While the tool serves its purpose for me based on my current understanding of these things, I'm a stickler for accuracy and do not want to spread erroneous information, so I'm looking for feedback about the following draft:



VirtualSz
This is the total of all the memory regions (RAM and swap) used or
partially used by the process, including all the shared libraries,
which (generally) account for most of it. Hence, this is not a mea‐
sure of the load added by the process, since (generally) much of
those shared libraries are already in play.

ResidentSz
Resident Set Size (RSS). This is much closer to the load added by
the process, however, it still includes some memory shared (or
sharable) by other processes spawned from the same executable. If
this is the only instance running, it is an accurate measure of
how much memory the process has consumed.

Shared_Mem
This is not the total of all the shared libraries used, ie, sub‐
tracting it from the virtual size will not give you the resident
size. It is a total of all the actual shared mappings (eg, those
parts of shared libraries actually in use).

Data+Stack
This is the bulk of the process's private, dynamically changing
memory.

Priv&Write
This is all the memory used by the process marked as private and
writable. The significance of this figure is that it is the
actual load added by the individual process if there are multiple
instances; pmap(1) will also report this figure.


The first 4 fields are directly from /proc/[pid]/statm, the fifth is calculated by parsing /proc/[pid]/maps. I've highlighted the bits that I'm most concerned about, since these are "facts" inferred by me from use, experimentation, existing documentation, etc, and for which I have been unable to find direct confirmation.

I'm also curious as to whether Data+Stack includes heap memory; it seems to me it does. If so, I could say "including stack and heap".

anduril462
10-07-2011, 01:02 PM
I'm also curious as to whether Data+Stack includes heap memory; it seems to me it does. If so, I could say "including stack and heap".
I'm by no means an expert, but I think it does include the heap, as far as your stats are concerned. A quick overview of Linux memory layout:


+------------+
| Stack |
| | |
| V |
| |
|Shared libs |
| |
| ^ |
| | |
| Heap |
| Data |
| Text |
+------------+


I can't recall if the BSS section is included in the Data portion or not, but I think it is (it's just an efficient way to store uninitialized globals, which go in the data section).

The text and data sections are fixed at compile time, but the stack and heap and shared libs change during runtime. AFAIK, when the OS/loader allocates space for the heap in a process, it basically does so by setting the start of the heap to the first piece of memory after the data section, then extending the end of the data section, to give the allocator a place to play. I think that's why it ends up included in your stats for data+stack.

Salem
10-07-2011, 01:51 PM
Perhaps this site can help -> LinuxMM - linux-mm.org Wiki (http://linux-mm.org/LinuxMM)

Codeplug
10-07-2011, 03:35 PM
VirtualSz
>> or partially used by the process
I'm not sure that adds much to the definition. Perhaps re-word it all to use "allocated" instead of "used"?

ResidentSz
>> includes ... memory shared ... by other processes spawned from the same executable. If this is the only instance running, it is an accurate measure of how much memory the process has consumed.
Don't think that's accurate. RSS = resident in ram, or virtual pages mapped to physical pages. That may include shared libraries etc... Perhaps you meant: "if this is the only process in the entire OS". To me, it read like "if the only instance is this particular process".

Shared_Mem
Suggest something like: "virtual memory pages that are shared between this process and other processes: including code from shared libraries, pages still shared from a forked (but not exec'ed) process, and manually shared memory". You may want to also specify if this is rss only shared pages, or if it includes shared but not resident pages. Or if it includes shareable pages that currently aren't referenced by any other processes.

Data+Stack
You sure statm\data doesn't include shared data? Or that it's all rss pages? For example, the default stack size for a new thread may be 8MB, but the OS will only make resident those pages which are actually touched.

Priv&Write
>> The significance of this figure is that it is the actual load added by the individual process [even] if there are multiple instances
Does adding that "even" convey your original intention?

Have you considered hitting smaps, then falling back on maps if it isn't there?
Exploring: Memory Usage with smaps (http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html)

gg

MK27
10-08-2011, 09:15 AM
ResidentSz
>> includes ... memory shared ... by other processes spawned from the same executable. If this is the only instance running, it is an accurate measure of how much memory the process has consumed.
Don't think that's accurate. RSS = resident in ram, or virtual pages mapped to physical pages. That may include shared libraries etc...

You're right; that is inaccurate. Much of the figure reported in "shared" is included in the RSS.

This made me realize I should incorporate the distinction between virtual and physical addressing, since these are two different "totals" that the process contributes to. To be honest, I had thought all the stats were virtual, and that the kernel did not report anything to do with physical memory. Which made it impossible to understand "RSS" properly, lol.



Data+Stack
You sure statm\data doesn't include shared data? Or that it's all rss pages? For example, the default stack size for a new thread may be 8MB, but the OS will only make resident those pages which are actually touched.

Right again; it includes shared data, and it is a measure of virtual space. I think I may actually drop the data+stack figure and add something from smaps.



Priv&Write
>> The significance of this figure is that it is the actual load added by the individual process [even] if there are multiple instances
Does adding that "even" convey your original intention?

No, what I meant here is that this is memory which cannot be shared between processes, so if you have two instances running and kill one, this is the amount of virtual space you will free up.

The "priv" bit is maybe irrelevant; AFAICT the only memory that is not marked private has to do with hardware interfaces.



Have you considered hitting smaps, then falling back on maps if it isn't there?
Exploring: Memory Usage with smaps (http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html)


That's a good article.

I'm going to keep the Priv&Write because this is the one that turned out to be most useful in the spring; the project was running under openVN, and it was the best measure WRT the total system memory usage/limit there (openVN is like a hypervisor).

But smaps also brought to my attention PSS, which is a newer metric: pages shared with other processes divided by the number of processes using them. Obviously, this fluctuates more than anything else (for idle or stable processes). So I may do a "Proportional Resident" figure; that seems like the most accurate measure of how much real, physical memory use a process represents.



VirtualSz
This is the virtual address space available to the process; ie, it
is the maximum amount of memory that could be consumed. If the
process does not use all of this most of the time, it is not a
very good measure of the real load incurred.

ResidentSz
Resident Set Size (RSS). This is the amount of real physical mem‐
ory used by the process. However, it includes space shared by
other processes.

Shared_Mem
This is the total of all the shared virtual mappings.

Priv&Write
This is all the virtual memory marked as private and writable.
The significance of this figure is that it is the load added by
the individual process if it is one of multiple instances of the
same program; pmap(1) will also report this.


Thanks Codeplug :)

MK27
10-08-2011, 10:25 AM
Shared_Mem
This is the total of all the shared virtual mappings.



Couldn't edit because of the new one hour limit, but I believe this should be shared physical mappings. I'll do some stuff with smaps to verify that.

I should probably remove VirtualSz too since AFAICT that is immutable anyway.

MK27
10-08-2011, 12:29 PM
Couldn't edit because of the new one hour limit, but I believe this should be shared physical mappings.

They are.



I should probably remove VirtualSz too since AFAICT that is immutable anyway.

No. What am I thinking???

This one hour limit is really going to beef up my post count, lol.

MK27
10-08-2011, 12:49 PM
I'm by no means an expert, but I think it does include the heap, as far as your stats are concerned.

The text and data sections are fixed at compile time, but the stack and heap and shared libs change during runtime. AFAIK, when the OS/loader allocates space for the heap in a process, it basically does so by setting the start of the heap to the first piece of memory after the data section, then extending the end of the data section, to give the allocator a place to play. I think that's why it ends up included in your stats for data+stack.

For sure it is included, and thanks for the "why". This was easy to confirm:



#include <stdio.h>
#include <stdlib.h>

int main(int argc, const char *argv[]) {
printf("%d\n", getpid());
getchar();
char *x = calloc(10<<10, 1);
getchar();
return 0;
}


Some good stuff over at LQ (such as me redefining VirtualSz for the 3rd time), if anyone is interested in all this:

Opinions wanted for documentation regarding process memory usage (http://www.linuxquestions.org/questions/linux-general-1/opinions-wanted-for-documentation-regarding-process-memory-usage-906966/#post4493115)

MK27
11-01-2011, 10:14 AM
If anyone is interested in this, it's online now:

Plog (http://cognitivedissonance.ca/cogware/plog)