Tuesday, March 21, 2006

Memory Usage with smaps

As most developers should know by now, the memory statistics given on Linux are mostly meaningless. We have the vmsize, which counts the total address space used by a process. However, this is not accurate because it counts pages that are not mapped in to memory. We also have rss, which measures pages mapped into memory. However, it multi-counts shared pages: every process gets X kb of rss due to libgtk. However, the majority of the pages in libgtk are shared across the processes.

What we really care about is the private rss, the amount of pages our process maps in to memory that are only used by our process. We'd also like to know the rss per mapping so that we can point-fingers/find-where-to-fix-the-bug.

Up until this point, such statistics have been hard to come by. No longer! The 2.6.16 kernel adds support for smaps, per-mapping data, including data on each mapping's rss usage. This data lives in /proc/$pid/smaps. However, the format of the smaps file is hard to digest. I've written a quick perl script which parses this into something more useful. It uses the Linux::Smaps module on CPAN.

An example of the data generated by this script:

VMSIZE:      41132 kb
RSS:         23052 kb total
         9212 kb shared
            0 kb private clean
        13840 kb private dirty
PRIVATE MAPPINGS
vmsize   rss clean   rss dirty   file
12768 kb        0 kb    12616 kb   [heap]
196 kb        0 kb      196 kb
120 kb        0 kb       92 kb   /usr/lib/libaudiofile.so.0.0.2
132 kb        0 kb       80 kb
 80 kb        0 kb       60 kb   [stack]
 48 kb        0 kb       48 kb   /usr/lib/libORBit-2.so.0.1.0
 40 kb        0 kb       40 kb   /usr/lib/libbonobo-2.so.0.0.0
 36 kb        0 kb       36 kb   /usr/lib/libgtk-x11-2.0.so.0.800.16
...
SHARED MAPPINGS
vmsize   rss clean   rss dirty   file
2848 kb     1596 kb        0 kb   /usr/lib/libgtk-x11-2.0.so.0.800.16
1172 kb      624 kb        0 kb   /lib/tls/i686/cmov/libc-2.3.6.so
488 kb      400 kb        0 kb   /usr/lib/libgdk-x11-2.0.so.0.800.16
900 kb      396 kb        0 kb   /usr/lib/libX11.so.6.2.0
524 kb      360 kb        0 kb   /usr/lib/libglib-2.0.so.0.1000.1

The vmsize and total rss size are the statistics that everyone is used to. The rss size is split into private and shared. The private rss is what could be best called the process' memory usage.

Below this, we give the per-mapping statistics for private mappings. Most of these are either from the heap (eg, malloc'd data), or writable mappings in .so files (from the .data section). This output is especially helpful for diagnosing the second kind. Following, we give the same data for shared mappings (most of which are .so files, the executable code, etc).

Call to action: I would love to see the following done:

  • Check out some of the libraries with large writable segments. An extreme example of this is libaudiofile. This library has 92 kb of dirty, private rss (isn't that naughty!). To make matters worse, the library is used by 22 processes on my GNOME 2.14 desktop. This is about 2 MB of memory! Let's figure out why this is happening and make the data const. Also, it might be wise to see if we could reduce the number of programs using this library. We should try to find any other instances of this.
  • Let's get some smaps based data in gnome-system-monitor, and possibly more low level tools as well.
  • We should look at tools like exmap to get per-page level rss into. This is useful for finding out things like what pages does evolution use from libgtk and why. We can use this tool to figure out what we can do for low memory users: we can simulate high levels of swapping by allocating memory in a dummy process. We should then see what pages must be loaded from disk to use the desktop, evolution, etc.
  • It'd be great to set a community goal for 2.16. We will reduce the private rss used by all GNOME processes in setup X by Y MB. We should also take statistics from 2.14 to make sure that there are no memory usage regressions.

In other performance related news, somebody has finally gotten good statistics on Firefox's memory usage. It looks like Mozilla is leaking pixmaps when browsing with tabs. I think there are many people who would be made much happier if this could be fixed. This is really great data gathering, and I'd like to see more of it.

18 comments:

Pádraig Brady said...

Note smaps is only required since
the rmap VM was introduced in later 2.4 redhat kernels.
I pushed last year for smaps to be integrated so one could determine how much RAM was actually being used by a process.
At that time I also created a script to list processes by memory usage using the more accurate smaps method when available.

Eric Windisch said...

My question is how does this work for multiuser systems? Say, I have 50 users on a system running various applications, but generally Apache with FastCGI or Mono. I'd like to know how much memory each user is using.

What I've done is take ps_mem.py and changed the arguments to 'ps' such that it only shows the current user's processes. I've also written two "less than exact" scripts, using different methods, giving me different results.

What I'd like to know is, is memory being shared between users? If not, is ps_mem.py (modified only to show the current user), going to be accurate for my needs?

Diego Calleja said...

/proc/$PID/smaps has been there since 2.6.14 (November 2005)

Jason Allen said...

I threw together this patch last night. Got it down to 12kb of dirty rss

Pádraig Brady said...

ps_mem.py returns for each process
RSS + max shared mem in any instance

Now I'm pretty sure the kernel will account
for other users shared pages in this.
So really ps_mem.py only applicable for
looking at all processes on the system.

For example you can't add all the
results for a particular process
together from all users, as this will
count the shared mem multiple times and
result in overestimation.

Pádraig Brady said...

Oops the first paragraph of the previous post should have read:

ps_mem.py returns for each program:

sum(all RSS for process instances) +
max(shared mem for process instances)

Anonymous said...

http://bugzilla.gnome.org/show_bug.cgi?id=325288

Anonymous said...

http://bugzilla.gnome.org/show_bug.cgi?id=325288

Anonymous said...

It would also be very interesting to reduce the number of applications that link to libaudiofile (and to other libraries). A lot of applications do not actually use these libraries at all...

S. Chauveau said...

Considering that all shared memories are ok is too simple: 100KB shared by 20 processes is not the same as 100KB shared by 2 processes. I even suspect that a shared lib used by only one application will be listed as shared.

It would be nice if your script could indicate how many times each memory area is shared. That information could be used to weigth each area while computing the resident memory usage per process.

Anonymous said...

what about applications that now load the full python stack only for plugins ?

Alex L said...

I agree with S. Chauveau. The private/shared distinction isn't ideal.

What if you count each page as pagesize / number of users of that page.

Anonymous said...

As far as I know, this is already supported by gnome-system-monitor

tf said...

BTW, are you aware of the exmap tool? It gives you shed-loads of details on how process memory used (what is shared, what is not, how shared, what is writable, what is mapped, details for individual elf sections and symbols. See http://www.berthels.co.uk/exmap for the original application and http://projects.o-hand.com/#misc for a command line client for it.

pixelbeat said...

I've just updated the ps_mem.py script referenced in the first comment, to use the PSS kernel metric
provided since kernel 2.6.23.

www.pixelbeat.org/scripts/ps_mem.py

Indro said...

/proc/$PID/smaps on my kernel is empty. What kernel option do i need to enable to get these entries?
Thanks

vinayan said...

Eric, Could you share the ps_mem.py (modified only to show the current user) script ?

I do have a similar requirement and am in need of a script which finds the exact memory usage per user. Thanks.

Anonymous said...

The perl script does just what I need. Good job. Thanks,
marc magrans de abril