Not sure what we can do to make sure every distro takes the actions necessary to get the best performance out of GNOME, I guess there's always the process of filing bugs.
Friday, March 31, 2006
Thursday, March 30, 2006
==24632== 74,477 bytes in 2,864 blocks are still reachable in loss record 4,829 of 4,830 ==24632== at 0x401C422: malloc (vg_replace_malloc.c:149) ==24632== by 0x479FF36: g_malloc (in /usr/lib/libglib-2.0.so.0.1000.1) ==24632== by 0x47AFC65: g_strdup (in /usr/lib/libglib-2.0.so.0.1000.1) ==24632== by 0x42BAD04: ??? (gtkicontheme.c:2161) ==24632== by 0x42BAF11: ??? (gtkicontheme.c:1057) ==24632== by 0x42BBC5F: gtk_icon_theme_lookup_icon (gtkicontheme.c:1244) ==24632== by 0x42BC108: gtk_icon_theme_load_icon (gtkicontheme.c:1388) ==24632== by 0x42B74C1: gtk_icon_set_render_icon (gtkiconfactory.c:1748) ==24632== by 0x43D0CAB: gtk_widget_render_icon (gtkwidget.c:5337)
There were other stack traces here, accounting for over 200kb of memory. I tried regenerating my cache, this had no effect. Have to see what is going on. Also, this stack trace was very curious:
==24632== 337,656 bytes in 94 blocks are still reachable in loss record 4,830 of 4,830 ==24632== at 0x401C422: malloc (vg_replace_malloc.c:149) ==24632== by 0x494FC0C: (within /usr/lib/libfreetype.so.6.3.8) ==24632== by 0x41B0E9A: (within /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x41B6974: (within /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x41B237A: (within /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x41B4DA1: (within /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x41B02D1: pango_ot_info_get_gpos (in /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x41B0357: (within /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x41B0407: pango_ot_info_find_script (in /usr/lib/libpangoft2-1.0.so.0.1200.0) ==24632== by 0x4C162E9: (within /usr/lib/pango/1.5.0/modules/pango-basic-fc.so) ==24632== by 0x45E4D12: (within /usr/lib/libpango-1.0.so.0.1200.0) ==24632== by 0x45F3FB3: pango_shape (in /usr/lib/libpango-1.0.so.0.1200.0) ==24632== by 0x45E7BCE: (within /usr/lib/libpango-1.0.so.0.1200.0) ==24632== by 0x45EAD1A: (within /usr/lib/libpango-1.0.so.0.1200.0) ==24632== by 0x45EB274: (within /usr/lib/libpango-1.0.so.0.1200.0) ==24632== by 0x45EBBCB: (within /usr/lib/libpango-1.0.so.0.1200.0) ==24632== by 0x42DA876: ??? (gtklabel.c:2027)
I wonder exactly what this is that needs 300 kb of memory.
Wednesday, March 22, 2006
constto lots of data tables. It trims the 92 kb of dirty private rss down to 8 kb, saving just under 2 mb desktop wide! Jason: this patch is great. Let's get it upstream, and also try to get the next round of distros (suse 10.1, dapper, fc5) to include this.
There are quite a few other libraries that are used by almost every GNOME process that could benefit from such constification. Some I saw from gnome-terminal:
48 kb 0 kb 48 kb /usr/lib/libORBit-2.so.0.1.0 40 kb 0 kb 40 kb /usr/lib/libbonobo-2.so.0.0.0 36 kb 0 kb 36 kb /usr/lib/libgtk-x11-2.0.so.0.800.16 36 kb 0 kb 28 kb /usr/lib/libxml2.so.2.6.23 24 kb 0 kb 24 kb /usr/lib/libgnutls.so.12.3.6 20 kb 0 kb 20 kb /usr/lib/libasound.so.2.0.0 20 kb 0 kb 20 kb /usr/lib/libfontconfig.so.1.0.4 20 kb 0 kb 20 kb /usr/lib/libgnomevfs-2.so.0.1400.0 20 kb 0 kb 16 kb /usr/lib/libgcrypt.so.11.2.1 16 kb 0 kb 16 kb /usr/lib/libX11.so.6.2.0 16 kb 0 kb 16 kb /usr/lib/libgnomeui-2.so.0.1400.0
Fixing one of these libraries will have the benefit multiplied by about 20.
We should also consider is reducing the number of processes on the desktop. For example, clock-applet takes up 2.7 MB of private dirty rss. 1.7 MB of this is the heap and stack, the other MB is the .data section of .so files. For the most part, these are constant costs we are going to experience with any process. Reducing the number of processes will reduce this problem.
Tuesday, March 21, 2006
What we really care about is the private rss, the amount of pages our process maps in to memory that are only used by our process. We'd also like to know the rss per mapping so that we can point-fingers/find-where-to-fix-the-bug.
Up until this point, such statistics have been hard to come by. No longer! The 2.6.16 kernel adds support for
smaps, per-mapping data, including data on each mapping's rss usage. This data lives in /proc/$pid/smaps. However, the format of the smaps file is hard to digest. I've written a quick perl script which parses this into something more useful. It uses the Linux::Smaps module on CPAN.
An example of the data generated by this script:
The vmsize and total rss size are the statistics that everyone is used to. The rss size is split intoVMSIZE: 41132 kb RSS: 23052 kb total 9212 kb shared 0 kb private clean 13840 kb private dirty PRIVATE MAPPINGS vmsize rss clean rss dirty file 12768 kb 0 kb 12616 kb [heap] 196 kb 0 kb 196 kb 120 kb 0 kb 92 kb /usr/lib/libaudiofile.so.0.0.2 132 kb 0 kb 80 kb 80 kb 0 kb 60 kb [stack] 48 kb 0 kb 48 kb /usr/lib/libORBit-2.so.0.1.0 40 kb 0 kb 40 kb /usr/lib/libbonobo-2.so.0.0.0 36 kb 0 kb 36 kb /usr/lib/libgtk-x11-2.0.so.0.800.16 ... SHARED MAPPINGS vmsize rss clean rss dirty file 2848 kb 1596 kb 0 kb /usr/lib/libgtk-x11-2.0.so.0.800.16 1172 kb 624 kb 0 kb /lib/tls/i686/cmov/libc-2.3.6.so 488 kb 400 kb 0 kb /usr/lib/libgdk-x11-2.0.so.0.800.16 900 kb 396 kb 0 kb /usr/lib/libX11.so.6.2.0 524 kb 360 kb 0 kb /usr/lib/libglib-2.0.so.0.1000.1
shared. The private rss is what could be best called the process' memory usage.
Below this, we give the per-mapping statistics for private mappings. Most of these are either from the heap (eg, malloc'd data), or writable mappings in .so files (from the .data section). This output is especially helpful for diagnosing the second kind. Following, we give the same data for shared mappings (most of which are .so files, the executable code, etc).
Call to action: I would love to see the following done:
- Check out some of the libraries with large writable segments. An extreme example of this is libaudiofile. This library has 92 kb of dirty, private rss (isn't that naughty!). To make matters worse, the library is used by 22 processes on my GNOME 2.14 desktop. This is about 2 MB of memory! Let's figure out why this is happening and make the data const. Also, it might be wise to see if we could reduce the number of programs using this library. We should try to find any other instances of this.
- Let's get some smaps based data in gnome-system-monitor, and possibly more low level tools as well.
- We should look at tools like exmap to get per-page level rss into. This is useful for finding out things like
what pages does evolution use from libgtk and why. We can use this tool to figure out what we can do for low memory users: we can simulate high levels of swapping by allocating memory in a dummy process. We should then see what pages must be loaded from disk to use the desktop, evolution, etc.
- It'd be great to set a community goal for 2.16.
We will reduce the private rss used by all GNOME processes in setup X by Y MB. We should also take statistics from 2.14 to make sure that there are no memory usage regressions.
In other performance related news, somebody has finally gotten good statistics on Firefox's memory usage. It looks like Mozilla is leaking pixmaps when browsing with tabs. I think there are many people who would be made much happier if this could be fixed. This is really great data gathering, and I'd like to see more of it.