Wednesday, March 22, 2006

libaudiofile patch

Jason Allen commented on my last blog with a patch to fix libaudiofile. This simple patch just adds const to lots of data tables. It trims the 92 kb of dirty private rss down to 8 kb, saving just under 2 mb desktop wide! Jason: this patch is great. Let's get it upstream, and also try to get the next round of distros (suse 10.1, dapper, fc5) to include this.

There are quite a few other libraries that are used by almost every GNOME process that could benefit from such constification. Some I saw from gnome-terminal:

      48 kb        0 kb       48 kb   /usr/lib/libORBit-2.so.0.1.0
      40 kb        0 kb       40 kb   /usr/lib/libbonobo-2.so.0.0.0
      36 kb        0 kb       36 kb   /usr/lib/libgtk-x11-2.0.so.0.800.16
      36 kb        0 kb       28 kb   /usr/lib/libxml2.so.2.6.23
      24 kb        0 kb       24 kb   /usr/lib/libgnutls.so.12.3.6
      20 kb        0 kb       20 kb   /usr/lib/libasound.so.2.0.0
      20 kb        0 kb       20 kb   /usr/lib/libfontconfig.so.1.0.4
      20 kb        0 kb       20 kb   /usr/lib/libgnomevfs-2.so.0.1400.0
      20 kb        0 kb       16 kb   /usr/lib/libgcrypt.so.11.2.1
      16 kb        0 kb       16 kb   /usr/lib/libX11.so.6.2.0
      16 kb        0 kb       16 kb   /usr/lib/libgnomeui-2.so.0.1400.0

Fixing one of these libraries will have the benefit multiplied by about 20.

We should also consider is reducing the number of processes on the desktop. For example, clock-applet takes up 2.7 MB of private dirty rss. 1.7 MB of this is the heap and stack, the other MB is the .data section of .so files. For the most part, these are constant costs we are going to experience with any process. Reducing the number of processes will reduce this problem.

10 comments:

Diego "Flameeyes" Pettenò said...

Although not named in the post ( ;) ), I've applied the patch to Gentoo's audiofile, hopefully there shouldn't be problems, but at least it's a good test.

I was trying to understand this whole better tho, and I'm actually failing to grok it completely... for example fontconfig has some small-to-big tables not marked const, but the dirty rss doesn't seem to change at all also after recompiling them with const (although such a change should be anyway preferred, at least for the sake of doing things right ;) ).

Also, what's the difference between clean and dirty rss? Is there some definition or it's just heuristics?

Federico Mena Quintero said...

Read-only mappings have only clean RSS. Dirty RSS means the part of the RSS that has been written to in a read-write mapping; it's the "still needs to be written to disk" part of copy-on-write pages.

Stephane Chauveau said...

I studied that problem a while ago and I am afraid that the patch may not save as much as it could.

The problem is with data structures containing pointers to strings or to other objects.

For example consider a global variable declared like that:

my_struct_t foo = {
1, 2 , 3 , 4 ,
"bar"
} ;

Making that variable const won't really help because, in a shared lib, "bar" is a relocatable address which is by definition non constant. so the compiler has to ignore the const attribute.

A possible solution is be to collect all constant strings in a single array and to refer to them by their constant index:

#define STRING_bar 2

const char * STRING_TABLE[] = {
"hello" ,
"gnome" ,
"bar" ,
"error"
} ;

my_struct_t foo = {
1, 2 , 3 , 4 ,
STRING_bar,
} ;

Or even better, catenate all strings and produce symbolic constant giving the offset of each substring:

const char STRING[] = "hello\0gnome\0bar\0error\0" ;

#define STRING_hello 0
#define STRING_gnome 6
#define STRING_bar 12
#define STRING_error 16

And to access the string "bar", the program would have to do:

STRING[STRING_bar]

This is painful to do manually but writting a small utility to automate that should be trivial.

Stephane Chauveau said...

Are you sure about those 92K going down to 8K? I just applied the patch and I see that the rw segment (obtained with objdump -p libaudiofile.so) goes down by only 736 bytes.
By the way, my original version of libaudiofile 0.2.6 is only using 12K of non-constant data.

I wonder if you did not hit a bug I discovered a few month ago on debian. Some versions of gcc are putting constant data in the non-constant segment. I remember filling a bug (in debian) but I can't find it back.

Ben Maurer said...

I'm pretty sure about the writable section getting trimmed.

The issue with structs is true. However, there is still alot of low hanging fruit...

Stephane Chauveau said...

Humm... Did you try to compile the lib with and without the patch using the same compiler?

Ben Maurer said...

Yeah, I rebuilt on Ubuntu using the deb source, so it should be the exact same as the distro version. This is gcc 4.

Maybe I'm using preload and this is changing thigns? not sure.

Stephane Chauveau said...

The version of gcc you have localy may be different from the one used to build the official release.
In fact, this is exactly what happened when I discovered the problem a few month ago.
I was using one of the most recent gcc available on Debian but the server used to build the packages was using a older version. Those build servers have to be very stable so they are not upgraded very often.
I did not bother to insist because it was just a matter of time before the build server would be upgraded.

I am not so sure anymore that the problem was with gcc. It may have been with binutils or libtool.

Could you do a 'objdump -p /usr/lib//libaudiofile.so' and post the content of the 'Program Header' section?

this what I get with the current debian version (0.2.6 i386):

Program Header:
LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12
filesz 0x000204a6 memsz 0x000204a6 flags r-x
LOAD off 0x000204c0 vaddr 0x000214c0 paddr 0x000214c0 align 2**12
filesz 0x00002378 memsz 0x0000237c flags rw-
DYNAMIC off 0x000225d4 vaddr 0x000235d4 paddr 0x000235d4 align 2**2
filesz 0x000000d0 memsz 0x000000d0 flags rw-
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
filesz 0x00000000 memsz 0x00000000 flags rw-

As you can see, the rw LOAD segment has a memory size of only 0x00002378 (~9K).

Stephane Chauveau said...

Tracking the origin of each non-constant global variable is difficult from the .so file because most symbols are not exported.

This is a lot easier from the object files used to build the lib.

If you want I can make some scripts to extract that information from any build tree.

Ben Maurer said...

I'd really appreciate the scripts to get data on this (email me at bmaurer@andrew.cmu.edu about this...)

I'll try to gather more data on this over the weekend. For now, it's midnight and I have "real work" ;-)