Monday, December 04, 2006

Performance Tip of the Day: Script Tags are Blocking

Today I downloaded the fantastic firebug extension. It has a mode where it shows network activity:

I learned that if you have a JavaScript file, the browser will block rendering of the page until the request is done. I saved about 100ms on a few sites I run by moving the Google Analytics tracker to the bottom of the page (not sure why it wasn't being cached, probably because I am on an SSL site).

Tuesday, November 28, 2006

Posting sensitive data in JSON

If you are using JSON in AJAX, make sure not to put sensitive data in the JSON feed. Because script tags don't follow the same-origin policy, it's possible to include a script from third party sites.

Google's GData-JSON feeds (which I blogged about earlier) had just such an issue. Google allowed you to request a URL such as If you use Google calendar, take a look at that feed with the alt= part taken off. It likely has your email address, your full name, and possibly some sensitive events in it. Any site you visited could have requested that URL and scraped the data. Note that with more advanced techniques, it's possible to get data that doesn't use the callback, ie, array literals. See Jeremiah Grossman's blog

Luckily, this was fixed relatively quickly after I reported it.

Saturday, November 25, 2006

DomBuilder + Functional Programming == Awesome

The DOM sucks. It's so so slow to type document.createElement and document.createTextNode. One nice solution for this is DomBuilder which allows you to say:

 DIV({ id : "el_" + times, 'onclick' : 'alert("sdsdsd")'}, 
  STRONG({ 'class' : 'test' },"Lovely"), " nodes! #" + times

When using the DomBuilder in a project of mine, I found that it couldn't handle data very well. I had a list of items, and I wanted to make a table. There's no easy way to do that with DOMBuilder.

However, a bit of functional programming can save the day. Using Prototype, and adding the following line of code to tagFunc gets lots of millage:

arguments = $A(arguments).flatten ().compact ();

What is this doing? First, we turn arguments into an array so that we can handle it cleanly. Then we flatten any arrays (turn [a,[b,c]] into [a, b, c]) and then compact any null entries ([a,null,b,c] into [a,b,c]). What's the win? Now this library can handle data very elegantly:

var stocks = [{ name : "NOVL", price : 6.28 }, { name : "GOOG", price : 505.00 }];
document.body.appendChild($table (
   $tr ($th ("Name"), $th ("Price")), (function (stock) {
       return $tr ($td (, $td (stock.price.toString ()));

Note the use of map to handle each of the stocks. Without the flatten, this would not have worked. It's pretty easy to build up HTML from data like this very elegantly.

Wednesday, November 22, 2006

Using GCal JSON to make a free/busy schedule

Lately, I seem to be getting lots of emails of the form "When are you free this week, I'd like to meet with you sometime". Each time I get this email, I have to go to my calendar, copy my appointments for the next week, and send it in a reply.

In an ideal world, I could just paste a link to my calendar in iCal format. Sadly, not enough people use a calendaring client for this to be reliable (and worse off, many of the people I interact with use the horror that is Oracle Calendar, which doesn't really handle external ical).

This week, Google added JSON output to their Google Calendar feeds. This allows me to make a pure-javascript solution to this problem. I created a bit of Javascript code (here) which loads my calendar in JSON format and tells the other person when I'm busy

It's nice to be able to only show a free-busy projection of my calendar (I don't want the world to know who I'm meeting with, where I am, etc at every moment. I also use the calendar as a place to dump event related date, for example, airline confirmation numbers). I also like that I only have to host a small static html page to do this. No figuring out where to put a PHP script, no SQL, just a bit of javascript


  • Handle multi-day events
  • Better date formatting (use day of week, month names, etc)
  • Combine events (If I'm busy from 10:30-11:30 and 11:30-12:30, I can just be busy between 10:30 and 12:30)
  • Not depend on prototype (or only take what I need)
  • Make it pretty

Monday, November 13, 2006

Now that javac is open source...

Maybe somebody (me?) can finally make a patch for this issue:

[bmaurer@omega ~]$ cat
public class x {
        public static void main (String[] args) {
                System.out.println ("hello world");
[bmaurer@omega ~]$ time javac

real    0m0.766s
user    0m0.604s
sys     0m0.040s

For the record, mcs has a time of:

[bmaurer@omega ~]$ time mcs x.cs

real    0m0.483s
user    0m0.440s
sys     0m0.024s

But Java is using a form of Ahead of Time compilation (they call it class file sharing or something) while my MCS is not.

Wednesday, November 01, 2006

Don't echo back plain text passwords

Today I found two nice little security issues on an e-commerce site I use. First, the site has a page that allows you to change passwords. The code on the page is of the form <input type="password" name="password" value="MY PASSWORD IN PLAIN TEXT">. Secondly, the site had some Cross Site Scripting issues. At the end of the day, it was drop-dead easy to phish for people's passwords. Yikes.

Never, ever, ever echo sensitive data back to the user. It makes an XSS attack really damaging (and is also bad if somebody leaves their computer unlocked).

Saturday, October 28, 2006

Finding Social Security Numbers with Google

Google has a helpful syntax x..y for searching web pages with numbers between x and y. This feature, combined with the stupidity of the general public, results in social security numbers being findable: inurl:resume ssn * 10000...99999999.

Note that Google proactively protects users by forbiding the query 100000000..999999999 (that's the full SSN range). It also bans a varity of searches for credit card numbers. However, using the wider range allows one to still find socials, granted with some false positives.

Wednesday, October 18, 2006

Math.abs Returns a Negative Number

Math.abs (Math.Abs in C# :-) is one of those methods that sould be pretty simple, right? Just take the value, if it's negative, return a positive version, duh! Well, it's not that simple.

There are 232 possible values of int, each of which is positive, negative or zero. There's only one integer that is zero, namely 0x00000000. That means that either there is one bit pattern of ints that doesn't represent an integer, or there is not a 1-1 mapping between positive and negative numbers. It turns out, the second is the case. The odd number is int.MinValue, which is equal to 0x80000000 in binary.

What happens when you negate this number? In two's complement -x=~x+1. ~x = 0x7fffffff. ~x+1=0x80000000. That means -int.MinValue == int.MinValue. Uh oh!

Enter Math.abs. What's the function to do when passed int.MinValue. Well, in C#, Math.Abs throws an OverflowException. Java, on the other hand, happily returns int.MinValue. Either way, the caller of the function probably wasn't expecting this.

I discovered this oddity when reading Effective Java (kindly provided to all Google Engineers). The book talks about the code: Math.abs (new Random ().nextInt ()) % n as a way to generate random numbers between 0 and n. This code usually works, except when Random happens to return int.MinValue. In this case, the abs returns a negative value, causing the modulus operator to return a negative number. Eeeek!

At this point, I realized that for the data structures course I TA, many students used a similar piece of code in a hashtable they wrote. I wrote a quick script to check how many people screwed up on int.MinValue. Most students did.

I wondered how much this occurred in production code. Luckily, I had one of the best test cases at my fingertips. Google has an internal "grep" tool (the inspiration for the new Code Search) utility. I made a quick regex to find where this occurred. There were many instances.

Now that Google has released an external version of this tool, you can see some of the places this anti-pattern is used in the real world:

Each of these snippits is a time bomb. One in 232 executions, the function will do something strange.

Having found this issue in Google's source code, I emailed somebody inside Google who had been running FindBugs internally. He in turn talked to Bill Pugh, who added a detector. I added a test case to the test suite for the homework assignment at CMU, so students will have to handle this case. All in all, a very obscure bug

So, what should you do rather than Math.abs? I've seen primarially two buggy patterns:

  • Hashtables People want to find "which bucket should I put an object with hashcode x in". The best way to do this is (o.GetHashCode () & 0x7fffffff) % table.Length. This has the advantage of being faster than Math.abs (no branches).
  • Random Numbers People want to say Math.Abs (new Random ().Next ()) % N. Not only is this buggy in the rare case, it can also cause a bad distribution of numbers for large N. Use the Next (int, int) method which allows you to specify bounds

Saturday, October 07, 2006

Edgy Memory Usage

I just installed the Ubuntu Edgy beta on my laptop. I'm impressed by the memory usage. On startup, it's using 87mb of ram. WOW.

In system monitor, there's one sore point. This system-tools-backend thingy is launching perl on startup. This requires 9 mb of writable memory on startup. What the hell is this thing?

Whatever the program does, it clearly does not have a reputation for performance. Ryan "kill wakeups" Lortie reported that the program made 20 wakesups per second

Filed a bug

Friday, October 06, 2006

Mono Summit; New Google Products

Mono Summit: I'm going to be at the Mono summit in Boston (thanks Mig!). I look forward to seeing lots of Mono hackers (and having some very good ice cream :-)

New Google Products: In the past month or so, Google has had a pretty crazy schedule of product releases.

  • Blogger Beta: Google finally made lots of improvements to Blogger, giving it tags, etc. Also, it was moved over to Google Accounts, killing yet another password. Most importantly: spell checking!
  • Google Reader: Completely new version. This thing is amazing, it really works the way I'd want a blog reader to work. The old Google Reader had really poor performance on Linux, and this one seems to fix it. I only have two issues with the program. First, it does not auto refresh feeds, so I need to remember to hit "r" to refresh things. Also, the server side seems to have less than stellar performance. I'm betting they did such a fantastic job that they got more users than they had server power.
  • Google Transit: Being a college student, I use public transportation (it's free in the sense that I am forced to pay a flat fee for the entire year). Pittsburgh's public transit does not publish schedules in a usable format. Transit completely fixed this for me.
  • Groups Beta: Google did a refresh of groups. The UI is much better, and they now have a wiki-like functionality and the ability to upload files (100mb of storage -- that's not too bad). Like Google Reader, the servers aren't quite as fast as they could be. I assume this will be fixed when it's out of beta
  • Code Search: global grep. No horrible interface like koders or krugle.
  • Image Labler: This is a version of a game by Luis Von Ahn (a professor here at CMU) that lets people label images. Google needs to do serious work on game play (their images are far too small, likely due to copyright paranoia, and they aren't using taboo words correctly well). However, if you look at the high scores, there are people who have been playing this thing for days. Imagine what would happen if Google were to increase the game play experience and put a link on the home page for one day.

I have to say, this is a pretty amazing array of products.

Friday, July 28, 2006

Constructive Finger Pointing

Ryan Lortie made a very good post on the waste of power due to timer use. In the past, I have made many similar posts about how memory is wasted. I think these types of blogs provide good insight to the developer community. It's a way to say "wow, this is really a problem". I'd like to see the process automated.

On the Google Intranet, there is a status dashboard. The dashboard is sorted by latency, slowest first. So, if your service is slow, you get highlighted on the home page, with a big red (!) next to your project. I think this is a good incentive to make services fast

I wonder if the same thing can be done for GNOME. We could gather data about which components are sucking and put it in a high profile place (the planet would be a good one). Some metrics are easy to get (modules with the most unreviewed bugs). In time, I think GNOME and distributions should build tools to get other types of data: "what applications are taking up memory", "which apps are segfaulting", "which apps are abusing timers".

Luis von Ahn Talks at Google

Luis von Ahn, inventor of CAPTCHAs and the ESP Game, gave a talk at Google yesterday. This talk is worth listening to (would I blog it otherwise?). Luis describes how, by using people's spare time, all images on the web could be labeled in months (or even weeks). This talk is non-theory-person friendly (no math). It's also quite funny (Luis' lectures are even better). (direct link if you can't see the embeded version below)

Thursday, July 27, 2006

Google Code Hosting

Hosting Google Code is very interesting. While, at first glance, the site really doesn't compare to Sourceforge, I think it's a really interesting offering:

  • The Google Code issue tracker is very elegant. The tagging system is much, much smarter than what Bugzilla does.
  • It will scale. Period. It's interesting to note that some of the biggest investment was on making subversion very scalable (see this interview). Anyone who has ever used Sourceforge knows that the performance of their version control is...less than stellar :-).
  • Possibly the most interesting thing is the fact that very little infrastructure will be needed to make this service usable; it can just connect to what Google already has. Google Pages, GMail, Google Groups, Blogger and Writely all provide services that in other services would just be one-off hacks. The future of online interaction seems to be combinations of tools that do one thing and do it right into a powerful system. For example, I'm currently trying to convince the people running 15-211 at CMU (the Data Structures course which I TA) to use the combination of Blogger + Google Groups + Google Calendar rather than Blackboard, which is a content management / community system gone wrong. In order to create an integrated experience, I wrote a "portal" (read: 200 line hack) using the ATOM feeds provided by all three services. It's going to be really interesting to see what a company with so many offerings can do for open source

Wednesday, July 19, 2006

Yahoo Portal

I saw that Yahoo released an ajaxish portal today. I tried it, to see if they've improved at all. Let me say, I'm shocked these guys are still around. The site is horrible. First, the home page is filled with ads. I mean the moving, flashing, distracting ads that are so 1999. In addition, the home page has some text ads offering me a range of services I really don't need (Vonage -- no thanks, I don't use the phone that much, domain name registration -- maybe, but does everyone need to see this, "Degrees in as fast as 1 year" -- thanks, I go to a school with reputation, "What’s your credit score 560? 678? 720? - See it free." -- this might as well be in my spam folder).

The featured items on the page are completely irrelevent to most people. In the prime location on the page, I'm offered a contest to "Design Janet Jackson's new album cover". The Yahoo Pulse tells me that the number one "Top Guilty Pleasures Ringtones" is "PYT Pretty Young..." by Michael Jackson.

Well, at least the page has a search box. The focus is on the text box by default (good!), so I can get going right away. Let's give Yahoo a hard search, linq. This is the C# 3.0 Object/Relational mapping. Typical search, not really. But I want a search engine that finds things that are hard to find.

It turns out that Yahoo and Google have about the same search results. However, the difference is in the ads.

Each page has more ads, however for advertisers, the top three are the most important spots, so I'm just going to look at those. The Yahoo ads in positions 1 and 3 offer me some obscure products that happen to be named "linq". Yahoo ad 2 links to with no connection what so ever to linq. Compare this to the Google ads, all of which might be relevent to somebody looking at O/R mapping in .NET. Let's just say Google is getting lots more revenue from its ads.

Ok, let's give Yahoo a break. I'll try an easy query "restaurant". Yahoo highlights restaurant results in Pittsburgh (right now, I'm in CA). Now, I know I've used Yahoo's farechase to find flights from PIT to SFO, but my IP address should very clearly tell Yahoo where I am. All of Yahoo's ads are for restaurant supplies Now, Google doesn't try to highlight local restaurants (to do that I have to say "restaurant near mountain view ca"), however the ads are geo-targeted, giving local restaurants (not restaurant suppliers).

Well, I don't think I'll be changing my home page any time soon. There are some things I did like about Yahoo's design (I really like that you can change from Web to Image search without the page being refreshed. Google should totally copy that idea). However, it's pretty clear why Google has so many users.

Tuesday, July 04, 2006

Today in performance

I started off today by using massif and the traditional memcheck to see where memory was allocated in the libgnomeui stack

  • We use libgnutls to handle ssl in gnome-vfs. This program mallocs 65 kb of memory in the intializer (which is called from gnome-vfs's initializer). I sent off an email to the address their website told me to use for bugs (no bugzilla!). If you are interested in fixing this on the gnutls side, the code to look at is gnutls_global_init, specifically the two calls to asn1_array2tree. In the mean time, I think we should fix this in GNOME by lazily initializing the tls library. I think this is a pretty rare use case. On my (basically empty) desktop, 18 processes are using gnome-vfs. That makes 1.1 MB (probably more, countin malloc overhead).
  • Noticed lots of allocations from inside glibc when calling setlocale (which the gnome option parser does). Turns out there is a glibc cache for this stuff, but Ubuntu wasn't packaging it. Filed a bug. I have 40 processes using the cache right now (even bash uses it!). Not using the cache costs about 70kb. That's 2.7 mb.

I then went on to look a bit at gedit, which I noticed was taking up a pretty high amount of memory

  • First sign of trouble: gedit loads python. Why? By default a plugin called "modelines" which describes itself as "Emacs, Kate and Vim-style modelines support for gedit". I'm not sure exactly what that does, or how I'd use it. Disabling just that plugin gives me back 3.2 mb of ram. It also made gedit feel faster to start up. Filed a bug suggesting that it either be disabled or written in C.

    It looks like python+gtk could use some memory optimization. For startup, a hello world in Python has 4 mb of private dirty rss. Compare this to Mono and Gtk# which takes only 2.7 mb. Granted, even Mono is large compared to 608 kb for a C based GTK app. I'll see what I can do about that some weekend :-).

  • Ubuntu's launchpad-integration library was taking up quite a bit of memory by allocating pixbufs. strace -eopen gave the issue away:

    open("/usr/share/pixmaps/lpi-help.png", O_RDONLY|O_LARGEFILE) = 17
    open("/usr/share/pixmaps/lpi-translate.png", O_RDONLY|O_LARGEFILE) = 17
    open("/usr/share/pixmaps/lpi-help.png", O_RDONLY|O_LARGEFILE) = 17
    open("/usr/share/pixmaps/lpi-translate.png", O_RDONLY|O_LARGEFILE) = 17
    ... (78 lines of this)

    Whoops :-). Filed a bug. Not sure how much this saves, as it's not easy to count how many times this code is used.

So, that's about 7 mb of memory from all these issues (and my estimates are fairly conservative -- I'm rounding things down, not counting malloc overhead, and looking at my desktop with just xchat, gaim, firefox, gedit, and a few terminals), I'd actually expect the total effects of these to be 8-10 mb. Even with 1 gb of ram, that's pretty large.

Monday, July 03, 2006

gnome cups icon leak

The printer status icon (gnome-cups-icon) leaks quite a bit. There is a Debian and a GNOME bug. It looks like there has been no maintainer attention to the bug since it was filed. Now that there's a patch, it'd be great to see some action. It looks like this is a fairly visible leak.

Sunday, July 02, 2006

Why does libgnomeui cost so much

So, after measuring the benefit of removing libgnomeui from a program, I thought I might dig deeper into the cause of the bloat. I first made three hello world style applications: one used plain GTK, another used GTK but also initialized GNOME VFS, the last used libgnomeui (which also uses vfs). Each of these three programs loads a superset of the libraries of those before it. I used smaps to gather data about the heap space allocated (used by malloc) and the writable mappings due to shared libraries.

Some observations to make here:

  • Malloced memory causes the most trouble at the gtk level. However, the gnome vfs and libgnomeui are still responsible for quite a bit of mallocing
  • libgnomevfs is the worst offender with respect to loading libraries.
  • libgnomevfs is a much larger jump in memory than libgnomeui

I then dug further into what libraries were being loaded by vfs and gnomeui. To get useful data here, I excluded the libraries loaded by gtk from consideration when looking at vfs and similarly excluded libraries loaded by vfs when looking at gnomeui.

  • Bonobo, ORBit...ugh
  • libgnutls, libxml2, and libgcrypt have quite a bit of writable memory. If they could be cut to 4 kb each, we'd save 50kb for each process with gnomevfs.
  • The "Other" category has all the 4 kb libs. A few worth special mention: avahi loads three .so files. First, having avahi here seems a bit silly; second, three libraries. Also, libpopt is used, isn't there something in glib for this now?
  • Maybe all those sound related libs should be dynamically loaded. Not many apps use sound!

Investigation to do

  • Look at the malloced memory. Valgrind is a good tool here
  • Look at the size of the writable memory in libraries mentioned here

Saturday, July 01, 2006

Kill libgnomeui

libgnomeui would be better named "libkitchensink". It brings in all kinds of libraries from avahi to zlib. How much effect would removing this dependency have on memory? I decided to try out on gnome-volume-manager (which handles volumes as in mounts, not as in sound). Hackishly, I commented out the session management stuff that requires libgnomeui. The results were pretty good.

That's 800kb of memory (I'm using the private dirty rss number, aka "the number that matters"). There are 17 processes on my desktop using libgnomeui right now. If we can remove the dependency from all of those, it would get us 13 MB of savings. In addition to the memory savings, this would likely speed startup speed quite a bit. FYI, the list of processes using libgnomeui right now is:


I'm quite sure that we can kill the library from many of those. If one looks for GNOME VFS (which is responsible for quite a bit of the bloat), there are even more processes that could use dependency pruning.

Thursday, June 29, 2006


notification-daemon displays those "you've got 10 minutes of battery left" dialogs. To launch it, you send a dbus signal. A file in /usr/share/dbus-1/services launches the servicce when the dbus interface is called.

An interesting thing I noticed today: if I kill the notification-daemon process, I can still get messages. The process just gets relaunched by dbus. Why the hell doesn't the process quit when no notifications are active? This would save 3.1 MB of memory on my system.

Filed here.

Google Checkout

Google Checkout was released today. In the great slashdot tradition, the story is sold as "Google is releasing an X killer" where X is a product that does something similar but has a totally different goal. (Google Spreadsheets the "excel" killer, etc).

Two things about Google checkout are very interesting to me. First, the ability for Google to provide the user with some sort of confidence in the checkout process. I never really liked Froogle all that much because it'd point you at some random store that just happened to have a website.

The second thing I like is that Checkout offers an anonymous email service. The ability to shut off email from a seller if they get out of hand is very, very nice. (as I understand it, it also offers the ability to forward email for the inevitable email address change. However, it seems the main functionality is being anonymous).

Of course, there's always a down side to things. I'm quite sure this will cause an increase in phishing emails for Google accounts. With that said, it looks like Google is using smart measures to make sure a stolen password won't be of much use. Also, I'd think that gmail users would be especially safe from phishing (when GMail team gets a phishing report, I'd assume they can retroactively filter it).

The checkout support site also has some (basic) information on what the anti-fraud team does. I can assure you, the process is much more complex than that site makes it sound :-).

Saturday, June 17, 2006

Powered by Google Maps

While Google can't find you a date, with the combined power of Maps API and somebody who has way too much time on his hands, you can tell the world what happened (and where it happened) after you got one.

Friday, June 16, 2006

onmousedown and performance

While checking my email on my coporate GMail account today, I noticed that when I clicked on a button in GMail (eg, Inbox, or on a message), it starts to open when you press the button down, rather than when you release it to make the full click. This is not the typical behavior of a site (try it on any other hyperlink).

After pondering for a while, I realized that this must be a performance thing. It must take 50-100ms for one to complete a click. However, pinging from here (home, on a wireless connection) takes about 70ms. That head start by capturing onmousedown must be enough to hide most of the delay in fetching stuff.

With a bit of further investigation, I found that Google Search uses the same thing. If you look at the html for a typical query, you will see onmousedown="return rwt(.... rwt appears to be a function that sends a request to with a query string that tells what result was clicked. This would make sense in collecting stuff about are the best results really on the top position. It's genius that they slip this tracking in the onmousedown so that it does not affect user performance.

Monday, June 12, 2006

Bad situation #153

You print out the design for something you need to use. The first line is:
Status: CURRENT (as of November 19, 2003)

It's time to start grepping the source code for what I need.

Monday, May 22, 2006

Please don't me-too on bugs I blog

On one of the Launchpad bugs I blogged, there have been a few me-too posts saying Yeah, this problem sucks. Let's fix it. Please do not do this. When I blog about a bug, I have two goals. First, it allows people reading my blog to get an idea of what issues are being looked into in the performance front. If you are interested in our progress, you should feel free to subscribe to the bugs. Subscribing can be viewed as a vote for the bug. Second, I'd like to highlight some of the easy wins to the developer community. Commenting on a bug saying I wish you would fix this helps nobody, and increases the time people have to spend on bugzilla. Please don't do it, or else I'll have to stop blogging bugs.

Sunday, May 21, 2006

Performance: The Good, The Bad, And The Ugly

The Good

  • The latest Dapper betas are starting up GNOME with less than 95 MB of ram. This is really impressive. Everyone involved in this should feel really good about it.
  • I think we're making good progress in terms of startup memory usage in general. In the next round of distros, I'm willing to bet we can bring the GNOME startup cost to about 85 MB. This will result in startup time reductions for everyone (even those with 1 GB of ram) and will be very important for low-memory users

The Bad

  • My box wastes 3.4 mb of private dirty ram to load hpssd, some sort of HP printing thing. I don't have a HP printer. Even if I did, why does 3.4 mb of stuff need to be loaded? This piece of software simply needs fixing. There's already a bug on this in launchpad.
  • Evolution sees it fit to launch evolution-exchange-storage just because the Exchange plugin is installed. This wasts 1.6 mb of private dirty ram. I personally think distributions should remove evolution-exchange from the default install set until this is fixed. Exchange users are likely using managed distros. Those people can install the plugin along with their other configurations. I filed a bug for Ubuntu to remove this from the default install. I also filed a Evolution bug.
  • The way Ubuntu uses gettext to translate gnome-panel items caused 1.2 MB of memory to be used

The Ugly

  • We load way too many processes at startup. nm-applet, gnome-volume-manager, gnome-power-manager, etc. All of these are loading must of the same stuff in their address space. I think we need to have some shared infrastructure for desktop plugins that listen for some sort of event and then take action on it. These are all small, simple tasks that should be in one process.
  • Firefox, Evolution, and OpenOffice are still taking much more memory than they should. We may be reaching the point of diminishing returns in desktop startup memory. We will need to turn our focus to these apps.

Saturday, May 20, 2006

Life as a Noogler

It's the end of my first week at Google. It was a fairly mundane week. The first day I set a password, got a Google badge, and filled out forms (they have a very nice automated system that made filling out 16 forms a quick job). The food is, simply put, fantastic. A typical Google lunch would be a $20 dinner. The menu at Charlies, Google's primary eating establishment, has a wide varity of selections. The snacks are just as good. It's like I'm living in Whole Foods.

There are a few things I'm getting used to at Google. It's really different than the world I'm used to. The oddest thing is how hush hush everything is. Every meeting starts out with "this is confidential". Sure, at Novell I had to deal with some things that were under NDA. In Google, you presume all knowledge is under NDA by default. Second, everyone at Google is new, or so it seems. When a social event is done, it seems like a common exercise is for people to raise their hand if they are new. A large number of people do this. The introductory training sessions are huge. Google is growing at a rapid rate, it's really exciting.

One of the most exciting parts of being at Google is the fun stuff to explore. Interns have access to almost all of the Google code base (interns fall into a bucket called nonconfs who have a few restrictions, such as not being able to see the pagerank code). It's hard to resist the urge to explore code when there is work to be done.

In other news: GStreamer was loading too many plugins in each process. This caused a few extra MB of private dirty memory for gstreamer using programs, like the settings daemon and mixer_applet2. This should be fixed now.

ps noogler == new googler

Friday, May 12, 2006

Google Trends

Google Trends is a really cool service. In case you don't read Slashdot, it gives you stats on who searches for what. Some are rather interesting, for example, you can find out what the popular Linux distros are in different parts of the world.

suse   fedora   ubuntu   debian   
Around the World

In the US

In Germany

Google Trends also has great tastes in education.

Carnegie Mellon University   Massachusetts Institute of Technology

Tuesday, May 09, 2006

Track down your first leak!

Now that gnome-system-monitor gives useful information, I've been keeping it open quite a while. There is pretty clearly a slow memory leak. The heap grows by just keeping it open (i've been having it in the process view). You need it open over a period of a few hours to notice the leak. I first saw it after leaving it on over night.

Want an easy way to get your hands dirty with performance? This is your chance. I've filed the bug as #341175. Here is the plan of action:

  1. Reproduce the bug, make sure you can see it. I'm using GNOME 2.14, as it is packaged on Dapper.
  2. Run G-S-M under valgrind. I like to use the following flags valgrind --num-callers=20 --leak-check=yes --show-reachable=yes --leak-resolution=high gnome-system-monitor. This will generate a large dump file, because it showes stuff that's still reachable. However, this will help you if gsm retains pointers to memory, but just doesn't use it.
  3. Figure out who's to blame
  4. Patch it

If you complete any of these steps, please update your progress in the bug report. Again, this is a great chance for somebody new to performance work to get involved. If you need help, go to #performance on IRC, and one of the performance hackers will help you.

Talking of performance masters, Ubuntu users should thank Sebastien Bacher. The fair seb128 checked in a patch that puts many of the gnome-panel applets in process. This saves a good amount of ram. Thanks Sebastien!

USB Wireless Help

If you have experience using a USB wireless adapter on Linux, I'd really appreciate an email ( I've been having quite a bit of trouble finding something supported. Many thanks!

Thanks to those who responded. Responses I got, by chipset: prism, atmel (reported to require some hacking to get working), Ralink rt2570. One person recommened SWEEX Usb WLAN LC100010, saying it worked out of the box. Again, thanks for the quick help!

Monday, May 08, 2006

Breaking news: g-s-m now gives useful information

Tired of guessing what 20 MB of RSS really means? Want numbers that give you statistics that aren't lies. Wait no longer. With a simple patch gnome-system-monitor version 2.14.2 gives useful information when combined with a modern kernel (in FC5, Ubuntu Dapper, or SUSE 10.1).

This new statistic is the Writable Memory column in gnome-system-monitor. You may have to modify your preferences to expose this. This column gives you the private dirty RSS, the memory statistic which I've talked about in previous posts. This is the amount of memory that is private to your process, modified from the on disk version, and loaded into memory. The number is a very good indication of how much memory you are using.

Note how the Writable Memory column is less than the traditionally used RSS. This is becaused shared rss is not taken in to account.

I'd like to propose that we make One True Statistic for g-s-m: writable memory + X memory. This is an very accurate accounting of memory usage. I think it's as good as we can get without further kernel patches.

Sunday, April 09, 2006

Fighting Daemons

One of the things I love about GNOME and Linux in general is the philosophy of "do one thing, and do it right". However, we may have taken this to an extreme in the GNOME community. The GNOME desktop has seperate daemons for a plethora of simple tasks. nm-applet sits around and waits for the network connection status to change. Great, I love nm-applet. But does this task warrent 2.7 MB of memory? According to my smem script, this is the amount of private dirty rss the task takes.

To initialize the GTK+ framework (among other frameworks), one must allocate a given amount of memory on the heap. Depending on what parts of the stack (dbus, libgnomeui, etc) one loads, this ranges from 1-3 MB. While it's important to work to fix this (eg, by mmaping things), there's only so much we can do.

I believe it would be beneficial to have a host process for daemons like this. The system would need to be set up in such a way so that mini-daemons could be put into a different process via configuration (eg, for debugging).

One of the types daemons on the desktop is panel applets. We have a process to display the clock, the volume switcher, etc. The panel already has a framework for doing these in process. What would be the benefit of using this? To find out, I put two applets in process: the wnck-applet which handles the workspace switcher and task list and the notification-area-applet, which handles the notification tray.

I actually already had the patch for this. openSUSE already uses a similar patch thanks to Federico. I decided to modify the patch so that it would not put clock-applet in process. Some people have complained that because this links to evo, it might be at risk for crashing. Fine, whatever. Let's start with some low hanging fruit.

Here are the results for private dirty rss, before and after the patch.

gnome-panel4008 kb4248 kb
notification-area-applet1368 kb--
wnck-applet3056 kb--
TOTAL8432 kb4248 kb

This is over 4mb! Just for putting two things in process. There are three other applets on my Dapper computer that could benefit just as easily: trash-applet (trivial), mixer applet (see below), clock applet (links to evo stuff. maybe?). I hope other distros follow the led of openSUSE in this area.

One thing to look at: the mixer applet loads every single part of gstreamer. This causes extra memory usage (gstreamer plugins badly need constification), as well as extra spinning of the disk. Can this be fixed?

Saturday, April 08, 2006


I will be interning at Google this summer in the AdWords anti-fraud team. This is a really exciting opportunity for me.

I'd like to publicly thank my friends and colleagues on the Mono project for the fantastic learning experience I've had over the past three years. There are a few people I'd like to give an extra-special thanks:

  • Sebastien, for guiding me through my first project, BigInteger.
  • Atsushi, for guidance on the project I'm probably most proud of, the XSLT engine.
  • Paolo, for code reviews that have begun to teach me the meaning of "good style" (and I admit, I'm still learning).
  • Miguel, for believing in me and being a great mentor/friend/boss.
I'll still work with Mono on the side and stay in touch via IRC and email.

If you are a college (or high school student) reading this, I would like to give you one piece of advice. Get involved in an open source project. It will be the best choice of your life. Even the world's best computer science schools are very bad at teaching you what you really need to know to get work done. CS classes teach theory, not how to write good code. Hacking on open source software is the best way to be a smart programmer. Open source projects are more than happy to take on a dedicated hacker, even one who makes mistakes at first. Hacking will be the best decision you ever made. Your chances of getting an internship are greatly increased by working on OSS.

As a side note: for those of you who missed my previous blog because pgo wasn't picking up my feed, I'm going to GUADEC. Be prepared to kill bloat.

(I promise performance blogs are coming soon. Really)

Thursday, April 06, 2006


  • I'm going to GUADEC, now that I am being sponsored. Be prepared to kill bloat with the Ben, Federico, Behdad team. If you're interested in memory reduction, it'd be great to have you.
  • With leadership from Daniel Holbach, Ubuntu will be using the icon cache. Yay.
  • I promise for some more juicy memory reduction content soon, after I finish 15-251 (Discrete Math) homework.

Sunday, April 02, 2006

More Icon Cache

It's too bad that we are at a standstill as to using the GTK Icon Cache in Ubuntu/Debian. Clearly, both Fedora and SUSE have implemented the standards defined policy and have (to my knowledge) not encountered major issues.

While the impressive number of over 300 packages quoted on the Ubuntu bug seems like a very hard task, distributed across many contributers, I can't imagine this being a blocking issue. As for 3rd party apps: sure, could cause a third party app to break. But the same app will be broken on any other issue.

Even if the Debian folks absolutely do not want to risk breaking something, how about at least using the shared memory aspect of the cache? You can stat every directory in /usr/share/icons to make sure the cache is valid. But if it is, you'll save 300kb per process by using the cache.

Anyways, it looks like the spec is staying as is. The folks upstream have given (IMHO) a well reasoned argument that the spec is implementable and reasonable.

Friday, March 31, 2006

Make sure to use the Icon Cache

It's sad when people take great efforts to optimize the memory usage of a program, only to have the optimization not used. It seems like, due to a packaging bug, Ubuntu isn't taking advantage of GTK's icon cache, wasting 300 kb per process.

Not sure what we can do to make sure every distro takes the actions necessary to get the best performance out of GNOME, I guess there's always the process of filing bugs.

Thursday, March 30, 2006

Zenity Memory Usage: Valgrind

So, on the latest Dapper beta, I tried running valgrind on an instance of Zenity, a fairly good example of a minimal gtk program. I noticed that stuff was getting allocated for the icon data:

==24632== 74,477 bytes in 2,864 blocks are still reachable in loss record 4,829 of 4,830
==24632==    at 0x401C422: malloc (vg_replace_malloc.c:149)
==24632==    by 0x479FF36: g_malloc (in /usr/lib/
==24632==    by 0x47AFC65: g_strdup (in /usr/lib/
==24632==    by 0x42BAD04: ??? (gtkicontheme.c:2161)
==24632==    by 0x42BAF11: ??? (gtkicontheme.c:1057)
==24632==    by 0x42BBC5F: gtk_icon_theme_lookup_icon (gtkicontheme.c:1244)
==24632==    by 0x42BC108: gtk_icon_theme_load_icon (gtkicontheme.c:1388)
==24632==    by 0x42B74C1: gtk_icon_set_render_icon (gtkiconfactory.c:1748)
==24632==    by 0x43D0CAB: gtk_widget_render_icon (gtkwidget.c:5337)

There were other stack traces here, accounting for over 200kb of memory. I tried regenerating my cache, this had no effect. Have to see what is going on. Also, this stack trace was very curious:

==24632== 337,656 bytes in 94 blocks are still reachable in loss record 4,830 of 4,830
==24632==    at 0x401C422: malloc (vg_replace_malloc.c:149)
==24632==    by 0x494FC0C: (within /usr/lib/
==24632==    by 0x41B0E9A: (within /usr/lib/
==24632==    by 0x41B6974: (within /usr/lib/
==24632==    by 0x41B237A: (within /usr/lib/
==24632==    by 0x41B4DA1: (within /usr/lib/
==24632==    by 0x41B02D1: pango_ot_info_get_gpos (in /usr/lib/
==24632==    by 0x41B0357: (within /usr/lib/
==24632==    by 0x41B0407: pango_ot_info_find_script (in /usr/lib/
==24632==    by 0x4C162E9: (within /usr/lib/pango/1.5.0/modules/
==24632==    by 0x45E4D12: (within /usr/lib/
==24632==    by 0x45F3FB3: pango_shape (in /usr/lib/
==24632==    by 0x45E7BCE: (within /usr/lib/
==24632==    by 0x45EAD1A: (within /usr/lib/
==24632==    by 0x45EB274: (within /usr/lib/
==24632==    by 0x45EBBCB: (within /usr/lib/
==24632==    by 0x42DA876: ??? (gtklabel.c:2027)

I wonder exactly what this is that needs 300 kb of memory.

Wednesday, March 22, 2006

libaudiofile patch

Jason Allen commented on my last blog with a patch to fix libaudiofile. This simple patch just adds const to lots of data tables. It trims the 92 kb of dirty private rss down to 8 kb, saving just under 2 mb desktop wide! Jason: this patch is great. Let's get it upstream, and also try to get the next round of distros (suse 10.1, dapper, fc5) to include this.

There are quite a few other libraries that are used by almost every GNOME process that could benefit from such constification. Some I saw from gnome-terminal:

      48 kb        0 kb       48 kb   /usr/lib/
      40 kb        0 kb       40 kb   /usr/lib/
      36 kb        0 kb       36 kb   /usr/lib/
      36 kb        0 kb       28 kb   /usr/lib/
      24 kb        0 kb       24 kb   /usr/lib/
      20 kb        0 kb       20 kb   /usr/lib/
      20 kb        0 kb       20 kb   /usr/lib/
      20 kb        0 kb       20 kb   /usr/lib/
      20 kb        0 kb       16 kb   /usr/lib/
      16 kb        0 kb       16 kb   /usr/lib/
      16 kb        0 kb       16 kb   /usr/lib/

Fixing one of these libraries will have the benefit multiplied by about 20.

We should also consider is reducing the number of processes on the desktop. For example, clock-applet takes up 2.7 MB of private dirty rss. 1.7 MB of this is the heap and stack, the other MB is the .data section of .so files. For the most part, these are constant costs we are going to experience with any process. Reducing the number of processes will reduce this problem.

Tuesday, March 21, 2006

Memory Usage with smaps

As most developers should know by now, the memory statistics given on Linux are mostly meaningless. We have the vmsize, which counts the total address space used by a process. However, this is not accurate because it counts pages that are not mapped in to memory. We also have rss, which measures pages mapped into memory. However, it multi-counts shared pages: every process gets X kb of rss due to libgtk. However, the majority of the pages in libgtk are shared across the processes.

What we really care about is the private rss, the amount of pages our process maps in to memory that are only used by our process. We'd also like to know the rss per mapping so that we can point-fingers/find-where-to-fix-the-bug.

Up until this point, such statistics have been hard to come by. No longer! The 2.6.16 kernel adds support for smaps, per-mapping data, including data on each mapping's rss usage. This data lives in /proc/$pid/smaps. However, the format of the smaps file is hard to digest. I've written a quick perl script which parses this into something more useful. It uses the Linux::Smaps module on CPAN.

An example of the data generated by this script:

VMSIZE:      41132 kb
RSS:         23052 kb total
         9212 kb shared
            0 kb private clean
        13840 kb private dirty
vmsize   rss clean   rss dirty   file
12768 kb        0 kb    12616 kb   [heap]
196 kb        0 kb      196 kb
120 kb        0 kb       92 kb   /usr/lib/
132 kb        0 kb       80 kb
 80 kb        0 kb       60 kb   [stack]
 48 kb        0 kb       48 kb   /usr/lib/
 40 kb        0 kb       40 kb   /usr/lib/
 36 kb        0 kb       36 kb   /usr/lib/
vmsize   rss clean   rss dirty   file
2848 kb     1596 kb        0 kb   /usr/lib/
1172 kb      624 kb        0 kb   /lib/tls/i686/cmov/
488 kb      400 kb        0 kb   /usr/lib/
900 kb      396 kb        0 kb   /usr/lib/
524 kb      360 kb        0 kb   /usr/lib/

The vmsize and total rss size are the statistics that everyone is used to. The rss size is split into private and shared. The private rss is what could be best called the process' memory usage.

Below this, we give the per-mapping statistics for private mappings. Most of these are either from the heap (eg, malloc'd data), or writable mappings in .so files (from the .data section). This output is especially helpful for diagnosing the second kind. Following, we give the same data for shared mappings (most of which are .so files, the executable code, etc).

Call to action: I would love to see the following done:

  • Check out some of the libraries with large writable segments. An extreme example of this is libaudiofile. This library has 92 kb of dirty, private rss (isn't that naughty!). To make matters worse, the library is used by 22 processes on my GNOME 2.14 desktop. This is about 2 MB of memory! Let's figure out why this is happening and make the data const. Also, it might be wise to see if we could reduce the number of programs using this library. We should try to find any other instances of this.
  • Let's get some smaps based data in gnome-system-monitor, and possibly more low level tools as well.
  • We should look at tools like exmap to get per-page level rss into. This is useful for finding out things like what pages does evolution use from libgtk and why. We can use this tool to figure out what we can do for low memory users: we can simulate high levels of swapping by allocating memory in a dummy process. We should then see what pages must be loaded from disk to use the desktop, evolution, etc.
  • It'd be great to set a community goal for 2.16. We will reduce the private rss used by all GNOME processes in setup X by Y MB. We should also take statistics from 2.14 to make sure that there are no memory usage regressions.

In other performance related news, somebody has finally gotten good statistics on Firefox's memory usage. It looks like Mozilla is leaking pixmaps when browsing with tabs. I think there are many people who would be made much happier if this could be fixed. This is really great data gathering, and I'd like to see more of it.