Saturday, August 06, 2005
Whidbey has a very cute feature in it's ngen (it's ahead-of-time compile solution). Traditionally, string literals in .NET assemblies are stored in a string heap. When they are loaded from the assembly (using the IL ldstr opcode), the runtime allocates a new object and does a memcpy of the data (on some platforms chars have to be byteswapped, but same idea). Why does this copy need to happen? The first 8 bytes of an object consist of a pointer to a VTable (which holds the object's type's identity, including the data needed to make virtual calls) and a (normally null) pointer to a "monitor" that can be used to lock the object. Now, this is a fairly large waste of memory. First, we have to allocate the string in the GC heap in each process. Second, by reading the string from metadata, we have paged it into memory there (but that is shared between processes). The traditional solution to this (as is implemented in C programs) is to store strings in the .rodata section of shared libraries. This way, they can be shared between processes. (C also does a cute hack taking advantage of the fact that strings are \0 terminated. If you have "123" and "abc123" as literal strings in the same program, it stores "abc123\0" and the string "123" is just "abc123" + 3). Doing something like this should be simple in the aot format. The string data just needs to be copied to a readonly section. However, there are two challenges. First, the VTable for string is allocated in dynamic memory. How can we know its address at AOT time. I can think of two possible solutions: Mono statically links the JIT and Runtime into a single executable. Static executables never get relocated, so we can pre-allocate the .data section in the mono runtime. We know that this pointer won't change (except if mono is recompiled; some sanity check would be needed). However, this has a very large disadvantage: applications embedding mono could never take advantage of this. A second, harder, but much better (IMHO) solution is to implement a system like the Windows dynamic loader. Jason Zander explains how this works here. To give a sort summary, in Windows, you need to say that your .dll should be loaded at a specfic address in virtual memory. If that address can't be found, the operating system goes around and changes all references that depended on this location. If we use this solution, the AOTd mscorlib.dll would simply hold the string vtable. This solution is alot better because it provides us with advantages beyond string freezing. We avoid the need for a GOT/PLT like structure. Also, we could precomputate vtables, and possibly other metadata structures, and store them in readonly memory; this would greatly reduce the cost of managed code. The disadvantage is that we need to help developers find available virtual address space. Also, if there are virtual address space conflicts, we probably can't use the aot'd file. This solution also takes a fair bit of work. For some more information about Whidbey improvements in Microsoft's runtime see ricom's blog. Also, Jeroen Frijters wrote about string freezing Idea I had right after I published: string literal bloat seems most likely to matter in a GUI application. However, in that case, we also have the issue that we need to convert the .NET style string to a C style string when we call Gtk+. Would it make sense for us to also pre-cache converted strings so that they too could be shared?