Sunday, August 28, 2005

Linux for College Students

It is pretty interesting how many CMU students want to try out linux but don't know how. And these are not just male CS students -- a diverse range of people is interested in what else is out there. Firefox has huge exposure at CMU and on other college campuses. People know about Linux but don't know how to take the plunge. I think it is important that we try to incubate new Linux users at universities like CMU. These are our future engineers, scientists, and programmers.

So, I think the best way to start thinking about this is "what do students do with their computers?" With a corporate desktop (like NLD) this part of the game is pretty easy. Employees have well defined tasks. College students are a much more interesting crowd. I don't think we will be able to replicate 100% the programs that students are used to. But let's take a look at what is important:

  • Being able to play every video and music format on Earth.
  • P2P sharing (i2hub!)
  • Using Flash
  • Fully functional java environment (eclipse+tools+your choice of gcj or sun java)
  • Being able to read every pdf thrown at them (that means acroread -- I love evince but there are some documents it does not want to read).
  • 0-hassle integration with their campus network (at CMU, afs should be mounted)
  • Allow easy installation of university licensed software (mathematica!)
  • iPods must work with no hassle.
  • All wireless cards must work (I had to go through some hoops to get my Dell laptop's wireless to work. Not cool).

The most important part is taking a stock distro and making it do the tasks I listed above extremely easily. We can't expect people to go through 100 hoops just to play a DVD. There are some obvious legal issues here. But given that we college students have found ways to host terrabytes of copyrighted music and video, I don't see why it should be such a challenge for us to host a small amount of software that may have whatever issues.

Once this step is done, we would need to consider how to get Linux out to users. I think this has two aspects. First we need to have "Linux Heros" who can help people out. facebook would be a great place to start this. The second aspect is making it easy to actually install Linux. On a campus network, bandwidth is virtually free (at least within the univ. and other internet2 campuses). Therefore, the media should come from the campus network. I think a pxeboot system would be very powerful: with the help of a person familiar with how to use the BIOS, Linux can be installed with no physical media. Also, "kickstart" disks could be distributed that would boot to the pxe system for people with network cards that don't support pxe or who don't know how to use the bios.

I guess the way to start making this work would be to get a list of rpms that can be installed to make my above task list work. If packages don't exist, we need to create them. Where possible, we should prefer open source, patent-free stuff, but where that does not exist/does not have the functionality students need/etc, other solutions are needed.

Late breaking ideas

  • What if we make a Windows program that modifies the MBR so that on the next boot, it will go to the pxe server. This allows people to use pxe without touching the bios and without needing any physical media
  • Presentations / Demos of Linux
  • Form a community. Do some non-geeky stuff. Ice cream social for Linux. Could be integrated with presentations (hook them with ice cream, get them to watch the demo of linux)
  • Give "Linux Heros" free t-shirts. If we have enough heros, it won't be hard to find a person wearing their tshirt on any given day. These people can answer questions.

Later Breaking:

  • See GRUB docs for ideas about booting from the network. So we give the user a grub config file with the pxe server specified
  • Install grub on windows without touching the MBR. If I understand this correctly, it would mean a dual stage boot: first it goes into the NTLDR menu then the grub menu. But, it sounds a bit less risky.

Laterer Breaking

Saturday, August 27, 2005

At CMU

Am at the CMU campus:

In the center is Hamerschlag Hall. The large tower on the right is Pitt's Cathedral of Learning. The ugly looking slab on the right is Wean Hall, the CS building.

Abstract art at Carnegie Museum of Art (cmoa):

Still CMOA, but now much more fun. Who said adults can't enjoy the children's section.

Saturday, August 20, 2005

Off to college

On Sunday, I'll be going off to college:

Carnegie Mellon

One week of orientation (the schedule is literally 40 pages long. Hopefully, there is a list of important things :-), and then classes. I'll have to start programming in Java. Given that I convinced the CS department to let me into a fairly advanced class, it would probably be a Good Thing to know how to write a "hello world" in java by the first day of class. How long will it take me to get used to javaCase? And then how long before I leak javaCase into CSharpCase? I wonder if I could get away with using Mainsoft's stuff and claiming that my c# code is actually Java.

Thursday, August 18, 2005

Msdn Browser

Browse msdn without an insane treeview and without an insane browser. Source is here. It's only 220 lines, you can read it really quickly!

Left to do:

  • Integrate into monodoc
  • Make it handle T:System.String style links in monodoc (would allow it to replace the default class ref).
  • Handle clicking links inside the msdn content area
  • Greasemonkey-like modifications of MSDN's content
    • Remove syntax for VB, etc.
    • Remove the msdn footer
  • Use a cache for the xml
    • Would be cool to use mozilla's cache here. That means I can avoid writing my own policy for caching.

This shows off some cool Gtk# 2.0 features (the tree node stuff) and some cool C# 2.0 features (the delegate stuff). Also, I am using a very cute hack, the xml serializer, to read the data.

Saturday, August 06, 2005

String Freezing

Whidbey has a very cute feature in it's ngen (it's ahead-of-time compile solution). Traditionally, string literals in .NET assemblies are stored in a string heap. When they are loaded from the assembly (using the IL ldstr opcode), the runtime allocates a new object and does a memcpy of the data (on some platforms chars have to be byteswapped, but same idea). Why does this copy need to happen? The first 8 bytes of an object consist of a pointer to a VTable (which holds the object's type's identity, including the data needed to make virtual calls) and a (normally null) pointer to a "monitor" that can be used to lock the object. Now, this is a fairly large waste of memory. First, we have to allocate the string in the GC heap in each process. Second, by reading the string from metadata, we have paged it into memory there (but that is shared between processes). The traditional solution to this (as is implemented in C programs) is to store strings in the .rodata section of shared libraries. This way, they can be shared between processes. (C also does a cute hack taking advantage of the fact that strings are \0 terminated. If you have "123" and "abc123" as literal strings in the same program, it stores "abc123\0" and the string "123" is just "abc123" + 3). Doing something like this should be simple in the aot format. The string data just needs to be copied to a readonly section. However, there are two challenges. First, the VTable for string is allocated in dynamic memory. How can we know its address at AOT time. I can think of two possible solutions: Mono statically links the JIT and Runtime into a single executable. Static executables never get relocated, so we can pre-allocate the .data section in the mono runtime. We know that this pointer won't change (except if mono is recompiled; some sanity check would be needed). However, this has a very large disadvantage: applications embedding mono could never take advantage of this. A second, harder, but much better (IMHO) solution is to implement a system like the Windows dynamic loader. Jason Zander explains how this works here. To give a sort summary, in Windows, you need to say that your .dll should be loaded at a specfic address in virtual memory. If that address can't be found, the operating system goes around and changes all references that depended on this location. If we use this solution, the AOTd mscorlib.dll would simply hold the string vtable. This solution is alot better because it provides us with advantages beyond string freezing. We avoid the need for a GOT/PLT like structure. Also, we could precomputate vtables, and possibly other metadata structures, and store them in readonly memory; this would greatly reduce the cost of managed code. The disadvantage is that we need to help developers find available virtual address space. Also, if there are virtual address space conflicts, we probably can't use the aot'd file. This solution also takes a fair bit of work. For some more information about Whidbey improvements in Microsoft's runtime see ricom's blog. Also, Jeroen Frijters wrote about string freezing Idea I had right after I published: string literal bloat seems most likely to matter in a GUI application. However, in that case, we also have the issue that we need to convert the .NET style string to a C style string when we call Gtk+. Would it make sense for us to also pre-cache converted strings so that they too could be shared?