Ahead of the Release Curve III: Virtually There
The How-Does-This-Affect-Me? Version
Various Tinderboxen will be down next week, in cycles, so we can migrate them into virtual machines.
These migrations will start on Tuesday, 9 May, and will be performed in three rounds, with about four tinderboxen per round.
During each round, these machines will be unavailable for a 24 hour period. The migrations will not affect the Bon Echo Alpha 2 release plans.
The Short Version
Step 1. Move all Tinderboxen to VMs
Step 2. ????
Step 3. PROFIT!
The Long(er) Version
Starting with planning and help from Chase, over the past five months, we've been working towards migrating all of Mozilla's Tinderboxen into virtual machines.
For those not familiar with the technology—VMware and Xen are players in the space with a lot of name recognition; Microsoft has an offering too, but it makes me giggle—virtualization offers the ability to run multiple instances of a full-blown operating system and an associated work load on the same piece of hardware. These OS instances are isolated from each other (conceptually, at least).
We've already migrated certain branches to virtualized Tinderboxen. Currently, the Firefox and Thunderbird maintenance branches are built using virtual machines. The 1.5.0.1, 1.5.0.2, and 1.5.0.3 releases have all come from VMs.
The major benefit for Mozilla, in addition to the marketing hype, include:
- is the ability to "deep-freeze" machine instances in their entirety, so we can go back and build previous releases, if necessary
- the ability to provision entirely new machines in a couple of hours
- The ability to "branch" machine configurations, so new software dependencies won't disturb other builds running on the same machine (because they won't be running on the same machine anymore)
- being able to remove bulky old desktop machines out of the colo, where the space costs are higher
- being able to move bulky, old, unreliable PC hardware into RAIDed VMs, with little or no change
Virtualization is, of course, not free. There's a performance hit to allow six to seven builds to run on the same machine. But we've been using big behemoth machines that are dual-core, dual-CPU monsters with 4-8 gigabytes of RAM, and have found that the performance hit isn't as bad as we had worried it might be, and well worth it in terms of the configuration management/administration/provisioning wins.
The biggest outstanding question is: "Can we continue to run performance tests in VMs?" The short answer is "We don't yet know." There's been some discussion on mozilla.dev.builds about using the resource limitation features VMware ESX to give strict CPU and I/O service levels requirements to each VM that executes performance testing.
We'll do that with the VMs we migrate, but to ensure the numbers are good, we'll continue to run certain tinderboxen after the migration, for comparison (currently, that list includes argo, gaius, prometheus, pacifica, btek, creature, and beast). It will be interesting to see if the performance numbers settle down, as it's been suggested that they might.
The ultimate goal with this rollout is to basically put every machine possible into a VM (*cough* are you listening, Apple?), and then work on defining reference platforms, that are VMs, and can be effectively versioned. When we're done, privisioning a new tinderbox should be a trivial task, involving cloning a VM and getting an IP address, as opposed to today, where it starts with a call to IT, involves installation CDs (ew!) and ends a few weeks later, with a build engineer whining that he can't get to it for a few more weeks, because they're busy doing releases.
Comments
> The ultimate goal with this rollout is to basically put every machine possible into a VM (*cough* are you listening, Apple?)
http://www.kberg.ch/q/
Posted by: Anonymous | May 6, 2006 7:17 AM
http://www.kberg.ch/q/
"At the present state, QEMU is still considered ALPHA software."
Posted by: Preed | May 7, 2006 7:44 AM