Oh, Snap! 2.0
Overheard today:
I don't want to date him-and-his-f@!*#ing-blog.
" />
« April 2006 | Main | June 2006 »
Overheard today:
I don't want to date him-and-his-f@!*#ing-blog.
rhelmer applied the first release tag to the tree—FIREFOX_1_5_0_4_RC3—since the CVS server upgrade.
It took just under four minutes to complete.
For comparison, the old CVS server?
Forty minutes. On a good day morning at 4 am.
(This is the point in the story where we all go hug justdave.)
Went to the Flock presentation this morning on the Browser Technology track. I've never used Flock, so it was interesting to see the demo and look at some of the features they have.
I asked what their automated update story was, since there have been two or three releases of Firefox since the release they're using (which, as I understand it, is 1.5 still).
They said "We couldn't find the code for the automatic updates stuff." Which we know about and are working on fixing. (In fact, I've been working on it at XTech!)
So, how did they solve this problem?
"We wrote our own replacement."
I found Myk's XTech talk on microsummaries very interesting.
Last night at dinner, I believe it was Axel who was suggesting that publishing tinderbox performance data as an RSS feed might offer some interesting possibilities.
At first, I didn't see the point of doing that exactly, but with microsummaries, a tree sheriff could put all the branches they're supposed to be watching in their toolbar, so they wouldn't have to scan a huge tinderbox page all day. Could maybe even whip up some XSLT (was it?) to make them change colors if the performance numbers jump outside of some pre-defined range.
Of course, the cool thing about microsummaries is you don't necessarily need the RSS feed, it sounds like.
It's an interesting idea, though... is there a more consumable format for perf data than we currently offer/publish?
One of the largest hurdles with the virtualization migration plan was this huge unknown question of whether or not the tinderboxen performing tests could be virtualized.
Now that we have one (somewhat modern) tinderboxen—argo— cloned in a VM and running in physical hardware, we do have some data to look at.
"argo's" data is actually a bit confusing, because while the machine instance was cloned, not migrated, the machine's identity was cloned; that is, "argo" on May 10th was a physical machine; "argo" after May 10th was a virtual machine. And then, "argo" once again became a physical machine on May 16th, with the virtual machine copy appearing on the tinderbox page as "argo-vm."
The Good News


Looks to be about 20 ms jitter in physical vs. about 50 ms in virtual.
The Not-As-Good News

You can really see the (obviously unacceptable) jitter here. Physical jitter looks to be about 50 ms, while virtualized jitter is an order of magnitude larger. The virtualized jitter on Ts below, with the minimum service levels, seems somewhat better.
The Encouraging News

The smaller jitter is still around 10 ms, a slight improvement, but the outlying jitter is less encouraging (25 ms at the beginning and 15 ms on the right).

The Ts jitter still looks pretty bad, but it's down to over half, to about 200 ms.

At its worst, we dropped about 10 ms of Tdhtml jitter in the virtualized case. There's definitely room for improvement here.
Stay tuned!
The upshot: the jitter was much lower than I expected for Tp, but much higher than I would've expected for Ts, especially given Tp's jitter.
This data was from the VM before it was setup to have a guaranteed minimum scheduling level, as provided by the virtualization layer. It's currently set at a minimum of 33% of the (V)CPU and a maximum of 100% of the (V)CPU, along with some network priority settings.
If this doesn't bring the jitter down, there are some other tricks I plan on trying, including constraining the scheduling level more (so, a window of 33% to 50%, or even a smaller window of 33% to 33%), and turning off VMware's Guest Tools' time synchronization, which has a tendancy to clobber the clock at unknown intervals. Normally this is done in such a way that it's not noticed, but when using high-resolution timing, that method of syncing the clock could produce odd results.
Overall, I'm pretty encouraged. I thought the Tp times were relatively good, given the VM had no custom configuration settings. I'm actually surprised that the Ts data is so much more skewed than the Tp data, which relies on more resources (network, mostly).
We'll be getting more data on physical vs. virtualized performance testing as the last set of performance tinderboxen get migrated, so a clearer solution should begin to present itself once we have and sift through that data.
After almost 16 hours of travel, I finally made it into Amsterdam, and have gotten settled. I even got a good night's sleep last night, so I'm mostly over the jet lag (although, my laptop's clock says it's currently 5:52 am PDT, a reminder that is really not helping matters...)
This trip has been particularly interesting thus far, since it's the first time I've been out of the country. Evar.
The flight was interesting, especially for a flyboy like myself. The transatlantic aircraft-separation-and-communication bit was of particular interest to me... and, of course, trying to figure out what air traffic controllers in Germany were referring to, despite the fact that they speak English on the frequency.
The most amusing part of the 9+ hour flight was being seated next to this old couple; they started having a conversation—in only the way an old married couple can have a conversation—about how much alcohol they should drink, and whether or not they were supposed to take it with the sleeping pills, or not take it with the sleeping pills, and whether or not the "jet lag" pills were the same as the sleeping pills (and whether or not they could take the jetlag pills with alcohol... and... etc., etc., etc.)
A few minutes after that, the husband said to his wife: "Smell this. [Holds up a travel bag.] The bourbon's leaking."
Then they proceeded to open the bag and take out sports bottles full of... "apple cider." Then the wife said "Wait, is it the scotch leaking or the bourbon leaking?"
Then they started arguing about what scotch and bourbon smell like.
I dont' know what they eventually decided, but about an hour after this, they both fell asleep... and were completely out of it for the rest of the entire flight.
The How-Does-This-Affect-Me? Version
Various Tinderboxen will be down next week, in cycles, so we can migrate them into virtual machines.
These migrations will start on Tuesday, 9 May, and will be performed in three rounds, with about four tinderboxen per round.
During each round, these machines will be unavailable for a 24 hour period. The migrations will not affect the Bon Echo Alpha 2 release plans.
The Short Version
Step 1. Move all Tinderboxen to VMs
Step 2. ????
Step 3. PROFIT!
The Long(er) Version
Starting with planning and help from Chase, over the past five months, we've been working towards migrating all of Mozilla's Tinderboxen into virtual machines.
For those not familiar with the technology—VMware and Xen are players in the space with a lot of name recognition; Microsoft has an offering too, but it makes me giggle—virtualization offers the ability to run multiple instances of a full-blown operating system and an associated work load on the same piece of hardware. These OS instances are isolated from each other (conceptually, at least).
We've already migrated certain branches to virtualized Tinderboxen. Currently, the Firefox and Thunderbird maintenance branches are built using virtual machines. The 1.5.0.1, 1.5.0.2, and 1.5.0.3 releases have all come from VMs.
The major benefit for Mozilla, in addition to the marketing hype, include:
Virtualization is, of course, not free. There's a performance hit to allow six to seven builds to run on the same machine. But we've been using big behemoth machines that are dual-core, dual-CPU monsters with 4-8 gigabytes of RAM, and have found that the performance hit isn't as bad as we had worried it might be, and well worth it in terms of the configuration management/administration/provisioning wins.
The biggest outstanding question is: "Can we continue to run performance tests in VMs?" The short answer is "We don't yet know." There's been some discussion on mozilla.dev.builds about using the resource limitation features VMware ESX to give strict CPU and I/O service levels requirements to each VM that executes performance testing.
We'll do that with the VMs we migrate, but to ensure the numbers are good, we'll continue to run certain tinderboxen after the migration, for comparison (currently, that list includes argo, gaius, prometheus, pacifica, btek, creature, and beast). It will be interesting to see if the performance numbers settle down, as it's been suggested that they might.
The ultimate goal with this rollout is to basically put every machine possible into a VM (*cough* are you listening, Apple?), and then work on defining reference platforms, that are VMs, and can be effectively versioned. When we're done, privisioning a new tinderbox should be a trivial task, involving cloning a VM and getting an IP address, as opposed to today, where it starts with a call to IT, involves installation CDs (ew!) and ends a few weeks later, with a build engineer whining that he can't get to it for a few more weeks, because they're busy doing releases.