Three Monkeys, Three Typewriters, Two Days

May 14, 2009

Reverse-engineering peacekeeper

There's been some noise recently about the Peacekeeper benchmark. It's got some issues (like not actually measuring quite what it claims to measure), but more importantly it doesn't publish its tests, its methodology, or how they compute the overall number.

I would be very interested if someone could take this thing and break it up such that it's possible to run the tests individually, so that it's clear what each test is measuring, and so that it's clear how those results are put together into the final score. The first of these is particularly interesting to me, since at that point we can look into whether the performance issues we have there (which I will assume are real, though it's easy to mess up a benchmark's measurements) are systemic or whether there's a particular set of benchmarks that tickle particular problems.

Any help on this would be very much appreciated. I gave it a shot a few weeks back, but gave up about a day later; I just didn't have more time to spend on it.

Posted by bzbarsky at 9:39 PM