Three Monkeys, Three Typewriters, Two Days

June 30, 2009

Performance testing pitfalls

Someone recently asked why it is that SFX (e.g. in Safari 4) scores 1000runs/s on the Dromaeo CharAt test while Spidermonkey (e.g. in Firefox 3.5) scores something closer to 80runs/s. Naturally, I pulled out my profiler and looked. I discovered some interesting things:

  1. About 2-3% of the time the Dromaeo harness measures for the charAt test is actually spent running the str_charAt function in Spidermonkey. The rest is spent inside the harness itself.
  2. If I pull the test out of the harness, and instead of the harness' runs/second measurement just directly measure how long it takes to run the test 1000 times, I get 430ms or so in Firefox and 750ms or so in Safari. That corresponds to scores of 2325runs/s and 1333runs/s for Firefox and Safari respectively.
  3. If I rerun the same standalone test in Firefox with jit disabled, I get numbers closer to 7300ms.

Conclusions: The harness is measuring pure JS-execution overhead in both browsers, not actual charAt performance. The harness causes us to somehow not trace the test when running inside of it, leading to the numbers seen above.

Update: Filed a bug on this.

Posted by bzbarsky at 10:13 AM

June 15, 2009

Lessons (re)learned today, and performance

Lesson relearned: working on airplanes is very productive. I chalk it up to the lack of IRC and e-mail.

Lesson learned: working on performance bugs drains your battery very quickly. This is because the work largely consists of compiling, running the performance testcases, running the profiler, and then compiling some more. These are all rather CPU-intensive activities.

I spent my flight this morning finally digging into something that's bothered me for years: the performance of setting inline style. I focused specifically on the setting, not the things we have to do lazily (restyling, reflow, etc) to handle the sets; even that part was making up something like 20% of the time on some testcases.

The first step was to write a microbenchmark for setting style.top (which is the same code as setting style.left, so I figured I'd just profile one of them). On this microbenchmark, on my machine, we were taking somewhere around 400ms on each part on a current m-c build. Looking at the profile, the usual culprits jumped out at me. Speeding up parsing of numbers in CSS got me into something like the 350ms range. Creating a fast path for modifying a single non-important already-set style property via inline style put me into about the 250ms range.

At this point, looking at the profile, I realized that about half the time was being spent in the JS engine, not in the DOM/style code. I figured this was due to us not tracing DOM setters yet, but the fact that we were hitting the resolve hook all the time and filling the property cache a lot looked suspicious. Eventually, I decided that this looks like a possible bug in the JS property cache. I tried a hack to make us hit when we're missing right now, and times dropped to about 110ms. I'm not sure that's where we'll end up, because I'm not sure that the property cache is really misbehaving here, but it looked pretty encouraging!

All of which raises the question of what this means for "real" performance. As an example, on the testcase in bug 229391, on which inline style sets had shown up in profiles before, times look about like this:

  • mozilla-central: 2850ms total, 1270ms in loop.
  • With CSS parser patches: 2200ms total, 900ms in loop.
  • With my totally-unsafe propcache hack: 2100ms, 830ms in loop.

Tracing setters might also help some; it's hard to tell.

There's more obvious work that would help these pages (e.g. David Baron's proposal to not rerun selector matching on inline style sets, or what looks like a Mac painting/invalidation issue that I'm still investigating), so I hope that Gecko 1.9.2 will be a good bit faster on this sort of thing than Gecko 1.9.1.

Posted by bzbarsky at 11:27 PM