July 3, 2009

Faces Of The Web Video Revolution

There's a lot of press these days about the HTML5 <video> tag and the struggle for universal unencumbered video and audio codecs --- much of it associated with the Firefox 3.5 launch. I wonder how many people know that the Firefox video implementation is almost entirely due to just a few people in the Mozilla office in Newmarket --- Chris Double, Matthew Gregan, Chris Pearce, and to a lesser extent, me. (Justin Dolske did the controls UI, but I'm not sure where he lives!) I'm proud that we managed to get considerably more done for the 3.5 release than I expected.

It's a great privilege to have the opportunity to really shake up the world for the better, with a very small team, in a relatively small amount of time, and do it right here in Auckland. I'm very thankful.

Chris Pearce, Chris Double and Matthew Gregan

Posted by roc at 10:21 PM | Comments (10)

July 1, 2009

Progress

I've submitted all of my compositor phase 1 patches for review. The patches are also published here. 111 files changed, 2573 insertions, 1448 deletions, divided into 39 separate patches over multiple Bugzilla bugs. In theory every one of those steps should build and pass tests, although I haven't actually verified that for all the patches. I managed to break things up pretty well --- the largest patch is only 536 insertions and 75 deletions --- so hopefully that will make reviewing easier.

MQ has been working pretty well for me. I get into a routine of applying all the patches, doing some testing, fixing a number of bugs, and then redistributing the changes across the patches that they logically belong to. I'm not 100% sure this is the most efficient way to work --- sometimes I burn quite a bit of time putting all the changes in just the right places --- but at least it's now possible.

Now it's time to start working on something else. My immediate next task is to restructure the media tests so we can generalize tests across file types and backends; for example, right now we have one set of seeking tests for Ogg and another for Wave, but we should just have a single set of tests parametrized by test files of different types.

After that, I plan to do some cleanup that's enabled by compositor phase 1. In particular, we can move scrolling out of the view system and integrate it all directly into the scrollframes in layout.

After that I plan to work on compositor phase 2. Right now in Gecko whenever something needs to be repainted we make platform-level invalidation requests, the platforrm dispatches paint events and we paint. This often leads to over-frequent painting. For example if there's a script changing the DOM 100 times a second, we'll try to paint 100 times a second if we can keep up, which is a waste of time since most screens only refresh at 60Hz or so. Even worse, if you have that script, and an animated image painting 20 times a second, and a video playing at 25 frames per second, we invalidate them all independently and if your computer is fast enough we'll paint 145 times a second. We need to fix this, and I have a plan.

Part of that plan is to create an internal animation API that various Gecko components (animated images, video, smooth scrolling, SMIL, CSS Transitions, etc) can plug into. But we also recognize that declarative animations will never be expressive enough for all use cases, and there's a lot of existing scripted animation libraries out there, so I have an idea for tying in scripted animations as well. Basically, I would expose the following API:

  1. window.mozRequestAnimationFrame(): Signals that an animation is in progress, and requests that the browser schedule a repaint of the window for the next animation frame, if the window is visible.
  2. The browser will fire a mozBeforePaint event at the window before we repaint it. Animation libraries should register an event handler that checks the current time, and updates the DOM/CSS state to show that point in the animation. If the animation has not ended, the event handler should call window.mozRequestAnimationFrame() again to ensure another frame will be scheduled in a timely manner.
That's it! That API gives the browser control over the frame rate, while allowing JS to do anything it wants in each frame.

Posted by roc at 5:46 PM | Comments (5)

June 24, 2009

Native Widgets Begone

Recently I've been working removing the use of native widgets (Windows HWNDs, Cocoa NSViews, GdkWindows) in various places in Gecko. Native widgets get in the way of painting and event handling, and break things due to various platform limitations.

The key difficulty of getting rid of native widgets is ensuring we still handle windowed-mode plugins adequately. For example, currently if a windowed-mode plugin is contained in an overflow:auto element we rely on a native widget associated with the element to scroll and clip the plugin. In situations where we should clip but there is no native widget present, like 'clip' on an absolutely positioned element, we actually don't currently clip the plugin, which is pretty bad.

My new strategy for clipping plugins is to explicitly compute a clip region for each plugin and use native platform APIs to clip the plugin's widget to that region. On Windows we can use SetWindowRgn, on X we can use the XShape extension via gdk_window_shape_combine_region. On Mac we don't have to do either of these things, because plugins work a bit differently there. One issue of course is figuring out what the clip region should be. For that, I'm reusing our display list machinery. We build a display list for the region of the page that contains plugins, and then traverse the display list to locate each visible windowed-mode plugin --- its position, and what clipping is affecting it. Plugins that aren't in the display list aren't visible so we hide them. This works really well, and means that everywhere we clip Web content, plugins are automatically affected as well.

A nice bonus is that display lists contain information about which elements are opaque. We already have the ability to determine that part of an element is covered by opaque content and thus doesn't need to be drawn. We can leverage this for plugins too: areas of a plugin we know are covered by opaque content (e.g. a DIV with a solid background color) are clipped out. This means that all the crazy hacks people have been using to get content to appear above a windowed-mode plugin today (e.g., positioned overflow:auto elements and IFRAMEs) will no longer be needed with Gecko; you can just position a DIV or table with a solid background color. The hacks still work of course, as long as the hacked element has a solid background. (Credit where it's due: IIRC clipping holes for opaque Web content out of windowed plugins was first suggested by a KHTML developer long ago.)

This also fixes relative z-ordering of plugins with other plugins and Web content. We don't have to touch native widget z-ordering because windowed-mode plugins are opaque. So if plugin A is supposed to be drawn in front of plugin B, but its widget is behind plugin B's widget, then we effectively punch a hole in plugin B's widget so you can see plugin A through it. It sounds a little crazy but it works just fine.

Perhaps the greatest challenge here for us is getting scrolling to work smoothly on all platforms. In general we may have an IFRAME or overflow:auto element in the page, which contains one or more plugins which need to move smoothly with the scrolled content. A naive implementation would just bitblit the Web content and then move and clip the plugin widgets, creating a variety of exciting and unwanted visual artifacts. I designed a new cross-platform scrolling API which is quite rich; you pass in the rectangle you want to scroll, the amount you want to scroll it, and a list of commands to reconfigure child widgets that should be performed with the scroll "as atomically as possible". These commands let the caller change the position, size and clip region of some or all child widgets. This gives the platform-specific code great freedom to reorder operations. I described some of the implementation issues on X in a previous post.

I think I've got things working pretty well. My patches pass tests on all platforms on the try servers. There are 24 patches in my patch queue, many of them small preparatory cleanup patches. The big important patches enable region-based plugin clipping, modify a ton of code to remove the assumption that every document has a native widget as its root, actually remove the native widgets associated with content IFRAMEs, and remove the native widgets used for scrolling and implement the new scrolling system.

In order to not change too much at once, I've left some things undone. In particular "chrome documents" like the Firefox UI still use native widgets in quite a few places. There's a lot of followup simplifications that we can do, like eliminate the "MozDrawingArea" abstraction that we currently use to support guffaw scrolling. In fact, apart from fixing a ton of plugin and visual bugs, the main point of this work is to enable further changes.

For people who are interested, I've got some try-server builds available for testing. I'm particularly interested in IME and accessibility testing, since although the code passes tests I suspect external IME and AT code may be making assumptions about our native widget hierarchy that I'm breaking. General scrolling and plugin performance is also worth looking at.

Posted by roc at 6:12 PM | Comments (6)

Whitespace Begone

I just landed on trunk a small improvement to page-load performance. Typical HTML documents contain a lot of markup like

<body>
  <div>
    <b>Hello World</b>
  </div>
</body>

In the DOM, there are four text nodes which contain only whitespace. Up until now, in Gecko we created one "text frame" to render each text node. However, this is a waste, since whitespace-only text nodes before or after a block boundary don't affect the rendering; they are collapsed away (unless CSS white-space:pre or certain other unusual conditions apply).

So I did some measurements of our page-load-performance test suite (basically, versions of the front pages of popular Web sites) with an instrumented Gecko build to see how much could be saved if we avoid creating frames for those whitespace DOM nodes. It turns out that in those pages, 65% of all whitespace-only text frames were adjacent to a block boundary (i.e., the first child of a block, the last child of a block, or the next or previous sibling is not inline) and don't need a frame. That's 30% of all text frames, and 11% of all layout frames!

Actually suppressing the frames was pretty easy to implement --- it helped that Boris Zbarsky has recently done some magnificent hacking to make nsCSSFrameConstructor more sane. Handling dynamic changes correctly was a little tricky and actually took most of the development time. Our test suite, which is pretty huge now, caught some subtle bugs. But the patch is on trunk and seems to be a 1-2% improvement in average page load time. It's also a small reduction in memory usage. It also makes dumps of the layout frame tree much easier to read because a lot of pointless frames are gone.

What's probably more important than all that is that this makes it easier to separate our implementation of blocks into "blocks that contain only blocks" and "blocks (possibly anonymous) that contain only inlines" without regressing performance, because we won't have to create anonymous blocks just to contain irrelevant whitespace.

Posted by roc at 5:32 PM | Comments (14)

June 23, 2009

DirectShow And Platform Media Frameworks

People keep asking why we don't integrate support for Windows' DirectShow into Firefox so the <video> element can play any media that the user has a DirectShow codec for. Even if a volunteer produces a patch, I would not want to ship it in Firefox in the near future; let me try to explain why.

  1. Probably most important: we want to focus our energy on promoting open unencumbered codecs at this time.
  2. Only a very small fraction of Windows users have a DirectShow codec for the most important encumbered codec, H.264. Windows 7 will be the first version of Windows to ship with H.264 by default. Even if millions of people have downloaded H.264 codecs themselves, that's a very small fraction of our users.
  3. DirectShow is underspecified and codecs are of highly variable quality. Many codecs probably will not work with Web sites that use all the rich APIs of <video>, and those bugs will be filed against us. We probably will not be able to fix them. (Note that the problem is bad enough that in Windows 7, Microsoft isn't even going to allow unknown third parties to install DirectShow codecs.) We could avoid some of this problem by white-listing codecs, but then a lot of the people who want DirectShow support wouldn't be satisfied.
  4. Many DirectShow codecs are actually malware. ("Download codec XYZ to play free porn!")
  5. DirectShow codecs are quite likely to have security holes. As those holes are uncovered, we will have to track the issues and often our only possible response will be to blacklist insecure codecs, since we can't fix them ourselves. If we blacklist enough codecs, DirectShow support becomes worthless.
  6. Each new video backend creates additional maintenance headaches as we evolve our internal video code.

So even if we didn't care about promoting unencumbered codecs and someone gave us a working patch, shipping DirectShow support in Firefox is of limited value and creates tons of maintenance work for us.

Some (but not all) of the same arguments apply to using Quicktime. But if we're going to ship our own video infrastructure for Windows, we save ourselves a lot of trouble by using that infrastructure across all platforms. It saves authors a lot of trouble too, due to less variability across Firefox versions.

Currently no browser or browser plugin vendor is using codec implementations they don't control. Apple does allow third-party codecs to be used with Quicktime in Safari --- but at least they control the framework and the important codecs, and apparently they have made some changes for Web video.

Posted by roc at 11:39 AM | Comments (17)

June 18, 2009

The Price Of Freedom

With the imminent release of Firefox 3.5 and the big step forward for unencumbered video and audio that this represents, there's been a lot of discussion about the merits of the free Ogg codecs vs the flagship encumbered codecs, especially H.264. Proponents of the encumbered codecs tend to focus on the technical advantages of patented techniques used in H.264 but not in Theora. But the real question that matters is this: at comparable bit rates, in real-world situations, do normal people perceive a significant quality advantage for H.264 over Theora? Because if they don't, theoretical technical advantages are worthless.

(It's not helpful to ask a video compression expert if they can see quality differences. Of course they can, they're trained to. That doesn't tell you what matters for the other 99.9999% of the population.)

One issue that has made a comparison difficult is that encoders are so tunable. People always say things like "oh, your H.264 would look much better if you set the J-frame quantization flux capacitor rank to 8". So Greg Maxwell had a stroke of genius and did a comparison using Youtube to do the encoding. Now, people who think the H.264 encoding is sub-par are placed in the difficult position of arguing that Youtube's engineers don't know how to encode H.264 properly.

In fact Youtube deliberately doesn't squeeze the most out of H.264 encoding. That's because in real life there are tradeoffs to consider beyond just quality and bit-rate, such as encoding cost and bit-rate smoothing. But that's fine, because these are the real-life situations we care about.

Maik Merten recently repeated Greg's experiment with different video in larger formats. In these tests, it seems pretty clear that there is no real advantage for H.264, or even that Theora is doing better.

We really need someone to do a scientific blind test with a wide pool of subjects and video clips (hello academic researchers!). But H.264 proponents have to demonstrate real-world advantages that justify surrendering software freedoms and submitting to MPEG-LA client and server license fees (not to mention the hassle of actually negotiating licenses and ensuring ongoing compliance).

Posted by roc at 12:21 PM | Comments (34)

June 15, 2009

Stupid X Tricks

Platforms like X and Win32 support trees of native windows. For example, an application might have a "top-level" window containing child windows representing controls such as buttons. In browsers, child windows are required for many kinds of plugins (e.g., most uses of Flash). Managing these child windows is quite tricky; they add performance overhead and the native platform's support for event handling, z-ordering, clipping, graphical effects, etc, often is not a good match for the behaviours required for Web content, especially as the Web platform evolves. So I'm working hard to get rid of the use of child windows and I've made a lot of progress --- more about that later. However, we're stuck with using child windows for plugins, and recently I've been grappling with one of the hardest problems involving child windows: scrolling.

It's very hard to make scrolling smooth in the presence of child windows for plugins, when all other child windows have been eliminated. In general you want to scroll (using accelerated blitting) some sub-rectangle of the document's window (e.g., an overflow:auto element), and some of the plugin windows will move while others won't. You may want to change the clip region of some of the plugins to reflect the fact that they're being clipped to the bounds of the overflow:auto element. The big problem is that to avoid flicker you want to move the widgets, change their clip regions, and blit the contents of the scrolled area in one atomic operation. If these things happen separately the window is likely to display artifacts as it passes through undesired intermediate states. For example, if you blit the scrolled area first, then move the plugins, the plugin will appear to slightly lag behind as you scroll the window.

On Windows, ScrollWindowEx with the SW_SCROLLCHILDREN flag gets you a long way. But on X, the situation is dire. X simply has no API to scroll the contents of a window including child windows! Toolkits like GTK resort to heroic efforts such as guffaw scrolling to achieve the effect, but those techniques don't let you scroll a sub-rectangle of a window, only the entire window. So for Gecko I have had to come up with something new.

Basically I want to use XCopyArea and have it copy the contents of the window *and* the contents of some of the child windows in the scrolled area, and then move the child windows into their new locations and change their clip regions (I'm using XShape for this) without doing any extra repainting. I tried a few things that didn't work, before I hit a solution:

  1. Hide (unmap) all child windows that are going to be moved during the scroll operation
  2. Do the XCopyArea
  3. Set the clip region and position of all relevant child windows
  4. Show (map) all child windows that we hid in step 1

By hiding the child windows during the XCopyArea, we trick the X server into treating the pixels currently on-screen for them as part of the parent window, so they are copied. By moving and setting the clip region while each child window is hidden, we also avoid artifacts due to a child window briefly appearing outside the area it's supposed to be clipped to. It *does* mean that when we scroll a window containing plugins, expose events are generated to repaint the contents of the plugins' child windows, so scrolling is slower than it could be. We might be able to avoid that by enabling backing store for the plugin windows.

It's somewhat tricky to implement this. We have to compute the desired child window locations and clip regions, store the results in a collection, and pass that collection into our platform window abstraction's Scroll() API so that the scrolling and widget configuration can happen as a unified operation. But I've done it, and it works great on my Ubuntu VM. I'm not 100% sure that it will work well on all X servers and configurations; that's something we'll need to get good testing of.

I do wonder why X has never felt the need to grow a genuinely useful scrolling API. Oh well.

Posted by roc at 3:12 PM | Comments (8)