June 16, 2009

theora video vs. h264

Some of you may have seen some of this on Slashdot or elsewhere online, but I'll bet that many haven't and I think it's pretty important as we get ready to unleash <video> and our unencumbered audio and video codecs and the Ogg container format onto the Web.

Let me preface the links by noting that yes, there's still a long way to go before we can say that Open Video will succeed, but the pieces are coming together and at least some of the chicken-and-egg problems are starting to be solved.

I've been working with Mozilla's Open Video technologies (including the Ogg container format, Theora video encoding, Vorbis audio, and the HTML 5 <video> and <audio> tags and their respective DOM APIs) for about 10 months and I've watched not only Firefox's implementations improve, but also the tools and the codecs themselves making great progress.

Firefox now has a pretty nice set of audio and video controls that are certainly ready for an initial launch. The DOM APIs, while not complete, are good enough that we built our browser controls using them and people are able to do some pretty cool demos and even final implementations with them. The Xiph QT Component, a tool that brings Ogg/Theora+Vorbis to QuickTime-capable apps has just had another release that brings it up to speed with the Theora 1.0 release and fixes some key bugs. (And another release is expected soon.) Theora 1.1 alpha 2 has made some really big gains for video encoding quality.

And, we're about to ship these capabilities to 300 million Internet users.

So, what about quality. First, it's important to note that when we talk about video on the Web, quality has to be paired with size. So, what kind of quality can you get from Theora in a comparable file size to H.264 -- the latest and greatest of the not-Open video codecs.

In recent days, Greg Maxwell and Maik Merten have both put up some real-world comparisons that go beyond the geeky sort of synthetic and objective benchmarks that made the rounds last month.

Greg's comparison is here and I think it makes a pretty good case that Theora+Vorbis is solidly ahead of YouTube's H.263(Sorensen Spark)+MP3 and quite competitive with YouTube's high quality H.264+AAC at resolutions of 400x226 and 480x270 respectively.

I'm a stickler for audio and video quality and even have some formal background in both and I can certainly see some differences.

I think that Theora+Vorbis absolutely trounces H.263+MP3 and I don't think there's even a question of which kind of artifacts you prefer. Theora+Vorbis is just plain better than the majority of what YouTube and many other Flash video sites have been serving to users for years.

When it comes to the H.264+AAC comparison, I think things are a lot closer and I personally prefer the H.264 video. (I couldn't pick a winner in audio but perhaps if I had a better pair of headphones...) The H.264 video isn't miles ahead though and I'd wager that most people, and this is supported by a few folks I've asked to look at it, either won't see a difference or if they do see a slight difference aren't bothered by it.

Not satisfied with those two rather low resolution video comparisons, Maik just posted a comparison at 1280x720 and the results are also quite positive for Theora+Vorbis.

Watching these two videos it really does seem to me like I'm asked to pick which kind of artifacts are least bad. Both videos have issues that bother me and the Theora version doesn't have quite the color saturation and contrast balance of the H.264 version but they're really not that far apart. Overall, I think I again prefer the H.264 version but only barely and the truth is I pay a lot more attention to the subtle differences than most Web users and even many content producers.

Oh, and one final closing thought. The Theora encoder is getting better every day and by the time we've rolled out Open Video to 300-400 million Web users (say, end of the year-ish when most of the Firefox user base will have completed the upgrade to 3.5) I think we're going to have a Theora encoder that will match H.264 for Web content in the eyes of 99.9% of the Web population.

Posted by asa at 8:18 PM