August 11, 2011

So long, farewell, and thanks for all the automation lessons learned (or: o hai beautiful Page Object Model AMO Selenium tests!)

Catchy title, no? Well, you're reading this...

Today, our legacy (and pretty comprehensive) Selenium tests for the Mozilla Add-ons website (both written in Python), were forced into retirement; the reason?: our previously-blogged-about AMO redesign--code-named "Impala"--went live for the homepage and the add-on detail pages.

New AMO homepage

Tests, of yore:

To be sure, the tests served us well for the past 14 months: when originally written, they covered nearly everything about the end-user experience (except for downloading an add-on, which we couldn't do with Selenium's original API). While not an exhaustive list by any means, we had tests for:

  • navigation, including breadcrumbs, the header/footer

  • locale/language picker options

  • switching application types (Firefox, SeaMonkey, Thunderbird, Sunbird, Mobile)

  • categories (appropriate for each application type)

  • login/logout

  • sorting and presence/correct DOM structure of add-ons (featured, most popular, top-rated)

  • reviews (posting, author attribution, cross-checking star ratings, as we found bugs in updating these across the various site aspects, via a cron)

  • collections (creation, tagging, deletion, etc.)

  • personas (again, different sort criteria was also covered)

  • search (popular, top-rated add-ons, substrings, negative tests, presence of personas/collections in search results, properly, etc.)

  • proper inclusion of Webtrends tagging

  • ?src= attributes on download links, to help ensure we didn't break our metrics/download-source tracking

  • ...and much, much more

...and then the testcase maintenance started:

  • we went through a few small redesigns, here and there (one of which was the header/footer, which for a brief time period we shared with, around the launch of Firefox 4)

  • we changed where and when categories showed up, and their styling

  • we kept running into edge-cases with individual add-ons' metadata (titles, various attributes -- some valid bugs, some that our tests' routines just couldn't anticipate or deal well with, like Unicode)

  • A/B testing hit us, and we had to deal with sporadic failures we couldn't easily work around

  • ...and so many more aggravating failures, and subsequently, difficult refactoring attempts (most of which were abandoned)

Lessons learned (the hard way):

  • don't try to test everything, or even close to everything -- you really have to factor in a lot of time for testcase maintenance, even if your tests are well-written

  • keep tests as small as possible - test one thing at a time

  • on the same token, don't try to iterate over all data, unless you're trying to ascertain which cases to cover -- it'll just cost you test-execution time, and --again-- testcase maintenance (eventually, you'll either cut down the test iterations, or you'll give up and rewrite it)

  • separate your positive and negative tests: reading long-winded if/else branches or crazy, unnecessary loops when you're already annoyed at a failing test just makes it worse

  • don't couple things that really shouldn't have an impact on each other; for instance, don't repeat search tests while logged in and logged out, unless you have a good reason

  • same goes for user types: while it's seemingly noble to try to cover different user types, don't test everything again with those -- just the cases where functionality differs

  • safeguard your tests against slow, problematic staging servers, but don't go so far as to mask problems with large timeout values

  • fail early in your tests -- if you have a problem early on, bail/assert ASAP, with useful error messages; don't keep executing a bunch of tests that will just fail spectacularly

  • go Page Object Model [1] or go home, if you're writing tests for a website with many common, shared elements

What's next?

  • a strong, well-designed replacement suite we've been running in parallel:

  • an eye in mind for Continuous Deployment (thanks for the inspiration, Etsy!), including, but not limited to:

  • better system-wide monitoring, via Graphite

  • to that end, much closer collaboration with our awesome dev team on unit vs. Selenium-level coverage balance


Ironically, the same time our legacy test suites started failing, so did a large portion of our rewrite (which also began before the Impala rewrite officially landed); the beauty, though, is that it's immeasurably easier to fix and extend, this time around, and we aim to keep it that way

[1] (Huge shout out to Wade Catron, from LinkedIn, who runs a tight Ruby-based test framework, and who helped us develop a great initial POM, which we've refined over the past months; video, below):

Posted by stephend at August 11, 2011 6:30 PM
Post a comment