I gave a talk at the London Selenium Meetup back in November of last year, and presented there on “How Mozilla Uses Selenium”; it’s been four months since, and I’ve been wanting to give an update on what we’ve done since then, where we’re going, and how we’re going to get there, so here goes.
Where were we?
It’s easiest for me to talk about progress and challenges per-project, so with that, let’s talk about AMO. Our inaugural tests gave and still give us a _ton_ of coverage; the problem we’re now facing is two-fold: we’re now seeing more frequent 500 Internal Server Errors on our staging box (which we’ll address hopefully soon after we ship Firefox 4), and it’s in fact a problem tickled by tests (though through no fault of their own) that do quite a bit of looping over a lot of data (https://addons.allizom.org/en-US/firefox/extensions/bookmarks/?sort=created, for instance). Looking at our commit logs, you can see we’re skipping certain add-ons, adding timeouts, adding retries, etc. -- we’re spending a lot of time fixing classes of tests, rather than individual tests, really (mostly because the individual tests are huge). We've even imported the Python module urllib, and check for server status before launching our tests, as well as sprinkling setTimeout(120000) in an attempt to ameliorate the server down/timeout issues.
Where are we now? What have we done, since?
Some of our current test suites, particularly in AMO, take over an hour to run (again, they're thorough, I'm not faulting them for that); that's simply too long to wait for a test to finish. Coupled with the fact that they're prone to suffering staging-server failures, too, that's a lot of time to wait for a green.
Moving forward, based on David's style guide and test templates, we'll be writing tests in the "new" Page Object Model approach; I say "new" because it's quite well-known in the Selenium world (thanks, Wade!) The tests, as David has himself started them, live here: https://github.com/AutomatedTester/Addon-Tests. These same new tests also use pytest, and can (and do) execute tests in parallel, resulting in much-faster completion time.
I should mention, as well, that the ever-awesome Dave Hunt has centralized our Selenium Grid config in github, which will help us stabilize that going forward.
Where do we want to go?
Hugely inspired (with a healthy sense of anxiety, as in "how do we get there?") by James Socol's "The future of SUMO development" post, along with our weekly releases in AMO, SUMO and Mozilla.com, we recognize we too obviously need to be a lot more nimble. That means we've got to spend less time fixing our tests ("now, where do I find the locator for this element, again?", "where in this huge test is _my_ add-on failing?", etc.), and more time writing new ones, and finessing/making more robust, the ones we have.
How are we going to get there?
Here are just a few of my thoughts:
- Again, clearly, we've got to fix the staging-server issues
- Continue to maintain legacy tests when sensible (i.e. the maintenance benefits outweigh the fix costs)
- As much as possible, outright rewrite the more fragile, huge tests (I'm looking at you, AMO_category_landing_layout.py) into smaller ones, and remove them wholesale from the old suite
- Continue to augment our production smoketest suites for all projects, replacing legacy, slower-running tests with the newer, faster ones
- Figure out the trade-off between coverage and time-to-run, especially for smoketests in production, and BFTs in staging (and, ensure all projects have this nice hierarchy, and that it works well for each)
- Socially-engineer developers to recognize/fix failures that our test-suites catch :-)
- Build-out our Selenium Grid's hub, for more capacity
- Switch over existing tests to Selenium 2 where it makes sense, and write new ones in it by default
- Switch from Hudson over to Jenkins (because new butlers are always better)
We certainly have a long, long way to go, and not enough resources at present to get there anytime soon, but we're working on it, and this summer, in particular, should prove to be a very fruitful time (watershed moment, in fact).