Intended Audience

The aim here is for a high-level overview of our effort, with enough detail that developers and other automation/QA engineers can hopefully become involved at some level, too


(Before, or at least in addition to reading this, it makes sense to read Andrew Halberdstat's post, "A Tired Developer's non-Illustrated Primer to B2G Testing.)"

Known to many, but certainly not all, of Mozilla, is that Web QA has been leading the Gaia UI Tests automation effort, on real devices, in an effort to support both the B2G QA team, and also development at large.

The original intent and driving force was automation of the daily QA smoketest suite, run manually, on each build. Although that's still squarely in our sight, we're not limiting our automation targets to just those pre-identified tests, now - we've added and continue to add critical-path tests as well as areas that are highly prone to regression or just need more coverage. We're using the Marionette framework to drive these tests.

It's definitely been a long road to get here, so let me chronicle (at a high level) what it's taken to get here, and where "here" is, too.

Our Infrastructure/Architecture - Continuous Integration Jobs

It probably makes sense to talk about what we test or provide infrastructure to test, first.

Below is a screenshot of all our (Web QA + A-team) B2G jobs running in our Jenkins instance.

The test suites you see listed there (click on the image to view it at full resolution), are as follows:

  1. *.download.pvtbuilds, *.download.pvtbuilds.flashzip, and *.download.releases are build-download jobs for their corresponding UI testing, update testing, and performance testing, respectively
  2. *.master.mozperftest is a new prototype of performance-testing job
  3. *.master.perf is a "regular" perf job
  4. *.master.perf.fps intends to measure scrolling performance (frames per second)
  5. *.master.ui runs Web QA's UI-testing suite against the master branch of Gaia
  6. *.master.ui.xfail are the tests with known failures, which we run periodically and restore to the master.ui job when passing/known good
  7. *.v1-train.perf are the performance tests for the v1-train branch
  8. *.v1-train.run_adhoc_tests
  9. - we use this job to plug in pull requests and single-out tests we'd like to run in isolation, for instance
  10. *.v1-train.ui is our suite of Gaia UI tests, run against the v1-train branch
  11. *.v1-train.xfail is our xfailed tests -- ones with known failures
  12. *.update.tests will be running Marshall's update tests (again, native Unagi)

Hardware/Software Stack

We've scaled out to a currently supported configuration of 7 Mac Minis/7 Unagis; the pairing is intended :-)

Originally, we made do with two -- one pair for running the UI tests, and the other for running performance tests; we then added additional branches, needed more capacity to be able to run concurrent builds, and also added more performance test-suites too.

Each Unagi is connected to a single Mac Mini, formatted and configured to run Ubuntu 12.04 LTS, also running a Jenkins slave agent, and adp port-forwarding to/from Unagi/Mac Mini.

Because we do real, on-device tests (dialer, SMS, etc.) each Unagi (sans the two that run the performance tests exclusively) has a SIM card -- for each of those, we maintain a JSON file (in a private GitHub repo) which contains all the necessary info the test runners needs to run tests on that device. Our Gaia UI Tests README has the info on that.

How We Run Gaia UI Tests (v1-train, and over-simplified)

Now, I'd like to zero-in on the main job Web QA supports, which for the foreseeable future is v1-train.ui.

  • First, a job in Jenkins automatically grabs that latest daily Unagi engineering build (
  • Next, we do a make-reset gaia, and check out and build the v1-train branch from the Gaia repo in GitHub
  • A metric ton of Python-package installs happen next inside a virtualenv
  • Then we: push to the Gaia profile, setting needed preferences, and pull in the JSON file that's specific to that Unagi (its phone #, IMEI #, etc.)
  • Finally, when we listen on the right adb ports, and start sending commands to gaiatest to run all tests marked for inclusion, from one primary manifest.ini file, which itself imports from others
  • Also part of the manifest is a list of potential attributes, to let us know both device-types and capabilities (e.g. bluetooth, camera, wifi, lan, panda for Panda boards) so we can run tests against the appropriate environments
  • To keep our tests isolated from potential other problems, we restart the B2G process in-between tests (takes ~ 18 seconds)
  • In addition to test results, we get a full logcat for the run, as well as screenshots for failures


As of this post, we have 85 total tests, consisting of 67 UI tests, and 18 unit tests (which make sure we're setting ourselves up for success in the later run).

We launch apps, test the keyboard, take photos, videos, make and receive calls, send and receive SMS messages, test IMEI numbers, play music, tune the radio app, make calendar appointments, install and delete apps both from the Firefox Marketplace as well as, and much, much more.

There are limitations, though, as both Gaia and Marionette are under very active development -- for the latter, we're tracking that in bug 801898 - Get GaiaTest UI smoke tests running reliably in Jenkins CI.

Get Involved

As I mentioned, we try to map as closely as possible (unless limited by bugs or not-yet-implemented features in Marionette, which we track) to the QA smoketests, but we've also added a plethora of high-risk/need-coverage areas in our GitHub Issues -- take a look, and reach out to us in #mozwebqa or comment in an issue if you're interested in taking one!

Other Blog Posts on the UI Tests

In a previous post, I wrote about Sauce Labs, and detailed how their Selenium-in-the-cloud service ("OnDemand") helped Web QA--particularly yours truly--quickly make sense out of a perceived performance regression, with really nice video support. It turns out that we keep having reasons to look to them for additional capacity and capabilities. And, thanks to a great working business relationship, they're helping us meet those needs.

The problem: as stated in our Q2 goals, our team has to "Plan for and support a high-quality launch of". The goal goes on, stating that we'll support desktop on a variety of browsers, as well as mobile.

Our list of supported, in-house browser and operating-system environments is here: Mac, Windows 7, and Windows Vista. Until recently, we didn't have in-house Opera or Chrome support, so calling out to Sauce for those browsers has been really beneficial. (When we launch a new site, we definitely cover all major browsers, manually, but due to our Grid infrastructure, the need to run frequent tests, and the maintenance cost, we try to keep it to the essentials; other teams have different, more-stringent browser requirements, so it was necessary to add in both Chrome and Opera to our automation.)

The solution: It's often been said that companies should "focus on their core competencies" (I'll admit there are counter-points aplenty), and it's a view that I happen to subscribe to, especially given the demands on our team. To that end, instead of planning for, provisioning, and then maintaining the morass of browser environments on a plethora of operating systems, we increased our scope and frequency-of-use with Sauce Labs, covering what we reasonably could, in-house, at present, while immediately spinning up test coverage, and making our developers and supported Identity/Services QA team very, very happy.

The here and now: as of this moment, we've been able to bring up 42 test jobs against (dev/beta) and (dev/beta), covering the most-critical paths of login and logout; we're closing in on new-user registration and changing passwords, too.

The future: as our testing needs increase, so, likely, will our usage of Sauce, and it'll be as easy as cloning a job in Jenkins and changing a single parameter, thanks to the built-in flexibility that Dave Hunt's pytest-mozwebqa plugin provides.

So, thanks again, Sauce; just another example of helping the Selenium [1] and Open Source communities, and, in particular, Mozilla!

[1] Sauce Labs powers the critical Selenium-core, continuous-integration tests for the project:

While we support many Mozilla projects on the Web QA team, I'd like to highlight one that I've been working on for the last couple releases now: Mozilla Reps.

We'll do a testday on it soon, I just wanted to encourage anyone interested to start looking at it now :-)

How can you help? Like all Mozilla projects, there's always a list of fixed bugs/features to verify, as well as a good amount of exploratory testing. For examples of the kinds of things we'd test for, check out a couple blog posts by yours truly: part 1, part 2.

What makes testing this particular site a bit unique is that Mozilla Reps is made up by a community committee, so most of the functionality depends on being a bonafide member; for testing purposes, we can bypass the usual application form, by making you a "vetted 'Reps' member" (though, of course, we encourage you to apply for real!)

Your best steps to begin, are to visit the team in #remo-dev on, and letting anyone in the channel know that you're interested in helping test!

Here are some more general resources to help get you familiar:

* Mozilla Reps wiki page, for general information/an overview
* Mozilla Reps "Website" page, for specific website-release information
* a list of open bugs, so you don't file duplicates
* all Resolved FIXED bugs needing verification

Eventually--I'm told--the plan is to incorporate elements (or the site wholesale) into Mozillians (, so look for that in the coming future!

And, lastly, a link back to the Web QA homepage, in case you have any further questions, about this or any web-testing project(s) @ Mozilla!

In addition to co-hosting/sponsoring many Selenium Meetups, we've been really grateful to have a generous and accomodating business relationship with Sauce Labs, which allows us to augment and complement our own custom-built Selenium Grid testing infrastructure with theirs. For our most recent Socorro release, I and the whole team (dev +QA) really came to appreciate one feature in particular: video, and for two reasons.

To get you familiar with Mozilla's Selenium Grid setup, let me back up just a second; Dave Hunt has previously highlighted his pytest-mozwebqa plugin (still in beta) for our WebQA team, which--among all the many other awesome things it does--allows us to very easily use Sauce Labs' API to integrate with their OnDemand service.

Inspired by the wealth of build-reporting information we get from Sauce, Dave set out some time ago to incorporate some of it--particularly capturing screenshots--as well as writing out HTML reports that link us to (if the job was Sauce-run) a Sauce Labs job #, complete with video for each tests. We looked at incorporating video a bit into our own infrastructure (as did David Burns in Lights! Camera! Action! Video Recording of Selenium), but soon realized that--given all the other challenges of shipping, on average, 7 websites per week, and dealing with our scale of 6 Mac Minis, archiving, disk space, etc.--it wasn't something we wanted to pursue, for now.

For this particular Socorro release, we thought we had--immediately after we deployed--a huge performance regression. See bug 718218 for the gory details, and all the fun that ensued (over the weekend!).

(Click on the image to view the test run.)

Typically, video is a "nice to have" feature, for us, as our team is quite familiar with debugging failing Selenium tests locally, and our HTML reports and screenshots go a long way, too; for this release, though, I was playing back-up for Matt Brandt, and, as the requisite pointy-hair, couldn't code or run a Selenium test to save my life (not quite true, but close!) When I began investigating why our Socorro test suite suddenly ran ~20 minutes longer than it ever conceivably should, I discovered it was in one of the search tests, test_that_filter_for_browser_results, which we later realized had an erroneous |while| clause that never eval'd to |true|.

While I first looked on our own Grid instance, to try to watch (via VNC) our browsers running this particular test, Grid's distributed test-execution nature means that it never predictably ran on the same node each time. (In retrospect, I could've used pytest's keyword (-k) flag from a job in Jenkins to run the suspect test in isolation, but I digress.) I needed a way to quickly show our dev team the individual test, without having them or me set up a test environment, and firing off that test. Luckily, since Dave made the default for our Sauce Labs jobs public, all it took was a single run of the job via Sauce, find the individual test, and copy the link, which has nice annotated timestamps for each Selenium command, and quick paste in #breakpad.

Earlier, I mentioned that their video support is great for two reasons: in addition to the above (the ability to view archived videos from past-run jobs), we also took advantage of the ability to watch the individual test--as it ran in real-time--and correlate the currently-issued Selenium command with the hold-up/"perf regression."

We quickly isolated it to the aforementioned, bogus |while| loop, commented it and the accompanying assert out, and certified the release after more careful checking. Because it was the weekend, and folks were available at different times, once each member popped onto IRC, we simply showed them the "before and after" videos from Sauce Labs, and the team regained its confidence in the push.

So, thanks, Sauce, for such a useful feature, even if it's (thankfully) not necessary most of the time!

Mozilla, and WebQA in particular, looks forward to continuing to work with Sauce Labs and the Selenium project, in exploring ways in which we can work together to help further and bolster Selenium and Firefox compatibility.

(Title get your attention? Good! Keep reading...)

My awesome coworker Matt Brandt has highlighted the awesome work and contributions that Rajeev Bharshetty is doing for and with, Mozilla WebQA.

I'd like to take this opportunity to call out another great contributor to (and with) WebQA: Sergey Tupchiy, who we nominated for Friend of the Tree (along with Rajeev) in this past Monday's meeting.


Sergey has been instrumental: he helped us turn around a Q4 goal of having test templates custom-tailored for brand-new Engagement/Marketing websites (and which, quite frankly, has application and implications for our standard test templates, going forward.

His flurry of awesome GitHub activity should be self-evident :-)

In 2012, and kicking it off this quarter, Mozilla WebQA (and the rest of Mozilla QA) will be taking a hard look at, and hopefully making many inroads into, a better community on-boarding story; we're blessed to have the great contributors we already do, but need to grow and scale to meet Mozilla's ever-increasing challenges.

We're looking for contributors of all expertise areas, skill levels, and project interest; if interested, or just want to find out more information, please contact us via whichever below method best works for you, and we'll happily help you get started!

IRC: #mozwebqa on
QMO homepage:
Team Wiki:

(And a huge thanks, again, to Sergey, and other awesome contributors like Rajeev!)

I won't embed it here, but here's the Mozilla Memory interview on how I became involved with Mozilla: I'm glad I was forced to write it all down -- memories I wouldn't want to forget :-)

A huge thanks to Ken Albers for capturing it so accurately!

Catchy title, no? Well, you're reading this...

Today, our legacy (and pretty comprehensive) Selenium tests for the Mozilla Add-ons website (both written in Python), were forced into retirement; the reason?: our previously-blogged-about AMO redesign--code-named "Impala"--went live for the homepage and the add-on detail pages.

New AMO homepage

Tests, of yore:

To be sure, the tests served us well for the past 14 months: when originally written, they covered nearly everything about the end-user experience (except for downloading an add-on, which we couldn't do with Selenium's original API). While not an exhaustive list by any means, we had tests for:

  • navigation, including breadcrumbs, the header/footer

  • locale/language picker options

  • switching application types (Firefox, SeaMonkey, Thunderbird, Sunbird, Mobile)

  • categories (appropriate for each application type)

  • login/logout

  • sorting and presence/correct DOM structure of add-ons (featured, most popular, top-rated)

  • reviews (posting, author attribution, cross-checking star ratings, as we found bugs in updating these across the various site aspects, via a cron)

  • collections (creation, tagging, deletion, etc.)

  • personas (again, different sort criteria was also covered)

  • search (popular, top-rated add-ons, substrings, negative tests, presence of personas/collections in search results, properly, etc.)

  • proper inclusion of Webtrends tagging

  • ?src= attributes on download links, to help ensure we didn't break our metrics/download-source tracking

  • ...and much, much more

...and then the testcase maintenance started:

  • we went through a few small redesigns, here and there (one of which was the header/footer, which for a brief time period we shared with, around the launch of Firefox 4)

  • we changed where and when categories showed up, and their styling

  • we kept running into edge-cases with individual add-ons' metadata (titles, various attributes -- some valid bugs, some that our tests' routines just couldn't anticipate or deal well with, like Unicode)

  • A/B testing hit us, and we had to deal with sporadic failures we couldn't easily work around

  • ...and so many more aggravating failures, and subsequently, difficult refactoring attempts (most of which were abandoned)

Lessons learned (the hard way):

  • don't try to test everything, or even close to everything -- you really have to factor in a lot of time for testcase maintenance, even if your tests are well-written

  • keep tests as small as possible - test one thing at a time

  • on the same token, don't try to iterate over all data, unless you're trying to ascertain which cases to cover -- it'll just cost you test-execution time, and --again-- testcase maintenance (eventually, you'll either cut down the test iterations, or you'll give up and rewrite it)

  • separate your positive and negative tests: reading long-winded if/else branches or crazy, unnecessary loops when you're already annoyed at a failing test just makes it worse

  • don't couple things that really shouldn't have an impact on each other; for instance, don't repeat search tests while logged in and logged out, unless you have a good reason

  • same goes for user types: while it's seemingly noble to try to cover different user types, don't test everything again with those -- just the cases where functionality differs

  • safeguard your tests against slow, problematic staging servers, but don't go so far as to mask problems with large timeout values

  • fail early in your tests -- if you have a problem early on, bail/assert ASAP, with useful error messages; don't keep executing a bunch of tests that will just fail spectacularly

  • go Page Object Model [1] or go home, if you're writing tests for a website with many common, shared elements

What's next?

  • a strong, well-designed replacement suite we've been running in parallel:

  • an eye in mind for Continuous Deployment (thanks for the inspiration, Etsy!), including, but not limited to:

  • better system-wide monitoring, via Graphite

  • to that end, much closer collaboration with our awesome dev team on unit vs. Selenium-level coverage balance


Ironically, the same time our legacy test suites started failing, so did a large portion of our rewrite (which also began before the Impala rewrite officially landed); the beauty, though, is that it's immeasurably easier to fix and extend, this time around, and we aim to keep it that way

[1] (Huge shout out to Wade Catron, from LinkedIn, who runs a tight Ruby-based test framework, and who helped us develop a great initial POM, which we've refined over the past months; video, below):

It's been quite a while since I blogged last, and I've made it a goal this quarter to resume blogging, because it helps me focus my thoughts, convey some gleanings, and, in general, I hope to grow the awesome community we have.

So, what *have* we done, since my last post? Let's directly address the "Wish List" I last posted:

1. Again, clearly, we've got to fix the staging-server issues

Input and AMO are slower and less reliable than the rest, still, but we have a new staging server up soon for Input, and AMO's performance is constantly monitored and improved upon.

2. Continue to maintain legacy tests when sensible (i.e. the maintenance benefits outweigh the fix costs)

We're still maintaining our "legacy" AMO tests, which provide us a tremendous amount of coverage, even today. Now, however, they are starting to show their rigidity, as we can't easily update them to accomodate A/B testing or rewrites of search, using Elastic Search. Instead, we're focusing our efforts on amo.rewrite, our new suite.

3. As much as possible, outright rewrite the more fragile, huge tests (I'm looking at you, into smaller ones, and remove them wholesale from the old suite

We haven't really removed the old AMO tests -- many of which are still providing great coverage. We continue to build out existing AMO tests, and one piece we're missing from the old suite is the dynamic, data-driven tests (which tend to be more fragile).

4. Continue to augment our production smoketest suites for all projects, replacing legacy, slower-running tests with the newer, faster ones

We trimmed the SUMO tests down quite a bit, and have done the same with AMO, too. And both are eminently more reliable than a few months ago.

5. Figure out the trade-off between coverage and time-to-run, especially for smoketests in production, and BFTs in staging (and, ensure all projects have this nice hierarchy, and that it works well for each)

Actively working through this, especially as we transition to more of an, continuous-deployment model.

6. Socially-engineer developers to recognize/fix failures that our test-suites catch :-)

Not quite there, yet; but the culture is catching on, and we continue the discussion with projects that don't always unit test.

7. Build-out our Selenium Grid's hub, for more capacity

Not only did we switch over to Selenium Grid 2, but we added Aurora, Beta.

8. Switch over existing tests to Selenium 2 where it makes sense, and write new ones in it by default

We're investigating this with our extended automation team, and Dave Hunt has already written both QMO and tests in Selenium 2/Webdriver.

9. Switch from Hudson over to Jenkins (because new butlers are always better)

Already done; and we always upgrade to the latest.

10. Mobile...?:

Not there yet -- also looking into it, and David Burns has a working prototype already.

We're continuing to figure out our shared continuous-deployment strategy with AMO and SUMO, and are already shipping directly from the master branch straight to production, with various features that aren't yet ready to be activated, hidden behind feature flags, using Waffle.

We welcome feedback, and of course any and all help: find our more about what we do at our QMO page, or find us on IRC, in #mozwebqa, on

I'm happy to announce that Mozilla and Sauce Labs will be holding the San Francisco Selenium Meetup here at Mozilla's HQ in downtown Mountain View, CA, on May 11, 6:30 pm.

Google Maps to Mozilla HQ.

Here's the official page: Please RSVP there, so we know how many to expect; thanks!

(Excerpting from the Meetup entry for better indexing/searchability)

"We're thrilled to announce that, hot on the heels of his CSS locators talk at the Selenium Conference, Santiago Suarez Ordonez, the Sauce Ninja and Selenium committer, will be the speaker of our next meetup on May 11th at Mozilla.

Santiago, or 'Santi', will give some insight into why you should say NO to XPath, and instead use CCS locators to make your tests more readable, faster, and realiable.

He gave an abridged version of this talk in the Track B of the Selenium Conference and it proved to be widly popular, so he'll be extending it and alse sharing some cool tools that will help you transition from XPath to CSS.

Thanks to our friends at Mozilla for offering to co-host this meetup. See you all in a few weeks!


6:30pm - Drinks, Pizza, Networking

7:15pm - Announcements

7:30pm - Santi tells us about CCS Locators

8:15pm - Q&A

8:30pm - Tools to help transition from XPath to CSS

9:15pm- Lights Out"

Mozilla are proud supporters/users of and contributors to, Selenium, and are really looking forward to the opportunity to learn and also exchange ideas and tips with the community!

I gave a talk at the London Selenium Meetup back in November of last year, and presented there on “How Mozilla Uses Selenium”; it’s been four months since, and I’ve been wanting to give an update on what we’ve done since then, where we’re going, and how we’re going to get there, so here goes.

Where were we?

It’s easiest for me to talk about progress and challenges per-project, so with that, let’s talk about AMO. Our inaugural tests gave and still give us a _ton_ of coverage; the problem we’re now facing is two-fold: we’re now seeing more frequent 500 Internal Server Errors on our staging box (which we’ll address hopefully soon after we ship Firefox 4), and it’s in fact a problem tickled by tests (though through no fault of their own) that do quite a bit of looping over a lot of data (, for instance). Looking at our commit logs, you can see we’re skipping certain add-ons, adding timeouts, adding retries, etc. -- we’re spending a lot of time fixing classes of tests, rather than individual tests, really (mostly because the individual tests are huge). We've even imported the Python module urllib, and check for server status before launching our tests, as well as sprinkling setTimeout(120000) in an attempt to ameliorate the server down/timeout issues.

Where are we now? What have we done, since?

Some of our current test suites, particularly in AMO, take over an hour to run (again, they're thorough, I'm not faulting them for that); that's simply too long to wait for a test to finish. Coupled with the fact that they're prone to suffering staging-server failures, too, that's a lot of time to wait for a green.

Moving forward, based on David's style guide and test templates, we'll be writing tests in the "new" Page Object Model approach; I say "new" because it's quite well-known in the Selenium world (thanks, Wade!) The tests, as David has himself started them, live here: These same new tests also use pytest, and can (and do) execute tests in parallel, resulting in much-faster completion time.

I should mention, as well, that the ever-awesome Dave Hunt has centralized our Selenium Grid config in github, which will help us stabilize that going forward.

Where do we want to go?

Hugely inspired (with a healthy sense of anxiety, as in "how do we get there?") by James Socol's "The future of SUMO development" post, along with our weekly releases in AMO, SUMO and, we recognize we too obviously need to be a lot more nimble. That means we've got to spend less time fixing our tests ("now, where do I find the locator for this element, again?", "where in this huge test is _my_ add-on failing?", etc.), and more time writing new ones, and finessing/making more robust, the ones we have.

How are we going to get there?

Here are just a few of my thoughts:

  • Again, clearly, we've got to fix the staging-server issues
  • Continue to maintain legacy tests when sensible (i.e. the maintenance benefits outweigh the fix costs)
  • As much as possible, outright rewrite the more fragile, huge tests (I'm looking at you, into smaller ones, and remove them wholesale from the old suite
  • Continue to augment our production smoketest suites for all projects, replacing legacy, slower-running tests with the newer, faster ones
  • Figure out the trade-off between coverage and time-to-run, especially for smoketests in production, and BFTs in staging (and, ensure all projects have this nice hierarchy, and that it works well for each)
  • Socially-engineer developers to recognize/fix failures that our test-suites catch :-)
  • Build-out our Selenium Grid's hub, for more capacity
  • Switch over existing tests to Selenium 2 where it makes sense, and write new ones in it by default
  • Switch from Hudson over to Jenkins (because new butlers are always better)
  • Mobile...?

We certainly have a long, long way to go, and not enough resources at present to get there anytime soon, but we're working on it, and this summer, in particular, should prove to be a very fruitful time (watershed moment, in fact).

In "Hey, Selenium. Hurry The Hell Up," Sean Grove from Sauce Labs makes a great general point:

"Write your tests to only test a single tiny bit of functionality.
Keep the number of steps in your tests under 20. All the parallelism
in the world won’t save you from long-running tests, because a single test can’t be parallelized.
The best written tests we’ve seen generally run in under 10 seconds."

In general, I agree with that; there are real-world cases, though, where longer, "endurance" or "longevity" tests, as I've heard them referred to, are necessary, or just plain helpful.

As such an example, we recently migrated our Firefox Support website (SUMO) from our San Jose datacenter to a much better build-out, infrastructure-wise, in our new Phoenix datacenter. Because we have a pretty good (though certainly not comprehensive) set of non-volatile tests we run in production pre/during/post-push, we simple edited our /etc/HOSTS files on our Grid environments to point to the staged SUMO instance in Phoenix, and ran our tests, tagged with "prod," against that host while load-balancers, master and slave DBs, etc. were configured and enabled.

Our tests kept passing -- logins were working, search was eventually fine (thought we had to re-configure Sphinx a few times), Knowledge Base articles were zippier than ever, etc.

We didn't, apparently, do enough manual testing around one thing, though: sessions. Yes, account-creation was fine, logins were working, posting was fine, etc., through manual testing, but one thing we noticed and eventually figured out was that user sessions weren't being maintained for more than, I think, 15 seconds, sporadically; other times, it was fine.

(iirc, the problem was that a Zeus VIP was pointing at our SJC colo to re-use sessions, which were being created in PHX, most of the time. jsocol will undoubtedly offer a correction if I'm wrong.)

In our postmortem, we noticed that our Selenium tests -- while testing each piece of functionality just fine, weren't set up to verify and assert that user sessions lasted for a certain length of time, since each was designed to be small, independent, and of course, because they're Selenium, get completely new sessions for each test, every time tearDown() gets called.

The beauty of Selenium is that, unlike unittest or other frameworks which use mock browsers rather than the real thing, it offers true end-to-end and environment testing, for which Vishal wrote test_forum_post_on_prod to ensure that future roll-outs (as well as sessions in general in production) are being maintained properly. This time, the seemingly arbitrary session limit was 15 seconds, but his new test, even on a fast production cluster, should run closer to a minute, giving us more confidence that sessions are maintained.

Again, this doesn't negate the "just the essentials for each function" test approach wisely advocated by Sean and others, it just offers an additional viewpoint and use-case for how Selenium might be used; we rely on our production smoketest/sanity suites for our largest projects, for good reason -- and now we have another one: we know that users' sessions are doing fine at a push of the button, and continuously, in our Hudson (soon to be Jenkins) runs.

This is one of those things that if I over-think, I'll never get around to posting, so here goes: the current WebQA "Contribute" page isn't the best it could be.

So, what would *you* change about it? If you're a contributing community member, what about the page (if anything) helped draw you/retain you?

I've started taking notes and making an outline, but am not happy with the lack of a "quickstart" guide, which I tried to do with the current page. It's hard to get the balance of the wealth of information I have to share with just enough to get folks started and encourage them to branch out later.

Please do leave feedback; I'll really take it to heart as I work through the Contribute redesign -- it's really important to increase visibility and get more people looking at and helping to shape our web apps.


Building off the momentum from the SUMO 2.1 release, which converted the old Tiki-Wiki-based Contributors/Off Topic/Knowledge Base forums to Python (code-named Kitsune), we’re redesigning (and reimplementing from scratch) the more prominent and feature-rich Firefox Support forums.

It's starting to take its shape in our 2.2 milestone.

As is always the case with Mozilla projects, we value and need your input and help in reporting issues/feedback along the way; how can you contribute?

Simple: take a look at the mockups (and their discussion) in bug 556287 and bug 556285, and start testing on, taking note of the open bugs for unimplemented feature work and known issues.

As you can see from the above mockup, we're introducing more crowd-sourcing, placing solutions more up-front and marking them better, and just cleaning things up and optimizing both the question-asking and answering workflows. A lot remains to be finished, however the SUMO team is working as quickly as they can to wrap up a ton of features in the coming couple weeks.

I'm excited to see what they can do with your help! Although our previous 2.1 release took a little time getting out of the gate, it's a solid release, and we're still keeping largely on track with the development timeline.

If you're looking for testing tips, drop by our Mozilla Web QA channel, on, at #mozwebqa, or, if you already know what you're doing, please file bugs in the SUMO product, Forum component.

I’ve alluded to (in small pieces) the automation work we’re doing here in Mozilla Web QA, but I’ve been meaning to give a more through overview (rather than merely a progress report) of what we’re doing, and how we’re doing it.

What are we doing?

For the top three Mozilla websites (AMO, SUMO,, we’re automating tests for key functionality; tests that, were they to break, would indicate serious regressions -- things that would block a release.

How are we doing it?

We’re using a few open-source tools (it’s in our DNA) to accomplish this (I’ll explain each in more detail):

Selenium RC provides us with the Selenium Core runtime, and the abstraction layer that our tests (written in Python) run through.

Selenium Grid is responsible for distributed scheduling and executing of Selenium RC-run tests; it generally does a good job of firing up and tearing down Selenium Remote Control instances (there are occasions, however, when a browser hangs, and a manual restart is required on our end).

Hudson is our continuous-integration server, and interfaces with Selenium Grid by picking up changes from our test repositories in SVN (like a cron job, it polls for additions on a schedule), issuing commands to Selenium Grid to start new builds, and a whole host of other important things, such as providing pass/failed status for builds, histograms of results over time, etc.

Python is our language of choice for many reasons, not the least of which is our awesome WebDev team has largely standardized on it for their own unittests (they also use Hudson, and are the reason we are, too). Also, in addition to the Python language being robust, it has a tremendous community (and the amazing Django framework, which WebDev also uses for both AMO and SUMO development, in the Zamboni and Kitsune projects, respectively).

When a build passes, its light goes green in Hudson; when it fails, we see red, and we get both email and IRC notifications sent. For failures, we get the Python traceback (which comes to us directly from Selenium), and we’re usually able to pretty quickly troubleshoot and either fix the problematic/incorrect test, or file a bug on development (as appropriate); maintaining tests can be a big part of automation, so it’s important to write them both flexible and granular enough, which is always a balancing act.

Where are we today, with regards to automated test coverage?

AMO - 25 tests; might not sound like much, but it actually covers quite a bit of functionality, as most of the tests come pretty close to covering a particular feature (e.g. Personas, category landing pages, etc.) The AMO tests are modular, too; they use:

SUMO - 61 tests; like its big brother (AMO), SUMO has started down the path of the same setup: - 3 tests; we’re working on ensuring that the download buttons and browser-specific redirects (Opera/Safari/Chrome, IE, current Firefox version, old Firefox versions, etc.) work.

Where a few of the AMO tests were converted from our old Selenium IDE tests, most were written from scratch; in the case of SUMO, most were converted and cleaned up/fixed from the IDE. We’re still ironing out and refining our framework and test setup(s), as well as continually sharing best practices from the automation projects; if you’re interested in helping out by writing Python tests, please take a look at the projects and dive in. We’re reachable at

Soon, I’ll be rewriting our Contribute page to better organize, solicit, and engage test-automation efforts (as well as end-user testing). Stay tuned!


A huge thanks to everyone who attended, "How Mozilla and LinkedIn use Selenium: Open Source Customized," here @ Mozilla HQ on May 19th, and, especially, to Sauce Labs (esp. you, John) for driving and co-sponsoring it, and Wade from LinkedIn, who was an excellent co-presenter.

There were about ~120+ folks there, and it was really wonderful to have so many Selenium-interested folks sharing real-world tips and asking intriguing questions.

See you all again, next Meetup!

Photo gallery:

Raymond's slides from the presentation

San Francisco Selenium Meetup group page

This is an *actual* URL from Vignette:

I'm sure that behind those obscure-looking alpha-numeric strings there's a rhyme or reason (hopefully?), but that's not really what I wanted to blog about today :-)

What I did in fact hope to convey today is how to try "breaking" (for various definitions of "breaking") web applications' internal logic (or output) by manipulating their URLs.

Spurred on by our resident web-applications-security guru, Michael Coates, I've been applying some really simple approaches. I'll walk you through just four of them, here (not picking on SUMO, I swear!)

1. Add/remove delimiters/trailing slashes:

Let's take a URL like as an example. Obviously, we make sure that and both return a valid homepage (that's almost a given, but you never know).

The more-interesting test is to add junk after the first trailing slash (which we now know to work), like so: Currently, when you do that on our new version of SUMO, you redirect to http://en-us/forums

Filed bug 566106.

2. Put Unicode/non-ASCII text where it wasn't originally intended:

Prior to a fix, SUMO didn't know what to do with Unicode as a value in its "&tags=" parameter:

Paul Craciunoui found and filed bug 564385.

3. Input non-expected/invalid values for parameters:*Finding+your+Firefox

The above is a long URL, but if you notice, the second parameter is "a=a". The original problem was that SUMO was expecting the value of "a=" to be an integer. So, of course, I fed it an alphabetical character, "a".

The result of which can be seen in bug 565857.

4. Try changing the locale code:

If your web app likes its URLs in a certain format, such as "en-US", try simply omitting the "-US", or "US", leaving, respectively, "en" and "en-".

As a real-world example, both the former, and the latter, redirect us to, which is graceful.

These are, of course, just four really simple examples, but they highlight how quickly and easily testers can begin to help ensure your app won't crash/hang/do something evil with bad data; there are a myriad of other ways, and the URLs in the location bar aren't the only target: try using a add-ons, such as Tamper Data, or Live HTTP Headers (which I've already blogged about for its primary use), to change sent values (on GETs/POSTs, etc.), especially when submitting forms.

Once you know how to manipulate _one_ parameter (irregardless of if you know the expected value, or, maybe in spite of that), you can find some gems.

Feel free to get pathological, too, just be mindful that some constructs are just too heinous for words, and a developer is likely to cast you the stink eye if it's too wild.

Check this out; confidence-inspiring, no?

stephen-donners-macbook-pro:smokeTests stephendonner$ python 
ERROR: test_searchapi (__main__.SearchAPI)
Traceback (most recent call last):
  File "", line 65, in tearDown
    self.assertEqual([], self.verificationErrors)
AssertionError: [] != ['type:firebug,dict']

Ran 1 test in 8.395s

FAILED (errors=1)
stephen-donners-macbook-pro:smokeTests stephendonner$ python
Ran 1 test in 6.174s


Sometimes, we return 33 results, and yet at others, we return 0.

Here's what the tests are actually doing:

  "/en-US/firefox/api/1.2/search/firebug type:dict")
            retVar = sel.is_element_present("//searchresults[@total_results=\"0\"]")
            if retVar == False:
                raise AssertionError               
        except AssertionError, e:
            self.verificationErrors.append(str(e) +'type:firebug,dict')
   "/en-US/firefox/api/1.2/search/firebug type:extension")
             if ("Firebug" != sel.get_text("//name")):
                raise AssertionError
        except AssertionError, e:
            self.verificationErrors.append(str(e) +'type:firebug,extension')


I'll be filing a bug shortly.

(This post assumes its readership already has a Python environment set up, as well as Selenium RC; a quick Googling should provide ample help on getting both install/configured.)

It's easy to convert Selenium tests written in its native "Selenese" language (through the IDE) over to, say, an object-oriented programming language like Python, but not always so easy to fix them, especially since in OOP there's is a bunch of different ways to trap/report errors.

I'm only going to cover the basics, here, because my Python skills aren't quite up to snuff (and it might take some time, given my other responsibilities); this should, however, give you a good idea of the syntax changes and, more importantly, encourage those of you who might be interested in taking on such an endeavor.

Let's get started; this assumes you've already written a Selenese-based test through the IDE. If you haven't, you should first go through their tutorials, or just take a look at some of our IDE tests to get a better idea.

I'll use searchapi.html as our example (written by the ever-intrepid Dave Dash). Fire up Selenium, open the file, and then do Options | Format -> Python - Selenium RC.


Your IDE's output should now look like the following:


Now, just copy and paste that into your favorite text editor/IDE/VIM/whatever; as you can see from this line:

self.selenium = selenium("localhost", 4444, "*chrome", "http://change-this-to-the-site-you-are-testing/"), we're not ready to run the test just yet (for those of you already familiar with the IDE, that's the baseURL you'll find in the HEAD); so, it should be this:

self.selenium = selenium("localhost", 4444, "*chrome", "")

  • Save the test
  • Launch your RC (java -jar selium-server.jar from the selenium-server-1.0.3 folder)
  • Drop into a terminal/console, and do: python

You'll get this error:

ERROR: test_searchapi (__main__.searchapi)
Traceback (most recent call last):
File "", line 76, in tearDown
self.assertEqual([], self.verificationErrors)
AssertionError: [] != ["'YSlow' != u'SenSEO'"]

FAIL: test_searchapi (__main__.searchapi)
Traceback (most recent call last):
File "", line 32, in test_searchapi

Ran 1 test in 20.134s

FAILED (failures=1, errors=1)

Obviously, all the tests were passing when they were first written, but to fix these, we now need to do a couple things:

  • Manually run the search API URLs and compare the expected output with the real data (since it changes)
  • Fix our tests not to fatally assert (and stop running the remaining tests) on such outdated data, but propagate the error to the console so we can run them all and fix individual failures

Most of the failing tests are because we open a URL and assert that there should be *no results*, e.g. self.failUnless(sel.is_element_present("//searchresults[@total_results=\"0\"]")); this won't be true now since new metadata (such as keywords, descriptions, summaries) changes, so we should either be more explicit in for what we're searching, or more lenient. In reality, a smart combination of both is needed, so Vishal came up this:

retVar = sel.is_element_present("//searchresults[@total_results=\"0\"]")
if retVar == False:
raise AssertionError
except AssertionError, e: self.verificationErrors.append(str(e) + 'apps,yslow')

Line by line, it:

  • opens the API URL
  • stores the value of |true| or |false|, depending on whether the total results is indeed 0 or not, and if so,
  • raises an assertion (but doesn't fatally die)
  • adds the assertion to the verificationErrors array and adds "apps,yslow" to the assertion string so we know which add-on failed

If we change all of the calls to that format, the complete test will run (though still spew a ton of failures); the next step is to figure out what the right tests are and remove/edit them; Vishal's steps will at least let us run the whole test and see all errors at the same time:

ERROR: test_searchapi (__main__.SearchAPI)
Traceback (most recent call last):
  File "", line 183, in tearDown
    self.assertEqual([], self.verificationErrors)
AssertionError: [] != ['type:firebug,dict', 'platform,linux', 'limit, firebug, all', 'mobile filtering']

Ran 1 test in 45.996s

Now, on to fix these tests; until next time...

While I'm always learning more about HTTP (albeit slowly), and I've only got a paltry 7 [1] or so of its status codes memorized, HTTP is part of my day job as a tester on the Web QA team here at Mozilla, so I've found tools, such as Live HTTP Headers, immensely useful.

HTTP headers (yeah, there are a ton) are essentially the core of HTTP requests, except for the payload--the actual content. (I hope I'm getting all this right!) Headers are primarily useful for debugging, but they can also be informational; let's take a look at some AMO headers and see just how.

Here's a URL that every Firefox build has pre-bookmarked, by default:

If you load that URL, you'll notice that you end up at, instead; what happened?

If you install Live HTTP Headers, and invoke it via Tools | Live HTTP Headers, you'll see a "Headers" tab. Notice that the "Capture" checkbox is enabled by default; if you have GMail or some other AJAXy sites in the background, you're probably going to want to close them while you capture headers (or risk drowning in information overload).


Since it's already shown above, I'll snip much of the header info, but with Live HTTP Headers capturing, and Firefox loading, you should see this request:

GET /en-US/firefox/bookmarks/ HTTP/1.1

... (followed by a bunch of headers)

HTTP/1.1 302 Found

Server: Apache

X-Backend-Server: pm-app-amo11

Content-Type: text/html; charset=UTF-8

Date: Tue, 20 Apr 2010 06:44:20 GMT


Let's break it down, line by line:

  1. HTTP/1.1 302 Found - We're using the 1.1 version of the HTTP protocol, and the server responded to our request for by telling us--via the 302 status code--that the resource has moved (if it were in its usual place, it would return a 200 OK).
  2. Server: Apache - pretty self-explanatory; for whatever reason, we're not echoing out the version, or mods (some sites go crazy and tell you most everything: Apache/2.2.11 (Unix) modssl/2.2.11 OpenSSL/0.9.8i DAV/2 modauthpassthrough/2.1 modbwlimited/1.4 FrontPage/
  3. X-Backend-Server: pm-app-amo11 - this is a custom header that tell WebDev, IT, and QA which of our many AMO servers served up this particular request; if we ever have a problem with a particular instance (it happens), we can quickly pinpoint it down to, say, outdated templates, borked Memcached data, or, perhaps, the server itself has connectivity issues or a failing cron job, etc.
  4. Date: Tue, 20 Apr 2010 06:44:20 GMT - self-explanatory (but useful to know how the server's clock is set -- again, rogue cron jobs could be in play if a system clock is off).
  5. Location: - the meat of the response; this is to where the server redirects your browser.

This post is already a little "long in the tooth," but for what else have we used this tool?

  • To diagnose caching problems with our web-caching/load-balancing infrastructure (Zeus, Netscaler), since we output the caching server's hostname/number via X headers
  • Or, even, to determine which files are getting cached; we use EdgeCast on and, so along with View | Page Source, in the case where we use a special subdomain, such as, to serve images, HTML, JavaScript, and CSS, we sometimes have to rely on headers, when resources are aliased/remapped.
  • To determine whether files are getting sent as the right content type; I've used this to file bugs on JSON files that weren't the right content type.
  • To help test against CRSF vulnerabilities, by ensuring that only valid tokens are accepted (you can manipulate data using Live HTTP Headers, but I usually use another excellent add-on, TamperData, for that).
  • Sometimes we put the version of PHP in there, too: X-Powered-By: PHP/5.2.9; useful if you're dealing with a cluster of servers, and not all have the same version of PHP (or, collectively, they're behind a version or two -- it happens).

If you're interested in web testing, or are just interested in learning more about the web and how it works in general, you might want to take a further look at HTTP, Live HTTP Headers and status codes; you'll be amazed how much you can learn pretty quickly.

And, if you do have questions, or are interested in learning more, stop by our IRC channel and say "Hi" (contact info here); we always appreciate web testers! We have many ways you can contribute, so don't be shy!

[1] Status codes I see frequently: 200, 301, 302, 304, 401, 403, 500

Marcis G has translated my article into Belorussian; thanks, Marcis!

We're in the process of switching to a new (and improved) Sphinx-based search engine on SUMO (, and would *love* your help ensuring it meets your needs.

There are a number of ways you can help:

* ad-hoc test search, here: for now, but probably by the time you read this, we'll have merged back into trunk:
* Test/verify Resolved FIXED 1.5 Search bugs
* convert |in-litmus?| and/or |in-testsuite?| flagged bugs into testcases (manual and automated via Selenium, respectively)

(New to Litmus and/or Selenium? No problem! The team contact info is up at our Web Testing homepage; if you're familiar with IRC, it's best to reach us at #sumodev on

To familiarize yourself with what's already been found, here's our open 1.5 buglist.

Short list of things to play around with:

* proper nouns, e.g. "Facebook"
* common phrases, "Firefox crashes" (I'm talking to *you* 3.5!)
* word stems, i.e. "fail" in "failing"
* named articles, with/without quotes: "Cannot clear location bar history"
* misspellings (which tests the "Did you mean" feature)
* URLs in Forum posts and KB articles
* ensure that the results returned include Forum and KB (Knowledge Base) URLs, when appropriate (if it includes both, KB first, followed by Forum results)

In particular, we could use help with native-language searches other than en-US (though, to be sure, we could use help there too).

* Tildes
* Umlauts
* Kanji
* Aramaic


Please file bugs here:

The goal is to code freeze the 24th of November and ship December 3rd.


We now have |in-litmus| flags for bugs needing testcases in both the SUMO ( and AMO ( projects; flag away!

To use: simply flag as |in-litmus?| if you're unsure whether a test exists; we'll triage and either add a comment (preferably with a Litmus testcase ID #), and/or flag it as + or -, depending on whether a test exists or is not needed, respectively.

(Sprinkle liberally!)

Since I test What's New and First Run pages, I thought it'd be useful to write down when which appears when.

I've put it up at our web testing wiki, and here it is, below:

* Freshly installed Firefox, you get:
o First Run (left tab focused/active)
o Firefox start page on (right tab)
* 3.0.x -> 3.5.x, you get:
o First Run (left tab)
o What's New (right tab focused/active)
* 3.0.x-1 (next-to-latest version) -> 3.0.x (latest version)
o Firefox start page on (left tab)
o What's New page (right focused/active)

Mozilla web developer Dave Dash has put out a call for help testing AMO's move to a Sphinx-based backend; I second that call, and encourage you to let us know of any new issues not already tracked in bug 498999's dependency tree.

New to testing search? Here's an incomplete list of things to try:

* substrings ("fire" from "firefox")
* named add-ons without quotes (Adblock Plus)
* named add-ons *with* quotes ("Adblock Plus")
* leading/trailing whitespace (" Adblock Plus ")
* broken-up strings ("stumble upon") -- bug 517344
* using SeaMonkey, Thunderbird, Sunbird, Mobile (Fennec)? Help us test those applications' search support too
* localization
* categories
* collections
* sandboxed add-ons
* advanced search options, such as:
** versions
** types
** platforms
** last-updated dates
** sorting views

The preview server, apt-named, is:

Thanks in advance for helping ensure our next version of AMO has a great search experience!

Interesting Selenium |type| bug?

| No Comments

While writing Selenium IDE tests for SUMO (, I've noticed something weird with its input handling.

A normal recording of a login to SUMO looks like this:

open /en-US/kb/Firefox+Support+Home+Page
clickAndWait link=Log In

type login-user username
clickAndWait login-button

The problem with this is (obviously) that it omits issuing:

type login-pass password, and so the next (and successive) commands will all fail.

(Perhaps Firefox itself doesn't propagate the value submitted, to Selenium, for security -- that would make sense.)

Pretty self-explanatory, but: login-user = name of username textfield element, login-pass is the name of the password textfield element, and username/password are placeholders for the real thing.

I'm actually learning a decent amount of Selenium by trial and error, but I'd love to hear tips and tricks -- email, comment here, or join us in #mozwebqa on

Tonight we shipped SUMO 1.3 (; full bug list.

We resolved 84 bugs as fixed; some regressions, others features -- a huge release by any measure.

Thanks to the SUMO team and our mozwebqa contributors for their help!

(We're now moving on to SUMO 1.4, which will aim to revamp the forum experience significantly for the better.)

We could use help testing AMO 5.0.9; if you want a lightweight way of helping out, please run any of the tests in our Litmus repository.

If you know enough to file a bug, please do; otherwise, just leave a comment when/if you mark something as failed or unclear.

Additionally, come find the Mozilla Web QA test team on IRC, at #mozwebqa on


- Stephen

In my last blog post, I covered only a few of the page elements at the top of the Mozilla Creative Collective site; here, I'll finish up that page.

Header navigation:

* Does it use JavaScript to replace the image on hover, or CSS?
** If JavaScript, try disabling it, and see what happens
* Ensure that links go to their intended destinations

Post Your Design:

* Does it have a hover effect? If so, see above
* Make sure the post-design page honors logged in status (and encourages you to log in, otherwise)

Latest Design Challenge:
* We should consider whether this is programmed content (and if so, we should test the admin interface to verify we can change it) or whether this is a time-based module (runs a certain promo based on dates)
* What are the expected ALT/TITLE attributes on the tags?

Rotating content block:
* Verify that clicking on each section below the image replaces the content
* Verify that long content is either ellipsized (...) or discouraged from the admin interface
* What are the expected ALT/TITLE attributes on the tags?

Hot Designs:
* This could run off an algorithm (and if so, we should test that when we manipulate the test data, the algorithm adjusts accordingly) or it could be programmable content
** How many likes/favorites does it take for a design to be "hot"?
** Is this block cached in any way? On a cron job? Immediate?
* What are the expected ALT/TITLE attributes on the tags?

Designers you Like:
* Verify that only favorited designers are displayed
** Try favoriting one and come back; does it appear?
** Similarly, un-favorite one and return; does it disappear?
* What are the expected ALT/TITLE attributes on the tags?

Staff News:
* RSS feed displays staff in the same order as they appear on the homepage
* Author name links to their blog/page
* Title links directly to their post
** Try long titles -- do they wrap?

* Copyright date
* Logo links to homepage/page top
* Links work and take you to their intended destination

Of course, the above list isn't comprehensive, but it's a good start; feel free to add comments to the blog entries for anything I've missed or gotten wrong.

Also, if you're interested in helping the Mozilla Web QA team test, please check out out Volunteer page.


I've often wanted to write a blog post or a wiki page explaining the process by which I look at a site (or in this case, a single-page mockup), and start to break down the functionality/elements that I'll be testing.

Let's look at the Creative Collective example, below:

First things first: the first thing I notice is that the site stores and recognizes usernames (or email addresses, depending on how it authenticates its users). I'd immediately expect that the "Join Us" link leads us to an account creation/registration page, wherein users fill in a combination of their real name, username, and/or email address. (Other fields like a short bio, etc. might exist too.)

On an account-creation page, the usual test candidates are:

* username/password length
** Does it accept more than the stated length limitation?
** How about something like 250 characters?
** How about "special characters", such as ":/?#[]@!$&'()*+,;=.<>"?
** Does it accept empty or only-space-character usernames?
* unique username (try to register two that are identical)
* empty password

Of course, those are just the individual fields; we also need to ensure that users enter all the required information upfront, and if they don't, that we don't lose what they'd typed, and even more than helpfully pointing a user to the field with the problem, we should tell them how they can correct it and move on.

And, that's just registration -- we want to be equally helpful to the user on login: which fields are required, what their minimum lengths are, etc. To test, I would:

* try valid username or email address (whichever the site uses) but invalid password
* try valid username/email address without a password
* try invalid username/email address but valid password (etc.)
* test the forgotten-password feature

Login and account-creation pages also should be served through HTTPS, so ensure that the links start and stay on https:// -- we don't want to send users' data "over the wire" in plain text -- that could potentially be read by others. Additionally, we should test that the fields properly escape data (encode it so that invalid/evil input is made safe); we should also test for XSS (cheat sheet here).

After testing this, I'd move on to search functionality. In the mockup above, we have "Search for" (images) * [search textfield]. Let's break it down: we know from "images" that one datatype will be images, but not how our search terms might match that category; because it's a selector, we can also assume there will be other datatypes -- maybe one for "text", and/or "images and text". For text, are we searching on perhaps tags, titles, or authors (each of which could be in the selector)? And whenever we think of search, there are a bunch of use-cases to consider:

* when you click on the pre-filled "search" text, does it disappear, to be replaced by your entered text?
* what should the empty (i.e. default) search experience be? (Should the user be able to search without entering something? Typically, yes.)
* what about substrings? e.g. "paint" in "painting" -- would it match "painting"?
* do misspelled words trigger a helpful "did you mean?" suggestion (a la Google)
* when you change the "Search for" pulldown, does the textfield clear?
* do your criteria get reset when reloading the page?
* try pasting something huge like the text of the U.S. Constitution -- does the server have to process it in its entirety, and take forever? Does it balk? Throw up an error page, with, perhaps, any internal server-config values/messages? Are there layout issues on the search-results page from such huge a string?
* again, also try XSS here
* try, of course, regular searches :-)

So far, I've really only touched on four components of this site--nay, this page--so you can quickly see how complex (and fun) website testing is.

I'll come back to finish this page towards the end of this week or early next week, so stay tuned!

Feedback/questions welcome at

As always, Mozilla's Web QA team can be found at

I've taken a pretty rough cut at gathering together a document to help those wanting to get started in (or whom are just interested in learning more about) MozWebQA (testing) with Mozilla:

Web Testing volunteer page

It currently shares content with the Web Testing homepage intentionally -- I'm still working out the balance between introductory guide and reference.

We would absolutely love your feedback!; feel free to make direct edits within parentheses, as I'll be editing this quite frequently and will incorporate suggested improvements/corrections, etc.

I'm encouraged by the recent team momentum, and need your help in growing interest and documentation.


Stephen Donner, on behalf of:

Earlier today, I tried to write a Selenium testcase to help ensure that bug 504188 never happens again; the bug was that advanced-search parameters in our AMO 5.0.7 candidate weren't getting set properly.

View source on the following page, and grep for "Linux" -- we want to verify that whole line, basically:

* option 2 is selected
* has "Linux" in its value

In the Selenium IDE, we're checking for:

verifyElementPresent, //div[@id='advanced-search']/fieldset[2]/div[3]/label[@for='pid'], Value="Linux"

But this isn't working -- verifyElementPresent is happy and returns success/true, because it finds the element, and doesn't care about the value in Value=; I *think* verifyAttribute is what we want, instead, but Raymond, Juan, and I tried that tonight without success.

Help is appreciated!

Recent Comments

  • Stephen Donner: A better-formatted URL is here:; I was in the read more
  • David Tenser: Great news indeed! As you once said, cheers to more read more
  • Merike: Why not use "verifySelectedLabel, pid, Linux"? Or do you actually read more
  • LpSolit: What you want is: for HTML scripts: verifySelected, pid, label=Linux read more
  • Régis: This : 1/ verifySelectedValue, id=pid, 2 2/ verifySelectedLabels, id=pid, Linux read more
  • Alex Rudloff: Thanks for the support Stephen :) Let us know if read more
  • Sarah LaDow: Ouch! I agree with your assessment, but it is a read more
  • Daniel Glazman: I know nothing coming from AOL should really surprise me read more
  • Stephen Donner: Unfortunately, my task is not to fix; merely to note read more
  • Sarah LaDow: Sounds like an excellent job! But I do know someone read more

Recent Assets

  • Screen Shot 2012-03-20 at 11.49.19 PM.png
  • 3445ab5.jpg


Powered by Movable Type 5.12