The aim here is for a high-level overview of our effort, with enough detail that developers and other automation/QA engineers can hopefully become involved at some level, too
(Before, or at least in addition to reading this, it makes sense to read Andrew Halberdstat's post, "A Tired Developer's non-Illustrated Primer to B2G Testing.)"
Known to many, but certainly not all, of Mozilla, is that Web QA has been leading the Gaia UI Tests automation effort, on real devices, in an effort to support both the B2G QA team, and also development at large.
The original intent and driving force was automation of the daily QA smoketest suite, run manually, on each build. Although that's still squarely in our sight, we're not limiting our automation targets to just those pre-identified tests, now - we've added and continue to add critical-path tests as well as areas that are highly prone to regression or just need more coverage. We're using the Marionette framework to drive these tests.
It's definitely been a long road to get here, so let me chronicle (at a high level) what it's taken to get here, and where "here" is, too.
Our Infrastructure/Architecture - Continuous Integration Jobs
It probably makes sense to talk about what we test or provide infrastructure to test, first.
The test suites you see listed there (click on the image to view it at full resolution), are as follows:
- *.download.pvtbuilds, *.download.pvtbuilds.flashzip, and *.download.releases are build-download jobs for their corresponding UI testing, update testing, and performance testing, respectively
- *.master.mozperftest is a new prototype of performance-testing job
- *.master.perf is a "regular" perf job
- *.master.perf.fps intends to measure scrolling performance (frames per second)
- *.master.ui runs Web QA's UI-testing suite against the master branch of Gaia
- *.master.ui.xfail are the tests with known failures, which we run periodically and restore to the master.ui job when passing/known good
- *.v1-train.perf are the performance tests for the v1-train branch
- *.v1-train.run_adhoc_tests - we use this job to plug in pull requests and single-out tests we'd like to run in isolation, for instance
- *.v1-train.ui is our suite of Gaia UI tests, run against the v1-train branch
- *.v1-train.xfail is our xfailed tests -- ones with known failures
- *.update.tests will be running Marshall's update tests (again, native Unagi)
We've scaled out to a currently supported configuration of 7 Mac Minis/7 Unagis; the pairing is intended :-)
Originally, we made do with two -- one pair for running the UI tests, and the other for running performance tests; we then added additional branches, needed more capacity to be able to run concurrent builds, and also added more performance test-suites too.
Each Unagi is connected to a single Mac Mini, formatted and configured to run Ubuntu 12.04 LTS, also running a Jenkins slave agent, and adp port-forwarding to/from Unagi/Mac Mini.
Because we do real, on-device tests (dialer, SMS, etc.) each Unagi (sans the two that run the performance tests exclusively) has a SIM card -- for each of those, we maintain a JSON file (in a private GitHub repo) which contains all the necessary info the test runners needs to run tests on that device. Our Gaia UI Tests README has the info on that.
How We Run Gaia UI Tests (v1-train, and over-simplified)
Now, I'd like to zero-in on the main job Web QA supports, which for the foreseeable future is v1-train.ui.
- First, a job in Jenkins automatically grabs that latest daily Unagi engineering build (b2g.unagi.download.pvtbuilds)
- Next, we do a make-reset gaia, and check out and build the v1-train branch from the Gaia repo in GitHub
- A metric ton of Python-package installs happen next inside a virtualenv
- Then we: push to the Gaia profile, setting needed preferences, and pull in the JSON file that's specific to that Unagi (its phone #, IMEI #, etc.)
- Finally, when we listen on the right adb ports, and start sending commands to gaiatest to run all tests marked for inclusion, from one primary manifest.ini file, which itself imports from others
- Also part of the manifest is a list of potential attributes, to let us know both device-types and capabilities (e.g. bluetooth, camera, wifi, lan, panda for Panda boards) so we can run tests against the appropriate environments
- To keep our tests isolated from potential other problems, we restart the B2G process in-between tests (takes ~ 18 seconds)
- In addition to test results, we get a full logcat for the run, as well as screenshots for failures
As of this post, we have 85 total tests, consisting of 67 UI tests, and 18 unit tests (which make sure we're setting ourselves up for success in the later run).
We launch apps, test the keyboard, take photos, videos, make and receive calls, send and receive SMS messages, test IMEI numbers, play music, tune the radio app, make calendar appointments, install and delete apps both from the Firefox Marketplace as well as Everything.me, and much, much more.
There are limitations, though, as both Gaia and Marionette are under very active development -- for the latter, we're tracking that in bug 801898 - Get GaiaTest UI smoke tests running reliably in Jenkins CI.
As I mentioned, we try to map as closely as possible (unless limited by bugs or not-yet-implemented features in Marionette, which we track) to the QA smoketests, but we've also added a plethora of high-risk/need-coverage areas in our GitHub Issues -- take a look, and reach out to us in #mozwebqa or comment in an issue if you're interested in taking one!
Other Blog Posts on the UI Tests
- Aaron Train: Dabbling with Marionette and Gaia UI layer testing
- Zac Campbell: 3-part Gaia UI Testing