Recently in selenium Category

In a previous post, I wrote about Sauce Labs, and detailed how their Selenium-in-the-cloud service ("OnDemand") helped Web QA--particularly yours truly--quickly make sense out of a perceived performance regression, with really nice video support. It turns out that we keep having reasons to look to them for additional capacity and capabilities. And, thanks to a great working business relationship, they're helping us meet those needs.

The problem: as stated in our Q2 goals, our team has to "Plan for and support a high-quality launch of http://persona.org". The goal goes on, stating that we'll support desktop on a variety of browsers, as well as mobile.

Our list of supported, in-house browser and operating-system environments is here: Mac, Windows 7, and Windows Vista. Until recently, we didn't have in-house Opera or Chrome support, so calling out to Sauce for those browsers has been really beneficial. (When we launch a new site, we definitely cover all major browsers, manually, but due to our Grid infrastructure, the need to run frequent tests, and the maintenance cost, we try to keep it to the essentials; other teams have different, more-stringent browser requirements, so it was necessary to add in both Chrome and Opera to our automation.)

The solution: It's often been said that companies should "focus on their core competencies" (I'll admit there are counter-points aplenty), and it's a view that I happen to subscribe to, especially given the demands on our team. To that end, instead of planning for, provisioning, and then maintaining the morass of browser environments on a plethora of operating systems, we increased our scope and frequency-of-use with Sauce Labs, covering what we reasonably could, in-house, at present, while immediately spinning up test coverage, and making our developers and supported Identity/Services QA team very, very happy.

The here and now: as of this moment, we've been able to bring up 42 test jobs against myfavoritebeer.org (dev/beta) and 123done.org (dev/beta), covering the most-critical paths of login and logout; we're closing in on new-user registration and changing passwords, too.

The future: as our testing needs increase, so, likely, will our usage of Sauce, and it'll be as easy as cloning a job in Jenkins and changing a single parameter, thanks to the built-in flexibility that Dave Hunt's pytest-mozwebqa plugin provides.

So, thanks again, Sauce; just another example of helping the Selenium [1] and Open Source communities, and, in particular, Mozilla!

[1] Sauce Labs powers the critical Selenium-core, continuous-integration tests for the project: http://sci.illicitonion.com:8080/

In addition to co-hosting/sponsoring many Selenium Meetups, we've been really grateful to have a generous and accomodating business relationship with Sauce Labs, which allows us to augment and complement our own custom-built Selenium Grid testing infrastructure with theirs. For our most recent Socorro release, I and the whole team (dev +QA) really came to appreciate one feature in particular: video, and for two reasons.

To get you familiar with Mozilla's Selenium Grid setup, let me back up just a second; Dave Hunt has previously highlighted his pytest-mozwebqa plugin (still in beta) for our WebQA team, which--among all the many other awesome things it does--allows us to very easily use Sauce Labs' API to integrate with their OnDemand service.

Inspired by the wealth of build-reporting information we get from Sauce, Dave set out some time ago to incorporate some of it--particularly capturing screenshots--as well as writing out HTML reports that link us to (if the job was Sauce-run) a Sauce Labs job #, complete with video for each tests. We looked at incorporating video a bit into our own infrastructure (as did David Burns in Lights! Camera! Action! Video Recording of Selenium), but soon realized that--given all the other challenges of shipping, on average, 7 websites per week, and dealing with our scale of 6 Mac Minis, archiving, disk space, etc.--it wasn't something we wanted to pursue, for now.

For this particular Socorro release, we thought we had--immediately after we deployed--a huge performance regression. See bug 718218 for the gory details, and all the fun that ensued (over the weekend!).

(Click on the image to view the test run.)

Typically, video is a "nice to have" feature, for us, as our team is quite familiar with debugging failing Selenium tests locally, and our HTML reports and screenshots go a long way, too; for this release, though, I was playing back-up for Matt Brandt, and, as the requisite pointy-hair, couldn't code or run a Selenium test to save my life (not quite true, but close!) When I began investigating why our Socorro test suite suddenly ran ~20 minutes longer than it ever conceivably should, I discovered it was in one of the search tests, test_that_filter_for_browser_results, which we later realized had an erroneous |while| clause that never eval'd to |true|.

While I first looked on our own Grid instance, to try to watch (via VNC) our browsers running this particular test, Grid's distributed test-execution nature means that it never predictably ran on the same node each time. (In retrospect, I could've used pytest's keyword (-k) flag from a job in Jenkins to run the suspect test in isolation, but I digress.) I needed a way to quickly show our dev team the individual test, without having them or me set up a test environment, and firing off that test. Luckily, since Dave made the default for our Sauce Labs jobs public, all it took was a single run of the socorro.prod job via Sauce, find the individual test, and copy the link, which has nice annotated timestamps for each Selenium command, and quick paste in #breakpad.

Earlier, I mentioned that their video support is great for two reasons: in addition to the above (the ability to view archived videos from past-run jobs), we also took advantage of the ability to watch the individual test--as it ran in real-time--and correlate the currently-issued Selenium command with the hold-up/"perf regression."

We quickly isolated it to the aforementioned, bogus |while| loop, commented it and the accompanying assert out, and certified the release after more careful checking. Because it was the weekend, and folks were available at different times, once each member popped onto IRC, we simply showed them the "before and after" videos from Sauce Labs, and the team regained its confidence in the push.

So, thanks, Sauce, for such a useful feature, even if it's (thankfully) not necessary most of the time!

Mozilla, and WebQA in particular, looks forward to continuing to work with Sauce Labs and the Selenium project, in exploring ways in which we can work together to help further and bolster Selenium and Firefox compatibility.

(Title get your attention? Good! Keep reading...)

My awesome coworker Matt Brandt has highlighted the awesome work and contributions that Rajeev Bharshetty is doing for and with, Mozilla WebQA.

I'd like to take this opportunity to call out another great contributor to (and with) WebQA: Sergey Tupchiy, who we nominated for Friend of the Tree (along with Rajeev) in this past Monday's meeting.

3445ab5.jpg

Sergey has been instrumental: he helped us turn around a Q4 goal of having test templates custom-tailored for brand-new Engagement/Marketing websites (and which, quite frankly, has application and implications for our standard test templates, going forward.

His flurry of awesome GitHub activity should be self-evident :-)

In 2012, and kicking it off this quarter, Mozilla WebQA (and the rest of Mozilla QA) will be taking a hard look at, and hopefully making many inroads into, a better community on-boarding story; we're blessed to have the great contributors we already do, but need to grow and scale to meet Mozilla's ever-increasing challenges.

We're looking for contributors of all expertise areas, skill levels, and project interest; if interested, or just want to find out more information, please contact us via whichever below method best works for you, and we'll happily help you get started!

IRC: #mozwebqa on irc.mozilla.org
QMO homepage: https://quality.mozilla.org/teams/web-qa/
Email: mozwebqa@mozilla.org
Team Wiki: https://wiki.mozilla.org/QA/Execution/Web_Testing

(And a huge thanks, again, to Sergey, and other awesome contributors like Rajeev!)

Catchy title, no? Well, you're reading this...

Today, our legacy (and pretty comprehensive) Selenium tests for the Mozilla Add-ons website (both written in Python), were forced into retirement; the reason?: our previously-blogged-about AMO redesign--code-named "Impala"--went live for the homepage and the add-on detail pages.

New AMO homepage

Tests, of yore:

To be sure, the tests served us well for the past 14 months: when originally written, they covered nearly everything about the end-user experience (except for downloading an add-on, which we couldn't do with Selenium's original API). While not an exhaustive list by any means, we had tests for:


  • navigation, including breadcrumbs, the header/footer

  • locale/language picker options

  • switching application types (Firefox, SeaMonkey, Thunderbird, Sunbird, Mobile)

  • categories (appropriate for each application type)

  • login/logout

  • sorting and presence/correct DOM structure of add-ons (featured, most popular, top-rated)

  • reviews (posting, author attribution, cross-checking star ratings, as we found bugs in updating these across the various site aspects, via a cron)

  • collections (creation, tagging, deletion, etc.)

  • personas (again, different sort criteria was also covered)

  • search (popular, top-rated add-ons, substrings, negative tests, presence of personas/collections in search results, properly, etc.)

  • proper inclusion of Webtrends tagging

  • ?src= attributes on download links, to help ensure we didn't break our metrics/download-source tracking

  • ...and much, much more


...and then the testcase maintenance started:

  • we went through a few small redesigns, here and there (one of which was the header/footer, which for a brief time period we shared with Mozilla.com, around the launch of Firefox 4)

  • we changed where and when categories showed up, and their styling

  • we kept running into edge-cases with individual add-ons' metadata (titles, various attributes -- some valid bugs, some that our tests' routines just couldn't anticipate or deal well with, like Unicode)

  • A/B testing hit us, and we had to deal with sporadic failures we couldn't easily work around

  • ...and so many more aggravating failures, and subsequently, difficult refactoring attempts (most of which were abandoned)


Lessons learned (the hard way):


  • don't try to test everything, or even close to everything -- you really have to factor in a lot of time for testcase maintenance, even if your tests are well-written

  • keep tests as small as possible - test one thing at a time

  • on the same token, don't try to iterate over all data, unless you're trying to ascertain which cases to cover -- it'll just cost you test-execution time, and --again-- testcase maintenance (eventually, you'll either cut down the test iterations, or you'll give up and rewrite it)

  • separate your positive and negative tests: reading long-winded if/else branches or crazy, unnecessary loops when you're already annoyed at a failing test just makes it worse

  • don't couple things that really shouldn't have an impact on each other; for instance, don't repeat search tests while logged in and logged out, unless you have a good reason

  • same goes for user types: while it's seemingly noble to try to cover different user types, don't test everything again with those -- just the cases where functionality differs

  • safeguard your tests against slow, problematic staging servers, but don't go so far as to mask problems with large timeout values

  • fail early in your tests -- if you have a problem early on, bail/assert ASAP, with useful error messages; don't keep executing a bunch of tests that will just fail spectacularly

  • go Page Object Model [1] or go home, if you're writing tests for a website with many common, shared elements

What's next?


  • a strong, well-designed replacement suite we've been running in parallel: https://github.com/mozilla/Addon-Tests

  • an eye in mind for Continuous Deployment (thanks for the inspiration, Etsy!), including, but not limited to:

  • better system-wide monitoring, via Graphite

  • to that end, much closer collaboration with our awesome dev team on unit vs. Selenium-level coverage balance


Phew!

Ironically, the same time our legacy test suites started failing, so did a large portion of our rewrite (which also began before the Impala rewrite officially landed); the beauty, though, is that it's immeasurably easier to fix and extend, this time around, and we aim to keep it that way

[1] (Huge shout out to Wade Catron, from LinkedIn, who runs a tight Ruby-based test framework, and who helped us develop a great initial POM, which we've refined over the past months; video, below):

I'm happy to announce that Mozilla and Sauce Labs will be holding the San Francisco Selenium Meetup here at Mozilla's HQ in downtown Mountain View, CA, on May 11, 6:30 pm.

Google Maps to Mozilla HQ.

Here's the official Meetup.com page: http://www.meetup.com/seleniumsanfrancisco/events/17238077//. Please RSVP there, so we know how many to expect; thanks!

(Excerpting from the Meetup entry for better indexing/searchability)

"We're thrilled to announce that, hot on the heels of his CSS locators talk at the Selenium Conference, Santiago Suarez Ordonez, the Sauce Ninja and Selenium committer, will be the speaker of our next meetup on May 11th at Mozilla.

Santiago, or 'Santi', will give some insight into why you should say NO to XPath, and instead use CCS locators to make your tests more readable, faster, and realiable.

He gave an abridged version of this talk in the Track B of the Selenium Conference and it proved to be widly popular, so he'll be extending it and alse sharing some cool tools that will help you transition from XPath to CSS.

Thanks to our friends at Mozilla for offering to co-host this meetup. See you all in a few weeks!

Agenda

6:30pm - Drinks, Pizza, Networking

7:15pm - Announcements

7:30pm - Santi tells us about CCS Locators

8:15pm - Q&A

8:30pm - Tools to help transition from XPath to CSS

9:15pm- Lights Out"

Mozilla are proud supporters/users of and contributors to, Selenium, and are really looking forward to the opportunity to learn and also exchange ideas and tips with the community!

I gave a talk at the London Selenium Meetup back in November of last year, and presented there on “How Mozilla Uses Selenium”; it’s been four months since, and I’ve been wanting to give an update on what we’ve done since then, where we’re going, and how we’re going to get there, so here goes.

Where were we?

It’s easiest for me to talk about progress and challenges per-project, so with that, let’s talk about AMO. Our inaugural tests gave and still give us a _ton_ of coverage; the problem we’re now facing is two-fold: we’re now seeing more frequent 500 Internal Server Errors on our staging box (which we’ll address hopefully soon after we ship Firefox 4), and it’s in fact a problem tickled by tests (though through no fault of their own) that do quite a bit of looping over a lot of data (https://addons.allizom.org/en-US/firefox/extensions/bookmarks/?sort=created, for instance). Looking at our commit logs, you can see we’re skipping certain add-ons, adding timeouts, adding retries, etc. -- we’re spending a lot of time fixing classes of tests, rather than individual tests, really (mostly because the individual tests are huge). We've even imported the Python module urllib, and check for server status before launching our tests, as well as sprinkling setTimeout(120000) in an attempt to ameliorate the server down/timeout issues.

Where are we now? What have we done, since?

Some of our current test suites, particularly in AMO, take over an hour to run (again, they're thorough, I'm not faulting them for that); that's simply too long to wait for a test to finish. Coupled with the fact that they're prone to suffering staging-server failures, too, that's a lot of time to wait for a green.

Moving forward, based on David's style guide and test templates, we'll be writing tests in the "new" Page Object Model approach; I say "new" because it's quite well-known in the Selenium world (thanks, Wade!) The tests, as David has himself started them, live here: https://github.com/AutomatedTester/Addon-Tests. These same new tests also use pytest, and can (and do) execute tests in parallel, resulting in much-faster completion time.

I should mention, as well, that the ever-awesome Dave Hunt has centralized our Selenium Grid config in github, which will help us stabilize that going forward.

Where do we want to go?


Hugely inspired (with a healthy sense of anxiety, as in "how do we get there?") by James Socol's "The future of SUMO development" post, along with our weekly releases in AMO, SUMO and Mozilla.com, we recognize we too obviously need to be a lot more nimble. That means we've got to spend less time fixing our tests ("now, where do I find the locator for this element, again?", "where in this huge test is _my_ add-on failing?", etc.), and more time writing new ones, and finessing/making more robust, the ones we have.

How are we going to get there?

Here are just a few of my thoughts:


  • Again, clearly, we've got to fix the staging-server issues
  • Continue to maintain legacy tests when sensible (i.e. the maintenance benefits outweigh the fix costs)
  • As much as possible, outright rewrite the more fragile, huge tests (I'm looking at you, AMO_category_landing_layout.py) into smaller ones, and remove them wholesale from the old suite
  • Continue to augment our production smoketest suites for all projects, replacing legacy, slower-running tests with the newer, faster ones
  • Figure out the trade-off between coverage and time-to-run, especially for smoketests in production, and BFTs in staging (and, ensure all projects have this nice hierarchy, and that it works well for each)
  • Socially-engineer developers to recognize/fix failures that our test-suites catch :-)
  • Build-out our Selenium Grid's hub, for more capacity
  • Switch over existing tests to Selenium 2 where it makes sense, and write new ones in it by default
  • Switch from Hudson over to Jenkins (because new butlers are always better)
  • Mobile...?

We certainly have a long, long way to go, and not enough resources at present to get there anytime soon, but we're working on it, and this summer, in particular, should prove to be a very fruitful time (watershed moment, in fact).

In "Hey, Selenium. Hurry The Hell Up," Sean Grove from Sauce Labs makes a great general point:

"Write your tests to only test a single tiny bit of functionality.
Keep the number of steps in your tests under 20. All the parallelism
in the world won’t save you from long-running tests, because a single test can’t be parallelized.
The best written tests we’ve seen generally run in under 10 seconds."

In general, I agree with that; there are real-world cases, though, where longer, "endurance" or "longevity" tests, as I've heard them referred to, are necessary, or just plain helpful.

As such an example, we recently migrated our Firefox Support website (SUMO) from our San Jose datacenter to a much better build-out, infrastructure-wise, in our new Phoenix datacenter. Because we have a pretty good (though certainly not comprehensive) set of non-volatile tests we run in production pre/during/post-push, we simple edited our /etc/HOSTS files on our Grid environments to point to the staged SUMO instance in Phoenix, and ran our tests, tagged with "prod," against that host while load-balancers, master and slave DBs, etc. were configured and enabled.

Our tests kept passing -- logins were working, search was eventually fine (thought we had to re-configure Sphinx a few times), Knowledge Base articles were zippier than ever, etc.

We didn't, apparently, do enough manual testing around one thing, though: sessions. Yes, account-creation was fine, logins were working, posting was fine, etc., through manual testing, but one thing we noticed and eventually figured out was that user sessions weren't being maintained for more than, I think, 15 seconds, sporadically; other times, it was fine.

(iirc, the problem was that a Zeus VIP was pointing at our SJC colo to re-use sessions, which were being created in PHX, most of the time. jsocol will undoubtedly offer a correction if I'm wrong.)

In our postmortem, we noticed that our Selenium tests -- while testing each piece of functionality just fine, weren't set up to verify and assert that user sessions lasted for a certain length of time, since each was designed to be small, independent, and of course, because they're Selenium, get completely new sessions for each test, every time tearDown() gets called.

The beauty of Selenium is that, unlike unittest or other frameworks which use mock browsers rather than the real thing, it offers true end-to-end and environment testing, for which Vishal wrote test_forum_post_on_prod to ensure that future roll-outs (as well as sessions in general in production) are being maintained properly. This time, the seemingly arbitrary session limit was 15 seconds, but his new test, even on a fast production cluster, should run closer to a minute, giving us more confidence that sessions are maintained.

Again, this doesn't negate the "just the essentials for each function" test approach wisely advocated by Sean and others, it just offers an additional viewpoint and use-case for how Selenium might be used; we rely on our production smoketest/sanity suites for our largest projects, for good reason -- and now we have another one: we know that users' sessions are doing fine at a push of the button, and continuously, in our Hudson (soon to be Jenkins) runs.

About this Archive

This page is an archive of recent entries in the selenium category.

mozilla is the previous category.

automation is the next category.

Find recent content on the main index or look in the archives to find all content.

Pages

Powered by Movable Type 5.12