October 20, 2008

Back to Basics: Why write components in C++?

A year or so ago, I introduced the ArrayConverter module to Verbosio, and tried to get it into mozilla.org code. The idea was simple: a JavaScript library for converting between native JavaScript arrays, XPCOM arrays, nsIArray objects and nsISimpleEnumerator objects. Although it did not get in, I've been rethinking half of the ArrayConverter module's functionality, and I believe now that I overlooked something.

I planned on writing two different articles - one talking about reinventing the wheel unnecessarily, and another talking about why I, a JavaScript expert, would choose to write XPCOM components in C++. It turns out, though, that each of these has roughly the same answer: the penalties of XPConnect. Read on for further details.

If you've got the data structure already, don't redo it

In ArrayConverter.jsm, I essentially wrote an entirely JavaScript-based implementation of nsIArray. Because nsIArray requires a complete implementation of nsISimpleEnumerator, I implemented that too. All in all, it took me about 80 lines, with comments, to implement this. (I did not implement the nsIMutableArray interface.)

In doing so, I bypassed the pre-existing mozilla.org array implementation. Perhaps it wasn't so much overlooking the original implementation as wishing to preserve a read-only array - but if that was the case, (as I discussed in a previous article) then I was just missing the point. Perhaps I simply wanted to understand how the code worked, to provide a baseline to compare against the original. Whatever the reason, it was a mistake.

Mozilla code has a very basic and fully functional nsIArray implementation for generic use already:

var arr = Components.classes["@mozilla.org/array;1"]
                    .createInstance(Components.interfaces.nsIMutableArray);

I could probably reduce that 80 lines of code to just ten, by reusing what's already there.

The XPConnect Performance Tax

It's generally accepted that JavaScript does not run nearly as fast as precompiled C++ code (although thankfully that's changing). But that's not the only penalty you pay for using JavaScript.

You see, when one C++ component talks to another, here's what the stack looks like between the two:

  1. C++ callee
  2. XPCOM bridge
  3. C++ caller

When JavaScript comes into play, so does XPConnect, whose sole purpose is to convert between C++ function calls and JavaScript function calls. For JavaScript calling into C++ (think document.getElementById()), here's the local stack:

  1. C++ callee
  2. XPCOM bridge
  3. XPConnect
  4. JavaScript caller

Hidden in that XPConnect layer are a few other "hidden fees" - security checks, argument type checking, etc. Jason Orendorff wrote a good article on these fees and how he's improved on this with "quick stubs" in Mozilla Firefox 3.1.

When C++ wants to call on a JavaScript component, the local stack is similar:

  1. JavaScript callee
  2. XPConnect
  3. XPCOM bridge
  4. C++ caller

I don't know what precisely happens in this XPConnect layer, other than translating the arguments from C++ to JavaScript - I don't know if it goes through any security checks or not. But it would make sense, in the case of one JavaScript calling another JavaScript component:

  1. JavaScript callee
  2. XPConnect
  3. XPCOM bridge
  4. XPConnect
  5. JavaScript caller

That's right, folks - XPConnect converts the arguments going out from the caller, and converts the arguments back going into the callee. What's more, this doesn't include any of the "stack unwinding" steps - processing the returned value from a JavaScript function, exceptions thrown (converted to a nsresult and then into a nsIException object). Nor does it include the security checks that I talked about earlier. It quickly adds up.

In an ideal world, I would personally prefer this:

  1. JavaScript callee
  2. XPConnect
  3. JavaScript caller

I don't know anyone working on that model, though, and I'm sure someone will be quick to tell me that CAPS code - or something else - makes this less than feasible.

So how does that apply to ArrayConverter?

Remember, I implemented nsIArray in JavaScript. So let's assume I want to get element 4 from the array. (Note that this is largely a guess, and I could be wrong about quite a few steps here - so take everything you read in this list with skepticism.)

var x = arr.queryElementAt(4, Components.interfaces.nsISupports);

  • XPConnect looks up the Components.interfaces object, and then the Components.interfaces.nsISupports object.
  • XPConnect gets a request for the queryElementAt method of arr. Bear in mind arr is not an object this JavaScript implements, but is a XPCOM component. So XPConnect has to look up the method.
  • Presumably, we've already asked arr whether it's a nsIArray or not through QueryInterface.
  • XPConnect finds the queryElementAt method, and asks permission to call that method.
  • CAPS does some checks (including, I think, an extra QueryInterface call), and agrees.
  • XPConnect begins converting arguments. First, it converts 4 (a raw JavaScript number) to a PRUint32.
  • Then, it converts the returned Components.interfaces.nsISupports back into its native IID type.
  • It creates another argument to hold the "retval", or return value, from the XPCOM method.
  • It calls into the XPCOM pointer for the method. This is the start of the XPCOM bridge.
  • XPCOM has the pointer for the array - and guess what - it's a XPConnect wrapper. Back to XPConnect we go.
  • XPConnect converts the first two arguments back into a JavaScript number and an interface pointer.
  • XPConnect clears the third argument (the "retval") and calls the JavaScript queryElementAt implementation with the converted first two arguments.
  • The nsIArray implementation does its work (including a QueryInterface call on the element it's about to return - more XPConnect-XPCOM-C++ cycles, probably).
  • The nsIArray method returns the desired object.
  • XPConnect observes no error message (this time).
  • XPConnect receives the returned object and does a QueryInterface on it to make sure it's the right type, then assigns it to the retval. (Don't forget the addref it does for JavaScript.)
  • XPConnect exits back to the XPCOM bridge, with a return code of NS_OK (no error thrown)
  • XPCOM returns the call to the first XPConnect caller, which is expecting the retval pointer has been set and addrefed.
  • XPConnect checks the return code it gets, sees NS_OK, and converts the retval back to a JavaScript-wrapped pointer.
  • XPConnect returns the pointer into JavaScript, and JavaScript assigns the value to the variable x.

Ouch. With C++ talking to C++, it's:

  • Dereference the nsCOMPtr, if you have one
  • Call the method.
  • Component calls QueryInterface() on the value it's about to return
  • Component sets the retval pointer and addrefs it, and then returns NS_OK
  • Caller checks the return code (optional, but highly recommended)
  • Caller looks up the retval pointer it passed in.

That's the XPConnect performance tax, in a nutshell. Now, I'm not saying XPConnect is bad - far from it - but that a JavaScript-based implementation of nsIArray is at a huge disadvantage. After all, it's doing its own QueryInterface call, which is a second layer of XPConnect cycles altogether. For something this basic, XPConnect and JavaScript are relatively expensive... especially given that there's already a native implementation of XPCOM arrays available that won't pay the XPConnect performance tax.

Now, JavaScript does have two significant advantages over C++ when it comes to components: JavaScript code is usually shorter and easier to read by humans, and it's much harder to cause a crash in JavaScript than in C++. I'm not referring to calls into components that themselves crash - but in executing the component itself. You have to go out of your way to crash in JavaScript. With C++, all it takes is one little mistake. C++ makes it possible to crash, while JavaScript doesn't give you a hold on anything to crash with. This reflects another trend I've noticed - that crashes happen not just because the code is buggy and broken, but because the language itself allows crashes to happen. (This last sentence could lead to another article I've been thinking about.)

Still, for simple data structures, if it already exists in C++, there's no reason to rewrite it in JavaScript. All you do when you do that is waste time - both yours in writing the code, and that of the computer that has to execute your code. Which, ultimately, is the user's time.

For more complex components, I'd largely agree with everyone else and say that JavaScript would probably be a better choice going forward.

For new data structures that don't already exist... that's up to you. Again, though, if it's simple, you may as well write it in C++... and write your tests in the xpcshell JavaScript test harness. That's what I've done for another component I'll soon be talking about.

Personally, I hope someone would seriously look at reducing XPConnect's load for JavaScript-to-JavaScript component conversations. There may be a good performance gain to find there too. Enough about performance today though (I apologize for bringing no hard data to back up my claims). The next article will talk about my thoughts on ECMAScript Harmony, and what I think JavaScript 2 means for XPIDL.

Thanks for reading!

Posted by WeirdAl at October 20, 2008 9:52 PM
Comments

Perhaps the introduction of Javascript Code Modules will make Javascript-to-Javascript XPCOM communication almost unnecessary. I know I was very thankful of being able to get rid of XPCOM when working on the most recent extension I co-authored (Fire.fm).

(From Alex: Don't bet on it. The trend these days is for more components to be written in JS, not less - and XPCOM does offer strong types, which JS modules don't.)

Posted by: Jorge at October 21, 2008 12:56 PM

Thanks for pointing that out about nsIMutableArray. I was also looking at making my own implementation.

We really need a MDC doc page I think about essentials of JavaScript developers needing to know to interact with XPCOM; QueryInterface, etc.

Anyhow, thanks!

Posted by: Brett Zamir at January 14, 2010 5:27 AM
Post a comment









Remember personal info?