September 18, 2008

Back to Basics: XPCOM Arrays and Memory

A couple weeks ago, I started reading "Learning Python", by Mark Lutz. It's an interesting feeling, going back to the basics of programming - numbers, strings, arrays, etc. In reading this book, I kept thinking there were some things missing from the basic XPCOM tool set. Therefore, I decided to attempt implementing these missing pieces.

Now, I could be totally wrong about any or all of these being missing... but it's also a good exercise for me to push my boundaries and strengthen my C++ skill set. The result is not only code that I can use for proof-of-concept, but code that may be useful elsewhere.

This is the first article in a multipart series titled "Back to Basics", exploring these concepts. This first article is about native C++ arrays through XPCOM - and making sure I don't leak memory. Read on in the extended entry for details.

The problem: Verbose code

In XPIDL, you define an interface for an array like this:

void getValues(out PRUint32 count,
               [array, size_is(count), retval] nsIFoo values);

In C++, this looks a little like:

NS_IMETHODIMP
nsBar::GetValues(PRUint32 *count,
                 nsIFoo ***values)
{
  PRUint itemCount = mValues.Length();
  nsIFoo** retval = nsMemory::Alloc(itemCount * sizeof(nsIFoo*));
  NS_ENSURE_TRUE(retval, NS_ERROR_OUT_OF_MEMORY);
  for (PRUint32 i = 0; i < itemCount; i++)
  {
    retval[i] = mValues[i];
    NS_IF_ADDREF(retval[i]);
  }

  /* Other operations may be here which return early, without freeing
     the above memory.  In other words, these other operations may
     inadvertently cause several leaks - retval, plus the members of 
     mValues (because their reference count has been increased).
  */

  *count = itemCount;
  *values = retval;
  return NS_OK;
}

There are several steps that happen this approach:

  • We allocate the memory ourselves
  • We set the return values as two separate lines, ourselves
  • We addref the return values ourselves
  • Oh, and if we return early, we either must free the allocated memory and release the reference counts ourselves... or we leak.
  • NS_ENSURE_SUCCESS(rv, rv) and NS_ENSURE_TRUE(foo, NS_ERROR_WHATEVER) are great ways to cause leaks here.
  • By the time we're done handling all the early returns, the code has gotten much, much larger and harder to read.

I step back and ask, "Is that all necessary?"

My solution: A local memory-handling class

It'd be a lot easier to just write something like:

NS_IMETHODIMP
nsBar::GetValues(PRUint32 *count,
                 nsIFoo ***values)
{
  PRUint itemCount = mValues.Length();
  nsresult rv;
  // XXX This is pseudo-code, and will not necessarily compile!!
  MemoryManager mem(nsIFoo*, itemCount, rv);
  NS_ENSURE_SUCCESS(rv, rv);
  for (PRUint32 i = 0; i < itemCount; i++)
    mem.setIndex(i, mValues[i]);

  mem.Finalize(count, values);
  return NS_OK;
}

This has several advantages:

  • The mem object is responsible for memory allocation and de-allocation.
  • The mem object provides a simple API for setting members (and presumably takes care of reference counting)
  • The mem object provides a simple API for setting the return values from nsBar::GetValues - specifically, the number of items, and the pointer to the item array.
  • Best of all, when GetValues exits, the mem object destructor executes.
  • This is a key difference between C++ and JavaScript - JavaScript objects don't have the concept of destructors (and thus, code that runs before they disappear into garbage collection).

Fortunately, all this is possible with C++. I implemented a pair of template classes, nsMemoryArray and nsMemoryRefArray. With these template classes, the above code looks like:

NS_IMETHODIMP
nsBar::GetValues(PRUint32 *count,
                 nsIFoo ***values)
{
  PRUint itemCount = mValues.Length();
  nsresult rv;
  nsMemoryRefArray<nsIFoo*> mem(itemCount, rv);
  NS_ENSURE_SUCCESS(rv, rv);
  for (PRUint32 i = 0; i < itemCount; i++)
    mem.setIndex(i, mValues[i]);

  mem.Finalize(count, values);
  return NS_OK;
}

How it works

  1. The nsMemoryRefArray<nsIFoo*> declaration defines the class via a template.
  2. The mem constructor takes the itemCount and allocates a private nsIFoo** pointer with size itemCount * sizeof(nsIFoo*). That means it can be an array holding itemCount number of nsIFoo* values.
  3. Through mem.setIndex(), members of the private nsIFoo pointer array are set - and "addref'ed" (their reference counts go up by one).
  4. Should GetValues() exit early, the mem destructor executes, "releasing" the addref'ed items and freeing the memory.
  5. Otherwise, in mem.Finalize(), the count and values pointers are appropriately set - and a private boolean indicates the mem object no longer needs to worry about memory.
  6. GetValues() exits.
  7. In the mem destructor, the private boolean from Finalize() means do not free the memory or release any addref'ed items.

What if your array of objects doesn't have reference counting? No problem!

NS_IMETHODIMP
nsBar::GetValues(PRUint32 *count,
                 PRUint32 **values)
{
  PRUint itemCount = mValues.Length();
  nsresult rv;
  nsMemoryArray<PRUint32> mem(itemCount, rv);
  NS_ENSURE_SUCCESS(rv, rv);
  for (PRUint32 i = 0; i < itemCount; i++)
    mem.setIndex(i, mValues[i]);

  mem.Finalize(count, values);
  return NS_OK;
}

Here, I made one significant change - I declared mem as a different type. XPIDL required a different argument set in the GetValues() definition, but other than that, the code is essentially identical.

Practical applications

A quick search on mxr.mozilla.org turns up several promising candidates. In particular, I notice nsConsoleService::GetMessageArray and several implementations of nsIClassInfo::GetInterfaces. If my nsMemoryArray.h file were to pass reviews and the "sniff test", it'd probably be worth using to clean these up.

I also notice there's very few C++-based components which implement native arrays (other than the nsIClassInfo::GetInterfaces one). The nsMemoryArray.h helpers would probably lower the barrier for such components. (Yes, I know, C++ is falling out of favor as a language for XPCOM components... but I'll deal with that in a later article.)

Thanks for reading!

Posted by WeirdAl at September 18, 2008 9:22 PM
Comments

The strangest thing is that such classes do not exist since a long time.

They're a required first step for RAID conformant code. Not only this is required in order to use exceptions, it's a strongly better way to code.

So there will be less XPCOM component in the code than there's now, but 100% of what will stay *needs* to be rewriten in this style.

(From Alex: I have no idea what "RAID conformant" means. But I disagree with the premise you make
that classes like these don't exist. Reference nsCOMPtr, nsTArray for examples.)

Posted by: jmdesp at September 19, 2008 8:45 AM

You've implemented D's scope feature (well with more stuff, but scope(failure) was primarily what you were after :P

(scope stuff is a fair ways in)
http://digitalmars.com/d/2.0/exception-safe.html
http://digitalmars.com/d/2.0/statement.html#ScopeGuardStatement

Posted by: Spudd86 at October 21, 2008 8:36 AM
Post a comment









Remember personal info?