February 12, 2003

Yet Another Bloat Day

Today was a day of bloat. First off, taking some of the new options off the Minimo Tinderbox decreased the build zip size from 10M to 7M! That's awesome. My --disable-xslt-bindings patch only shaves 20K off TestGtkEmbed in a rigorously optimized build. Though as sicking pointed out there is still more to shave with XPath.

I did get useful trace-malloc data today (only after a highly painful time writing demangling scripts and massaging types.dat to fit the New World). One idea that might come out of this (might!) is to deliberately remove some StringBundles from memory after each page, or at least when we approach the hard limit.

One serious question is why we have nsSHEntries in the embedding build. They account for a decent portion of memory. Another is why void*'s are not all accounted for in uncategorized.pl.

Metrics I want to do now:
- find out who owns the leaked / cached objects (might just be doable looking at the memory dump from trace-malloc)
- determine what types of objects increase as we go through the pageload tests
- find all hashtables, arenas and arrays that persist through multiple pages, and force their owners to give up the memory

Posted by jkeiser at 1:22 AM | Comments (10)

February 11, 2003

Other Stuff

On the bright side of life, I made a --disable-xslt-bindings switch today that makes Mozilla not include anything at all from that particular module (unfortunately that disables XML pretty printing, but I think we can live with that); and I managed to fix Ye Olde Event Bug without causing innumerable regressions (I think). It even managed to fix a bug I did not expect to fix.

Bernd has been pushing border-collapse patches through today, and apparently one of the ones still in his tree fixes tghe border-collapse issue visible in Patch Viewer. Mmmm.

Posted by jkeiser at 12:37 AM | Comments (2)

Bloat 3.0

I have now been playing with trace-malloc on Minimo (a GTK+ embed with no chrome and such). It's an excellent but under-configured tool--the majority of the trash that gets leftover when you go to CNN and then leave it is in void*'s. A

Here's the overview numbers for now: minimo-bloat.html. I will fine-tune these numbers as I go. It appears the three top hitters are things allocated by image frame (coming in at a whopping 1M), by the cache (at 100-200K) and by nsCSSDeclarations (allocated all over the freaking place and apparently never destroyed).

Next I need to figure out if I have all the right prefs for the Minimo run to be truly useful. Probably not. Image caching in particular is obviously on.

Posted by jkeiser at 12:36 AM | Comments (10)

February 6, 2003

Static Linking

Quick log of memory investigations for the day:

Static Linking

It turns out that several of our modules are not currently statically linked. This means there are a bunch of optimizations linkers can't do, and a bunch of strings that are necessarily put into programs that link against those dlls that can't be stripped. On a tiny platform (which is a big reason to be concerned about bloat) you're not going to need those libs to be shared.

Fragmentation

Turning off frame arenas (by defining DEBUG_TRACEMALLOC_FRAMEARENA) doesn't seem to keep fragmentation down much. But now that I think more seriously about how an OS would do memory allocation, I think I understand why that would be. At least if pagesize is small, frame arenas should actually keep fragmentation down. In fact, as I write this I begin to question whether the numbers we have been reading for the amount of size the app takes up are accurate--the VmSize of GtkEmbed was up to 29M with an apparent 10Mb of fragmentation, and the size reported in top was 17-19M (I forget which). Therefore it would seem that that number is not reported, when it really needs to be for a small target. Or is this a red herring? Maybe what you'd do to kill fragmentation problems on a small device is just set pagesize frighteningly small. I need to investigate this.

I learned how /proc/*/maps works today (tells you what dll has what memory allocated for it), so it's time to start correlating those results with trace-malloc and see what kind of objects are left around in mostly-empty pages and where they are left around. It's my theory (perhaps wrong) that a ton of objects are created when you load a document and some of those objects are kept around for the next document (caches and the like). For a small target, they shouldn't be.

Hard Size Limits (Memory Pressure)

Another really interesting idea to come out of today's myriad discussions was to have a pref to enforce a "pretty hard limit" on our size. In other words, when we get near that limit (say within 200K?), documents will stop creating content nodes and frame construction will abruptly halt. We can't do anything about people creating new strings, but I'm sure. This pref will help us develop to smaller targets and, as Kevin points out, some users might want it on their machines anyway. The trick is to make sure we don't leak, because that would get ugly with such a pref.

Events

On an unrelated note, I found the problem with events: it relates directly to the fact that we do not store the event target in the event. But shocker of shockers, fixing the problem causes more, eviler problems in events which seem to be related to the same root cause (storing the event target in all sorts of hairy places). I'm investigating, but I begin to suspect we may have to back out the mouse events patch and fix targets to be stored in the event itself (where it should have been all along!) in 1.4alpha.

Posted by jkeiser at 8:30 PM | Comments (10)

Cache

OK, so Minimo is getting us pretty small. 8-9M codesize, 13M first-page startup and around 22-24M high watermark. This is even with the lea allocator, which according to waterson takes fragmentation to a minimum (it saves us a little under a meg of maxheap as far as we can tell). So the numbers I posted before for mfcembed weren't the best representation of the goal we have to hit for embedded systems (certainly a big deal, but not the main goal). it is our static growth that is the problem. We fit fine on embedding platforms. And it's not leaks, or at least not much; this goes up and levels off (as others have pointed out before). This is cached data (not necessarily the cache), most of it at least. I am certain there are leaks, but the cache problem is a prerequisite to deal with any of that.

This entry is a reminder to myself of stuff to do to the minimo branch:

  1. Get rid of XSLT bindings (make a switch to not create the XSL content sink and friends). This seems to be pretty low-hanging fruit; XMLContentSink is the only place in the app where we talk to XSLT.
  2. Don't include printing and plugin DLLs in the "small embedder" solution
  3. Hook into low memory pressure sensor to gc, compress frame arena (possible?), get rid of any cached images and documents and basically destroy anything else that we're caching. For embedders, run this algorithm after every page too, to keep memory down.
  4. Turn off paint suppression entirely and ensure that we therefore don't have two docs in memory at the same time
  5. Turn off frame arena and see if that keeps our maxheap down. Not even sure if this will work, let alone keep maxheap down.
Posted by jkeiser at 12:18 PM | Comments (10)

February 4, 2003

Tinderbox 3 Docs

Tinderbox 3 now has rudimentary documentation and the server is auto-generating a tbox3.tgz to be installed. Now I must sleep.

Posted by jkeiser at 4:00 AM | Comments (12)

Tinderbox 3

This weekend I added some ultra-cool features to Tinderbox 3, namely fast-update, only-build-if-there-is-something-new, and upload of builds to the server. Now the cycle time when there are no changes is less than a minute, so changes will be picked up as quickly as humanly possible.

tbox3 is now at the point where I consider it feature complete (enough to unleash it on the world at large). A few more features--very few--need to be added to make a drop-in replacement for the Mozilla Tinderbox, but that was not the main objective in writing this. I wanted a tbox various groups could use and apply patches to on multiple platforms. And I wanted a tinderbox client that was really easy to set up and hassle-free to administer (allow us to fix problems with as little intervention from the owner of the tinderbox client as security will permit).

Now I need to get server space and bandwidth, both for finished binary uploads and for the actual Tinderbox. I don't mind putting it on my server, but I am behind a cable modem and somebody might notice if people are downloading builds from my living room all day. I think getting people to donate machines for clients will be easy. The setup is brain-dead and you can always run the tinderbox off-and-on while you're not actually compiling. I find myself creating tinderbox clients as a way to keep my build up to date when I'm not there :)

Features of note in tbox3:

  • Database-backed, http protocol (so that you get two-way communication)
  • Builds use fast-update and do not build unless there is something to build (build cycle is reduced to 1 minute when nothing is there)
  • Uploads finished builds to a server so you can click on them from tbox
  • You can upload patches to the tbox and all clients will download and compile them
  • Clients auto-upgrade themselves
  • Clients can be controlled from the server: .mozconfig can be changed (among other things) and you can send commands ("kick", "clobber", "checkout", "build")
  • Positively brain-dead client setup, largely due to the server control (you just point at the server and it gets all the info it needs to check out, configure and build the tree)
  • Everything configurable through web interface with login/password security (uses Bugzilla logins)--easy server setup

I would post the URL to the nice pretty tinderbox, but now that the clients are uploading builds to it, I need to keep it quiet to keep bandwidth problems from arising. You may And if you have bandwidth and a server for me (I have cron scripts that intelligently keep the binary builds under a quota), don't hesitate to drop me a line.

February 1, 2003

mozilla_tools

mozilla_tools now has a home! It includes tools to help build, manage patches, manage cvs and do roaming (with a few other random things like dos2unix.pl and a uuid generator). It is pretty cool and has been very useful for me. My color coordination and style skillz aren't up to par so the page is pretty Plain Jane; hopefully some grateful contributor with better sense will come along and help.

Posted by jkeiser at 8:25 PM | Comments (10)