Surfin' Safari

Front Page  -  Technorati

May 2, 2005


Implementing CSS (Part 1)

Posted at 2:57 PM

One of the most interesting problems (to me at least) in browser layout engines is how to implement a style system that can determine the style information for elements on a page efficiently. I worked on this extensively in the Gecko layout engine during my time at AOL and I've also done a lot of work on it for WebCore at Apple. My ideal implementation would actually be a hybrid of the two systems, since some of the optimizations I've done exist only in one engine or the other.

When dealing with style information like font size or text color, you have both the concept of back end information, what was specified in the style rule, and the concept of front end information, the computed result that you'll actually use when rendering. The interesting problem is how to compute this front end information for a given element efficiently.

Back end information can be specified in two different ways. It can either be specified using CSS syntax, whether in a stylesheet or in an inline style attribute on the element itself, or it is implicitly present because another attribute on the element specified presentational information. An example of such an attribute would be the color attribute on the font tag. Both WebCore and Gecko use the term mapped attribute to describe an attribute whose value (or even mere presence) maps to some implicit style declaration.

A rule in CSS consists of two pieces. There is the selector, that bit of information that says under what conditions the rule should match a given element, and there is the declaration, a list of property/value pairs that should be applied to the element should the selector be matched.

All back end information can ultimately be thought of as supplying a declaration. A normal rule in a stylesheet that is matched has the declaration specified as part of the rule. An inline style attribute on an element has no selector and is simply a declaration that always applies to that element. Similarly each individual mapped attribute (like the color and face attributes on the font tag) can be thought of as supplying a declaration as well.

Therefore the process of computing the style information for an element can be broken down into two phases. The first phase is to determine what set of declarations apply to an element. Once that back end information has been determined, the second phase is to take that back end information and quickly determine the information that should be used when rendering.

WebCore (in upcoming Safari releases) has a really cool optimization that I came up with to avoid even having to compute the set of declarations that apply to an element. This optimization in practice results in not even having to match style for about 60% of the elements on your page.

The idea behind the optimization is to recognize when two elements in a page are going to have the same style through DOM (and other state) inspection and to simply share the front end style information between those two elements whenever possible.

There are a number of conditions that must be met in order for this sharing to be possible:
(1) The elements must be in the same mouse state (e.g., one can't be in :hover while the other isn't)
(2) Neither element should have an id
(3) The tag names should match
(4) The class attributes should match
(5) The set of mapped attributes must be identical
(6) The link states must match
(7) The focus states must match
(8) Neither element should be affected by attribute selectors, where affected is defined as having any selector match that uses an attribute selector in any position within the selector at all
(9) There must be no inline style attribute on the elements
(10) There must be no sibling selectors in use at all. WebCore simply throws a global switch when any sibling selector is encountered and disables style sharing for the entire document when they are present. This includes the + selector and selectors like :first-child and :last-child.

The algorithm to locate a shared style then goes something like this. You walk through your previous siblings and for each one see if the above 10 conditions are met. If you find a match, then simply share your style information with the other element. Such a system obviously assumes a reference counting model for your front end style information.

Where this optimization kicks into high gear, however, is that it doesn't have to give up if no siblings can be located. Because the detection of identical style contexts is essentially O(1), nothing more than a straight pointer comparison, you can easily look for cousins of your element and still share style with those elements.

The way this works is that if you can't locate a sibling, you can go up to a parent element and attempt to find a sibling or cousin of the parent element that has the same style pointer. If you find such an element, you can then drill back down into its children and attempt to find a match.

This means that for HTML like the following:

<table>
<tr class='row'>
<td class='cell' width=300 nowrap>Cell One</td>
</tr>
<tr class='row'>
<td class='cell' width=300 nowrap>Cell Two</td>
</tr>

In the above example, not only do the two rows share the same style information, but the two cells do as well. This optimization works extremely well for both old-school HTML (in which many deprecated presentational tags are used) and newer HTML (in which class attributes might figure more prominently).

Once the engine determines that a style can't be shared, i.e., that no pre-existing front end style pointer is available, then it's time to figure out the set of declarations that match a given element. It is obvious that for inline style attributes and mapped attributes that you can find the corresponding declaration quickly. The inline style declaration can be owned by the element, and the mapped attributes can be kept in a document-level hash. WebCore has a bit of an edge over Gecko here in that it treats each individual mapped attribute on an element as a separate declaration, whereas Gecko hashes all of the mapped attributes on an element as a single "rule." This means that Gecko will not be able to share the mapped attribute declaration for the following two elements:

<img width=300 border=0>
<img width=500 border=0>

WebCore creates three unique declarations and hashes them, one for a width of 300, one for a width of 500, and one for a border of 0. Gecko creates two different "rules," one for (width=300,border=0) and another for (width=500,border=0). As you can see in such a system, you will frequently not be able to treat the identical border attributes as the same.

Aside from this difference in mapped attribute handling, the two engines employ a similar optimization for quickly determining matching stylesheet rules called rule filtering. All rules that are potentially matchable by any element (i.e., that have the correct media type) are hashed based on the contents of the rightmost simple selector in the rule.

A selector in CSS can be either simple (meaning that all of the contents of that selector apply only to a single element) or compound (meaning that you may examine multiple elements like parents or siblings of that element). A compound selector is essentially a chain of simple selectors, so the following rule:

tr > td { color: blue }

has two simple selectors, tr and td. The rightmost simple selector in the rule is the one that we will use for the rule filtering optimization.

The rightmost simple selector falls into four categories.

(1) The selector uses an ID. (Example: #foo)
(2) The selector doesn't have an ID but uses a class. (Example: .foo)
(3) The selector has no class or ID but specifies a tag name. (Example: div)
(4) The selector specifies none of these things. (Example: *[disabled])

The rule is placed into one of four hashtables depending on which category it falls into. The idea behind these categorizations is to always filter out more specific information first. For example, if an element has a specific ID, then obviously any rules whose rightmost selector uses a different ID cannot match. Technically the last category can just be a list and not a hashtable, since those rules must always be examined by all elements.

Each hashtable, therefore, consists of a mapping from a given atomic string to a set of rules that match. The class attribute is exceptional in that you must put the rule into the hashtable multiple times if multiple class attributes are used.

When determining the set of rules that match a given element, you only examine rules that correspond to the correct hash entry based off your ID, classes and tag name. This optimization basically eliminates 95+% of the rules up front so that they need not even be considered during the matching process.

Each rule is then examined in detail, with all selectors being checked, to determine if it is a match, and the set of matches is collected. The set of matches can then be sorted by priority and specificity such that all the declarations are in the proper application order.

This brings us to the final phase of the style computation, which is taking the set of matches and quickly computing the appropriate front end style information. It is here that Gecko really shines. What I implemented in Gecko was a data structure called the rule tree for efficient storing of cached style information that can be shared *even when* two elements are not necessarily the same.

The idea behind the rule tree is as follows. You can think of the universe of possible rules in your document as an alphabet and the set of rules that are matched by an element as a given input word. For example, imagine that you had 26 rules in a stylesheet and you labeled them A-Z. One element might match three rules in the sheet, thus forming the input word "C-A-T" or another might form the input word "D-O-G."

There are several important observations one can make once you formulate the problem this way. The first is that words that are prefixes of a larger word will end up applying the same set of rules. All additional letters in the word do is result in the application of more declarations. Thus the rule tree is effectively a lexicographic tree of nodes, with each node in a tree being created lazily as you walk the tree spelling out a given word.

This system allows you to cache style information at each node in the tree. This means that once you've looked up the word "C-A-T-E-R-W-A-U-L", and cached information at all of the nodes, then looking up the word "C-A-T" becomes more efficient.

In order to make the caching efficient, properties can be grouped into categories, with the primary criterion for categorization being whether the property inherits by default. It's also important to group properties together that would logically be specified together, so that when a fault occurs and you have to make a copy of a given struct, you do so knowing that the other values in the struct were probably going to be different anyway.

Once you have the properties grouped into categories like the border struct or the background struct, then you can either store these structs in the rule tree or as part of a style tree that more or less matches the structure of the document. Inheritance has to apply down the style tree and tends to force a fault, whereas non-inherited properties can usually be cached in the rule tree for easy access.

WebCore doesn't contain a rule tree, but it is smart enough to refcount the structs and share them as long as no properties have been set in the struct. In practice this works pretty well but is not as ideal as the rule tree solution.

comment (161) -

Copyright © Dave Hyatt 2003, Design by Stéphane Curzi/ProjetsUrbain.com