OK, so let's say hypothetically you wanted to actually split our interfaces up into three conceptual things for embedders: document, shell, presentation and device. Let's further say that our embedder could do things like "presentation = new presentation(document, device)". A presentation would represent a viewport into a laid out version of the document, ready for painting. What sort of things would our hearty embedder want to do with this presentation? I've been thinking about the interface and would like some more suggestions. Actually, I haven't just been thinking about the interface, I have implemented it for printing, so this is more than an academic enterprise.
class presentation {
attributes: viewportWidth, viewportHeight, document, documentWidth, documentHeight
resize(width,height);
scrollTo(x,y);
paint();
}
So you can resize it, scroll the presentation within it, and paint it. I am somewhat confused by the sparseness of this interface, however. Perhaps DOMNode hitTest(x,y); is in order.
What else? Think dynamic screen presentation which could change over time (the presentation observes changes to the document already and handles its own animated gifs); this interactive (once events are not targeted at frames this interface should suffice, I think); think printing, think rendering onto the side of a cube in OpenGL, think anything you want.
We've recently been having a bit of back-and-forth about the effectiveness of multiple presentation support in Mozilla, and I've reversed my earlier, less-knowledgeable position. We should continue to support them. The cost of keeping them up is minimal and they make printing and other secondary presentations work faster (and printing is slow enough as it is without cloning the document). And I haven't seen any specific proposals (yet :) for how a single presentation will help us simplify our architecture, so there's no win yet for that. Combining content and frames, the only one I can think of, is not really a popular idea yet, and I can see it might have problems (our display: mechanism, as roc pointed out, is totally centered around that dichotomy).
On a probably related issue, I've been wondering lately what the cost/benefit is of passing PresContext to every freaking method in layout (which we do now) versus just storing them in the frames or somewhere accessible. I bet changing this to storing in the frames would be a real win--you get rid of a lot of symbols and callsite pushes. If it came down to it, you could possibly avoid storing it in the textframe and instead have textframes ask their parents. I'm told someone else was thinking along the same lines in a blog lately (dbaron?) but I can't find it.
The engine is coming along nicely. Now I have a framework for solid objects that should allow arbitrary meshes, rudimentary lighting (with surface normal smoothing calculations built in), and a texture mapping system (it even refcounts the texture maps ;) Right now all I have to show for it is a rotating near-equilateral-pyramid in the light with somebody's face mapped onto it; but damn it feels good.
Also my objects seem to disappear after rotating them for a while. I'm thinking something goes off to 0 and then something else goes off to infinity. Next up: fix that bug, load .3DS files, camera control, and add collision detection (this will take some real doing, I know ;). Then I basically have all the stuff I need to start playing with the engine from a high level. This is going far better than should really be expected. Hooray for OpenGL!
Oh, as you may have noticed, I have updated my weblog and moved it to Mozillazine, many thanks to kerz. This should deal with the problems some folk have been having accessing my blog.
Day one of learning OpenGL:
Today is for polyhedrons and basic collision detection.
I think the beer helped. Foster's is pretty darn good.
I have been learning OpenGL this weekend. It is a series of fascinating subjects. I think it might just be possible (and possibly easy) to map a browser window into OpenGL as basically a dynamic texture. Clicking is a much more difficult problem, but might be solvable by having a hidden window, detecting where a click hit in a scene (can you do that?) and dispatching it to the right point in the Gecko window. You could make a little water ripple in the point where you clicked :)
A silly project, but might be a fun simple one. You want windows that roll up? We'll give you windows that really roll up. You thought tabs were cool? How about a dodecahedron or a scroll where you could see multiple windows at once if you wanted?
Heh. Looks like it's been filed. Though they want to do a full rendering context; I wonder how widgets would work with that.
The other thing that has been bothering me lately is the stochastic behavior of our regression tests. False positives are just a bad thing, indicating something is wrong with our engine. I have filed a few bugs on that, which still won't get the job completely done. Fixes pending for two of them.
Man, I'm gonna have to drop gvim and get some other editor. It's totally stealing all my system resources whenever I have 20 or more files open, a typical Mozilla hacking session.
In the last couple of weeks, I have had a series of epiphanies about layout. (To the point where I mostly understand it now.) Continuing work on the printing rewrite is largely responsible for this. Here's a few of the things I've been thinking about that I think we can reasonably change.
Top-Level Objects
I'm working on a document on the top-level objects in Mozilla, and it seems to me that there are really only 4 classes of top-level object involved with the document: document, shell (scripting/docshell), presentation, and device/window. Furthermore ,a limited amount of stuff you want to do with these top-level objects as an embedder. We do not support enough of this stuff (for example, you can't easily right now build a client that starts up Gecko and prints). It would be nice, for example, to be able to quickly tell Mozilla these things:
(1) create a window/device.
(2) create a shell hooked to that window (along with all surrounding scripting and stuff)
(3) embed document into shell (or load a document into the shell)
(4) create a presentation for document aimed at that window
(5) create another presentation for document aimed at a printer or even another window
There should never be any more steps than this, but each step in this is currently arcane and the parts of it are spread-out ... there is no way to just create a presentation for a document, for instance, though I've written a method for that that I plan to check in soon. The above set of steps should be something like this for an embedder (some may end up taking a couple of steps to create services and whatnot, but they should all essentially be one step):
// Create main window to display document nsCOMPtr<nsIWidget> widget = CreateRenderingWidget(800, 600, otherWindow); nsCOMPtr<nsIDocShell> shell = CreateShell(widget); shell->EmbedDocument(document); nsCOMPtr<nsIPresShell> presentation = CreatePresentation(widget); // Display the document in another window nsCOMPtr<nsIWidget> widget = CreateRenderingWidget(320, 200, otherWindow); nsCOMPtr<nsIPresShell> presentation = CreatePresentation(widget);
That's all she wrote. You have taken your webpage from nowhere to the screen, twice. You need to do things with your widget to make it visible and move it where you want within your app, but that's it.
Most of these steps are pretty much there or close to happening; I want to see us get the rest of the way so we're cleaner and work better with alternate media (which is the reason I want a stronger separation of the presentation from the shell). DocumentViewer is the one doing much of the presentation-creation job today, because someone has to catch the right part of the document load where you realize what kind of presentation you want.
Doing this would allow you to more easily write various clients, as well as do things internally like move documents from one window or tab to another.
Presentations
This begs the question of whether we should have multiple simultaneous presentations for a document at all, which some people kind of want us not to. I think our reflow model and architecture is too poorly understood to make a go at that, and we have to fix that first before we even consider it. In the meantime, we should split the presentation out and acknowledge that for the time being we currently do support multiple simultaneous presentations (which I can confirm that we do).
It's multiple active presentations that we really don't support--presentations that you can click on and interact with, etc.--and probably never will. DOM just doesn't support it. For example, if you click on an event in presentation A you only want the event to affect handlers and such in presentation A (which is what you'd want, for example, in a Composer with a browser view side by side). This means a whole mechanism where all event handlers must be registered only with a particular presentation. And if you solve that problem, then when JS does stuff like click(), who do you contact? I don't think DOM can really support multiple active presentations.
Box Reflow Model
Oh, and recently hyatt convinced me that we can indeed remove the performance hurdles to using a box-like reflow model, so I'm all for it. Let's make a MaxElementWidth() function first and stop calling into reflow to figure that crap out.