What Is An Outline, part 3

2004-02-02

In my previous post, I concentrated on extending our data model to handle multiple data sources cleanly and transparently. However, we now have a conceptual contradiction in our data model. To this point, the data model we've built still assumes it is a tree, which means we can use the tree-optimized algorithms for things like counting children. But if a document transcludes something that transcludes the original, we can have a cycle, which means there is a path starting with a node in the original, going through the transcluded document, and ending up back at the original node. Most tree-based algorithms respond to this situation by infinitely looping, which manifests itself as a hard lock-up, which is generally considered a negative by users.

Design Decision 3: How To Represent Structure

Issue: How do we represent the structure of the outline in memory?

Options:

Stick with the tree.
Continue to allow nodes to directly link to child nodes, as in the original structure, just lift the requirement that the resulting graph be a tree.
Do something more sophisticated.

Discussion: The first is included for completeness; with my priorities it holds no attraction to me.

I jump straight to "discussion" this time, rather then analyse the costs and benefits, because it turns out the correct answer is quite subtle; I initially went with the second answer and somewhat humorously, it worked as long as I didn't actually put a GUI on top of the structure. Of course users are typically uninterested in manipulating data structures by typing in Python commands into the Python interpreter, so we need a GUI. (Also, if there are data structures the GUI needs to work well, there are almost certainly other uses that need those structures too.)

So, let us set out on an exploration of the subtle problems inherent in...

The Obvious Answer: Let Nodes Link To Children

Let's continue to let each node store direct references to the children nodes in their data sources, removing the limitation that the outline must be a tree at all times. Then, you can have a node A that has as a child B that has a child A, and as long as you write your methods intelligently, which is to say, explain to your data model how the manipulations are supposed to work, it pretty much all works. (Again, big OO win here; as long as the outline designer (me) gets it right, nobody else really has to.)

Actually, a node can have itself as a child directly, but usually it's easier to talk about an A and a B. Also, throughout this post and the rest of the posts on outline structure, I will use -> to mean "has as a child", so the previous example can also be written as A->B->A, which I think is easier to read and understand. (A node as its own child is A->A.) Also, generally, I define A and B as containing their own text, so A->B shows up in a traditional outliner on screen as a node with the text "A", with a child with the text of "B":

A
- B

This almost works, and the way in which it doesn't work is quite frustrating and subtle. A traditional outliner node in the GUI, i.e., what you actually see, can be uniquely identified by the node itself. If you see a node with "This Is The Title" on the screen, it will only show up once in a given outline view, no matter what you do. This is one of those subtle properties you don't even think about using, until suddenly, it's gone.

Take the example document of A->B->A. The GUI must keep track of what the user is actually viewing. Since the nodes are generally all collapsed when a document is first loaded, the GUI starts out like this:

with the node indicator next to "A" indicating there are children that can be opened if you click there. No problem. Now let's say you expand the A node:

A
- B

We're still OK. But when you open B, there's a problem:

A
- B
  - A
    - B
      - A... and so on forever

How do you know whether a node is open in the GUI? Well, the GUI needs to remember which nodes are "open" and which are "closed". With the model we've built so far, the only way of identifying nodes is by their identity, in other which, by which node they are, with no other information. Do you see the problem?

Since we have two nodes in this document, the GUI has a data structure inside it that starts out looking like this:

A: closed
B: closed

Once we open the first A node, the data structure then reads like this:

A: open
B: closed

Finally, we open the B node and it looks like this:

A: open
B: open

Once you open the B node to reveal the A child, when the GUI goes to update the display, it has no way of knowing that the "first" A is open, but the second "A" is closed. When I first implemented it, this filled my screen with A and B when I first opened the B node. That's because as the GUI is drawing the screen, it basically starts at the top of the screen, with the first "A", and goes through the following sequence:

Add the first node to the "Draw next" list. (Lots of extraneous detail elided here.)
Draw the next node.
If we've filled the screen, stop drawing.
Is the node open? (Consult the opened/closed list.) If so, add the children to the front of the "next" list.
Go to 2.

As the GUI runs through this routine, it keeps adding A and B to the "next" list, over and over again, because they are both open. This is a perfect example of how cycles can screw you up; this algorithm is correct as long as there are no cycles in the graph, as in conventional outliners.

It's worth taking a moment to ask yourself, "Is there any GUI behavior that would satisfy me?" If your problem is not clearly specified, there may not be an answer until you clarify your problem.

Fortunately in this case, it is obvious that as a human, we could make this work correctly. When we click on the B node, we want to just see one A node below it, and the same if we then click the second A node, we just want one B below it. We can keep clicking on the nodes as deeply as we care to as outliner users, and as humans, we never have any confusion about what we want. This is a good sign that there must be some programmatic solution.

The GUI needs a more sophisticated model of which "node" is open. The obvious choice is to index the nodes in the GUI by their path. So now, our data structure actually looks like this initially:

[empty data structure]

If we ever find ourselves asking about a path that doesn't exist yet, we report that it's "closed" by default. So there's no A yet; it's still closed so there's no point adding it to our opened/closed list. Once we first click on the A, so we see the first A and B, it looks like this:

A: open

When the GUI runs through the loop above, it checks for the proper path instead of just the node name, and correctly opens just the first A node, showing A->B. The difference arise when we click on the B node. Now the opened/closed data structure shows this:

A: open
A->B: open

The GUI loop will correctly show just three nodes: An open A node, with a B below it, and a second A below it. There is will stop, because when it asks the opened/closed data structure about A->B->A, the data structure will report that as "closed", so the GUI will stop following the cycle. You can click again and again, and this will work. You can have things like A->B->A->B->A->B->A->B and this will work.

But I'm only exploring this in so much depth because it's the obvious answer. Unfortunately, it works great... until the user has the audacity to change any of paths to the nodes. Since this is something that happens rather often (every indent, every dedent, nearly every cut & paste), this is a big problem.

Consider the simple outline A->B->C->D. Suppose the user has A, B, and C open once, so the opened/closed data structure reads like this:

A: open
A->B: open
A->B->C: open

Now let's suppose that dastardly user dares to dedent node C. We want to maintain the fact that the node for C is still open. But that node is no longer identifiable as A->B->C... now it's A->C!

If we don't do something, the next time the GUI goes to update the screen, it's going to ask about A->C, which will be closed (since it's not in the data structure at all), and so the C node will suddenly close for the user. That's not correct behavior.

The obvious solution is to start writing code to update the path list in the opened/closed structure, changing all of the A->B->C entries to A->C. But that just solves one small part of the problem, and it doesn't even solve it well, since it means groveling through a potentially huge data structure every time the user does something, which will eventually slow the system down unacceptably. It leaves the issue of all the other places in your program you might have stored a path to a node that is now invalid unfixed. Deletions are even more interesting.

Again, the amount of code necessary to handle this is ballooning. Sometimes that's inevitable but it's always worth looking to see if there's a better way.

The first cut is that we can bundle these paths into some object and wrap it behind a consistent interface. This is not an entirely bad idea, but I'm going to handwave here and say it doesn't work that well. (Programmer hint: Try to write the object, you'll see what I mean. Efficiently updating the paths requires either tons of memory or tons of time, and there's just no good way around that. You end up wanting to ask questions about any arbitrary subpath after you've implemented the "hoist" command, and for any path of length n there are n*(n-1)/2 distinct subpaths; that's a lot of indexing that needs to be done.)

Node Handles

It took some thinking, but the key observation here is that we do not care about the path as such. Consider the example of A->B->C->D above. When we were upset we lost the collapse state of C, we weren't really upset that we lost "A->B->C"; all we cared about was the ->C part, and only that specific ->C. Anywhere that particular C goes, whether we indent, dedent, move it up, down, etc., goes, we want the expansion state to follow it.

I introduce the idea of a "node handle" here. The handle can be come to thought of as the "->" part of the above. I call it a "handle" because as I've implemented it, you can use it to manipulate the node; it delegates the methods appropriately. It also wraps the calls that normally return children to return node handles to those children instead. Why is this interesting?

Here's the brain-bending part: Instead of storing the entire path, the node handle stores only a link to its own immediate parent. We don't need the whole path. From this simple rule, we obtain objects that uniquely identify the "display node" that we care about, without imposing any restrictions on the underlying data model.

There must be some beginning to the handles, and we call that first handle the "root handle". You can start one from anywhere in the document, you're not limited to the whatever the "document root" is, but generally that's what you use. With the root handle in place, the document is better rendered as ->A->B->C->D, but this is usually just visually distracting so we merely imply the root handle, unless we need to specifically refer to it as I did just now.

There is some machinery in the handles to make sure that if a handle is requested that already exists under the current root, the "same" handle is returned, so it's useful for comparison. (In Python terms, you use "is" to compare one handle to another.) We also lean on this to solve the obvious problem with obsoleted handles (due to outline manipulations); typically you drop the handles of everything except what you immediately care about, and always "reconstruct" them, depending on the machinery to correctly re-use old handles or create new ones.

Now look at how the ->A->B->C->D problem is solved. We redirect the ->C's parent to be the (root) ->A instead of ->B, we remove C from B's child list, and add it to the end of A's child list. Now, the opened/closed data structure starts out looking like this:

->A: open
->B: open
->C: open

And after the "indent" operation it still looks the same. (Optionally, since ->B now has no children, you may remove it, but technically it doesn't really matter.) ->C remains open, both before the move and after, and all behaves as it should.

This is elegant because it turns many O(n) algorithms into O(1) algorithms, and makes it easier for other parts of the program to hold references to handles (instead of paths), generally making those parts of the program more robust too.

Isn't this complex?

Generally, complexity in code is frowned upon. I am concerned, however, that many people take that too far; it is the global complexity that ultimately matters, and people often apply the criterion far too locally.

One way or another, every problem mentioned in this post must be addressed somewhere. The problem with taking any of the easier, supposedly "simpler" answers to any of the issues I mentioned here is that they do not make anything simpler! They merely off-load the complexity off onto all the client code, requiring the clients to maintain their link handles whenever a path changes, etc. This not only is not a net gain, it's a net loss, since this code must now appear in every client, i.e., multiple times, and get it right each time, or cause really... ahhh... "fun" bugs.

The outline data structure is obviously quite fundamental to an outliner, and as I've said earlier, it is effectively impossible for a program to rise very far above its data structures. Thus, in this context, it makes sense to lavish as much attention as possible onto this fundamental data structure, to load as much functionality into this centralized area as possible, and to make using the outline data structure as simple as possible.

In fact, given all of this behind-the-scenes detail, you'd probably be surprised at how natural it is to use these data structures I've created; the vast majority of the time it is no more complicated to the user then a pure-tree outline structure! And in the end it is simpler because I've taken the time to make it Just Work Right^TM; you generally don't need to account for very many edge cases at all, your program just works correctly unless it is doing something weird; generally only the Iron Lute core needs to do these weird things.

A Conceptual Formulation

What I've built here, conceptually, is a data structure that separates the "outlineness" from the underlying "graphiness" of our data. By using handles, the graph looks like an outline; it's always a tree. Technically, it's even finite at any point, though it will transparently grow if you keep expanding. (Typically, this is only done by the user interface, and most algorithms Iron Lute users will care to run will only need each node to show up once.) Many of the tree algorithms actually work unchanged on top of the handles, with the proviso that you avoid recursion (which is made easy by parts of the API not worth talking about at this level), and generally you can just write tree-based (or if you prefer, "outline-based") algorithms, which is relatively easy compared to full graph algorithms.

On the other hand, underneath the hood there is a full graph, with the attendant flexibility that allows. It's a nice hybrid that doesn't limit the user or the programmer, but makes easy things easy, and even some hard things easy. For instance, the graph allows easy, principled access to the world outline, which doesn't exist very strongly yet but it will.

I'm actually pleasantly surprised by how conceptually clean this all comes together.

The Right Answer Is Usually Subtle

Note how often in today's post the construction "The obvious solution is... but it doesn't work because..." has been used. Each and every one of those represents at least a couple of hours of thinking and doodling, usually evaluating three or four distinct solutions until one of them becomes the clear winner.

Each and every one of the "obvious solutions" can be bashed into some sort of compliance, with enough work. For instance, when indexing the nodes by path for the GUI, whenever somebody moves a node you can walk through the entire opened/closed data structure and update the entries correctly. But that would be slow to execute, would be challenging to get right in all cases (especially with my sophisticated models of outlines!), and would leave you without the benefits of the real right way of doing it. Combine two or three of these "bashing sessions", and you end up with software that has already hit its maintenance limit; the slightest change breaks things across the board and don't even think about adding features.

Now, that said... there's still a problem in our data structure. Given a node, we have no way to ask about what nodes are its parents, which is a surprisingly infrequent operations for client programmers (generally you only care about parents of handles you've defined, which always exist to the point you care about), but the core of Iron Lute, and the occasional advanced application, needs this. With each node storing pointers to the children nodes, we can follow the child->parent links, but there's no way to go back. Next time, we'll explore how to fix that, explore subtle implications of various solutions (some of which took me weeks to find), and face a very significant design decision: Emulate Xanadu or the WWW?