Outlines, Part 6

2004-03-08

In my previous post, I discussed the practical matter of how to hold the outline structure we've built so far together. Having created a strong base, it is now fruitful to consider how to extend the data structure to handle the wide variety of outline structures I want Iron Lute to be able to manipulate.

Node Types

One of the most interesting possibilities inherent in this structure is to formally recognize that there are a lot of potential different types of nodes that can be built.

Every outliner I've seen has at least some informal idea that there are different types of nodes; in pre-Node Type Frontier, the menus were implemented as extensions of outlines, in many other outliners they use attributes to try to declare some limited idea of node type (like "isComment='1'" and such things). But formal recognition of the fact that we want lots of different types of nodes allows us to formally support them, define them, and allow paths to extend them.

As an existing example, implemented in Radio Userland and Iron Lute, is the OPML Transclusion node. In Radio Userland, it is expressed as a node type of "link", in Iron Lute, I have an "OPMLTranscluder" (to seperate the OPML-specific semantics from other potential transclusion types, and because OPML is just one particular format to Iron Lute, it's not built on it).

What does that mean? It means that the node can bundle certain actions along with it, and carry data with it. In other words, it almost exactly parallels the "class" idea in OO, so in Iron Lute we actually build it on classes.

An OPMLTranscluder node knows to load a remote OPML file if it is opened. Many other type of nodes can be imagined, each carrying their own verbs with them whereever they may go. A lot of things can be expressed as Node Types that you might not initially realize.

For instance, I envision a web site based on outlines, instead of files and database entries. (Kinda like Channel Z, but really deeply outline oriented; I've recently been impressed with how HTML::Mason works in Perl and I think it'll end up looking like that with "outlines" in place of "components".) "Comment Entry Field" can itself be a particular node. Expanding it might let you view the comments made to date. Each comment would itself be a Comment type node, and might know who made the comment. The comment would carry commands with it, so you might be able to right-click and select "Ban User/IP".

Removing a comment would be as easy as just deleting it. You could edit/censor the comment if you were the site owner, re-order the comments, or do any of the other outline manipulations and it would all work. If you decided you didn't want any comments at all, you could delete the entire Comment node from the posting, and all the comments, plus the ability to add new ones, would disappear. So you could easily allow comments on a post-by-post basis.

Moreover, it's just a node. You could allow multiple distinct comment fields per posting, or move the comment field to another post, move comments from post to post, have the exact same comment field show up in multiple places (different pages of one article), etc. I see a very flexible and interactive web site developing from this, with a different content philosophy then modern web sites.

The website itself will be viewed as HTML by running the engine part of Iron Lute on the web server and rendering things as HTML. Of course, templates and such can already be represented as outlines; that's proven technology.

This is just one small instance of how much fun it might be to have such a clear, consistent, and powerful interface to data available to us. Replicating any of this in current architectures is of course possible, but with outlines and an outliner, the capabilities are clean, cheap, and extend well.

There's also some internal uses; I've implemented the menus as outlines (though as of this writing they don't cascade, that's only because I haven't needed it yet), and later internal configuration will be available in an outline form, with nodes holding preferences.

Node types are, as I've mentioned earlier, the fundamental Good Idea that drove me to write Iron Lute. I'm still excited by the prospect of having this stuff to play with; I only wish I was already done with the "hard work" to get here.

I don't have access to all of the closed-source outliners so I'd like to explicitly disclaim the possibility that they may have a wonderfully clear abstraction, since many of them do seem to have a relatively rich set of things that look a lot like Node Types. Without a developer API that can tap into that power, though, who cares?

Document Types

Every document can have a document type, which can also define behaviors and commands for every node it contains. For instance, in the web system I allude to above, whenever you are editing a document, you'll have a command in the Document menu to insert a new comment area. (You can see the Document menu in screen shot, dimmed out because I haven't come up with any reasonable "document commands" for the default, neutral document that aren't really generally useful commands that belong in the main menus.)

Another useful aspect in the general case is "embedded document" or "composite document". Imagine editing a Perl program, and having an HTML "here" document in it. Iron Lute will be able to provide full HTML support inside of the "here" document while still providing full Perl support in the normal document, with no "hacks" necessary. This comes up surprisingly often, and the ability to do it cleanly will bring forth even more applications when one document is "embedded" in another.

One thing I'm planning on is to segregate the keyboard commands based on whether they are "node" or "document" commands; I'm planning on reserving "ALT" for document commands, whereas I want to leave CTRL (w/ or w/o SHIFT) for node commands. (The user can re-configure this away if they like but I wouldn't recommend it.)

More Structured Node Text

Most outliners have some capability of supporting at least formatted text. I want to take the next step, and support smarter formatting.

Radio Userland already supports at least one special case of this; IIRC correctly, you can click an HTML link and follow it. Most, if not all, outliners allow formatted text. I want to generalize this, so that you can declare "tags" and decorate text with them (just as you decorate text with "bold" and "italic"), and allow the text itself to carry any useful commands, formatting, intelligence, etc. that it needs to.

Of course you can build the standard "bold" and "italic" on top of this, but you would then easily be able to build other things on top of it too, like an "HTML link". An "HTML Link" is a tag that turns the tagged text blue, underlines it, and when right-clicked, presents (at least) two choices: Edit the link destination, or open a browser on that link. In some ways it's easier to define the generic capability then build the simple things on top of it.

The spell-checker will tap into this; a spellchecker watches the words in a node, and marks misspelled words with a tag that changes the formatting and when selected, gives the usual correction options.

As of this writing, I'm not entirely certain how this is going to be done, but I am fairly certain you will be able to define your own tag types and provide them with context-specific commands in a clean manner.

Synthesis

All added up, this provides a rich, robust system for representing a wide variety of outline type data structures, documents and otherwise.

It's not a panacea; perhaps in some other posting I'll talk about the indicators that an outliner is not an appropriate interface. For now, that's only written up in the developer's manual, and still a work very much in progress. It also depends on some of the precise aspects of the outline structure that I have not discussed, such as its interaction with the Commands the user runs on the outline, and the hitherto-unmentioned Undo/Redo system, so I can't meaningfully describe all of the indicators right now.

It's worth reviewing for a moment what I've done. Starting from a monolithic idea of what a node is, I broke the node up into:

A structural piece (original node class)
how the structural nodes are connected (links)
a 'projected' outline on top of a graph (handles)
Where the node gets its data from (data sources)
what kind of node/document it is (node types)
what document the node belongs to

that's six distinct "aspects" of a node that I can manipulate, or even replace, independently of the others. (This is why writing outline code in other outliners feel so constraining to me; things natural under this model are complex, error-prone, or even impossible under the "monolithic node" model.) Despite that, it's only marginally more difficult to manipulate, and often easier because it asks you to jump through fewer hoops.

For reference, the final Python code that implements this structure is significantly shorter then just this discussion. Approx. 50 kilobytes of code sustained about 90kilobytes of discussion; and this discussion was just the high-level discussion, and the 50 kilobyte count includes a lot of documentation. While not all code is written with this level of deliberation, a lot of it is.

For you non-developers who may have slogged through this, I hope this opens your eyes to the amount of work represented in every program you use, including huge quantities of code you aren't even aware of at the OS layer and other layers you can't see. And even in the case of Iron Lute, which remains a relatively small, unreleased program, this is only a fraction of the total code in that program; as of this writing there are 700 kilobytes of Python code, including testing, documentation, and at least one dead branch of code. (While this is a very carefully considered and designed part, there are at least four or five other things in the code worthy of equal care; some I may yet write about here, some I probably won't.)

This should be the last bit exclusively regarding outlines, though of course further discussion will all be predicated on the outline model for obvious reasons.

Last night, I 'finished' the part of Iron Lute that reconstructs outlines based on the pieces as described in this post; finished gets scare quotes because it needs more tests written for it, which it gets tonight. However, it already correctly handles the case of A->B, which pretty much is the fundamental thing it has to get right; everything else (except maybe A->A) is built on that. After that I need to write the XML converter for the parts, and start testing the dickens out of it, then I can swing back to the GUI.

After more thought, I'm strongly considering releasing an alpha version at that point, even though I'm not "done" yet, for testing and feedback. There are several large ideas still left to go in, but in the long run it might be better to have it out there, getting used and getting feedback, then to get all the ideas in there at first. My primary issue is I don't want developers to get too attached to the current way of doing things which may change later, but this sort of project generally takes time to come up to speed, if it indeed ever does, so perhaps that's a phantom concern.