Thinking about a better Editor

31 March 2004, 06:39 UTC

My earlier note about wanting a Better Emacs has left me thinking about more details, and I find that if I don't write them down, they just go around and around in my head. So maybe I'll write them down.

The main parts of the editor would be

The internal language would probably have to be Python. Though I haven't used it much, the syntax and general approach is just soooo much cleaner than any of the competition, it isn't funny. It is a pity about the lack of static typing, but as creating a new language just for an editor is probably a backwards step, Python it is.

The internal structure would also be fairly unsurprising. A simple tree with an assortment of tags at each node to indicate meaning would largely surfice. Obviously text would have to have a significant role in the tree, so the data structure would have a simple and efficient way to store and manipulate large texts.

One element that the tree would need would be "lazy nodes". This would be a node that contained text, or pointers to text, that was not fully parsed yet. This would mean that a document didn't have to be fully parsed before any was displayed. If the "pointer to text" was actually just a filename or even a network connection, the editor could work on a document that might not even completely exist yet. Only when the editor needed to access specific content would that content be loaded and parsed.

A simple example of using lazy nodes would be in front of an IMAP connection. The document tree could represent the entire collection of mail boxes, but data would only be collected when actually needed.

Another question about the internal structure is how different modules would work together. For example, a spelling checker should be fairly independant of other modules and will need to annotate the text to indicate potential spelling errors. This annotation should not interfere with that of any other module. However modules migt still want to interact: a programming language might annotate comments and text strings as "plain text", and the spelling checker might ignore everything else. This suggests that what ever stucture we come up with needs to make a clear distinction between attributes of the core structure of the document, and attributes that are just annotations.

How best to parse program code into this will be an interesting question. Should the core structure be just lines of text with all observations on syntax at attributes, or should more of the inherent structure be recognised and represented, and attributes used only for minor structural elements?

Parsing the input would be done under control of some python script. Python has plenty of libraries to help with parsing, and more could be written if needed.

Similarly output would be under script control.

That leaves the most interesting bit - rendering onto the screen.

It would be nice to be able to render almost anything. However setting our sights too high will lead to excess complexity and ultimate failure to get anywhere.

The basic units needed are text and images. Random line drawing should be deferred at least until we have some experience, possibly forever.

These text and images can be arranged in panes. Quite possibly panes can be rotated, though rotations other than multiples of a right angle should not be a concern too early.

Panes can be non-rectangular but the complexity this creates should be large left to the application. Each pane should be treated as rectangular though possibly transparent. It should have a polagonal shape that might affect mouse clicks, but not rendering.




[æ]