Many documents consist of just a single file, that can be read into a single document buffer. However this is not always the case.
A good example is a mailbox containing mail to be read. This can sometimes be a single file, but can also be a collection of files, either read from the filesystem or from a network connection such as with POP and IMAP. Another example is an HTML page that contains images. These images are stored in separate files.
Also, it is not always possible to simply read a document into a buffer. Sometimes the content of the document might be made available slowly, either arriving over a network connection or being generated by some external program. In these cases the creator of the document is an active agent which can simply append to a buffer as data becomes available.
Finally, it could sometimes be nice to be able to deliberately only read part of a file into a buffer. This would help when editting a very large file, or when reading the file might be very slow, again as with a network connection.
This all suggests that our abstraction of a document must be very flexable. It must allow for multiple files in a document, or at least for files to be closely related to one another, and it must allow for documents that are not completely read yet.
To handle multiple files, it is probable simplest to allow each document buffer to have as list of ancilliary (satellite? auxiliary?) buffers that are related to that buffer. They will all have local names that are chosen by whichever subsystem is parsing the file information. In this case, the primary file will quite probably be an indexing file such as a directory listing or an IMAP message listing.
This structure would not allow the possibility of multiple index files for the one set of files as might be useful with different views on the list of messages in a mailbox. It is possible that this could be achieved just as well with sufficient expressiveness of different views on the one buffer through different windows. Providing all the important information to allow selecting and sorting (e.g. tags and prev/next references) are in the buffer, displaying different views on an index from the one buffer should be easy enough.
Managing files that are to read lazily requires each buffer to have a flag to say if the end has been found yet, and a method for getting more data. This method would be called on demand, when any pointer tries to move past the end of the buffer, and lazily when the editor has nothing else to do. This lazy reading should be rate limited in some way so as not to inappropriately burden any network link. A simple mechanism would be to spend at least as much time not reading as reading. So if a read of the first block took a while, an equal while was waited until the next block was read. Filesystem read-ahead might defeat this however.
Having each document buffer be associated with a number of other buffers leads to a tree-like arrangement of buffers. There would presumably be a single top level document that lists the current set of open documents. Some of these might be simple files, some might be complex files with an index. Others might be further ad hoc sets of documents - this might be different sets of documents for different on going projects (mail, a programming project, a second project, some reference documentation, etc).
