Where Unix went wrong - filesystems - access control

06 December 2004, 17:02 UTC

This is one in a (possible) series where I complain about design decisions in Unix. Unix has, on the whole, a very good design, so mistakes stand out rather clearly.

Today's article looks at "Access Control" for files. The "ugo" access control in Unix doesn't seem all that bad until you look at the generalisation known as ACLs - access control lists. Once you see them you have to realise that something was fundamentally wrong to start with.

Access Control

Being a multi-user system, Unix has to have some sort of access control mechanism for files that are stored in or accessed by the system. People need to be confident of privacy of files they want to keep secret, but also able to make some files visible by others when sharing is needed. Equally, people need to be sure that other cannot modify their files without permission.

The Unix approach of assigning each file an owner - and a group owner - plus a set of permission flags - read, modify, execute - for each of the owner, the group owner, and everybody else, it relatively simple. Unfortunately it isn't very powerful.

Occasionally people want to give privileged access to more than one group, or different access to different groups, and this cannot be achieved with the standard unix scheme.

The "obvious" response is to extend the scheme. Instead of just one group, have a list of group. And have a list of users as well. And while we are at it, extend the list of permissions to include "create" and "delete" and "append" (which is a restricted modify) and .... the list goes on.

This list (an Access Control List) then needs to be stored with each file. Some sharing might be possible in the filesystem supports it, but in general you need to be able to have different lists for each file and pay the cost of accessing this list for every file access.

As you can imagine, once the list of access flags has grown beyond the basic "read" and "write" (even eXecute is dubious) people start wanting to add more and more, and there will always be someone who isn't satisfied with the available set. All this extra flexability, and it probably still isn't flexable enough to suit everybody, and in all probability will not be used most of the time (after all, basic Unix access control has been quite workable for a very long time).

A different response

As mentioned above, the "obvious" response seems to be to extent the current scheme. I would like to suggest that there is a better response: throw the current scheme away all together.

Rather than adding to the access control with extra lists, we should drop the whole idea of storing access control information with each file, and look for a different approach to the need for access control. It should be an approach that doesn't add cluster to individual files, is efficient where efficiency is needed, and is extensible enough to suit anybody's unexpected need.

The rest of this article attempts to develop and justify such a response. Being a response that is substantially different to the accepted mode of operation it is unlikely to ever be widely accepted, but one can dream.

Core needs

One of the core concepts of access control is the idea that each file has an owner. Someone (or some principal - it might be a group) created the file, is charged for the space used by the file, and has complete say as to what happens to the file (well, maybe not complete. They might not be able to remove it from backups for instance). I believe that it is not unreasonable that all of the files owned by a particular account should all live, exclusively, in one directory tree - the so-called "home directory" of that account. With few exceptions, this is a commonly accepted practice and making it a requirement causes no important loss of functionality.

This concept can be implemented by treating home directories as independant units, disallowing the movement of files between home directories, and only making a user's home directory available to that user. Using a per-process mount table (which is conceptually quite simple and available in Linux and else where) only those processes owned by a user would be able to see the user's home directory at all. Other processes simply wouldn't see it. Thus per-file access control would not be needed to ensure privacy and security.

The next core need is the idea of public files. People often want to be able to share files openly with anyone who might be interested. Such sharing is inevitable read-only. Giving the world uncontrolled write access to any file is certainly a bad idea. Giving open read access is often very valuable.

To support this, each user could have a subdirectory that is designated as "public" any file placed (or possibly linked) into that directory are made visible to anyone is a read-only manner. Possibly these files would be presented is a directory tree like "/pub/username/...." where each users public files would be available for anyone to read. Again, no per-file ownership or access information is needed. Rather, the location in the directory tree determines how the files may be accessed, and this is controlled by generic system-wide configuration.

Another important need is colaborative work. Different people will sometimes want to openly share files with one another. I believe that this is best implemented not by allowing users to share their files with others, but by allowing users to create groups and have the groups own the files. This, I believe, more accurately reflects the true nature of the work that is happening. How users form groups is a local administrative issue. It may require approval and action by a system administrator, or it may be possible to be completely self managed, with individuals able to "give" some of their own storage allocation to a group that they are a member of.

If this were to happen, then group files might appear under "/groups/groupname". When a user logs in, only the group directories for groups that they are a member of will be mounted. None others will be visible. Again with this scheme, individual files do not need to have a concept of ownership or access controls. It is whole directories trees that are controlled, and these controls can be described in a separate database, and enabled at login time.

The remaining sorts of sharing that might be wanted are not handled quite so easily, but possibly don't need to be. These cover case where read access is to be given to a selection of people (not everyone) or where some sort of controlled write access is desired.

As suitable model for these ad-hoc accesses would be the model used by the Apache web server (and doubtlessly other web serves). This important aspects of the model is that access to the files is mediated by a user-space server program, that program reads a simple textual access file that has a rich syntax for describing access, and that the server can run separate programs to provide complete control over how files are accessed.

With this scheme, discressionary read access can be given away based on visible attribute of the accessor (including asking for passwords), write access can be allowed with arbitrary checks and post-processng, and any access can be easily logged for an audit trail.

These sorts of accesses are likely to be less efficient than normal accesses by the owner of the file, but in the cases in question, this is unlikely to be too big a price to pay for the much higher degree of flexability.

Summary

Rather than having ownership and access control per-file which is stored with the file, we have it per-home directory stored in a separate database, together with text files within the home directory.

We have a per-process mount table, possibly mediated by an auto-mount daemon, that honours the separate database and only makes files visible which should be visible.


/home contains the current users home directory with full access.
/pub contains the public part of every users home directory, with read-only access.
/group contains a full-access mount of the homedirectory for every group the user is a member of.
/share contains a mountpoint which gives access to every home as mediated by a user-space daemon. It honours (the equivalent of) a .htaccess file to restrict and control access.

This should give better access control with less overhead that per-file ACLs in most cases and certainly feels a lot cleaner to me.

But what about ...

The above description conveniently avoids a number of issues that would come up if we discarded ownership and permission bits. It does so because I think they are larely uninteresting and can be dealt with simply. Here I will try to do that.

... the eXecute bit

The eXecute bit on directories has very little value over the Read bit. Allowing someone to read a directory, but not access files in it is little more than being a tease. It serves no security purpose. Having an eXecute bit without a Read bit has a little more value, but not much. It provides some security, but it is security through obscurity and is unlikely to be of lasting value. If this was really wanted, it - and anything else - can be implemented in the /share tree.

The eXecute bit of files is also fairly pointless, though for different reasons. It is not the permission bit on a file that really says whether a file can be executed, but the content of the file. If it has the right form, then execution is appropriate. Normally "executables" are found in a "bin" directory. Setting the eXecute bit on all files in a "bin" directory (i.e. a directory listed in your PATH) but not elsewhere is a bit pointless.

There certainly are times when you want to execute programs in some other directory, and time when you don't want to execute a file by mistake. However the practice of requiring a full path name, or a name starting "./" in order to execute something that isn't in you path is enough to give full flexability and maintain adequate security.

... the setuid/setgid bit

Well, this bit is really just a mistake that should have died long ago. Another article might deal with that.

... read-only files

There is sometimes a desire to mark ones own files as read-only, thus preventing accidental damage. This is probably the strongest argument for per-file access control, and it only justifies one bit. But I think it too shouldn't be necessary. An adequate backup system or a "cp" rather than a "chmod" might do almost as well.

Some software systems use the read-only bit as a flag. RCS, for example, marks a file as 'read-only', when it isn't currently 'checked-out'. Removing the possibility of doing this might make life a bit harder for RCS. However that could be a good thing - it might encourage RCS to allow for people changing files without an explicit check-out just as CVS and others do.

On the whole, I'm sure that removing this sort of feature would encourage people to revisit the problems they were using it to solve, and quite possibly find better solutions.

... system files

The above has talked about users and home directories only. What about system files?

I contend that all system files should be world readable and not writable. Any file with other access requirements should have that access mediated by some daemon and the real file should be hidden from users - only the gateway to the daemon should be visible. This will probably be dealt with in greater detail when I consider setuid files.

An excuse for Unix

So Unix ownership and permissions are a bad idea, but can we blame Unix? I think not. Unix was written for very small computers (by todays standards) and there may not have been room in the kernel to implement per-process address spaces or user-space filesystem daemons. The whole idea of daemons providing access to restricted resources is possibly more recent than Unix was too. At the time, the three-level permissions probably seemed adequate and elegant.

The blame, is such is to be laid, must be on those who suggest ACLs to enhance Unix permissions. ACLs are ugly and combersome. There is no excuse for creating and perpetuating them. Discard it all and start again is the best option.


Comments...

Re: Where Unix went wrong - filesystems - access control (06 February 2007, 20:56 UTC)
KISS

or in your case: KIS

If the admin of the server can add users to various groups then perhaps the system serves well.

Doing an NFS mount with read only would allow further control to other users of the same system.

Any meta stuff is just a case of implementing 1 & 0's

[permalink][hide]


Comment (06 February 2008, 10:59 UTC)
I'm sorry - where did you study operating systems architecture? CMU? Oxford? Cambridge? Berkeley? Stanford? With Tony Hoare? Edsger Dykstra? At Thomas J Watson?

[permalink][hide]


Re: Where Unix went wrong - filesystems - access control (06 February 2008, 11:55 UTC)

At UNSW under John Lions if it is relevant... but most of by studying of such things was done outside the confines of a university.

Why?

[permalink][hide]





[æ]