A history of S_IFMT

In Unix, S_IFMT is a mask identifying the bits of an inode's mode that indicate the file's type, i.e. whether it is a directory, a symbolic link, a socket, and so on. It is conventionally 0170000 , which corresponds to the top 4 bits of a 16-bit mode.

I saw someone asking the other day why 4 bits are used when POSIX only defines 7 types, and so could be stored just as well in 3 bits. The straightforward answer is that it allows room for expansion, and indeed many Unixes define several more. Solaris, for example, has an additional 3 types: doors, event ports, and ACL shadows (though the latter is not exposed in userspace).

But that's not the whole story. The question I'm going to answer in this post is not why 4 bits are used, but why they're used the way they are. If you have a look at the standard file types, their values seem pretty arbitrary, when you might expect a simple count upwards.

Symbol Octal Bits Type S_IFMT 170000 1111 S_IFIFO 010000 0001 Named pipe S_IFCHR 020000 0010 Character special S_IFDIR 040000 0100 Directory S_IFBLK 060000 0110 Block special S_IFREG 100000 1000 Regular file S_IFLNK 120000 1010 Symbolic link S_IFSOCK 140000 1100 Socket

I saw some patterns in there, but I couldn't work it out, so I had a look at some historical manuals and header files.

1st Edition UNIX

1st Edition UNIX (1971) had no type field as such. The top 4 bits of the mode had the following layout. A dot ( . ) means that the bit's value doesn't matter.

Octal Bit Meaning 100000 1... Inode is allocated 040000 .1.. Directory 020000 ..1. Has been modified 010000 ...1 Large file storage

We can see the origin of S_IFDIR here, but the other bits had completely different meanings. In fact, 1st Edition had a very different layout for the mode in general. For one thing, groups had yet to be introduced. The bottom 6 bits were used, from higher to lower, to mean: setuid, executable, owner-read, owner-write, other-read, and other-write. And so 1st Edition ls might write --xrwr- to mean something like -rwxr-xr-x today.

Bit 020000 was apparently always set to 1, and so was likely just ignored by the time of the 1st Edition. Bit 100000 was also always set to 1 for allocated inodes, but this allowed the file system to distinguish between an unallocated inode and a regular file with no permissions ( ------- ).

4th Edition UNIX

The mode layout changed in 4th Edition UNIX (1973), coinciding with the addition of groups and a switch to the modern -rwxrwxrwx layout for the file permissions. This was the first Unix to have a mask for these inode types, though it was only 2 bits wide, taking the place of the directory bit and modification bit.

Symbol Octal Bits Type IFMT 060000 0110 000000 .00. Regular file IFCHR 020000 .01. Character special IFDIR 040000 .10. Directory IFBLK 060000 .11. Block special

The allocation bit ( IALLOC ) and large file bit ( ILARG ) were still used as in the 1st Edition.

7th Edition UNIX

The next change happened in 7th Edition UNIX (1979), when the mask was extended to the present 4 bits, by extending it by a single bit in each direction, displacing IALLOC and ILARG . Yet each bit retained its absolute position in the mode, which is why the earliest types are not counted from 1. In addition, regular files kept their highest bit set (as it will have been when IALLOC was in use), so as to distinguish between an unallocated inode (stored with a fully zeroed mode), and a regular file with no permissions ( ---------- ).

Also added were two types no longer in use, multiplexed special files, which had the same codes as their uniplexed counterparts, but with their lowest bit set. These types did not however last long.

Symbol Octal Bits Type S_IFMT 170000 1111 S_IFCHR 020000 0010 Character special S_IFMPC 030000 0011 Multiplexed character special S_IFDIR 040000 0100 Directory S_IFBLK 060000 0110 Block special S_IFMPB 070000 0111 Multiplexed block special S_IFREG 100000 1000 Regular file

System III

System III (1982) added named pipes, starting at the lowest value now possible.

Symbol Octal Bits Type S_IFIFO 010000 0001 Named pipe

4.3BSD

4.3BSD (1986) added symbolic links and sockets, also counting up but only using the top 3 bits, 160000 , presumably so as not to step on AT&T's toes.

Symbol Octal Bits Type S_IFLNK 120000 1010 Symbolic link S_IFSOCK 140000 1100 Socket

Enumerating S_IFMT

Something interesting (to me) about how this layout has come about is that, if you twiddle the bits a little, you can end up with a reasonably chronological numbering of the types. Specifically, in code:

fmt = mode >> 12; // drop file permissions, leaving IFMT if (fmt == 010) return 0; // if only IALLOC bit is set, clear it return ((fmt >> 1) | (fmt << 2)) & 07; // fold rightmost bit onto leftmost bit

And this gives us:

# Type 0 Regular file 1 Character special 2 Directory 3 Block special 4 Named pipe 5 Symbolic link 6 Socket