"Everything is a file" is a bit glib. "Everything appears somewhere in the filesystem" is closer to the mark, and even then, it's more an ideal than a law of system design.

For example, Unix domain sockets are not files, but they do appear in the filesystem. You can ls -l a domain socket to display its attributes, cat data to/from one, modify its access control via chmod , etc.

But, even though regular TCP/IP network sockets are created and manipulated with the same BSD sockets system calls as Unix domain sockets, TCP/IP sockets do not show up in the filesystem,¹ even though there is no especially good reason that this should be true.

Another example of non-file objects appearing in the filesystem is Linux's /proc filesystem. This feature exposes a great amount of detail about the kernel's run-time operation to user space, mostly as virtual plain text files. Many /proc entries are read-only, but a lot of /proc is also writeable, so you can change the way the system runs using any program that can modify a file. Alas, here again we have a nonideality: BSD type Unixes generally run without /proc , and the System V Unixes expose a lot less via /proc than Linux does.

I can't contrast that to MS Windows

First, much of the sentiment you can find online and in books about Unix being all about file I/O and Windows being "broken" in this regard is obsolete. Windows NT fixed a lot of this.

Modern versions of Windows have a unified I/O system, just like Unix, so you can read network data from a TCP/IP socket via ReadFile() rather than the Windows Sockets specific API WSARecv() , if you want to. This exactly parallels the Unix Way, where you can read from a network socket with either the generic read(2) Unix system call or the sockets-specific recv(2) call.²

Nevertheless, Windows still fails to take this concept to the same level as Unix, even here in 2018. There are many areas of the Windows architecture that cannot be accessed through the filesystem, or that can't be viewed as file-like. Some examples:

Drivers. Windows' driver subsystem is easily as rich and powerful as Unix's, but to write programs to manipulate drivers, you generally have to use the Windows Driver Kit, which means writing C or .NET code. On Unix type OSes, you can do a lot to drivers from the command line. You've almost certainly already done this, if only by redirecting unwanted output to /dev/null .³ Inter-program communication. Windows programs don't communicate easily with each other. Unix command line programs communicate easily via text streams and pipes. GUI programs are often either built on top of command line programs or export a text command interface, so that the same simple text-based communication mechanisms work with GUI programs, too. The registry. Unix has no direct equivalent of the Windows registry. The same information is scattered through the filesystem, most of it in /etc , /proc and /sys .

If you don't see that drivers, pipes, and Unix's answer to the Windows registry have anything to do with "everything is a file," read on.

How does the "Everything is a file" philosophy make a difference here?

I will explain that by expanding on my three points above, in detail.

Long answer, part 1: Drives vs Device Files

Let's say your CF card reader appears as E: under Windows and /dev/sdc under Linux. What practical difference does it make?

It is not just a minor syntax difference.

On Linux, I can say dd if=/dev/zero of=/dev/sdc to overwrite the contents of /dev/sdc with zeroes.

Think about what that means for a second. Here I have a normal user space program ( dd(1) ) that I asked to read data in from a virtual device ( /dev/zero ) and write what it read out to a real physical device ( /dev/sdc ) via the unified Unix filesystem. dd doesn't know it is reading from and writing to special devices. It will work on regular files just as well, or on a mix of devices and files, as we will see below.

There is no easy way to zero the E: drive on Windows, because Windows makes a distinction between files and drives, so you cannot use the same commands to manipulate them. The closest you can get is to do a disk format without the Quick Format option, which zeroes most of the drive contents, but then writes a new filesystem on top of it. What if I don't want a new filesystem? What if I really do want the disk to be filled with nothing but zeroes?

Let's be generous and say that we really do want a fresh new filesystem on E: . To do that in a program on Windows, I have to call a special formatting API.⁴ On Linux, you don't need to write a program to access the OS's "format disk" functionality. You just run the appropriate user space program for the filesystem type you want to create: mkfs.ext4 , mkfs.xfs , or what have you. These programs will write a filesystem onto whatever file or /dev node you pass.

Because mkfs type programs on Unixy systems work on files without making artificial distinctions between devices and normal files, it means I can create an ext4 filesystem inside a normal file on my Linux box:

$ dd if=/dev/zero of=myfs bs=1k count=1k $ mkfs.ext4 -F myfs

That literally creates a 1 MiB disk image in the current directory, called myfs . I can then mount it as if it were any other external filesystem:

$ mkdir mountpoint $ sudo mount -o loop myfs mountpoint $ grep $USER /etc/passwd > mountpoint/my-passwd-entry $ sudo umount mountpoint

Now I have an ext4 disk image with a file called my-passwd-entry in it which contains my user's /etc/passwd entry.

If I want, I can blast that image onto my CF card:

$ sudo dd if=myfs of=/dev/sdc1

Or, I can pack that disk image up, mail it to you, and let you write it to a medium of your choosing, such as a USB memory stick:

$ gzip myfs $ echo "Here's the disk image I promised to send you." | mutt -a myfs.gz -s "Password file disk image" you@example.com

All of this is possible on Linux⁵ because there is no artificial distinction between files, filesystems, and devices. Many things on Unix systems either are files, or are accessed through the filesystem so that they look like files, or in some other way look sufficiently file-like that they can be treated as such.

Windows' concept of the filesystem is a hodgepodge; it makes distinctions between directories, drives, and network resources. There are three different syntaxes, all blended together in Windows: the Unix-like ..\FOO\BAR path system, drive letters like C: , and UNC paths like \\SERVER\PATH\FILE.TXT . This is because it's an accretion of ideas from Unix, CP/M, MS-DOS, and LAN Manager, rather than a single coherent design. It is why there are so many illegal characters in Windows file names.

Unix has a unified filesystem, with everything accessed by a single common scheme. To a program running on a Linux box, there is no functional difference between /etc/passwd , /media/CF_CARD/etc/passwd , and /mnt/server/etc/passwd . Local files, external media, and network shares all get treated the same way.⁶

Windows can achieve similar ends to my disk image example above, but you have to use special programs written by uncommonly talented programmers. This is why there are so many "virtual DVD" type programs on Windows. The lack of a core OS feature has created an artificial market for programs to fill the gap, which means you have a bunch of people competing to create the best virtual DVD type program. We don't need such programs on *ix systems, because we can just mount an ISO disk image using a loop device.

The same goes for other tools like disk wiping programs, which we also don't need on Unix systems. Want your CF card's contents irretrievably scrambled instead of just zeroed? Okay, use /dev/random as the data source instead of /dev/zero :

$ sudo dd if=/dev/random of=/dev/sdc

On Linux, we don't keep reinventing such wheels because the core OS features not only work well enough, they work so well that they're used pervasively. A typical scheme for booting a Linux box involves a virtual disk image, for just one example, created using techniques like I show above.⁷

I feel it's only fair to point out that if Unix had integrated TCP/IP I/O into the filesystem from the start, we wouldn't have the netcat vs socat vs Ncat vs nc mess, the cause of which was the same design weakness that lead to the disk imaging and wiping tool proliferation on Windows: lack of an acceptable OS facility.

Long Answer, part 2: Pipes as Virtual Files

Despite its roots in DOS, Windows never has had a rich command line tradition.

This is not to say that Windows doesn't have a command line, or that it lacks many command line programs. Windows even has a very powerful command shell these days, appropriately called PowerShell.

Yet, there are knock-on effects of this lack of a command-line tradition. You get tools like DISKPART which is almost unknown in the Windows world, because most people do disk partitioning and such through the Computer Management MMC snap-in. Then when you do need to script the creation of partitions, you find that DISKPART wasn't really made to be driven by another program. Yes, you can write a series of commands into a script file and run it via DISKPART /S scriptfile , but it's all-or-nothing. What you really want in such a situation is something more like GNU parted , which will accept single commands like parted /dev/sdb mklabel gpt . That allows your script to do error handling on a step-by-step basis.

What does all this have to do with "everything is a file"? Easy: pipes make command line program I/O into "files," of a sort. Pipes are unidirectional streams, not random-access like a regular disk file, but in many cases the difference is of no consequence. The important thing is that you can attach two independently-developed programs and make them communicate via simple text. In that sense, any two programs designed with the Unix Way in mind can communicate.

In those cases where you really do need a file, it is easy to turn program output into a file:

$ some-program --some --args > myfile $ vi myfile

But why write the output to a temporary file when the "everything is a file" philosophy gives you a better way? If all you want to do is read the output of that command into a vi editor buffer, vi can do that for you directly. From the vi "normal" mode, say:

:r !some-program --some --args

That inserts that program's output into the active editor buffer at the current cursor position. Under the hood, vi is using pipes to connect the output of the program to a bit of code that uses the same OS calls it would use to read from a file instead. I wouldn't be surprised if the two cases of :r — that is, with and without the ! — both used the same generic data reading loop in all common implementations of vi . I can't think of a good reason not to.

This isn't a recent feature of vi , either; it goes clear back to the ancient ed(1) text editor.⁸

This powerful idea pops up over and over in Unix.

For a second example of this, recall my mutt email command above. The only reason I had to write that as two separate commands is that I wanted the temporary file to be named *.gz , so that the email attachment would be correctly named. If I didn't care about the file's name, I could have used process substitution to avoid creating the temporary file:

$ echo "Here's the disk image I promised to send you." | mutt -a <(gzip -c myfs) -s "Password file disk image" you@example.com

That avoids the temporary by turning the output of gzip -c into a FIFO (which is file-like) or a /dev/fd object (which is file-like). (Bash chooses the method based on the system's capabilities, since /dev/fd isn't available everywhere.)

For yet a third way this powerful idea appears in Unix, consider gdb on Linux systems. This is the debugger used for any software written in C and C++. Programmers coming to Unix from other systems look at gdb and almost invariably gripe about it, "Yuck, it's so primitive!" Then they go searching for a GUI debugger, find one of several that exist, and happily continue their work...often never realizing that the GUI just runs gdb underneath, providing a pretty shell on top of it. There aren't competing low-level debuggers on most Unix systems because there is no need for programs to compete at that level. All we need is one good low-level tool that we can all base our high-level tools on, if that low-level tool communicates easily via pipes.

This means we now have a documented debugger interface which would allow drop-in replacement of gdb , but unfortunately, the primary competitor to gdb didn't take the low-friction path.

Still, it is at least possible that some future gdb replacement would drop in transparently simply by cloning its command line interface. To pull the same thing off on a Windows box, the creators of the replaceable tool would have had to define some kind of formal plugin or automation API. That means it doesn't happen except for the very most popular programs, because it's a lot of work to build both a normal command line user interface and a complete programming API.

This magic happens through the grace of pervasive text-based IPC.

Although Windows' kernel has Unix-style anonymous pipes, it's rare to see normal user programs use them for IPC outside of a command shell, because Windows lacks this tradition of creating all core services in a command line version first, then building the GUI on top of it separately. This leads to being unable to do some things without the GUI, which is one reason why there are so many remote desktop systems for Windows, as compared to Linux: Windows is very hard to use without the GUI.

By contrast, it's common to remotely administer Unix, BSD, OS X, and Linux boxes remotely via SSH. And how does that work, you ask? SSH connects a network socket (which is file-like) to a pseudo tty at /dev/pty* (which is file-like). Now your remote system is connected to your local one through a connection that so seamlessly matches the Unix Way that you can pipe data through the SSH connection, if you need to.

Are you getting an idea of just how powerful this concept is now?

A piped text stream is indistinguishable from a file from a program's perspective, except that it's unidirectional. A program reads from a pipe the same way it reads from a file: through a file descriptor. FDs are absolutely core to Unix; the fact that files and pipes use the same abstraction for I/O on both should tell you something.⁹

The Windows world, lacking this tradition of simple text communications, makes do with heavyweight OOP interfaces via COM or .NET. If you need to automate such a program, you must also write a COM or .NET program. This is a fair bit more difficult than setting up a pipe on a Unix box.

Windows programs lacking these complicated programming APIs can only communicate through impoverished interfaces like the clipboard or File/Save followed by File/Open.

Long Answer, part 3: The Registry vs Configuration Files

The practical difference between the Windows registry and the Unix Way of system configuration also illustrates the benefits of the "everything is a file" philosophy.

On Unix type systems, I can look at system configuration information from the command line merely by examining files. I can change system behavior by modifying those same files. For the most part, these configuration files are just plain text files, which means I can use any tool on Unix to manipulate them that can work with plain text files.

Scripting the registry is not nearly so easy on Windows.

The easiest method is to make your changes through the Registry Editor GUI on one machine, then blindly apply those changes to other machines with regedit via *.reg files. That isn't really "scripting," since it doesn't let you do anything conditionally: it's all or nothing.

If your registry changes need any amount of logic, the next easiest option is to learn PowerShell, which basically amounts to learning .NET system programming. It would be like if Unix only had Perl, and you had to do all ad hoc system administration through it. Now, I'm a Perl fan, but not everyone is. Unix lets you use any tool you happen to like, as long as it can manipulate plain text files.

Footnotes: