Kernel is my friend — Part 1

Fun with file descriptors

At the start of 2018, I wished to learn in general about Linux and Systems Programming. That means, being friends with the Kernel! So, I started reading The Linux Programming Interface. This is the first part of the series of blog posts that I will be writing as I go through the book and unravel the mysteries of Linux.

This post is an attempt to understand some parts of the lesson “File I/O: The Universal I/O Model” from The Linux Programming Interface.

Types of files

I have heard of the statement that “everything in Linux is a file”. But, I got learn that there are different types of files in Linux.

System calls

System call is a way of asking the kernel to do some work for us. For example, these are the four basic system calls that help to work with files in Linux

open — Hey kernel, could you open a file for me? So that the process could do something with it.

read — Hey kernel, could you read this amazing piece of code that I saved in ~/cool/stuff/is/here.js ?

? write — Hey kernel, could you help me write my really long homework to this file?

close — Hey kernel, I am done with my homework, please close the file. We will go hangout and do some other fun stuff.

File Descriptors

File descriptor is a non-negative integer number that is used to reference a file.

When open() system call is being called by a process, a file descriptor is being returned from it, which could be used in other system calls like read, write, close.

One of the interesting thing that I missed out to observe closely when I read the lesson for the first time is (This is probably an important key take away)

Each process has its own set of file descriptors

Lets try to understand this in a step by step manner.

One process, one file

Lets write a program that just opens a file and print its file descriptor value.

I created a binary and executed it to read one file at a time. So, this is basically reading one file from one process at a time.

The file always gets the file descriptor value of 3 always

One process, multiple files

Now, lets try opening multiple files from one process at a time

files got allocated with sequential integer values

The way the files get the file descriptor number is based on this simple idea

SUSv3 specifies that if open() succeeds, it is guaranteed to use the lowest-numbered unused file descriptor for the process. — Kerrisk, Michael. The Linux Programming Interface: A Linux and UNIX System Programming Handbook (p. 73). No Starch Press. Kindle Edition.

Multiple processes, multiple files

This is the interesting part. What happens if same files are accessed by multiple processes at the same time. How is the file descriptor allocated then?

Numbering is done at a process level

This proves the statement that we started with,

Each process has its own set of file descriptors

Process 9012 opened 1.js assigning fd 3 and 2.js assigning fd 4. Process 9015 is no different, it did the same thing. Because those are the lowest numbered unused file descriptor value within those processes.

Now that we have come this long way, the interesting question in my mind is what will happen if two processes try to write to the opened files at the same time. (I guess, this is probably worth answering another time!)