Delve into UNIX process creation

Explore the life cycle of a process running under the UNIX operating system

One of the many jobs assigned to system administrators is making sure the programs of users are running properly. This task is made more complex by the presence of other programs running concurrently on the system. For various reasons, these programs might fail, hang up, or otherwise misbehave. Understanding how the UNIX® environment creates, manages, and destroys these jobs is a crucial step in building a more reliable system.

Developers also have a motivation to learn how the kernel manages processes, because applications that behave well with the rest of the system take fewer resources and don't anger the system administrators as frequently. An application that restarts constantly because it creates zombie processes (described later) is obviously not desirable. An understanding of the UNIX system calls that govern processes allows developers to write software that can run silently in the background, rather than needing a terminal session that must be kept on someone's screen.

The fundamental building block of managing these programs is the process. A process is a name given to a program being executed by the operating system. If you're familiar with the ps command, then you're familiar with a process listing, such as the one shown in Listing 1.

Listing 1. Output of a ps command

sunbox#ps -ef UID PID PPID C STIME TTY TIME CMD root 0 0 0 20:15:23 ? 0:14 sched root 1 0 0 20:15:24 ? 0:00 /sbin/init root 2 0 0 20:15:24 ? 0:00 pageout root 3 0 0 20:15:24 ? 0:00 fsflush daemon 240 1 0 20:16:37 ? 0:00 /usr/lib/nfs/statd ...

The first three columns are important to this discussion. The first lists the user the process is running as, the second lists the ID of the process, and the third lists the ID of the parent of the process. The final column is a description of the process, usually the name of the binary that was launched. Each process is assigned an identifier, called the process identifier (PID). A process also has a parent, which in most cases is the PID of the process that launched it.

The presence of a parent PID (PPID) implies that one process is created by another process. The original process that kicks this off is called init , and it is always given a PID of 1. init is the first real process to be started by the kernel on bootup. It is the job of init to start up the rest of the system. init and other processes with a PPID of 0 belong to the kernel.

Using the fork system call

The fork(2) system call creates a new process. Listing 2 shows fork being used in a simple piece of C code.

Listing 2. A simple use of fork(2)

sunbox$ cat fork1.c #include <unistd.h> #include <stdio.h> int main (void) { pid_t p; /* fork returns type pid_t */ p = fork(); printf("fork returned %d

", p); } sunbox$ gcc fork1.c -o fork1 sunbox$ ./fork1 fork returned 0 fork returned 698

The code in fork1.c simply makes the call to fork and prints the integer result through a call to printf . Only one call is made, but the output is printed twice. This is because a new process is created within the call to fork . Two separate processes are now returning from the call. This is often described as "called once, returns twice."

The values returned by fork are interesting. One of them returns 0; the other, a non-zero value. The process that gets the 0 is called the child process, and the non-zero result goes to the original process, which is the parent process. You use the return value to determine which process is which. Because both processes resume execution at the same space, the only practical differentiator is the return value from fork .

The rationale for the 0 and non-zero return values is that a child can always find out who its parent is through a call to getppid(2) , but it is more difficult for a parent to find all its children. Thus, the parent is told about its new child, and the child can look up its parent, if needed.

With the return value of fork in mind, the code can now check to see if it is the parent or child process and act accordingly. Listing 3 shows a program that prints different output based on the result of the fork .

Listing 3. A more complete example using fork

sunbox$ cat fork2.c #include <unistd.h> #include <stdio.h> int main (void) { pid_t p; printf("Original program, pid=%d

", getpid()); p = fork(); if (p == 0) { printf("In child process, pid=%d, ppid=%d

", getpid(), getppid()); } else { printf("In parent, pid=%d, fork returned=%d

", getpid(), p); } } sunbox$ gcc fork2.c -o fork2 sunbox$ ./fork2 Original program, pid=767 In child process, pid=768, ppid=767 In parent, pid=767, fork returned=768

In Listing 3, the PIDs are printed out at each step, and the code checks the return value from fork to determine which process is the parent and which is the child. Comparing the PIDs printed, you can see that the original process is the parent process (PID 767), and the child process (PID 768) knows who its parent is. Note how the child knows its parent through getppid and how the parent uses the result of fork to locate its child.

Now that you understand the method of duplicating a process, let's examine how to run a different process. fork is only half of the equation. The exec family of system calls runs the actual program.

Using the exec family of system calls

The job of exec is to replace the current process with a new process. Note the use of the word replace. Once you call exec , the current process is gone and the new process starts. If you want to create a separate process, you must first fork , and then exec the new binary within the child process. Listing 4 shows such a scenario.

Listing 4. Run a different program by pairing fork with exec

sunbox$ cat exec1.c #include <unistd.h> #include <stdio.h> int main (void) { /* Define a null terminated array of the command to run followed by any parameters, in this case none */ char *arg[] = { "/usr/bin/ls", 0 }; /* fork, and exec within child process */ if (fork() == 0) { printf("In child process:

"); execv(arg[0], arg); printf("I will never be called

"); } printf("Execution continues in parent process

"); } sunbox$ gcc exec1.c -o exec1 sunbox$ ./exec1 In child process: fork1.c exec1 fork2 exec1.c fork1 fork2.c Execution continues in parent process

The code in Listing 4 first defines an array, with the first element being the path to the binary that is to be executed, and the remaining elements acting as the command-line parameters. The array is null-terminated per the man pages. After returning from the fork system call, the child process is instructed to execv the new binary.

The call to execv first takes a pointer to the name of the binary to be run, and then a pointer to the array of parameters that you declared earlier. The first element of the array is actually the name of the binary, so it's really the second element where the parameters start. Note that the child process never returns from the call to execv . This shows that the running process is replaced by the new process.

There are other system calls to exec a process, and they differ by how they accept parameters and if environment variables need to be passed. execv(2) is one of the simpler ways to replace the current image, because it doesn't need information about the environment and it uses the null-terminated array. Other options are execl(2) , which takes the parameters in individual arguments, or execvp(2) , which also takes a null-terminated array of environment variables. To make matters more complicated, not all operating systems support all variants. The decision of which one to use depends on the platform, coding style, and whether you need to define any environment variables.

What happens to open files when fork is called?

When a process duplicates itself, the kernel makes a copy of all open file descriptors. A file descriptor is an integer that refers to an open file or device, and it is used for reading and writing. If a program has a file open before the fork , what happens if both processes try a read or a write? Will one process overwrite data from the other? Will two copies of the file be read? Listing 5 investigates this by opening up two files -- one for reading and one for writing -- and having both the parent and the child read and write simultaneously.

Listing 5. Two processes reading and writing to the same file simultaneously

#include <stdio.h> #include <strings.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main(void) { int fd_in, fd_out; char buf[1024]; memset(buf, 0, 1024); /* clear buffer*/ fd_in = open("/tmp/infile", O_RDONLY); fd_out = open("/tmp/outfile", O_WRONLY|O_CREAT); fork(); /* It doesn't matter about child vs parent */ while (read(fd_in, buf, 2) > 0) { /* Loop through the infile */ printf("%d: %s", getpid(), buf); /* Write a line */ sprintf(buf, "%d Hello, world!

\r", getpid()); write(fd_out, buf, strlen(buf)); sleep(1); memset(buf, 0, 1024); /* clear buffer*/ } sleep(10); } sunbox$ gcc fdtest1.c -o fdtest1 sunbox$ ./fdtest1 2875: 1 2874: 2 2875: 3 2874: 4 2875: 5 2874: 6 2874: 7 sunbox$ cat /tmp/outfile 2875 Hello, world! 2874 Hello, world! 2875 Hello, world! 2874 Hello, world! 2875 Hello, world! 2874 Hello, world! 2874 Hello, world!

Listing 5 is a simple program that opens a file and fork s into the parent and child. Each process reads from the same file descriptor (which is just a text file with the numbers 1 through 7), printing what was read along with the PID. After reading a line, the PID is written to the out file. The loop completes when there are no more characters to read in the in file.

The output of Listing 5 shows that as one process reads from the file, and the file pointer is moved for both processes. Likewise, when a file is written to, the next character goes to the end of the file. This makes sense, because the kernel keeps track of the open file's information. The file descriptor is merely an identifier for the process.

You might also know that the standard output (the screen) is a file descriptor, too. This is duplicated during the fork , which is why both processes can write to the screen.

The death of a parent or child

Processes have to finish at some point. It's just a question of who dies first: the parent or the child.

Parent dies before child

If the parent process dies before its children, the orphaned children need to know who is their parent process. Recall that each process has a parent, and you can trace this family tree of sorts all the way back to PID 1, otherwise known as init . When a parent dies, init adopts all its children, as Listing 6 demonstrates.

Listing 6. Parent process dying before the child

#include <unistd.h> #include <stdio.h> int main(void) { int i; if (fork()) { /* Parent */ sleep(2); _exit(0); } for (i=0; i < 5; i++) { printf("My parent is %d

", getppid()); sleep(1); } } sunbox$ gcc die1.c -o die1 sunbox$ ./die1 My parent is 2920 My parent is 2920 sunbox$ My parent is 1 My parent is 1 My parent is 1

In this example, the parent process calls fork , waits for two seconds, then exits. The child process continues by printing its parent PID for five seconds. You can see that the PPID changes to 1 as the parent dies. Also of interest is the return of the shell prompt. Because the child process is running in the background, control returns to the shell as soon as the parent dies.

Child dies before parent

Listing 7 shows the opposite of Listing 6 -- that is, the child dying before the parent. To better illustrate what's happening, nothing is printed from the process itself. Instead, the interesting information comes from the process listing.

Listing 7. Child process dies before the parent

sunbox$ cat die2.c #include <unistd.h> #include <stdio.h> int main(void) { int i; if (!fork()) { /* Child exits immediately*/ _exit(0); } /* Parent waits around for a minute */ sleep(60); } sunbox$ gcc die2.c -o die2 sunbox$ ./die2 & [1] 2934 sunbox$ ps -ef | grep 2934 sean 2934 2885 0 21:43:05 pts/1 0:00 ./die2 sean 2935 2934 0 - ? 0:00 <defunct> sunbox$ ps -ef | grep 2934 [1]+ Exit 199 ./die2

die2 runs in the background using the & operator, and then a process listing is displayed, showing only the running process and its children. PID 2934 is the parent process, and PID 2935 is the one that is fork ed off and terminated immediately. Despite its untimely exit, the child process is still in the process table as a defunct process, otherwise known as a zombie. When the parent dies 60 seconds later, both processes are gone.

When a child process dies, its parent is notified with a signal called SIGCHLD . The exact mechanics of this are unimportant right now. What is important is that the parent must somehow acknowledge the death of the child. From the time the child dies until the time the parent acknowledges the signal, the child sits in a zombie state. The zombie is not running or consuming CPU cycles; it is merely taking up process table space. When the parent dies, the kernel is finally able to reap the unacknowledged children along with the parent. This means that the only way you can get rid of zombie processes is by killing the parent. The best way to deal with zombies is to make sure they don't happen in the first place. The code in Listing 8 implements a signal handler to deal with the incoming SIGCHLD signal.

Listing 8. A signal handler in action

#include <unistd.h> #include <stdio.h> #include <sys/types.h> #include <sys/wait.h> void sighandler(int sig) { printf("In signal handler for signal %d

", sig); /* wait() is the key to acknowledging the SIGCHLD */ wait(0); } int main(void) { int i; /* Assign a signal handler to SIGCHLD */ sigset(SIGCHLD, &sighandler); if (!fork()) { /* Child */ _exit(0); } sleep(60); } sunbox$ gcc die3.c -o die3 sunbox$ ./die3 & [1] 3116 sunbox$ In signal handler for signal 18 ps -ef | grep 3116 sean 3116 2885 0 22:37:26 pts/1 0:00 ./die3

Listing 8 is slightly more complex than the previous example because of the sigset function, which assigns a function pointer to a signal handler. Whenever a handled signal is received by a process, the function assigned through sigset is called. For the SIGCHLD signal, the application must call the wait(3c) function to wait for the child process to exit. Because the process has exited already, this serves as the acknowledgement of the child's death to the kernel. In reality, the parent might have more to do than simply acknowledge the signal. It might also need to clean up the child's data.

After you execute die3 , the process listing is checked and the child process executes cleanly. The signal handler is called with a value of 18 ( SIGCHLD ), the child's exit is acknowledged, and the parent goes back to its sleep(60) .

Summary

UNIX processes are created when one process calls fork , which splits the running executable into two. The process can then execute one of the system calls in the exec family, which replaces the current running image with the new one.

When the parent process dies, all its children are adopted by init , which is PID 1. If the child dies before the parent, a signal is sent to the parent, and then the child moves to a zombie state until the signal is acknowledged, or the parent process is killed.

Now that you understand how processes are created and destroyed, you're better equipped to deal with the processes running your system, especially those that make heavy use of multiple processes, such as Apache. Being able to follow the process tree for a particular process also lets you track any application back to the process that created it, should you need to do some troubleshooting.

Downloadable resources

Related topics