3 Processes

A process is a program executing on the operating system. It consists of a program (machine code) and a state of the program (current control point, variable values, call stack, open file descriptors, etc.).

This section presents the Unix system calls to create new processes and make them run other programs.

3.1 Creation of processes

The system call fork creates a process.

val fork : unit -> int

The new child process is a nearly perfect clone of the parent process which called fork . Both processes execute the same code, are initially at the same control point (the return from fork ), attribute the same values to all variables, have identical call stacks, and hold open the same file descriptors to the same files. The only thing which distinguishes the two processes is the return value from fork : zero in the child process, and a non-zero integer in the parent. By checking the return value from fork , a program can thus determine if it is in the parent process or the child and behave accordingly:

match fork () with | 0 -> (* code run only in the child *) | pid -> (* code run only in the parent *)

The non-zero integer returned by fork in the parent process is the process id of the child. The process id is used by the kernel to uniquely identify each process. A process can obtain its process id by calling getpid.

The child process is initially in the same state as the parent process (same variable values, same open file descriptors). This state is not shared between the parent and the child, but merely duplicated at the moment of the fork . For example, if one variable is bound to a reference before the fork , a copy of that reference and its current contents is made at the moment of the fork ; after the fork , each process independently modifies its “own” reference without affecting the other process.

Similarly, the open file descriptors are copied at the moment of the fork : one may be closed and the other kept open. On the other hand, the two descriptors designate the same entry in the file table (residing in system memory) and share their current position: if one reads and then the other, each will read a different part of the file; likewise, changes in the read/write position by one process with lseek are immediately visible to the other.

3.2 Complete Example: the command leave

The command leave hhmm exits immediately, but forks a background process which, at the time hhmm , reports that it is time to leave.

1 open Unix;; 2 3 let leave () = 4 let hh = int_of_string (String.sub Sys.argv.(1) 0 2) 5 and mm = int_of_string (String.sub Sys.argv.(1) 2 2) in 6 let now = localtime(time ()) in 7 let delay = (hh - now.tm_hour) * 3600 + (mm - now.tm_min) * 60 in 8 9 if delay <= 0 then begin 10 print_endline "Hey! That time has already passed!"; 11 exit 0 12 end ; 13 if fork () <> 0 then exit 0; 14 sleep delay; 15 print_endline "\007\007\007Time to leave!"; 16 exit 0;; 17 18 handle_unix_error leave ();; Unix;;leave () =hh = int_of_string (String.sub Sys.argv.(1) 0 2)mm = int_of_string (String.sub Sys.argv.(1) 2 2)now = localtime(time ())delay = (hh - now.tm_hour) * 3600 + (mm - now.tm_min) * 60delay <= 0print_endline "Hey! That time has already passed!";exit 0fork () <> 0exit 0;sleep delay;print_endline "\007\007\007Time to leave!";exit 0;;handle_unix_error leave ();;

The program begins with a rudimentary parsing of the command line, in order to extract the time provided. It then calculates the delay in seconds (line 8). The time call returns the current date, in seconds from the epoch (January 1st 1970, midnight). The function localtime splits this duration into years, months, days, hours, minutes and seconds. It then creates a new process using fork . The parent process (whose return value from fork is a non-zero integer) terminates immediately. The shell which launched leave thereby returns control to the user. The child process (whose return value from fork is zero) continues executing. It does nothing during the indicated time (the call to sleep ), then displays its message and terminates.

3.3 Awaiting the termination of a process

The system call wait waits for one of the child processes created by fork to terminate and returns information about how it did. It provides a parent-child synchronization mechanism and a very rudimentary form of communication from the child to the parent.

val val wait : unit -> int * process_status waitpid : wait_flag list -> int -> int * process_status

The primitive system call is waitpid and the function wait () is merely a shortcut for the expression waitpid [] (-1) . The behavior of waitpid [] p depends on the value of p :

If p > 0, it awaits the termination of the child with id equal to p .

> 0, it awaits the termination of the child with id equal to . If p = 0, it awaits any child with the same group id as the calling process.

= 0, it awaits any child with the same group id as the calling process. If p = −1, it awaits any process.

= −1, it awaits any process. If p <−1, it awaits a child process with group id equal to -p .

The first component of the result is the process id of the child caught by wait . The second component of the result is a value of type process_status:

WEXITED r The child process terminated normally via exit or by reaching the end of the program; r is the return code (the argument passed to exit ). WSIGNALED s The child process was killed by a signal (ctrl-C, kill , etc., see chapter 4 for more information about signals); s identifies the signal. WSTOPPED s The child process was halted by the signal s ; this occurs only in very special cases where a process (typically a debugger) is currently monitoring the execution of another (by calling ptrace ).

If one of the child processes has already terminated by the time the parent calls wait , the call returns immediately. Otherwise, the parent process blocks until some child process terminates (a behavior called “rendezvous”). To wait for n child processes, one must call wait n times.

The command waitpid accepts two optional flags for its first argument: the flag WNOHANG indicates not to wait if there is a child that responds to the request but has not yet terminated. In that case, the first result is 0 and the second undefined. The flag WUNTRACED returns the child processes that have been halted by the signal sigstop . The command raises the exception ECHILD if no child processes match p (in particular, if p is -1 and the current process has no more children).

Example The function fork_search below performs a linear search in an array with two processes. It relies on the function simple_search to perform the linear search. 1 open Unix;; 2 exception Found;; 3 4 let simple_search cond v = 5 try 6 for i = 0 to Array.length v - 1 do 7 if cond v.(i) then raise Found 8 done ; 9 false 10 with Found -> true ;; 11 12 let fork_search cond v = 13 let n = Array.length v in 14 match fork () with 15 | 0 -> 16 let found = simple_search cond (Array.sub v (n/2) (n-n/2)) in 17 exit ( if found then 0 else 1) 18 | _ -> 19 let found = simple_search cond (Array.sub v 0 (n/2)) in 20 match wait () with 21 | (pid, WEXITED retcode) -> found || (retcode = 0) 22 | (pid, _) -> failwith "fork_search";; Unix;;Found;;simple_search cond v =i = 0Array.length v - 1cond v.(i)raise FoundFound ->;;fork_search cond v =n = Array.length vfork ()| 0 ->found = simple_search cond (Array.sub v (n/2) (n-n/2))exit (found1) | _ ->found = simple_search cond (Array.sub v 0 (n/2))wait ()| (pid, WEXITED retcode) -> found || (retcode = 0) | (pid, _) -> failwith "fork_search";; After the fork , the child process traverses the upper half of the table, and exits with the return code 1 if it found an element satisfying the predicate cond , or 0 otherwise (lines 16 and 17). The parent process traverses the lower half of the table, then calls wait to sync with the child process (lines 21 and 22). If the child terminated normally, it combines its return code with the boolean result of the search in the lower half of the table. Otherwise, something horrible happened, and the function fork_search fails. * * *

In addition to the synchronization between processes, the wait call also ensures recovery of all resources used by the child processes. When a process terminates, it moves into a “zombie” state, where most, but not all, of its resources (memory, etc.) have been freed. It continues to occupy a slot in the process table to transmit its return value to the parent via the wait call. Once the parent calls wait , the zombie process is removed from the process table. Since this table is of fixed size, it is important to call wait on each forked process to avoid leaks.

If the parent process terminates before the child, the child is given the process number 1 (usually init ) as parent. This process contains an infinite loop of wait calls, and will therefore make the child process disappear once it finishes. This leads to the useful “double fork” technique if you cannot easily call wait on each process you create (because you cannot afford to block on termination of the child process, for example).

match fork () with | 0 -> if fork () <> 0 then exit 0; (* do whatever the child should do *) | _ -> wait (); (* do whatever the parent should do *)

The child terminates via exit just after the second fork . The grandson becomes an orphan, and is adopted by init . In this way, it leaves no zombie processes. The parent immediately calls wait to reap the child. This wait will not block for long since the child terminates very quickly.





3.4 Launching a program

The system calls execve, execv, and execvp launch a program within the current process. Except in case of error, these calls never return: they halt the progress of the current program and switch to the new program.

val val val execve : string -> string array -> string array -> unit execv : string -> string array -> unit execvp : string -> string array -> unit

The first argument is the name of the file containing the program to execute. In the case of execvp , this name is looked for in the directories of the search path (specified in the environment variable PATH ).

The second argument is the array of command line arguments with which to execute the program; this array will be the Sys.argv array of the executed program.

In the case of execve , the third argument is the environment given to the executed program; execv and execvp give the current environment unchanged.

The calls execve , execv , and execvp never return a result: either everything works without errors and the process starts the requested program or an error occurs (file not found, etc.), and the call raises the exception Unix_error in the calling program.

Example The following three forms are equivalent: execve "/bin/ls" [|"ls"; "-l"; "/tmp"|] (environment ()) execv "/bin/ls" [|"ls"; "-l"; "/tmp"|] execvp "ls" [|"ls"; "-l"; "/tmp"|] * * *

Example Here is a “wrapper” around the command grep which adds the option -i (to ignore case) to the list of arguments: open Sys;; open Unix;; let grep () = execvp "grep" (Array.concat [ [|"grep"; "-i"|]; (Array.sub Sys.argv 1 (Array.length Sys.argv - 1)) ]) ;; handle_unix_error grep ();; * * *

Example Here’s a “wrapper” around the command emacs which changes the terminal type: open Sys;; open Unix;; let emacs () = execve "/usr/bin/emacs" Sys.argv (Array.concat [ [|"TERM=hacked-xterm"|]; (environment ()) ]);; handle_unix_error emacs ();; * * *

The process which calls exec is the same one that executes the new program. As a result, the new program inherits some features of the execution environment of the program which called exec :

the same process id and parent process

same standard input, standard output and standard error

same ignored signals (see chapter 4)

3.5 Complete example: a mini-shell

The following program is a simplified command interpreter: it reads lines from standard input, breaks them into words, launches the corresponding command, and repeats until the end of file on the standard input. We begin with the function which splits a string into a list of words. Please, no comments on this horror.

open Unix;; open Printf;; let split_words s = let rec skip_blanks i = if i < String.length s & s.[i] = ' ' then skip_blanks (i+1) else i in let rec split start i = if i >= String.length s then [String.sub s start (i-start)] else if s.[i] = ' ' then let j = skip_blanks i in String.sub s start (i-start) :: split j j else split start (i+1) in Array.of_list (split 0 0);;

We now move on to the main loop of the interpreter.

let exec_command cmd = try execvp cmd.(0) cmd with Unix_error(err, _, _) -> printf "Cannot execute %s : %s

%!" cmd.(0) (error_message err); exit 255 let print_status program status = match status with | WEXITED 255 -> () | WEXITED status -> printf "%s exited with code %d

%!" program status; | WSIGNALED signal -> printf "%s killed by signal %d

%!" program signal; | WSTOPPED signal -> printf "%s stopped (???)

%!" program;;

The function exec_command executes a command and handles errors. The return code 255 indicates that the command could not be executed. (This is not a standard convention; we just hope that few commands terminate with a return code of 255.) The function print_status decodes and prints the status information returned by a process, ignoring the return code of 255.

let minishell () = try while true do let cmd = input_line Pervasives.stdin in let words = split_words cmd in match fork () with | 0 -> exec_command words | pid_son -> let pid, status = wait () in print_status "Program" status done with End_of_file -> () ;; handle_unix_error minishell ();;

Each time through the loop, we read a line from stdin with the function input_line . This function raises the End_of_file exception when the end of file is reached, causing the loop to exit. We split the line into words, and then call fork . The child process uses exec_command to execute the command. The parent process calls wait to wait for the command to finish and prints the status information returned by wait .