When writing scripts, it is good practice to have a controlled exit from your script; this allows for failed conditions within the script processing. Consider a script that copies or replaces certain files in a file system. You could check if each copy completes successfully before moving on to the next task in the script. If issues occur, then the script exits. This allows the system administrator to inspect where the script failed so that immediate action can be taken to back-out the process or take an alternative action in completing the task.

Listing 1 below contains basic conditional code that could achieve this goal. Using a file copy process as an example, a test is carried out to make sure the file run_pj actually exists. If it does, then a copy is carried out to take a backup of the destination file. If the copy is unsuccessful, then the script exits with a message, detailing the error. If the file is not present, then the script exits, as no more processing should be carried out. If the copy was successful, then the new updated file is copied and overwrites the original file. If this is not successful, then the script exits.

Listing 1. Example_replace #!/bin/bash # proj_dir=/opt/pcake/bin #check file is present if ! ‑f "$proj_dir/run_pj" then echo " $proj_dir/run_pj not present...exiting" exit 1 fi #make a backup copy cp ‑p $proj_dir/run_pj $proj_dir/run_pj.24042011 if $? != 0 then echo "$proj_dir/run_pj no backup made...exiting" exit 1 fi #copy over updated file if ! ‑f "/opt/dump/rollout/run_pj" then echo "/opt/dump/rollout/run_pj not present...exiting" exit 1 fi cp ‑p /opt/dump/rollout/run_pj $proj_dir/run_pj if $? != 0 then echo " $proj_dir/run_pj was not copied..exiting" exit 1 fi Show more Show more icon

In this demonstration, I am using bash v3.2. The bash shell can be downloaded from the AIX Toolbox, see the Related topics section.

Using the approach in Listing 1, the script exits if there is any error in the copy process, thus not allowing the script to carry on processing if there is an error. Clearly, any error would be fixed before the script is run again.

Another technique to check for errors and exit is to use the set option:

set ‑e Show more Show more icon

With the set option: -e, if a command fails (that is, it returns a non-zero exit status), the script exits (unless it is part of a iteration, &&, || command). The example shown in Listing 2 below, copies a non-existent file. The set -e option is used. If the copy command fails, the script exits. Notice that when you run the command, the if statement for the last exit status is never reached because the script exits upon a non-zero return status of the cp command.

Listing 2. Example_fail #!/bin/bash set ‑e proj_dir=/opt/rollout/v12 #copy a non‑existent file cp $proj_dir/go_sup /usr/local/bin/go_sup if $? != 0 then echo "could not copy $proj_dir/go_sup to /usr/local/bin/" exit 1 fi $ cp_test cp: /opt/rollout/v12/go_sup: A file or directory in the path name does not exist. Show more Show more icon

Generating syslog messages

Using the logger command allows the shell and scripts to write messages to the system messages file via the syslogd service. This can be used within a script to log errors or on completions of your processes so that is viewable by all who interrogate the messages file. Thus keeping you and other system administrators informed of events that have been generated from your scripts.

The most basic format of the command is:

logger ‑p priority message Show more Show more icon

Where -p is the priority or facility level contained within syslog.

For example, the following logger command contains the calling script name (“rollout” in this example) with the message something has happened .

logger ‑p notice "$(basename $0) ‑ something has happened" Show more Show more icon

The the following output appears in /var/adm/messages:

Apr 5 13:20:30 uk01wrs6008 user:notice dxtans: rollout ‑ something has happened Show more Show more icon

Getting a signal

The two examples contained in Listing 1 and Listing 2 shows one way that checking post command execution can be carried out. However, what happens if a script gets terminated during its execution? Scripts can be killed or terminated using the signal mechanism (note that not all signals sent are terminal). A signal that is sent to a running process interrupts that process to force some sort of event, typically some action. Signals can come from, but not restricted to:

The kernel or user space via some system event.

The actual process itself via the keyboard (Ctrl-C).

An illegal instruction from within the process.

Another process via another user sending a kill to your process.

Notification via a notification of the state of a required device.

To view the current list of signals, use kill -l (the letter l) command. The list is presented in the form (signal number, signal name):

$ kill ‑l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGEMT 8) SIGFPE 9)SIGKILL 10) SIGBUS 11) SIGSEGV 12) SIGSYS ….... ….... Show more Show more icon

To view the signals and their default actions (on an AIX machine), view the file:

$ cat /usr/include/sys/signal.h|more ….. ….. #define SIGHUP 1 /∗ hangup, generated when terminal disconnects ∗/ #define SIGINT 2 /∗ interrupt, generated from terminal special char ∗/ #define SIGQUIT 3 /∗ (∗) quit, generated from terminal special char ∗/ #define SIGILL 4 /∗ (∗) illegal instruction (not reset when caught)∗/ #define SIGTRAP 5 /∗ (∗) trace trap (not reset when caught) ∗/ #define SIGABRT 6 /∗ (∗) abort process ∗/ ….. ….. Show more Show more icon

I have received a signal. Now what?

When a signal has been received by the script, the script can do one of three actions:

Ignore it and do nothing. This is probably what most scripts do without the script authors realising it. Catch the signal using trap and take appropriate action. Take the default action.

All the above is true except for the following signals:

SIGKILL (signal 9)

SIGSTOP (signal 17)

SIGCONT (signal 19)

These cannot be caught and always uses the default action. SIGKILL always kills the process. Looking at the listing from the /usr/include/sys/signal.h file, we see the default action for each signal. For instance, SIGINT (signal 2) is an interrupt generated from the terminal; typically, this is the keyboard. Each defined system signal has a different action. There are also two user defined signals: SIGUSR1 (signal 30) and SIGUSR2 (signal 31).

It is up to the author of the script to take what action is required if any, if a signal is received.

These can be used by the script author to do bespoke signals. Be sure to view the signal.h file for all the default actions.

Common signals are:

SIGHUP – hangup or exit a foreground running process from a terminal

– hangup or exit a foreground running process from a terminal SIGINT – Ctrl-C from the keyboard

– Ctrl-C from the keyboard SIGQUIT – Ctrl-\ from the keyboard

– Ctrl-\ from the keyboard SIGTERM – software termination signal

When receiving a signal, actions that can take place are:

cleaning up files

prompting the users if the script should be actually terminated

ignoring the actual signal

carry on processing

Catching a signal

To catch a signal that is sent to your process, use the built-in trap command. When a signal is caught, the current command being executed attempts to complete before the trap command takes over. If it is a SIGKILL , then termination is immediate. If you ignore certain signals, the default action always take place. For example, if you only trap for SIGINT but do nothing about SIGQUIT , then if your process gets a SIGQUIT , the default action takes place (most likely an untidy termination of your script, which you probably do not want).

The format of the trap command is:

trap 'command_list' signals Show more Show more icon

Where command_list is a list of commands, which can include a function to run upon receiving a signal contained in the signals list. And, signals is a list of signals to catch or trap.

To ignore a signal, use two single quotes in place of the command_list:

trap '' signals Show more Show more icon

To reset a trap use:

trap ‑ signals Show more Show more icon

Where signals is the signal list.

Lets now look at a bare bones script that catches SIGINT and SIGQUIT . The script contained in Listing 3 below is a counter iteration script. When the user hits Ctrl-C or Ctrl-\ on the keyboard, the trap command traps the signal, and echoes a message that the script has terminated. The termination is accomplished by using the exit command at the end of the command list. If this is not done, the script does not terminate and continues processing. In this example, we want it to terminate. There may be occasions when this would not be the case and processing should continue.

Listing 3. Trap1 #!/bin/bash #trap1 trap 'echo you hit Ctrl‑C/Ctrl‑\, now exiting..; exit' SIGINT SIGQUIT count=0 while : do sleep 1 count=$(expr $count + 1) echo $count done $ trap1 1 2 3 ^Cyou hit Ctrl‑C/Ctrl‑\, now exiting.. Show more Show more icon

It is considered good form that you use the signal names and not the signal numbers within the trap command. This is for portability reasons across other systems.

You can also use a function in place of the command as demonstrated in Listing 4 below:

Listing 4. Trap1a #!/bin/bash #trap1a trap 'my_exit; exit' SIGINT SIGQUIT count=0 my_exit() { echo "you hit Ctrl‑C/Ctrl‑\, now exiting.." #cleanp commands here if any } while : do sleep 1 count=$(expr $count + 1) echo $count done Show more Show more icon

Signals can also be caught, when a script is running in the background. Listing 5 below, contains a simple counter as in the previous examples. In the following example, I have again chosen to exit the script upon catching the signal. If this was a file processing script, temporary files created would be deleted first.

The script is submitted into the background using:

$ /home/dxtans/trapbg & [1] 708790 $ 1 2 3 Show more Show more icon

Now from another terminal, send a signal SIGHUP to kill it.

$ ps ‑ef |grep trapbg dxtans 708790 2457860 11:49:39 pts/0 0:00 /bin/bash /home/dxtans/trapbg $ kill ‑1 708790 Show more Show more icon

Now back on the terminal where the script was submitted, the following is displayed:

$ /home/dxtans/trapbg & [1] 708790 $ 1 2 3 Going down on a SIGHUP ‑ signal 1, now exiting.. [1]+ Done /home/dxtans/trapbg Show more Show more icon

Listing 5. trapbg #!/bin/bash #trapbg trap 'echo Going down on a SIGHUP ‑ signal 1, now exiting..; exit' SIGHUP count=0 while : do sleep 10 count=$(expr $count + 1) echo $count done Show more Show more icon

The most common tasks when dealing with signals is to clean up temporary files. Typically, these are created with the PID (the script process pid) that are appended to the user created files in /tmp . Assume the temp files are in this form:

hold1.$$ hold2.$$ Show more Show more icon

A common command to remove these files is:

rm /tmp/hold∗.$$ Show more Show more icon

The following piece of code traps for SIGNHUP SIGINT SIGQUIT SIGTERM then remove the files:

trap 'rm /tmp/hold∗.$$; exit' SIGNHUP SIGINT SIGQUIT SIGTERM Show more Show more icon

Earlier in this article, I demonstrated that using set -e causes a script to terminate upon an occurrence on a non-zero exit status from a command. Within trap, you have a similar option; it is not really a signal as such but is based on set -e as if it was invoked. It traps a non-zero exit status from a command, using the ERR variable. The ERR goes with the signal list within the trap command. In the following example, a non-existent file is copied, which invokes an error:

#!/bin/bash #trap1b trap 'echo I have error in my script..' ERR cp /home/dxtans/afile /tmp Show more Show more icon

When executed, the output is:

$ trap1b cp: /home/dxtans/afile: A file or directory in the path name does not exist. I have error in my script. Show more Show more icon

There are two variables that come in handy when dealing with traps to give you more information on the script termination, LINENO and BASH_COMMAND . The BASH_COMAMND is exclusive to bash. These report, or attempt to report, the line number that the script is currently executing, and also the current command that is running. The following example, Listing 6 below, demonstrates this. The script executes a list of echo and sleep commands. When the script is sent either a SIGHUP, SIGINT, SIGQUIT , the script terminates. A message displays containing the line number and command when the trap was caught; the script then exits (from the exit command on the trap command list). Notice that the trap calls the function my_exit to display the information. By parsing the parameters $1 ( LINENO ) and $2 ( BASH_COMMAND ), it also logs a message to /var/adm/messages of the event. Other clean up commands would be put in this function, if required.

Listing 6. trap4 #!/bin/bash #trap4 trap 'my_exit $LINENO $BASH_COMMAND; exit' SIGHUP SIGINT SIGQUIT my_exit() { echo "$(basename $0) caught error on line : $1 command was: $2" logger ‑p notice "script: $(basename $0) was terminated: line: $1, command was $2" #cleanp commands here if any } echo 1 sleep 1 echo 2 sleep 1 echo 3 Show more Show more icon

Running this script a couple of times, and then interrupting at different intervals, produces the following output.

$ trap4 1 2 ^Ctrap4 caught error on line : 15 command was: sleep $ trap4 1 ^Ctrap4 caught error on line : 13 command was: sleep Show more Show more icon

In /var/adm/messages, we have an entry for the script termination:

Apr 6 12:12:46 rs6000 user:notice dxtans: script: trap4 was terminated: line: 13, command was sleep Show more Show more icon

There are occasions when you will want to ignore certain signals. Perhaps you wish to prevent someone hitting Ctrl-C or Ctrl-\ on the keyboard by mistake when your script is doing some processing on large files, and you wish it to complete, without user interruption. The following segment of code achieves this:

trap '' SIGINT SIGQUIT Show more Show more icon

You can also ignore certain signals during a portion of your script, then re-instate them later on when you do wish to catch the signals so you can take some form of action. The script contained in Listing 7 below ignores the signals SIGINT and SIGQUIT until after the sleep command has finished. Then when the next sleep command starts, trap takes action if the signals are sent and terminates. As in the previous examples, you can assume the sleep commands represent some form of processing.

Listing 7. trapoff_on #!/bin/bash #trapoff_on trap '' SIGINT SIGQUIT echo "you cannot terminate using ctrl‑c or ctrl‑\, " #heavy pressing go on here, cannot interrupt ! sleep 10 trap 'echo terminated; exit' SIGINT SIGQUIT #user can now interrupt echo "ok you can now terminate me using those keystrokes" sleep 10 Show more Show more icon

Sending a signal to a child

Scripts that contain child processes also need to be addressed. Assuming you wish to terminate any child processes, you need to kill these as well. This is accomplished using the trap command as demonstrated in Listing 8 below. In this example, two sleep commands are used as the child processes. These are put into the background; as each process is run, the PID of the process is placed into the variable: $pid . This variable holds the two PIDS of the child (sleep) processes.

To kill the main script, either a SIGHUP,SIGINT,SIGQUIT or SIGTERM is sent. Upon catching this signal, a kill command is issued to the PID of the child processes contained in the variable $pid . Once completed, the script exits. The wait at the end of the script will wait for the child processes to terminate or complete. Further signal traps may be required that would be contained within the child scripts to do further cleaning up before exit. Clearly, this depends on your type of processing.

The following example kills the children when the parent is sent one of the signals.

Listing 8. trapchild #!/bin/bash #trapchild sleep 120 & pid="$!" sleep 120 & pid="$pid $!" echo "my process pid is: $$" echo "my child pid list is: $pid" trap 'echo I am going down, so killing off my processes..; kill $pid; exit' SIGHUP SIGINT SIGQUIT SIGTERM wait Show more Show more icon

Upon execution of the script, the following displays:

$ /home/dxtans/trap/trapchild my process pid is: 6553626 my child pid list is: 5767380 6488072 Show more Show more icon

Check from the terminal that the processes are running, along with the child processes (the two sleep commands).

$ ps ‑ef |grep trapchild root 6553626 5439516 0 20:51:32 pts/1 0:00 /bin/bash /home/dxtans/trap/trapchild $ ps ‑ef |grep sleep root 5767380 6553626 0 20:51:32 pts/1 0:00 sleep 120 root 6488072 6553626 0 20:51:32 pts/1 0:00 sleep 120 Show more Show more icon

Let’s now send a SIGTERM to the parent process. The script terminates and terminates the child processes.

$ kill ‑15 6553626 Show more Show more icon

The script then terminates with the following output:

$ /home/dxtans/trap/trapchild my process pid is: 6553626 my child pid list is: 5767380 6488072 I am going down, so killing off my processes.. Show more Show more icon

Check that nothing is returned after the termination:

#ps ‑ef |grep sleep Show more Show more icon

Conclusion

Using traps within your scripts requires a little extra effort. The result can be that when a trappable signal is inbound to your script, you will be in a good position to take action.