4.5 / 5 ( 48 votes )

What is a lockfile

You may have experienced it before, you create a cronjob to change some data every X hour or minutes and one day this job takes longer than it usually does and cron spawns another job before the first one is finished.

This can result in data corruption or deletion of data that should not have been deleted, all depending on what the cronjob is set up to do

To prevent bad things from happening, a good rule of thumb is to always use a lockfile

A lockfile is a small file, it virtually takes up no space, at least so little you won’t care (The actual size depends on your filesystem). Sometimes it contains a PID, sometimes a timestamp or just plain empty. Depending on how the lockfile is managed



How lockfiles work

There are multiple ways to write a lockfile, i’ll explain the basics of a lockfile here, in two different ways

Empty file

The first and most simple way is to make your script/program check if a file exists at the beginning of a script, let’s say the filename is /var/lock/myscript.lock

If the lockfile exists, then just exit the script since it seems like the script is already running based on the existence of the lockfile. However if the lockfile does not exist, then create it and continue on with doing what the script has to do

When the script is done doing it’s job, the lockfile has to be deleted before the script exists

That’s basically it, the lockfile is just a file indicating that the script is already running. However this method with just an empty file has one big problem and advantage.

If the script fails and exits before it gets to delete the lockfile, the script will never run again before you go in and delete the lockfile manually, or if your server reboots/crashes while the script is running you will have the same problem

However that is not necessarily bad and can be useful in some cases. Sometimes your script may be written to do some changes that can not be restarted if interrupted before it’s finished, in this case this type of lockfile is a must because the script will not restart on it’s own before you delete the lockfile manually to let it

Lockfile with PID

Let’s say your script name is myscript.sh and the lockfile is located at /var/lock/myscript.lock

If the lockfile exists, your script will read it to see if it has any content, if it finds data in the file, the script will assume it’s a PID (Process ID, every process gets an ID. The ID is just a number starting from 1 which is incremented by 1 for every process spawned) and check if a process with that ID is running

If no process with the PID from the lockfile is found or the lockfile does not exists at all, the script will create the lockfile with the current running scripts PID (Process ID) as the content of the lockfile. Nothing else, just the PID

This way, you do not have to delete the lockfile when done, and in case of a script or system crash your lockfile will still be there, but it does not matter since the script with the PID from the file is no longer running so when the script runs again, it will not find it doing the check at the beginning and therefore write the new PID into the lockfile and continue on with it’s job

I have used this method in multiple scripts and even though it has some downsides, for example if a process with the same ID is spawned (PID’s are reused when you hit the max). But I have never run into any problems like this

Lockfiles the “hard” way

I call it the hard way because it requires you to add some code to your script, it’s not really hard but it’s not as easy as the easy solution further down in this post, but it helps people who are new to lockfiles to understand how it works

Adding the following code on top of a bash script will:

Create the lockfile if it does not already exists Read the data from the lockfile Check if a process with the PID matching the data from the lockfile is running If no process with the PID is running, then write the current PID to it However if a process with the PID from the lockfile is running, then just exit the script

Here is the code for a bash script with comments:

# Variable to hold the location of the lockfile lf=/var/lock/myscript.lock # Create empty lock file if none exists touch $lf # Read the content of the lockfile into a variable read lastPID < $lf # If lastPID is not null and a process with that pid exists, exit the script [ ! -z "$lastPID" -a -d /proc/$lastPID ] && exit # Write the PID of the current running script to the lock file echo $$ > $lf # Your code goes here and will do it's job from this point on. No further code related to the lockfile is needed

And here is the code to use just an empty lockfile:

# Variable to hold the location of the lockfile lf=/var/lock/myscript.lock # Check if the lockfile exists, exit if it does [ -f $lf ] && exit # Create the lockfile touch $lf # You script has to do it's job here # At the very end of the script or before it exists, delete the lockfile rm $lf

Lockfiles the easy way

Above i showed you how a lockfile works, and the “hard” way to manage them

Now let’s look into the easy way. This method however required you to install a tiny program that is in the official repositories

The program is called “flock”

To install flock run the following command:

Debian:

apt-get install flock

RHEL/CentOS:

yum install flock

Once installed, it’s really easy to use with the following syntax:

/usr/bin/flock -n /path/to/lockfile.lock /path/to/myscript.sh

the -n makes flock exit in case the script is already running, without the -n flock will wait until the first process is done

That’s it, flock will handle it all for you then, just run the script with flock in front of it every time you run it and you will be safe from the script accidentally running multiple times in parallel