We all dream of servers that need no maintenance at all. But unfortunately in reality this is not the case. Disks can get full, processes can crash, the server can run out of memory...

Last week our team released a server monitor package written in PHP that keeps an eye on the health of all your servers. When it detects a problem it can, amongst others, notify you via slack.

In this post I'd like to give some background why we created it and give you a run-through of what the package can do.

Why create another server health monitor?

The short answer: the available solutions were too complicated and / or too expensive for my company. If you want the long version, read on and otherwise skip to the introduction of laravel-server-monitor.

In order to answer this question, let's first take a look at how my company has been doing things the last few years. We're what most people call a web agency. Our team is quite small: only a handful of developers with no dedicated operations team. We create a lot of web applications for clients. For the most part we also host all these applications.

Up until a few years ago we created rather smallish sites and apps. We relied on traditional shared hosting. But as the complexity of our projects grew, shared hosting didn't cut it anymore. Thanks to excellent resources like serversforhackers.com and Laravel Forge we felt confident enough running our own servers.

Each application is hosted on its own Digital Ocean server that was provisioned by Laravel Forge. Sure, running each application on a separate server is probably a bit more expensive than grouping some of them together on the same box. But using separate boxes has a lot of benefits:

for new projects you can just set up a new box with the latest versions of PHP, MySQL, etc...

When touching a project running on an older of PHP we can very easily upgrade the PHP version on that server. When running multiple applications on the same server you don't get this freedom without testing all the application running on it.

when a server is in trouble it only impacts one application

an application that is misbehaving in terms of memory and cpu usage can't impact other applications

each application has a lot of diskspace to play with (minimum 20 GB)

when Digital Ocean loses your server (yes, this can happen), it only impacts one application

Even though we are very happy with how we to things in regard to hosting, we don't see a lot of other companies of our size using this strategy. Most of them use managed / shared hosting. So as a small company with a lot of servers we're probably in a niche.

The problem with paid server monitoring is that most services assume that if you have a lot of servers you're probably a large company that has a big budget for server monitoring. Pricing of paid plans is mostly per host. For single host this is mostly cheap (Datadog for example has a plan of $15 / per host / per month), but multiplied by a hundred hosts, this becomes too expensive.

Also most of these services offer much more than we need. We don't need graphs of historical data or a lot of checks. We simply want to have a notification on our Slack channel when disk space is running low or when a service like memcached or beanstalk is down.

There also are a lot of free open source solutions, like Zabbix, Nagios and Icinga. As a developer, the problem with these tools is that they don't target developers but people in an operations department. For developers these tools are quite complex to set up. Take a look at this guide to install Nagios. Sure, it's doable, but if you don't have much experience setting these kinds of things up, it can be quite daunting.

There must be a better way.

Introducing laravel-server-monitor

To monitor our servers we built a Laravel package called laravel-server-monitor. This package can perform health checks on all your servers. If you're familiar with Laravel I'm sure you can install the package in a couple of mintutes. Not familiar with Laravel? No problem! We've also made a standalone version. More on that later. And to be clear: the package is able to monitor all kinds of servers, not only ones where Laravel is running.

The package monitors your server by ssh'ing into them and performing certain commands. It'll interpret the output returned by the command to determine if the check failed or not.

Let's illustrate this with the memcached check provided out of the box. This verifies if Memcached is running. The check runs service memcached status on your server and if it outputs a string that contains memcached is running the check will succeed. If not, the check will fail.

When a check fails, and on other events, the package can send you a notification. Notifications look like this in Slack.

You can specify which channels will send notifications in the config file. By default the package has support for Slack and mail notifications. Because the package leverages Laravel's native notifications you can use any of the community supported drivers or write your own.

Hosts and checks can be added via the add-host artisan command or by manually adding them in the hosts and checks table.

This package comes with a few built-in checks. But it's laughably easy to add your own checks.

Defining checks

The package will run checks on hosts. But what does such a check look like? A check actually is a very simple class that extends Spatie\ServerMonitor\CheckDefinitions\CheckDefinition . Let's take a look at the code of the built-in diskspace check.

namespace Spatie \ ServerMonitor \ CheckDefinitions ; use Spatie \ Regex \ Regex ; use Symfony \ Component \ Process \ Process ; final class Diskspace extends CheckDefinition { public $command = 'df -P .' ; public function resolve (Process $process) { $percentage = $this ->getDiskUsagePercentage($process->getOutput()); $message = "usage at {$percentage}%" ; if ($percentage >= 90 ) { $this ->check->fail($message); return ; } if ($percentage >= 80 ) { $this ->check->warn($message); return ; } $this ->check->succeed($message); } protected function getDiskUsagePercentage (string $commandOutput) : int { return (int) Regex::match( '/(\d?\d)%/' , $commandOutput)->group( 1 ); } }

This check will perform df -P . on the server. That will generate output much like this:

Filesystem 1024 -blocks Used Available Capacity Mounted on /dev/disk/by-label/DOROOT 20511356 12378568 7067832 64 % /

With a little bit of regex we extract the percentage listed in the Capacity column . If it's higher than 90% we'll call fail . This will mark the check as failed and will send out a notification. If it's higher than 80% it'll issue a warning. If it's below 80% the check succeeds. It's a simple as that.

Let's take a look at another example: the memcached check.

namespace Spatie \ ServerMonitor \ CheckDefinitions ; use Symfony \ Component \ Process \ Process ; final class Memcached extends CheckDefinition { public $command = 'service memcached status' ; public function resolve (Process $process) { if (str_contains($process->getOutput(), 'memcached is running' )) { $this ->check->succeed( 'is running' ); return ; } $this ->check->fail( 'is not running' ); } }

This check will run the command service memcached status on the server. If that commands outputs a string that contains memcached is running the check succeeds, otherwise it fails. Very simple.

Adding your own checks

Writing your own checks is very easy. Let's create a check that'll verify if nginx is running.

Let's take a look at how to manually verify if Nginx is running. The easiest way is to run systemctl is-active nginx . This command outputs active if Nginx is running.

Let's create an automatic check using that command.

The first thing you must to do is create a class that extends from Spatie\ServerMonitor\CheckDefinitions\CheckDefinition . Here's an example implementation.

namespace App \ MyChecks ; use Spatie \ ServerMonitor \ CheckDefinitions \ CheckDefinition ; use Symfony \ Component \ Process \ Process ; class Nginx extends CheckDefinition { public $command = 'systemctl is-active nginx' ; public function resolve (Process $process) { if (str_contains($process->getOutput(), 'active' )) { $this ->check->succeed( 'is running' ); return ; } $this ->check->fail( 'is not running' ); } }

Let's go over this code in detail. The command to be executed on the server is specified in the $command property of the class.

The resolve function that accepts an instance of Symfony\Component\Process\Process . The output of that process can be inspected using $process->getOutput() . If the output contains active we'll call $this->check->succeed which will mark the check successful. If it does not contain that string $this->check->fail will be called and the check marked as failed. By default the package sends you a notification whenever a check fails. The string that is passed to $this->check->failed will be displayed in the notification.

After creating this class you must register your class in the config file.

'checks' => [ ... 'nginx' => App\MyChecks\Nginx::class, ],

And with that, you're done. A check definition can actually do a few more things like when it's supposed to be run the next time, setting timeouts and it has support for using custom properties. Take a look at the docs if you want to know more about this.

Using the stand alone version

If you're not familiar with Laravel, installing a package can be a bit daunting. That's why we also created a stand alone version called server-monitor-app. Under the hood it's simply a vanilla Laravel 5.4 application with the laravel-server-monitor package pre-installed into it.

Using this app you can set up server monitoring in literally one minute. Here's a video that demonstrates the installation and using a check.

Under the hood

Let's take a look at a few cool pieces of source code.

If you have a buch server than you can end up with al lot of checks that need to be run. Running all those checks one after the other can take a bit of time. That's why the package has support for running checks concurrently. In the config file you can configure how many ssh connections the package may use.

This code is taken from CheckCollection which is responsable for running all the checks.

public function runAll () { while ( $this ->pendingChecks->isNotEmpty() || $this ->runningChecks->isNotEmpty()) { if ( $this ->runningChecks->count() < config( 'server-monitor.concurrent_ssh_connections' )) { $this ->startNextCheck(); } $this ->handleFinishedChecks(); } }

This loop will run as long as there are pending checks or running checks. Whenever there are less checks as the amount configured in the config file, another new check is started.

Let's take a look at what's happening inside the handleFinishedChecks function.

protected function handleFinishedChecks () { [ $this ->runningChecks, $finishedChecks] = $this ->runningChecks->partition( function (Check $check) { return $check->getProcess()->isRunning(); }); $finishedChecks->each->handleFinishedProcess(); }

This code leverages al lot of niceties offered by the latest versions of PHP and Laravel. It will filter out all the processes that are not running anymore (and are thus finished) and put them in the $finishedChecks collection. After that handleFinishedProcess will be called on each finished process.

handleFinishedProcess will eventually call the resolve function seen in the CheckDefinition examples listed above.

Testing the code

Like all other packages we previously made, laravel-server-monitor contains a good suite of tests. This allows us to improve the code and accept PRs without fear of breaking the code. Because SSH connections are used in this package, testing all functionality provided some challenges.

To easily code that relies on ssh connections the test suite contains a dummy SSH server written in JavaScript. When it runs it mimics all functionality of an SSH server. The SSH server itself is provided by the mscdex/ssh2. My colleague Seb wrote an easy to use abstraction around it. Using that abstraction we can let it respond whatever we want to when a command is sent to the server.

This makes testing the package end to end a breeze. Here's how we test a succesful check.

public function it_can_run_a_successful_check () { $this ->letSshServerRespondWithDiskspaceUsagePercentage( 40 ); Artisan::call( 'server-monitor:run-checks' ); $check = Check::where( 'host_id' , $this ->host->id)->where( 'type' , 'diskspace' )->first(); $this ->assertEquals( 'usage at 40%' , $check->last_run_message); $this ->assertEquals(CheckStatus::SUCCESS, $check->status); }

Let's take a look at another example. The package will fire a CheckRestored event if a check succeeds again after it has failed previously.

public function the_recovered_event_will_be_fired_when_an_check_succeeds_after_it_has_failed () { $this ->letSshServerRespondWithDiskspaceUsagePercentage( 99 ); Artisan::call( 'server-monitor:run-checks' ); $this ->letSshServerRespondWithDiskspaceUsagePercentage( 20 ); Event::assertNotDispatched(CheckRestored::class); Artisan::call( 'server-monitor:run-checks' ); Event::assertDispatched(CheckRestored::class, function (CheckRestored $event) { return $event->check->id === $this ->check->id; }) }

In closing

Laravel Server Monitor was a fun project to work on. We're quite happy with the results and now use it to monitor all our servers. If you're interested in learning more about the package head over to our documentation site or the package itself on GitHub. Keep in mind that this package was built specifically for teams without a dedicated ops member or department. So before using it, research the alternatives a bit yourself and make up your own mind what is a good solution for you.

Our server monitor determines health by checking stuff inside your server. We also built another package called laravel-uptime-monitor that monitors your server from the outside. It'll regularly send http requests to verify if your server is up. It can even verify if the used ssl certificate is still valid for a certain amount of days. Take a look at the uptime monitor docs to know more.

Also take a look at the other framework agnostic and laravel specific packages before. Maybe we've made something that can be of use in your next project.