Motivation

A while back I started thinking about software and how it does what it does. Perhaps even more importantly, I started thinking about how we can know what software does, without relying on the software’s marketing pitch.

Software analysis

Software can be good and can also be bad; this is what we know as malware. For malware, a lot of time and effort is spent reverse-engineering the program’s machine code in order to figure out what it does and how it does it. This is a highly specialised skill, requiring a huge amount of time and effort to be able to perform this reverse-engineering successfully. We also use virtual machines (VMs) in order to execute untrustworthy executable files in a safe environment in order to limit their effectiveness when performing an analysis.

But it’s not just malware that should be analysed. We are increasingly reliant on software in all walks of life, and we often have to trust that the software we rely upon is safe and does what we expect it to do, and only that.

One particular concern I have is the divide between what we’re told a piece of software should do and what it actually does. How can I be sure that some software that purports to keep all my passwords safe in an encrypted keychain does not in fact also send those passwords off to someone who will sell them to the highest bidder? Most of us can’t really prove this, especially if the software is closed source, and we simply have to trust software vendors and their claims. And what about updates? Perhaps version 1.0 of this software works properly, but something happens and version 1.1 suddenly does something malicious. We’re all told to make sure everything’s kept up to date in order to prevent security problems, but the very fact of updating could also cause problems!

There’s a real divide between developers and users, where users simply cannot guarantee that the software does what the vendor says it does and nothing more. Even with open source software, most casual users lack the ability or time to read through and reason about what it actually does. Also, who’s to say that the executable that you install actually matches the compiled code which would be generated from that source code? The beauty of open source of course is that you don’t have to trust executables, you can actually build them yourself. But even then, many people do not have the expertise or time to check through pages of source code, and often use pre-built binaries anyway.

Software and Operating Systems

A piece of software is usually written to work with a particular operating system. The OS provides an interface between the user-space software and the computer itself. If software wants to do something, it needs to ask the OS nicely; this is known as a syscall. The important point here is that anything related to file I/O, network access, etc, is all performed on behalf of the software by the OS kernel via syscalls.

So, wouldn’t it be nice to see what a piece of software is doing? If you run Linux or another Unix-based system, you can do just that! Simply run the following command in a terminal: -

strace ls

Suddenly, we can see every syscall made by the ls command as it executed! This is very important, we can use this to glean some information about exactly how our ls command works and is able to list the current directory (for Windows users, ls is the equivalent of Windows’ dir ).

Firewalls

In networking, the concept of a firewall is well established. We all know that some network connections should be allowed while others should be prevented in order to maintain a secure environment.

While strace is extremely useful, it is passive; it audits everything the software asks of the OS and the responses it gets. It would be nice to have something which both audits and controls what software can do, like a firewall.

Description of syswall

I’d like to introduce syswall , a firewall for syscalls.

This is a personal project and is currently very much in the proof of concept stage. However there is a large set of planned features which should make syswall a powerful tool.

The goals of syswall are as follows: -

To provide an enhanced version of strace which makes it easier to determine what software is actually doing; and

which makes it easier to determine what software is actually doing; and To provide an environment for testing and experimenting with software by allowing a fine-grained and interactive approach to allowing and disallowing syscalls.

What syswall currently does

Currently syswall is able to intercept all syscalls on the Linux x86_64 platform. That means that any software on this platform can already be run via syswall and it will work correctly.

syswall currently only actively handles a few syscalls however: syscalls related to file I/O. That means syswall will announce every file that the target program tries to open, read, write and close, and will ask the user’s permission for each.

The user has a small set of responses to each of these queries; they can either allow the syscall, allow all syscalls of that type, block the syscall or block all syscalls of that type. When blocking there are two methods that can be used: “hard” or “soft” block. A hard block prevents the OS from running the syscall and returns a “permission denied” error to the program being traced. This way the traced program knows that it was blocked from performing that action. A soft block on the other hand also prevents execution but attempts to pretend to the program being traced that the call was successful, meaning that the traced program can continue as if it had been allowed.

syswall can also save some of the user decisions to a JSON file, which can then be loaded during subsequent executions.

Finally, while handling syscalls, syswall maintains its own internal state for the program being traced. Once the program has finished, syswall will produce a report detailing what the program did during its execution. As only file-based syscalls are handled currently, this report currently lists all files that were opened by the program, but will eventually include other details such as all network connections that were made.

What syswall will do

The current version is in the very early stages, but still is a functional tool. However, there is still a way to go. Some of the most important additional features are: -

Support for more syscalls, so that it can handle socket (network) connections, etc.;

Improved reporting so that all statistics for all handled syscalls can be displayed;

More fine-grained control, allowing users to allow/block all syscalls with one or more matching arguments;

Tracking of the traced program’s state throughout its execution, providing a step-by-step log of everything the program is doing; and

Comparing a stored log against a log generated from another execution of a program, perhaps a new version.

The latter two features are very important. Tracing and saving a log of a program’s execution from a syscall perspective provides an excellent way of seeing what a program is doing. Being able to compare this against another execution can then show us what else a program is doing. Going back to an earlier example, if a new software version introduces some malicious code, we should be able to see this easily by comparing what it now does additionally to what it did before.

Another big feature that I’d like to implement is the support for additional platforms. I have created syswall in a modular fashion, separating the program itself from the actual code which handles platform-specific syscalls. As such, additional modules could be created to handle syscalls from 32-bit Linux, MacOS, Windows and perhaps even Android. This requires more investigation at this stage, but I am optimistic!

Why Rust

I absolutely love Rust! I have been using C and C++ for more low-level programming for around 20 years now and they were always my go-to languages if anything serious needed to be done. For more general-purpose applications, Python is almost always a winner in my book. Rust feels like a combination of the two; serious grunt when needed with modern and powerful language features which make writing complex software much easier. Additionally, Rust requires no runtime, meaning that code can be compiled for multiple platforms (Rust has a LLVM backend), including embedded environments.

A lot has been said about Rust, and has been said in a much better way than I could possibly hope to achieve here, so if you have not tried it then I highly recommend reading the excellent Rust Book and giving the language a whirl, you will not regret it!

Conclusion