At linux.conf.au 2015 in Auckland, Matthew Garrett gave a session about the security risks of the Intelligent Platform Management Interface (IPMI). IPMI is a largely a tool of convenience for administering server farms, and is implemented on supported motherboards as a feature that sits outside of the usual bootloader and operating system. It should come as no surprise, then, that it can be the source of serious security problems if is not prepared for properly.

Garrett began by reminding the audience of his previous experiences dissecting the security hazards of various four-letter words—namely ACPI and UEFI. Much like those examples, IPMI is an "out-of-band" feature: motherboard manufacturers implement it in a separate chipset without source code available, which makes it particularly challenging to insulate the main OS against.

Inside IPMI

Though related to ACPI, IPMI does have a different and reasonably well-defined use case. The underlying principle, Garrett said, is that data centers are awful places where no one wants to be: they are noisy, have nowhere to sit, the temperature is unbearable, and the phone reception is terrible. Thus, any technology that enables people to spend less time in their data center is perceived as good.

That is the point of IPMI; it allows system administrators to monitor the health of their servers remotely and to troubleshoot problems—all without requiring access to the machines' OS. This is convenient for server hosting companies, where the customer expects the provider to maintain the machines, but usually not to interfere with the OS. The IPMI interface provides a serial terminal over its own network connection, so the host OS never even needs to know it is running.

90% of the time, the "management" boils down to rebooting the server, he said, since that solves most computing problems. IPMI simply lets administrators do so without having to walk all the way to the data center and back. The ability for a data center administrator to reboot a server even when the host OS does not want to reboot might be troubling—as could the fact that IPMI can tell the machine to boot from a different device (including network booting). But, Garrett said, the various vendor implementations are where things get truly "interesting" from a security standpoint, since vendors always want to add their own features beyond the specification.

Customizing the IPMI implementation is an attractive prospect for server vendors, he said, since it offers branding opportunities that cannot be removed, no matter what OS the customer installs, as well as the opportunity to incorporate various "value adds" to differentiate against rival vendors. Among the features server vendors add to IPMI are the ability to perform firmware updates, support for virtual hardware devices like a keyboard/monitor combination or CD drive, network service discovery, various web services, and what Garrett called "magic GPUs"—IPMI graphics adapters that can directly grab the contents of the server's actual GPU and make it available over the IPMI interface.

The magic GPUs and virtual hardware devices essentially allow the remote IPMI user to interact with the machine as if they were physically present—but bypassing the host OS entirely, with obvious potential for abuse by data-center administrators. The service-discovery and web-server features, in contrast, present potential attack vectors to people other than the system administrators who happen to be on the same network as the server.

Furthermore, IPMI is implemented on motherboards in a baseboard management controller (BMC), an embedded microcontroller that cannot be removed from the motherboard. BMC CPUs tend to be in the 600-to-800MHz range, Garrett said, with several megabytes of RAM. A variety of processor architectures are used, and almost all BMCs run Linux (with the notable exception of HP's, which run a specialty realtime OS).

The final piece of IPMI design that Garrett explained was the user-authentication scheme. The specification originally did not involve encryption at all, but it was updated in 2004. Now, it presents "a bewildering array of encryption and authentication protocols" from which to choose—but it still is not very good.

For one thing, he said, as part of the authentication handshake process, the BMC sends a hashed version of the correct password back to the user. Thus, all an attacker needs to do is guess one valid username, grab the hashed password, and run a cracking program against it. But even this is not always necessary, he added, since one of the options in the bewildering array is "Cipher Zero"—which, as the name suggests, involves no encryption or authentication at all. Luckily, he added, most vendors released an update to disable Cipher Zero, although not everyone has applied it.

Implementation details

The potential problems with IPMI and the various server-vendor–specific features are one set of threats, Garrett said, but another class entirely comes from implementation problems. Almost all IPMI implementations come from one of two vendors: Avocent (used in Dell, IBM, and Cisco machines) and AMI (used by almost everyone else). There is also a lot of common code shared between the two—for example, both use an embedded web server written by atWare and similar plugins. Bugs found in AMI's implementation of a basic feature tend to be found in Avocent's as well (and vice-versa). To their credit, he said, he has not yet found anything too egregious.

The server vendors' extensions, however, seem to be noticeably worse. They expose a lot of vulnerable interfaces, he said, such as CGI or PHP (the latter with extensions that allow embedded C code). They tend to leave a lot of services running, from SSH and Telnet to non-web GUI applications. Many servers support the Web Services-Management (WSMan) interface, which allows users to query server status with HTTP requests—and sometimes much more, like pushing new firmware. Garrett then referred the audience to firmware_config, a tool he developed for querying and setting firmware configuration options for a variety of IPMI-enabled server brands.

Garrett concluded with a tour of specific, egregious IPMI vulnerabilities he has encountered in the wild, noting that "you're here because you want to see me get really angry." The first example was from Dell's IPMI implementation: on that system, the IPMI console allows arbitrary command execution by appending commands as an "argument" to one of the real IPMI commands. So, for example, typing:

firmware;/bin/sh

runs the firmware IPMI command, then gives the user a root shell. Another flaw involved the XML parser, which could be crashed and made to dump core by simply feeding it a set of tags in the wrong order, such as:

</USERNAME><USERNAME>

Other commands, he said, are not written defensively—such as one instance of /sbin/pm that contained:

"/bin/rm -f %s%s*"

which, he noted, might accidentally or maliciously be exploited to delete important files in certain circumstances.

It is very hard to "fix" IPMI implementations, Garrett said. BMCs are not removable and they cannot be switched off. They piggyback on motherboards' existing network ports and are configured to connect to the network at startup with DHCP. Furthermore, BMCs are fairly easy to discover, since they include SSL certificates that expose the server's serial number. Garrett offered an "educated guess" that scanning the entire IPv4 address space to look for server serial numbers in BMC SSL certificates could be done in well under an hour—gleaning around 100,000 servers. That information could be exploited by an attacker to track down owner information of all of the vulnerable servers.

The good news, he said, is that the industry has realized the problems with IPMI and is in the process of replacing it—although the replacement is not guaranteed to be better. In the meantime, Garrett advised server owners to put their BMCs onto a separate network, filter out all incoming IPMI traffic, and make sure all of the available updates get applied—quickly.

[The author would like to thank LCA 2015 for travel assistance to Auckland.]

Comments (31 posted)

The Linux Test Project (LTP) is an automated kernel test suite designed "to validate the reliability, robustness, and stability of Linux". An earlier article introduced LTP and explained how to run the test cases. This article will focus on how to write an LTP test case.

The main advantage of writing an LTP test case is the availability of the test library. That library contains some useful code patterns that recur in test development. For example, it simplifies test setup significantly by providing "safe" macros that are wrappers for most of the POSIX interfaces. These wrappers abort the test safely if the call to the particular function fails. It's also place where test-suite-wide parameters are defined, environment variables are parsed, and so on.

The library is, on the other hand, designed to be optional. It is entirely possible to create an LTP-compatible test case just by returning right exit status at the end of the test case. Although, for new test cases, use of the test library will ease the development and is generally preferred.

An example test case

Let's dive into a real world test example first. The test we will look at is a simple test for mount() errors.

The test starts with a short description explaining what the test does. Each test case is stored in an array of structures; each entry holds parameters to pass to the mount() call and also the expected result. For this test case, these are all failures with the specified errno value. Although this is not always the case, this pattern is common for simple system call test cases.

The overall test setup is done in the setup() function and overall cleanup in the cleanup() function. The setup() function is called only once at the start of the main(), but it's an unwritten convention for it to be a separate function. On the other hand, the cleanup() function may be called from various parts of the test, even from setup() , and therefore should be able to handle unfinished test initialization as well. The final remaining piece of this big picture overview is the verify_mount() function that takes a pointer to a test case and runs the actual test.

If you are wondering why the test does not test conditions where mount() succeeds, the answer is simple. Keeping both negative and positive test cases in one source file would complicate the design, because the steps to check the expected result are completely different. The positive test cases are implemented in mount01.c and mount03.c .

Closer look at the code

The main header for the test API is include/test.h. Since the LTP build system passes the correct path to the include directory, the file is simply included as test.h . There is also usctest.h that contains a few more macros, but only the TEST() macro (described below) is widely used. That header file is last remnant from earlier times that has not been touched in the large-scale cleanups; it will likely be removed in the future.

Each test must define two global variables. The TCID variable is a string containing the test identifier, which is usually the same as the test's filename without the file extension. The TST_TOTAL variable stores the total number of test cases in the test. The test should report the corresponding number of passes and fails unless the execution was interrupted prematurely. These variables are used by the test library for printing messages. In addition, the TCID is used for the filename template when the test temporary file is created.

The first thing the test does after entering main() is the call to the library function parse_opts() to parse the standard test options. These options can be used, for example, to run the test N times or to run it for T seconds. If a test needs test-specific options, they can be passed to the function as well by passing a structure defining the parameters and a pointer to function to print help information. The TEST_LOOPING() macro is used in the test main loop. It uses the information gathered by parse_opts() and evaluates to true while the test case should continue to execute.

Test results are reported by printf() -like functions that take a few extra parameters. The example test uses tst_resm() (to report results without exiting the test) and tst_brkm() (to exit the test), which are the most commonly used. The main advantage of these functions is that the overall test result is stored internally by the test library, so there is no need to propagate it manually in the code.

The result types are bit flags and the result is a bit field. The possible values are TPASS , TFAIL , TBROK , TWARN , and TCONF . TPASS and TFAIL are self-explanatory. TBROK means that something unexpected happened in the test setup and test was aborted. TWARN means that something unexpected happened but the test carried on. TCONF means that test is not suitable for the current configuration. The overall test exit value is composed of these bit flags. Note that TPASS is actually defined as zero so that the return value for successful test is also zero.

A subset of the printf() -like functions will also exit the test execution immediately. In the example code, the tst_brkm() call will exit. Such functions also include a cleanup callback parameter that, if it is not NULL , is called before the test exits. The cleanup callback is usually a pointer to the overall test cleanup function that is also called at the end of the test. It frees the resources claimed by the test case, which are usually a temporary directory, open file descriptors, loopback devices, and so on.

There are also several additional features these functions bring to the table. For example, all non-success messages include the filename and line number automatically in order to easily track back to the proper location in the test source code.

The tst_sig() library function is used in setup() to install "poisoned" signal handlers that end test case execution when an unexpected signal has arrived. The tst_require_root() library function will exit the test unless the process runs as root (has EUID == 0 ).

LTP contains two library functions to help with temporary directory creation and deletion. The first is tst_tmpdir() that creates a unique test temporary directory under $TMPDIR and also changes the current working directory to it. The companion function, tst_rmdir() , is called from test cleanup to delete the directory recursively, so there is no need to remove individual temporary files and directories.

Once the temporary directory is created, the test can proceed with creating test files and directories. The so-called safe macros (e.g. SAFE_OPEN() , SAFE_MKDIR() ) implement wrappers for most of the POSIX interfaces and for a few more common tasks. These macros will return to the caller only if they were successful. Similar to tst_brkm() , the macros take a cleanup callback parameter that will be called before the test exits in case of failure.

If the call fails, the safe macros produce an error message with as much relevant information as possible. That message will include the source filename and line, the parameters passed to the call in a human readable form, errno if applicable, and so forth.

All safe macros are defined in include/safe_macros.h header located under the LTP source tree. There are also some safe file operations defined in include/safe_file_ops.h . These operations include SAFE_FILE_SCANF() and SAFE_FILE_PRINTF() that are especially useful for reading/writing values from various filesystems such as procfs and sysfs .

Our example test needs to work with a block device, so it makes use of three library functions designed for that purpose. The first two, tst_dev_fs_type() and tst_acquire_device() , return the filesystem type to be used for the testing and a path to the device to be used for the testing. If no device was passed to the top-level test script, the test library will prepare a suitable loop device. The call to tst_mkfs() will call mkfs to format the device with a filesystem. It also handles extra parameters that are needed for certain filesystem types.

Now that we have described the setup process, let's get back to main() and have a look at the test main loop. Apart from previously explained TEST_LOOPING() macro, what the test does is loop over the structure that describes all of the test cases and call verify_mount() to actually do the test for each of them. The TEST() macro actually makes the mount() call. It is just shorthand for:

errno = 0; TEST_RETURN = call(); TEST_ERRNO = errno;

TEST_RETURN

TEST_ERRNO

andare declared inside the LTP library and are used in the test output.

Output from the example test case:

mount02 0 TINFO : Found free device '/dev/loop0' mount02 0 TINFO : Formatting /dev/loop0 with ext2 extra opts='' mke2fs 1.42.10 (18-May-2014) mount02 1 TPASS : mount() failed expectedly: TEST_ERRNO=ENODEV(19): No such device mount02 2 TPASS : mount() failed expectedly: TEST_ERRNO=ENOTBLK(15): Block device required mount02 3 TPASS : mount() failed expectedly: TEST_ERRNO=EBUSY(16): Device or resource busy mount02 4 TPASS : mount() failed expectedly: TEST_ERRNO=EBUSY(16): Device or resource busy mount02 5 TPASS : mount() failed expectedly: TEST_ERRNO=EINVAL(22): Invalid argument mount02 6 TPASS : mount() failed expectedly: TEST_ERRNO=EINVAL(22): Invalid argument mount02 7 TPASS : mount() failed expectedly: TEST_ERRNO=EINVAL(22): Invalid argument mount02 8 TPASS : mount() failed expectedly: TEST_ERRNO=EFAULT(14): Bad address mount02 9 TPASS : mount() failed expectedly: TEST_ERRNO=EFAULT(14): Bad address mount02 10 TPASS : mount() failed expectedly: TEST_ERRNO=ENAMETOOLONG(36): File name too long mount02 11 TPASS : mount() failed expectedly: TEST_ERRNO=ENOENT(2): No such file or directory mount02 12 TPASS : mount() failed expectedly: TEST_ERRNO=ENOTDIR(20): Not a directory

Example output on a kernel without support for loop devices:

mount02 0 TINFO : Couldn't find free loop device mount02 1 TCONF : mount02.c:195: Failed to obtain block device mount02 2 TCONF : mount02.c:195: Remaining cases not appropriate for configuration

Parent-child synchronization, loop devices, and further reading

The LTP test library contains lots more functionality to cover less common, but still repeatedly occurring, patterns. For example, if the test is run from a child process, LTP has tst_record_childstatus() function that waits for the child to exit and applies its exit status to the parent's test results. In addition, to ease parent-child synchronization, LTP implements FIFO-based synchronization primitives. Test cases that involve several threads may fail horribly if the cleanup callback is entered from several threads at once. To avoid this, LTP has TST_DECLARE_ONCE_FN() macro to create a thread-safe cleanup callback.

The library also includes runtime detection for kernel version, filesystem type, filesystem free space, and more. LTP build system and library can also be used to build and load kernel modules. Some test cases can be written as shell programs by using the reporting functions and parts of the library API that are implemented in the shell library.

For comprehensive API documentation, consult the Test Writing Guidelines.

How to port test cases to LTP

Porting existing test cases to LTP is a pretty straightforward process. First of all, test cases must be split into a separate executables (one executable per assertion or group of similar assertions) that can be executed without any manual intervention. Given the size of the project, it is important that all test source files have a unique name. Ideally, all files that are part of a certain test suite should start with common prefix.

The test results must be propagated to the exit value in the expected format. The LTP test exit value is a bit field defined in include/tst_res_flags.h . Using LTP test-reporting functions is preferred, but not strictly required.

The last step is to create a runtest entry that tells the LTP test execution framework which binaries should be executed in a test run. Runtest files are stored under runtest/ directory and their names are (hopefully) self-explanatory. If none of the existing runtest files seems right, new files can be created (set of runtest files used for a default run is stored in scenarios_groups/default file). The runtest file format is simple, all characters up to the first white space are the unique test name that appears in the test logs; the rest of the line is a command line to be executed.

mount02 mount02

Help wanted

For example, the runtest entry for the mount02 test case we have been describing is as follows:

The easiest way to start contributing is to run LTP and look at the results. If there are failed test cases, report them on the mailing list—or, even better, send a patch. Review of test cases, especially complex ones in the area of your expertise, is always welcomed as well.

The size of the interface between the kernel and user space grows faster than the number of test cases, so more tests are always needed. To see what is missing, just compare the list in man 2 syscalls with the content of the testcases/kernel/syscalls directory. Off the top of my head, a few that are missing are open_by_handle_at() , memfd_create() , getrandom() , as well as additional system call flags such as O_TMPFILE and O_BENEATH .

Writing a functional test case is as easy as calling the system call and checking that the result matches the documentation. Then there are numerous kernel interfaces that are not covered at all, for example the kernel input subsystem, where quite a lot could be tested using the uinput interface.

Conclusion

I hope that readers now see how the LTP test library can simplify the job of writing automated test cases. Questions, suggestions, patches, code that can be turned into automated test cases, etc. can be directed to the project's mailing list.

Comments (3 posted)

At linux.conf.au 2015 in Auckland, Rusty Russell presented a talk about his personal side-project, Pettycoin. Russell had announced Pettycoin at LCA 2014; at that time it represented an untested concept: a way to attach a separate, Bitcoin-like network to the existing Bitcoin blockchain. Pettycoin's goal was originally to offer a simpler and faster "side network" that periodically reconnected to Bitcoin. In the intervening year, Russell made a lot of progress, but other new innovations in the Bitcoin arena have led him to question parts of the Pettycoin approach and consider a reimplementation.

Russell began with a recap. The problem he set out to solve is that Bitcoin mining is an expensive task, with upfront hardware and ongoing power costs. Those expenses place a lower bound on the transaction fees that accompany every Bitcoin transaction, which in turn means there is an inherent limit to how small a "microtransaction" can be. Pettycoin was envisioned as a Bitcoin-like "adjunct" service optimized for microtransactions with a quick turnaround.

He designed it to function as a bi-directional gateway to the existing Bitcoin network. Small amounts could be removed from the Bitcoin network, converted to Pettycoins, and exchanged as needed; when the transactions were complete, the Pettycoin owners could re-enter the Bitcoin network and receive the correct amount (in Bitcoins) for their Pettycoin balances.

But Pettycoin would also have some distinct properties that differ from how Bitcoin operates. First, it would be limited to small amounts and use a simplified transaction protocol, features that would enable the Pettycoin network to have fast block times (i.e., the average amount of time between new blocks being mined, which corresponds to the wait time required to verify a transaction). Bitcoin's block time is approximately ten minutes, while Russell aimed for ten seconds with Pettycoin. Second, it would have a time-limit "horizon"—after one month, unclaimed Pettycoin transactions would be automatically returned to the Bitcoin network. That lowers the risk of participation by making it hard to permanently lose Bitcoins.

Third, it would require only "partial knowledge" from clients—in other words, although every Bitcoin client has to know every transaction in the Bitcoin blockchain just to participate, it would be sufficient in Pettycoin for someone to know each transaction. That significantly reduces the cost of participating as a client, which also bolsters the idea of offering fast transaction times. Finally, it would offer a payback mechanism to reward participants: each miner that processes a transaction would receive a small percentage of future transaction fees as payment, since the Pettycoin network would not, itself, mine new coins to serve as a reward.

Russell got approval to work on the project during a sabbatical from his day job at IBM, and took six months in mid-2014 to explore the ideas and develop an implementation. When he announced the first working code, though, he ran into a challenge: it was difficult to explain to people what the purpose of the project was. There are a glut of so-called "altcoins" on the Internet: projects that mimic Bitcoin, usually with some superficial change, and often building on Bitcoin software. Examples of altcoins include Litecoin and Dogecoin.

Lacking a better option to publicize the new project, Russell announced its existence on the main altcoin discussion forum, where it gained virtually no attention. In part this was because of the "orders of magnitude of noise" outweighing meaningful discussion on the altcoin forum, he said, and partly it was because no one understood what Pettycoin was meant to be. Either way, it made it considerably harder to get in touch with other developers who might be interested in the project.

But things changed rather suddenly in October, when Adam Back, Greg Maxwell, and others published Enabling Blockchain Innovations with Pegged Sidechains [PDF] (a.k.a., "the sidechains paper"), a white paper that "explained the right way to do this." As described in the paper, sidechains enable the Bitcoin network to be used as backbone resource to which other, independent cryptocurrencies can be attached. They are, in essence, an implementation of the same concept Russell was exploring with Pettycoin.

Russell then took what he termed "a massive detour" to explain sidechains. The concept is, naturally, a rather involved one, and Russell included considerable supplementary information in his slides that he labeled "caveats and notes"—skipping over them during the talk, but leaving them for further perusal. Interested readers would do well to have a look at the talk slides [PDF] for more detail.

The general idea, though, can be described in a few broad strokes. Every Bitcoin transaction is recorded with inputs (payers) and outputs (payees)—except for transactions that correspond to minting new Bitcoins, which have no inputs and, thus, have a dummy input field. A sidechain can be created by specifying a meaningful value in this dummy field—specifically, the transaction record of the start of the sidechain (which, like all Bitcoin transaction records, is hashed). This records the creation of the sidechain in the main Bitcoin network, so it can be verified later; the dummy input field has no effect on the other Bitcoin transactions recorded in the same Bitcoin block.

With the addition of two special functions to the Bitcoin protocol—one that moves Bitcoins from the main network to the sidechain, and one that returns Bitcoins to the main network, sidechain users could use "real" Bitcoins to conduct their transactions, and easily convert them back to the more widely-accepted Bitcoin standard. But, apart from the processing needed to transfer in and out of the Bitcoin network, sidechains are essentially untethered: they can operate according to different rules, experiment, and even implement other kinds of transactions. In short, they could be used to implement many of Pettycoin's ideas. Furthermore, client applications can simultaneously mine Bitcoins and sidechains, since sidechains can be initialized with Bitcoin blocks. Thus, there is no additional "cost" to the Bitcoin network for anyone to implement sidechains.

Sidechains are a new idea, and one that is still undergoing considerable debate in the Bitcoin community—alterations to the Bitcoin protocol are a serious matter. But, Russell said, a significant side-effect of the sidechain paper was that he suddenly found a lot of people interested in discussing his Pettycoin ideas.

The obvious question is whether or not Russell would re-implement Pettycoin as a sidechain project. On that topic, he said he had already learned a number of things from the sidechain paper and from subsequent discussions that would simplify a Pettycoin re-implementation.

For example, he has reconsidered how Pettycoin implements "partial knowledge" support. Sidechains enable some of the partial-knowledge features Russell had defined for Pettycoin, but more robustly. In particular, sidechains allow clients to spot (and, more importantly, to prove) when another client is trying to publish a fraudulent transaction—such as spending the same Bitcoin twice or spending a nonexistent Bitcoin balance—without requiring the client to process the entire historical Bitcoin blockchain.

In the original Pettycoin implementation, clients recorded a back-reference in their transactions—essentially, a pointer to where the Pettycoin balance they were spending came from. Other clients could prove a transaction was fake by looking up the back-reference and showing that it did not contain the balance that the spender claimed it did. Sidechains use a more complicated solution, tracking all unspent transaction balances and accompanying each balance with a proof (namely, a tree of the preceding transactions, which can be verified by checking its hash value). More data is required, but including the proof makes spotting phony transactions simpler and faster.

Russell also decided that Pettycoin's mechanism for paying miners transaction fees was too unpredictable and that his approach to achieving ten-second average block times was flawed (since it requires headers that are an order-of-magnitude larger than Bitcoins and still, in some percentage of blocks, results in extremely long block times).

The upshot, he concluded, is that the Bitcoin world "has moved on" in a significant fashion since Pettycoin was first developed. He will have to make the system more Bitcoin-like, which essentially means he will have to rebase his code on the Bitcoin reference code. On the other hand, he said, the good news is that there is now a well-understood word for what Pettycoin is: a sidechain. And that makes it easier to talk about and to recruit developers with.

Sidechains have the potential to spur enormous innovation in the developer community, since they leverage Bitcoin's advantages (such as decentralization and proof of work) but allow new projects to establish their own rules. Pettycoin will be making a comeback at some point (although Russell did not have a time frame to announce) in the new sidechain world, but it will likely be far from the only experiment worth watching.

[The author would like to thank LCA 2015 for travel assistance to Auckland.]

Comments (7 posted)