The July/August 2020 issue of acmqueue is out now



Subscribers and ACM Professional members login here



PDF

February 24, 2014

Volume 12, issue 2

The Logic of Logging

And the illogic of PDF

George Neville-Neil

Dear KV,

I work in a pretty open environment, and by open I mean that many people have the ability to become the root user on our servers so that they can fix things as they break. When the company started, there were only a few of us to do all the work, and people with different responsibilities had to jump in to help if a server died or a process got away from us. That was several years ago, but there are still many people who have rootly powers, some because of legacy and some because they are deemed too important to restrict. The problem is that one of these legacy users insists on doing almost everything as root and, in fact, uses the sudo command only to execute sudo su -. Every time I need to debug a system this person has worked on, I wind up on a two- to four-hour log-spelunking tour because he also does not take notes on what he has done, and when he's finished he simply reports, "It's fixed." I think you will agree this is maddening behavior.

Routed by Root

Dear Routed,

I would like to tell you that you can do one thing and then say, "It's fixed," but I can't tell you that. I could also tell you to take off the tip of his pinky the next time he does this, but I bet HR frowns on Japanese gangster rituals at work.

What you have is more of a cultural problem than a technical problem, because as you suggest in your letter, there is a technical solution to the problem of allowing users to have auditable, root access to systems. Very few people or organizations wish to run their systems with military-style security levels; and, for the most part, they would be right, as those types of systems involve a lot of overkill—and, as we've seen of late, still fail to work.

In most environments it is sufficient to allow the great majority of staff to have access only to their own files and data, and then give a minority of staff—those whose roles truly require it—the ability to access broader powers. Those in this trusted minority must not be given blanket authority over systems, but, again, should be given limited access, which, when using a system such as sudo, is pretty simple. The rights to run any one program can be whitelisted by user or group, and having a short whitelist is the best way to protect systems.

Now we come to your point about logging, which is really about auditability. Many people dislike logging because they think it's like being watched, and, it turns out, most people don't like to be watched. Anyone in a position of trust ought to understand that trust needs to be verified to be maintained and that logging is one way of maintaining trust among a group of people. Logging is also a way of answering the age-old question, "Who did what to whom and when?" In fact, this is all alluded to in the message that sudo spits out when you first attempt to use it:

We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things:

#1) Respect the privacy of others.

#2) Think before you type.

#3) With great power comes great responsibility.

Item #3 is a quote from the comic book Spider-Man, but Stan Lee's words are germane in this context. Root users in a Unix system can do just about anything they want, either maliciously or because they aren't thinking sufficiently when they type. Frequently, the only way of bringing a system back into working order is to figure out what was done to it by a user with rootly powers, and the only way of having a chance of doing that is if the actions were logged somewhere.

If I were this person's manager, I would either remove his sudo rights completely until he learned how to play well with others, or I would fire him. No one is so useful to a company that they should be allowed to play god without oversight.

KV

Dear KV,

I was recently implementing new code based on an existing specification. The documentation is 30 pages of tables with names and values. Unfortunately, this documentation was available only as a PDF, which meant that there was no easy, programmatic way of extracting the tables into code. What would have taken five minutes with a few scripts turned into a half day of copy and paste, with all the errors that that implies. I know that many specifications—Internet RFCs (Requests for Comments), for example—are still published in plain ASCII text, so I don't understand why anyone would publish a spec destined to be code in something as programmatically hard to work with as PDF.

Copied, Pasted, Spindled, Mutilated

Dear Mutilated,

Clearly you do not appreciate the beauty and majesty implicit in modern desktop publishing. The use of fonts—many, many fonts—and bold and underline , increase the clarity of the written words as surely as if they had been etched in stone by a hand of flame.

You are either facing a problem of overwhelming self-importance, as when certain people—often in the marketing departments of large companies—start sending e-mail with bold, underline, and colors, because they think that will make us pay attention, or you're dealing with someone who writes standards but never implements them. In either case this brings up a point about not only documentation, but also code.

Other than in mythology, standards and code are not written in stone. Any document that must change should be trackable in a change-tracking system such as a source-code control system, and this is as true for documents as it is for code that will be compiled. The advantage of keeping any document in a trackable and diffable format is that it is also possible to extract information from it using standard text-manipulation programs such as sed, cut, and grep, as well as with scripting languages such as Perl and Python. While it's true that other computer languages can also handle text manipulation, I find that many people doing what you suggest are doing it with Perl and Python.

I, too, would like to live in a world where—when someone updated a software specification—I could run a set of programs over the document, extract the values, and compare them with those that are extant in my code. It would both reduce errors and increase the speed with which updates could be made. The differences might still need to be checked visually, and I would check mine visually out of basic paranoia and mistrust, but this would be far easier than either printing a PDF file and going over it with a pen or using a PDF reader to do the same job electronically. While there are programs that will tear down PDFs for you, none is very good; the biggest problem is that if the authors had thought for 30 seconds before opening Word to write their spec, none would be necessary.

There are many things one can complain about with respect to the IETF (Internet Engineering Task Force), but the commitment to continue to publish its protocols in text format is not one of them. Alas, not everyone had Jon Postel to guide them through their first couple of thousand documents, but they can still learn from the example.

KV

LOVE IT, HATE IT? LET US KNOW

[email protected]

KODE VICIOUS, known to mere mortals as George V. Neville-Neil, works on networking and operating system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are code spelunking, operating systems, and rewriting your bad code (OK, maybe not that last one). He earned his bachelor's degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. He is an avid bicyclist and traveler who currently lives in New York City.

© 2014 ACM 1542-7730/14/0200 $10.00





Originally published in Queue vol. 12, no. 2—

see this item in the ACM Digital Library

Follow Kode Vicious on Twitter

Related:

J. Paul Reed - Beyond the Fix-it Treadmill

Given that humanity’s study of the sociological factors in safety is almost a century old, the technology industry’s post-incident analysis practices and how we create and use the artifacts those practices produce are all still in their infancy. So don’t be surprised that many of these practices are so similar, that the cognitive and social models used to parse apart and understand incidents and outages are few and cemented in the operational ethos, and that the byproducts sought from post-incident analyses are far-and-away focused on remediation items and prevention.

Laura M.D. Maguire - Managing the Hidden Costs of Coordination

Some initial considerations to control cognitive costs for incident responders include: (1) assessing coordination strategies relative to the cognitive demands of the incident; (2) recognizing when adaptations represent a tension between multiple competing demands (coordination and cognitive work) and seeking to understand them better rather than unilaterally eliminating them; (3) widening the lens to study the joint cognition system (integration of human-machine capabilities) as the unit of analysis; and (4) viewing joint activity as an opportunity for enabling reciprocity across inter- and intra-organizational boundaries.

Marisa R. Grayson - Cognitive Work of Hypothesis Exploration During Anomaly Response

Four incidents from web-based software companies reveal important aspects of anomaly response processes when incidents arise in web operations, two of which are discussed in this article. One particular cognitive function examined in detail is hypothesis generation and exploration, given the impact of obscure automation on engineers’ development of coherent models of the systems they manage. Each case was analyzed using the techniques and concepts of cognitive systems engineering. The set of cases provides a window into the cognitive work "above the line" in incident management of complex web-operation systems.

Richard I. Cook - Above the Line, Below the Line

Knowledge and understanding of below-the-line structure and function are continuously in flux. Near-constant effort is required to calibrate and refresh the understanding of the workings, dependencies, limitations, and capabilities of what is present there. In this dynamic situation no individual or group can ever know the system state. Instead, individuals and groups must be content with partial, fragmented mental models that require more or less constant updating and adjustment if they are to be useful.



© 2020 ACM, Inc. All Rights Reserved.