How Logging Made me a Better Developer

I think logging is one of the most powerful tools in a programmer's toolbox. Unfortunately, it's supremely underutilized. Logging gives you data about what your code is doing. Over the past couple years, I've found logging helps me to manage complexity, communicate my code to other developers (and vice versa), and allows me to rapidly develop and debug software. As simple and silly as it may seem; for me, logging is unequivocally the most beneficial tool in my toolbox.

This is the first part in a two part series on logging. This post describes why I think logging is important on an abstract level, and the second post will discuss a concrete implementation of a Javascript logging library that uses the concepts outlined here

When I first started learning to program, I remember using log statements a lot. Almost all introductory programming classes or tutorials demonstrate concepts by using logging, or print(), or some other function to print output to the console. What quicker way exists to verify your code is doing what you expect it to (or doing anything at all)? As I learned more about how to program, I stopped littering my code with log statements. I thought logging was a sort of novice thing to do. Over the past couple years, however, I've gone back to my logging roots.

Many of my peers have little to no logging in their codebases, and many - if not most - popular third party libraries are devoid of any logging. Of course there are exceptions, but I'm confused why finding well logged code is not the norm. Why is this? I don't know, and I'm not going to try to answer it. I can only share what has worked for me, and why I've found logging to be incredibly useful.

Why Log?

Liberal use of logging has a plethora of benefits. At the highest level, programs are hard to write and manage because they are complex. The complexity of the problem you're solving, the building of abstractions, is the hard part. Logging, unit tests, agile programming, test driven development, scrum, continuous integration, third party framework X or Y; or any other tool really does not decreases complexity (the essence of programming). However, logging has helped both myself and my team to manage that complexity more efficiently. Not only has it helped ease the communication of what we built, logging also provides numerous other beneficial side effects.

What I mean by logging

I'm going to describe a lot of the benefits logging has provided me. However, the positive effects are affected by what is being logged and how often it occurs. I lean towards logging nearly everything. When a function is called, I log that it has been called along with any passed in arguments, the time it was called, and in Javascript, the name of the function that called it. I also log before a function returns, along with the time and any relevant return data. I log when asynchronous calls return. I log when significant logic changes happen. I leave these log messages in the code; I don't comment them out or remove them (of course, I don't have the output of all log messages always enabled - I'll cover how I filter the output of log messages later).

Benefits

This approach might sound ugly and make it seem like the code would be cluttered and unreadable. As crazy as it might sound though, this method allows me to develop code more quickly and effectively than unit tests, can communicate what the code is doing better than comments, and is easier and faster to implement than nearly any other tool or technique I've used.

Visibility into code helps manage complexity.

For me, the primary benefit of logging is visibility into what my code is doing. Breakpoints can do this to some extent - but they require manual setting and clearing. Unit tests provide significant advantages as well, but they don't let you see what your code is doing (important when it's running in production, but also even a consideration when you're developing). Sufficient logging allows answering nearly any question about what a program is doing.

There are two sides I see to increased visibility: the first is visibility into what your code is doing after it's running and people are using the software you wrote. The second is visibility into your code during development.

Visibility after shipping

Seeing what your code is doing when it's being used is extremely beneficial if you're developing any kind of product that changes over time. With logging, if you have an API, you can see what endpoints are being hit and when. With minimal additional sophistication to a logging framework, you could even track the time for every request / response for free. If you write a game, you can track for every player how many units are killed or what the average score is at different points of time in the game. More data allows more questions to be asked and answered. The more you log, the more data you have.

Visibility while developing

Just as useful as visibility into shipped code is visibility into code during development. By logging when a function is called and returned, I get on demand stack traces for free. I don't have to know a-priori that I want a stack track. If a related and unanticipated edge case bug occurs in some other area of my code when I'm developing, I wouldn't know to set a breakpoint. With logging, it's almost as if I set breakpoints everywhere, so I can trace the flow of execution (and a lot of state) for every call after they've happened. This, of course, isn't convenient just for me.

Communication

Logging allows other develops to see exactly code is doing. Unit tests are wonderful for many reasons, but they aren't great at communicating what code does - only that it does or doesn't do something. They help to verify code works after changing something, but how effective are they in helping to understand how the code works? With good logging in place, it's easier for other developers to see exactly what the code is doing. By enabling all logging messages, other developers can follow through the execution of any parts of the program they desire without needing to manually set break points.

Explicit comments.

To me, the ideal commenting system involves small functions with a sentence or two high level description at the top of the function describing its purpose. This, coupled with well named variables, is expressive (e.g., "var x=4" doesn't count - this is position, some random variable, what?). Code should describe itself, but those descriptions often aren't enough to give a high level overview of what's happening. In reality, functions often times get to be larger than anticipated, and due to various non-programming constraints (e.g., time), we often don't have the luxury (or are too lazy) to go back and write proper comments or cleaner code.

Logging helps solve this problem. By sprinkling a function with log messages, you're implicitly providing a comment about what the should be doing at that point. When the program runs, it explicitly logs what it is doing. Reading and understanding a function's behavior is easier when there are comments, but if you can replace a lot of comments with log messages, you get both the benefits of documentation and explicit information when a program runs.

How I log

I've found the benefits of logging to be only as good as what and how I'm logging. In the past, I would write a log statement to validate some assumptions, but then remove the log statement afterwards. Nobody wants their code producing irrelevant output. If the log statements were once helpful, however, I think they can continue to be helpful. The trick is to leave the messages in, but have a method of filtering them.

The following spec has worked wonderfully for me. A sample log message might look like: logger.log("filename:methodname", "some message: %s", "replaced string");

First, we need a way to log messages. The logging library exposes a "log" function which takes in two (and possibly more) parameters:

group: a {string} specifying the group the log message belongs to. Subgroups are delineated by colons (:) message : a {string} specifying the message to log. If it contains formatting parameters (e.g., http://nodejs.org/api/util.html#utilutilformat_format ), additional parameters can be passed in to be replaced. n more arguments: objects that will be formatted by the message string

Next, we need a way to filter the messages. The idea is to have a bunch of log messages and leave them in without negatively impacting the application's output. By specifying the group(s) a log message belongs to, we can filter by group and any level of subgroups.

One possible implementation, the approach I took, is to provide a logLevel parameter which is takes in an array of strings specifying what to log (e.g., "filename" would log anything that matched "filename", as well as any subgroups: filename:group1", "filename:group1:subgroup1", etc.). Each time log() is called, the passed in log group is checked against the valid log groups.

Conclusion

I believe logging is a terribly underutilized tool that can provide tremendous benefits. Logging provides data about code. This data allows you to gain insights into what your code is doing and how it's doing it. Along with increased visibility into code, logging enhances communication about it. For my team and me, logging has been incredibly helpful. I think it's a practice that should be more widely used. In my next post, I'll discuss the concrete implementation of a logging library I developed based on these concepts that I've been using for the past couple years.