See also: a rewritten and somewhat calmer version of this post I wrote later.

So I do a lot of debugging.

It’s not because I write a lot of broken code so much as that I seem to be a go to guy for “Something mysterious is happening. Can you lend a hand?”

That’s fine. It’s something I’m reasonably good at, and sometimes it’s even enjoyable.

But it means I have a bit of a triggering issue which will get me very angry.

“What is that triggering issue, David? Tell us, please!” I hear you say.

Well, I’m glad you asked!

That issue is very simple: Bad error reporting.

A thing developers don’t seem to understand is that what happens when things go wrong is every bit as important as what happens when things go right. Possibly more important.

If you don’t realise this, when things go wrong you will feel the wrath of myself and all the ops people who have to deal with your software floating through the air, trying to set you on fire with our minds.

While I’m reasonably sure psychic powers aren’t actually a thing, do you really want to take that chance?

So, if you don’t want to experience spontaneous geek rage induced combustion, here is some helpful advice for you to follow.

First, a word on process. When something goes wrong, the question I am asking is “How can I make this not go wrong?”. In order to answer this, I must first answer the following questions:

Where has it gone wrong? What has gone wrong with it? Why has it gone wrong?

Your job as a writer of software is to make it as easy as possible for me to answer these three questions.

Next a note of context:

How am I attempting to answer this question?

Well, in an ideal world, I’m attempting to answer it because I have a nice precise test case which reproduces the problem.

However, I first need to get to that point, and in order to get to that point I need enough information to give a pretty good answer to the first two questions. An entire application is not a test case, especially not if it’s in a complicated deployment environment. I need enough information about where it has gone wrong to extract a smaller test case and I need enough information about what has gone wrong to put that test case in a state where it will demonstrate the problem.

So what I’m actually looking at initially is almost certainly a log file. It’s OK if this log file is really the screen of a console, but the point is that something, somewhere, has given me a textual record that says “Hey, something’s gone a bit wrong. Here’s some info for you to look at”.

There is a possibility that if you’re writing an application or a framework or something you have deliberately avoided producing such a textual record of anything, or are piping your errors to /dev/null or something. Hopefully this is not the case, because if it is you don’t need to worry about spontaneous combustion because whomever has to deploy and maintain your code has probably already tracked you down to your home address and killed you in your sleep. No jury would convict.

So, from now on, I’m assuming you’ve done the decent thing and there’s some way of going from errors that occur to logs of such errors.

What can you do to make everyone’s lives easier?

Error messages

Obviously the prerequisite of this is that you actually tell me something in your error message. You’d never just write an error message that said “Something went wrong”, right? So assuming you’ve already got error messages that tell me roughly what went wrong, here is how to have error messages that tell me exactly what went wrong:

If your error message is triggered by a value, for the love of god include that value in your error message.

People don’t seem to do this. I don’t understand why. It’s very simple.

Don’t do:

error "Bad argument" error "Bad argument"

Do do:

error "Bad argument: #{argument.inspect}" error "Bad argument: #{argument.inspect}"

Even better if you tell me exactly why it is invalid:

error "You can't curdle a frobnitzer which has already been curdled: #{argument.inspect}" error "You can't curdle a frobnitzer which has already been curdled: #{argument.inspect}"

(Side note: All examples here will be in ruby, because that’s mostly what I’ve been working with when this has been pissing me off. The examples should be easily portable and the principles are language agnostic).

That’s it. You’ve already made my life at least 27% simpler with this one step.

Why is this important?

It’s important because tracking data flow is hard. It’s entirely possible that the function you’ve raised an error in is about 50 calls deep. I can probably track down what has been passed to it eventually after carefully looking through calls and such-like, but I shouldn’t need to. If you are not including the value in your error message then you have exactly the information I need at your finger tips and are failing to tell me. That’s kinda a dick move.

Exceptions are awesome. Do more of those

You know what are great?

Exceptions. Exceptions are great.

I mean obviously I’d rather if your code isn’t throwing exceptions, but I’d rather it’s not throwing exceptions because it doesn’t need to because everything is going swimmingly, not because it wouldn’t throw them if something went wrong.

Why are exceptions great?

Exceptions are nice for structuring error handling in code, they provide good classification for error recoveries, etc. etc.

That’s not what I care about here.

Exceptions contain one thing that elevates them to the status of patron saint of people who have to debug problems.

They carry a stack trace.

It’s like a glorious little audit trail that points the finger at exactly where the problem occurred. If you’ve followed the previous instructions and given them a good error message too then you’ve probably told me exactly what I need to know to reproduce the problem (there are some, ahem, exceptions to this which I will get on to later, but this is true most of the time).

Side note: I know this isn’t true in all languages. e.g. C++ exceptions don’t carry stack traces (I think) and Haskell ones have less than useful stack traces due to lazy evaluation. You have my sympathies. Everyone else, no excuses.

Further, they carry exactly the information I want to appear in the log on top of that: An error category and a message. An exception which bubbles up to the log file is my best friend for problem debugging.

Some specific notes on exceptions:

If you see an exception, say an exception

Never hide exceptions from me. Ever.

If you catch an exception, I need to know about it unless you’re really goddamn sure I don’t (examples where you may validly be goddamn sure I don’t include Python’s StopIteration and any other exceptions used for control flow. Yes this is a valid thing to do).

I don’t care if you send an email, dump it in a log file, whatever you want. I just need to know about it, and I need to know at the very least the exception class, the exception method and for the love of god the exception stack trace.

Thou Shalt Not Fuck With The Stack Trace

A lot of frameworky things (rails, rspec, etc. I define framework as any library or application where the usage pattern is “Don’t call our code, we’ll call yours”) think that exceptions are confusing and unhelpful. They might show you some of the stack trace, but you really don’t want the whole thing do you? Here, let us filter out those unhelpful bits.

NO.

NO NO NO NO NO NO NO NO NO NO.

NO.

Bad developer. Wrist slap. No cookie.

The chances that you actually correctly understand what is the important bit of the stack trace are effectively zero. Even if you somehow manage to correctly understand this, you are removing important context. The lack of that context will confuse me more than its presence. If I ever find you are doing this I will simply have to do everything again with the “stop lying to me you bastard” flag turned on.

And that’s terrible.

Except…

There is one case in which fucking with the stack trace is not only permitted but also mandatory.

It is OK to add more information to the stack trace.

In particular, if there is another stack trace involved you should also include that.

I often see code like this:

begin ... rescue LowLevelException = & gt; e raise MyLibrarySpecificException ( e. message ) end begin ... rescue LowLevelException => e raise MyLibrarySpecificException(e.message) end

Please take my outrage as read.

It doesn’t look like you’re doing it but you are once again fucking with the stack trace. Remember what I said about not doing that?

It’s OK to wrap exceptions. I understand the reasoning for doing it, and it’s often a good idea.

However: Your language almost certainly gives you the capability to override the stack trace. When you are wrapping an exception you must do this so that it includes the original stack trace. Ideally you would include both back traces, so your logs would contain something like:

MyLibrarySpecificException: Wrapped LowLevelException: "A message" this error was thrown here -- WRAPPED BACKTRACE -- the original error was thrown there

The details don’t matter. The point is: Include both back traces if you can, include only the original stack trace of the exception you’re wrapping if you absolutely must.

Here’s an example of how you can do that in Ruby:

class MyLibrarySpecificException & lt; StandardError attr_reader :wrapped_exception def initialize ( wrapped_exception ) super ( "Wrapped #{wrapped_exception.class}: #{wrapped_exception.message}" ) @wrapped_exception = wrapped_exception end def backtrace super + [ "---WRAPPED EXCEPTION---" ] + wrapped_exception. backtrace end end class MyLibrarySpecificException < StandardError attr_reader :wrapped_exception def initialize(wrapped_exception) super("Wrapped #{wrapped_exception.class}: #{wrapped_exception.message}") @wrapped_exception = wrapped_exception end def backtrace super + ["---WRAPPED EXCEPTION---"] + wrapped_exception.backtrace end end

Enough of exceptions. Some more general principles.

If something goes wrong, tell me

This rant isn’t about Padrino, but it was a triggering point for it.

One of Padrino’s more interesting behaviours is that if you have a syntax error in one of your controller files it won’t fail to start. Instead what will happen is it will log a warning, continue loading and then just go “Eh, I don’t know anything about that” if you try to use routes defined in a controller it failed to load.

This is not helpful.

A common design principle seems to be that you should attempt to do the right thing – recover from errors, guess what the user meant, etc.

This is really not helpful.

The problem with fuzzy behaviour is that it produces fuzzy results. Postel’s Law is deeply unhelpful for library design: Code which you are running should be correct. If it’s a bit wrong, you should not attempt to run it, you should error out and make me fix my code.

This is because errors in code are signs of error in thought. The chances of my accidentally calling your code with the wrong value is much higher than the chances of me deliberately being a bit sloppy (and if I’m deliberately being a bit sloppy it’s OK to slap my wrist and punish me for it). Code which is doing the wrong thing is going to be a problem now or a problem later, and I’d much rather you told me it was a problem now so I can fix it now rather than having to locate it later.

On the subject of “now rather than later”.

Validate early, validate often

Suppose I write the following code:

class HelpfulHashWrapper def initialize ( hash ) @hash = hash end def do_something ( some_key ) return @hash [ some_key ] end end class HelpfulHashWrapper def initialize(hash) @hash = hash end def do_something(some_key) return @hash[some_key] end end

(ignore the fact that this class is stupid)

Now suppose I do the following:

1.8.7 :029 & gt; my_wrapper = HelpfulHashWrapper. new ( nil ) = & gt; # 1.8.7 :032 & gt; my_wrapper. do_something "hi" NoMethodError : undefined method `[]' for nil:NilClass from (irb):26:in ` do_something ' from (irb):32 1.8.7 :029 > my_wrapper = HelpfulHashWrapper.new(nil) => # 1.8.7 :032 > my_wrapper.do_something "hi" NoMethodError: undefined method `[]' for nil:NilClass from (irb):26:in `do_something' from (irb):32

Where is the error here?

Hint: It’s not the point where the exception was raised.

I constructed the HelpfulHashWrapper with an argument that was never going to work. My HelfpulHashWrapper unhelpfully didn’t tell me that I had put it into an invalid state.

Why is this important?

Remember when I said that the first question I needed to be able to answer was “Where has it gone wrong?”

If I get an error when I try to use an object in an invalid state, I’m not really able to answer that question. Instead what I need to do is back track to the point where the object got put into an invalid state. This is hard work. The following version of the class will make my life much easier:

class HelpfulHashWrapper def initialize ( hash ) raise "I can only helpfully wrap hashes. #{hash.inspect} is not a hash" unless hash. is_a ? Hash @hash = hash end def do_something ( some_key ) return @hash [ some_key ] end end class HelpfulHashWrapper def initialize(hash) raise "I can only helpfully wrap hashes. #{hash.inspect} is not a hash" unless hash.is_a? Hash @hash = hash end def do_something(some_key) return @hash[some_key] end end

I will now discover very early on when I’ve done something wrong, rather than waiting to find it at a mysterious later date.

Basically: The closer to the original error you report the problem, the easier it is for me to identify and fix the problem.

In summary

Above all else, give me helpful error messages Helpful error messages contain any invalid values and a reason as to why they’re invalid. Throw exceptions if something goes wrong. Your application should record all exceptions it receives. Do not fuck with the stack trace Do not attempt to help me by not throwing an exception. If something maybe should throw an exception, it should throw an exception. Validate your internal state, and throw an exception when your state becomes invalid, not when I try to use it in an invalid state.

Doing these things will significantly reduce my blood pressure, will make your ops guys love you (or at least resent you slightly less bitterly), and will reduce your chances of spontaneous combustion by at least 43%.