Dec 19 2012

Whenever you write code, you hold certain assumptions in your mind. You expect some conditions to be met, and you promise that, if those conditions are valid, you’ll do something in return. But sometimes your code fails to convey those assumptions. Consider this snippet:

class Users def find_by_name(name) collection.find { |u| u.name == name } end private attr_reader :collection end users = Users.new user = users.find_by_name(name)

With no context other than the code above, can you determine whether it is acceptable for the:

name parameter to be nil?

parameter to be nil? value returned by the method (a user) to be nil?

Let’s look at some possible combinations:

name parameter may be nil, user object return value may be nil

parameter may be nil, user object return value may be nil name parameter must not be nil, user object return value may be nil

parameter must not be nil, user object return value may be nil name parameter may be nil, user object must not be nil

parameter may be nil, user object must not be nil name parameter must not be nil, user object must not be nil

This code can be used in all sorts of situations. It’s flexible, yes, but it can also be a source of confusion and subtle bugs. The more lenient you allow your code to be, the more you expose yourself and the users of your code to unforeseen consequences.

So, how would you make the snippet communicate its intent more clearly?

Guarding Against an Evil World

We have to start somewhere, so let’s handle the name parameter. Is it acceptable if the value is nil? Is it acceptable if the value is blank? Maybe it is. Maybe the original author knew that value of name would never be nil, so she didn’t bother to put in a check for nil-ness. The only way to answer that is to know your domain’s requirements, but in this case you had to take the time to check the documentation (or maybe ask someone), because the original author didn’t bother to clarify the context for the use of the method.

One way to fix this would be to document the method. But, personally, I find it much easier to keep code up-to-date than to keep documentation up-to-date. If you’re like me, you should strive instead to write self-documenting code. Let’s do that.

Let’s imagine for a moment that name is a required attribute, so it will never make sense to ask for a user without a name. You can do the following:

def find_by_name(name) raise ArgumentError, "name must not be nil" if name.nil? # ... end

This will communicate your assumption clearly to anyone reading (or using) your code. As an added bonus, errors will be caught closer to where they are introduced, making debugging easier.

Let’s consider the case where empty strings are equally unwanted. You could do this:

def find_by_name(name) raise ArgumentError, "name must not be blank" if name.blank? # ... end

but let’s think about it. What if you have more methods that expect a user name to be passed to them? You’d be breaking encapsulation and knowledge about what constitutes a valid user name would bleed into other classes. In that case, consider introducing a specific class to represent a user name, say UserName . You can then ensure you only get a UserName as an argument:

def find_by_name(name) raise TypeError, “expected a UserName” unless name.is_a?(UserName) # ... end

Or, if you want to spare users of your code some trouble, you can do the conversion yourself. Personally, I like sending a message whose name matches the class name, which looks like a cast and is similar to what is used by Ruby (for example, see #Integer or #Array ). This is not a Ruby idiom, however, so you’ll have to rely on project convention. One possible implementation of UserName may be:

def UserName(value) case when UserName === value value when String === value UserName.new(value) when value.respond_to?(:to_user_name) value.to_user_name else raise TypeError, “can’t convert #{value.inspect} to UserName” end end

And you’d use it like this:

def find_by_name(name) name = UserName(name) # ... end

There is no possible ambiguity here. The UserName class should encapsulate all knowledge about what is a valid user name. The #UserName method should encapsulate all knowledge about how to convert some object to a UserName .

There are a few advantages to doing it this way: it’s reusable and it forces you to give up primitive obsession and see concepts you may be missing in your app. On the other hand, you will lose some flexibility. It will be harder to duck-type, unless the duck-like object responds to #to_user_name . But sometimes it’s ok to lose this flexibility, especially when you’re dealing with data as opposed to behavior.

But what if it is acceptable for name to be nil?

The first question you need to ask yourself is, “Why is it acceptable for name to be nil?” nil is a special case, and special cases imply more tests, more effort in understanding the implication of change, and usually, more bugs. Whenever you need to do something like:

if var_that_can_be_nil var_that_can_be_nil.send_message else # do something else end

You have at least one additional code path to test. If you need to check for nil in many places, you need an additional test for each place where you make the distinction between nil and not nil. This can be mitigated by the use of null objects, but these come with their own problems: if you always return a null object instead of nil, how can you be sure that you are returning a null object only when you should?

Alas, if you determine that there is no way to avoid name being nil, there are a few techniques that you can use to communicate your intent clearly. For example, you can create a different method and adjust the original method’s requirements:

def find_by_required_name(name) raise ArgumentError, “name must not be nil” unless name select_by_name(name).first end def select_by_name(name) select { |u| u.name == name } end

If it is acceptable for name to be nil, it’s also likely that many users without a name can be returned, so I’m reflecting that on the method name (Ruby uses #find to return a single item, and #select to return many items).

This also uncovered a possible bug in #find_by_name: if name is nil, only the first user with a nil name will be returned. Perhaps this was what the original author wanted, but again, it’s forcing you to deduce your assumptions, therefore placing an unnecessary burden on users of your class.

Another possible route, if you know the context in which you want to select users without a name, is to create a specific method for that situation:

def select_unnamed select_by_name(nil) end

If the above looks awkward to you, that’s because it is. You pay a price for more robust code, but ultimately the problem is caused by your domain requirements. As a software developer and problem solver, part of your job is to avoid complexity as much as possible. There is no better way to avoid complexity than to trim functionality aggressively.

Let’s turn to the User class now.

Promises, Promises

The second part to this exploration is deciding how to handle user. Again, let’s first assume that it is never acceptable for user to be nil and, for the sake of simplicity, that users must always have a name (so you don’t need to handle a nil name).

First, remember that Ruby collections already come with a #find method, which returns nil if no match is found. So if you plan to have #find_by_name guarantee that a result is returned, you’ll need to change its name. To find (no pun intended) a proper name, we can turn to Hash. There you will find Hash#fetch, which in its simpler form, will raise a KeyError if a given key can’t be found. We can follow that idiom, and rename our method appropriately:

def fetch_by_name(name) name = UserName(name) find { |u| u.name == name } or raise KeyError, "No user for name '#{name}'" end

That was easy. Now let’s tackle the case where it is acceptable for user to be nil. Again, taking the cue from Hash#fetch , which takes a default value as an optional second argument, that’s pretty easy to do:

def fetch_by_name(*args) if args.size > 2 raise ArgumentError, "wrong number of arguments (#{args.size} for 2)" end name, default = UserName(args.first), args.last user = find { |u| u.name == name } if user user elsif args.size == 2 default else raise KeyError, "No user for name `#{name}`" end end

Rather than simply returning nil, you require the caller to specify a default. Compare the more forgiving interface:

user = users.find_by_name(name_of_non_existing_user) #=> nil do_something_to_user(user) # BOOM. I hope you know where to find # #do_something_to_user because the error # will be raised there first.

with the more strict one

user = users.fetch_by_name(name_of_non_existing_user) # BOOM do_something_to_user(user)

The error is raised where it is occurs, making the problem much easier to debug. There’s an obvious byproduct: the size of the method has increased, but the gain of clear, cohesive code will be well worth the minor trade off of a longer methods.

Asking Questions

There’s a final avenue that we need to explore, what if you only wanted to know if a given user exists? All you need to do is ask:

def include_by_name?(name) name = UserName(name) !! fetch_by_name(name, false) end

This should be pretty clear. Once more you require an actual name to be passed. Then you reuse #fetch_by_name , passing it a default value of false. Finally, even though any value other than nil or false is considered true in Ruby, you ensure the value is an actual boolean using the !! idiom.

Why not reuse #fetch_by_name ? At this point, it should be obvious already: clarity.

The Power of Clarity

Yes, being clear involves more work. It forces you to think, really think about what you need to do. It forces you to convey your intent in code. But I believe that time spent clarifying your code, will be time saved hunting down obscure bugs. And, in the end, you may discover that you didn’t have to write that much more code, anyway, if you stick to writing only the code that’s needed instead of generalizing something that may not need to be generalized.

—David Leal