Four Guidelines That I Feel Have Improved My Code

I have been thinking a lot about isolation, dependencies and clean code of late. I know there is a lot of disagreement with people vehemently standing in both camps.

I certainly will not say either side is right or wrong, but what follows is what I feel has improved my code. I post it here to formalize some recent thoughts and, if I am lucky, get some good feedback.

Before I rush into the gory details, I feel I should mention that I went down this path, not as an architecture astronout, but out of genuine pain in what I was working on.

My models were growing large. My tests were getting slow. Things did not feel “right”.

I started watching Gary Bernhardt’s Destroy All Software screencasts. He is a big proponent of testing in isolation. Definitely go get a subscription and take a day to get caught up.

On top of DAS, I started reading everything I could on the subject of growing software, clean code and refactoring. When I say reading, I really should say devouring.

I was literally prowling about like a lion, looking for the next book I could devour. Several times my wife asked me to get off my hands and knees and to kindly stop roaring about SRP.

Over the past few months as I have tried to write better code, I have definitely learned a lot. Learning without reflection and writing is not true learning for me.

Reflecting on why something feels better and then writing about it formalizes it in my head and has the added benefit of being available for anyone else who is struggling with the same.

Here are a few guidelines that have jumped out at me over the past few days as I reflected on what I have been practicing the past few months.

Guideline #1. One responsibility to rule them all

Single responsibility principle (SRP) is really hard. I think a lot of us are frustrated and feeling the pain of our chubby <insert your favorite ORM> classes. Something does not feel right. Working on them is hard.

The problem is context. You have to load a lot of context in your brain when you crack open that INFAMOUS user model. That context takes up the space where we would normally create and come up with new solutions.

Create More Classes

So what are we to do? Create more classes. Your models do not need to inherit from ActiveRecord::Base, or include MongoMapper::Document, or whatever.

A model is something that has business logic. Start breaking up your huge models that have persistence bolted on into plain old Ruby classes.

I am not going to lie to you. If you have not been doing this, it will not be easy. Everything will seem like it should just be tucked as another method in a model that also happens to persist data in a store.

Naming is Hard

Another pain point will be naming. Naming is fracking hard. You are welcome for the BSG reference there. I would like to take that statement a step further though.

Naming is hard because our classes and methods are doing too much. The fewer responsibilities your class has, the easier it will be to name, especially after a few months of practice.

An Example

Enough talk, lets see some code. In our track processors, which pop tracks off a queue and store reports in a database, we query for the gauge being tracked before storing reports. The purpose of this query is to ensure that the gauge is in good standing and that we should, in fact, store reports in the database for it.

A lot of people throw the tracking code on their site and never remove it or sign up for a paying account. We do this find to make sure those people noop, instead of creating tons of data that no one is paying for.

This query happens for each track and it is pulling information that rarely if ever changes. It seemed like a prime spot for a wee bit of caching.

First, I created a tiny service around the memcached client I decided to use. This only took an hour and it means that my application now has an interface for caching ( get , set , delete , and fetch ). I’ll talk more about this in guideline #3.

Once I had defined the interface Gauges would use for caching, I began to integrate it. After much battling and rewriting of the caching code, each piece felt like it was doing too much and things were getting messy.

I stepped back and thought through my plans. I wanted to cache only the attributes, so I threw everything away and started with that. First, I wanted to be able to read attributes from the data store.

class GaugeAttributeService def get(id) criteria = {:_id => Plucky.to_object_id(id)} if (attrs = gauge_collection.find_one(criteria)) attrs.delete('_id') attrs end end end

Given an id, this class returns a hash of attributes. That is pretty much one responsibility. Sweet action. Let’s move on.

Second, I knew that I wanted to add read-through caching for this. Typically read-through caching uses some sort of fetch pattern. Fetch is basically a shortcut for look first in the cache and if it is not there, compute the block, store the computed result in the cache and return the computed result.

If I would have added caching in the GaugeAttributeService class, I would have violated SRP. Describing the class would have been "checks the cache and if not there it fetches from database". Note the use of "and".

As Growing Object Oriented Software states:

Our heuristic is that we should be able to describe what an object does without using any conjunctions (“and,” “or”).

Instead, I created a new class to wrap (or decorate) my original service.

class GaugeAttributeServiceWithCaching def initialize(attribute_service = GaugeAttributeService.new) @attribute_service = attribute_service end def get(id) cache_service.fetch(cache_key(id)) { @attribute_service.get(id) } end end

I left a few bits out of this class so we can focus on the important part, which is that all we do with this class is wrap the original one with a cache fetch.

As you can see, naming is pretty easy for this class. It is a gauge attribute service with caching and is named as such. It initializes with an object that must respond to get . Note also that it defaults to an instance of GaugeAttributeService .

Unit testing this class is easy as well. We can isolate the dependencies ( attribute_service and cache_service ) in the unit test and make sure that they do what we expect ( fetch and get ).

Note: There definitely could a point made that "with" is the same as "and" and therefore means that we are breaking SRP. Naming is hard, really hard. Rather than get mired forever in naming, I rolled with this convention and, at this point, it does not bother me. I am definitely open to suggestions. Another name I played with was CachedGaugeAttributeService.

Below is an example setup with new dependencies inject in the test that help us verify this classes behavior in isolation.

attributes = {'title' => 'GitHub'} attribute_service = Class.new do def get(id) attributes end end.new cache_service = Class.new do def fetch(key) get(key) || yield end def get(key) end end.new service = GaugeAttributeServiceWithCaching.new(attribute_service) service.cache_service = cache_service

Above I used dynamic classes. Instead of dynamic classes, one could use stubbing or whatever. I’ll talk more about cache_service= later.

Decorating in this manner means we can easily find without caching by using GaugeAttributeService or with caching by using GaugeAttributeServiceWithCaching.

The important thing to note is that we added new functionality to our application by extending existing parts instead of changing them. I read recently, but cannot find the quote, that if you can add a new feature purely by extending existing classes and creating new classes, you are winning.

Guideline #2. Use accessors for collaborators

In the example above, you probably noticed that when testing GaugeAttributeServiceWithCaching , I changed the cache service used by assigning a new one. What I often see is others using some top level config, or even worse they actually use a $ global.

# bad Gauges.cache = Memcached.new class GaugeAttributeServiceWithCaching def get(id) Gauges.cache.fetch(cache_key(id)) { … } end end # worse $cache = Memcached.new class GaugeAttributeServiceWithCaching def get(id) $cache.fetch(cache_key(id)) { … } end end

What sucks about this is you are coupling this class to a global and coupling leads to pain. Instead, what I have started doing is using accessors to setup collaborators. Here is the example from above, but now with the cache service accessors included.

class GaugeAttributeServiceWithCaching attr_writer :cache_service def cache_service @cache_service ||= CacheService.new end end

By doing this, we get a sane, memoized default for our cache service ( CacheService.new ) and the ability to change that default ( cache_service= ), either in our application or when unit testing.

Finding ourselves doing this quite often, we created a library, aptly named Morphine. Right now it does little more than what I just showed (memoized default and writer method to change).

As I have started to use this gem, I am getting more ideas for things that would be helpful. Here is the same code as above, but using Morphine. What I like about it, over a memoized method and an attr_writer is that it feels a little more declarative and creates a standard way of declaring collaborators for classes.

class GaugeAttributeServiceWithCaching include Morphine register :cache_service do CacheService.new end end

Note also that I am not passing these dependencies in through initialize. At first I started with that and it looked something like this:

class GaugeAttributeServiceWithCaching def initialize(attribute_service = GaugeAttributeService.new, cache_service = CacheService.new) @attribute_service = attribute_service @cache_service = cache_service end end

Personally, over time I found this method tedious. My general guideline is pass a dependency through initialize when you are going to decorate it, otherwise use accessors. Let’s look at the attribute service with caching again.

class GaugeAttributeServiceWithCaching include Morphine register :cache_service do CacheService.new end def initialize(attribute_service = GaugeAttributeService.new) @attribute_service = attribute_service end end

Since this class is decorating an attribute service with caching, I pass in the service we want to decorate through initialize. I do not, however, pass in the cache service through initialize. Instead, the cache service uses Morphine (or accessors).

First, I think this makes the intent more obvious. The intent of this class is to wrap another object, so that object should be provided to initialize. Defaulting the service to wrap is merely a convenience.

Second, the cache service is a dependency, but not one that is being wrapped. It purely needs a sane default and a way to be replaced, therefore it uses Morphine (or accessors).

I cannot say this is a hard and fast rule that everyone should follow and that you are wrong if you do not. I can say that through trial and error, following this guideline has led to the least amount of friction while maintaining flexibility and isolation.

Guideline #3. Create real interfaces

As I mentioned above, the first thing I started with when working on the caching code was an interface for caching for the application, rather than just using a client directly. Occasionally what I see people do is create an interface, but wholesale pass arguments through to a client like so:

# bad idea class CacheService def initialize(driver) @driver = driver end def get(*args) @driver.get(*args) end def set(*args) @driver.set(*args) end def delete(*args) @driver.delete(*args) end end

In my opinion, this is abstracting at the wrong level. All you are doing is adding a layer of indirection on top of a driver. It makes it harder to follow and any exceptions that the driver raises will be raised in your application. Also, any parameters that the driver works with, your interface will work with. There is no point in doing this.

Instead, create a real interface. Define the methods and parameters you want your application to be able to use and make that work with whatever driver you end up choosing or changing to down the road.

Handling Exceptions

First, I created the exceptions that would be raised if anything goes wrong.

class CacheService class Error < StandardError attr_reader :original def initialize(original = $!) if original.nil? super else super(original.message) end @original = original end end class NotFound < Error; end class NotStored < Error; end end

CacheService::Error is the base that all other errors inherit from. It wraps whatever the original error was, instead of discarding it, and defaults to the last exception that was raised $! . I will show how these are used in a bit.

Portability and serialization

I knew that I wanted the cache to be portable, so instead of just defaulting to Marshal’ing, I used only raw operations and ensured that I wrapped all raw operations with serialize and deserialize, where appropriate.

In order to allow this cache service class to work with multiple serialization methods, I registered a serializer dependency, instead of just using MultiJson’s dump and load directly. I then wrapped convenience methods ( serialize and deserialize ) that handle a few oddities induced by the driver I am wrapping.

class CacheService include Morphine register :serializer do Serializers::Json.new end private def serialize(value) serializer.serialize(value) end def deserialize(value) if value.is_a?(Hash) # get with multiple keys value.each { |k, v| value[k] = deserialize(v) } value else serializer.deserialize(value) end end end

Handling exceptions (continued)

I then created a few private methods that hit the driver and wrap exceptions. These private methods are what the public methods use to ensure that exceptions are properly handled and such.

class CacheService private def driver_read(keys) deserialize(@driver.get(keys, false)) rescue Memcached::NotFound raise NotFound rescue Memcached::Error raise Error end def driver_write(method, key, value) @driver.send method, key, serialize(value), DefaultTTL.call, false rescue Memcached::NotStored raise NotStored rescue Memcached::Error raise Error end def driver_delete(key) @driver.delete(key) rescue Memcached::NotFound raise NotFound end end

At this point, no driver specific exceptions should ever bubble outside of the cache service. When using the cache service in the application, I need only worry about handling the cache service exceptions and not the specific driver exceptions.

If I change to a different driver, only this class changes. The rest of my application stays the same. Big win. How many times have you upgraded a gem and then had to update pieces all over your application because they willy-nilly changed their interface.

The public interface

All that is left is to define the public methods and parameters that can be used in the application.

class CacheService def get(keys) driver_read(keys) rescue NotFound nil end def set(key, value) driver_write :set, key, value end def delete(key) driver_delete key rescue NotFound nil end end

At this point, the application has a defined interface that it can work with for caching and for the most part does not need to worry about exceptions as they are wrapped and, in some cases, even handled (ie: nil for NotFound).

Creating real interfaces ensures that expectations are set and upgrades are easy. Defined interfaces give other developers on the project confidence that if they follow the rules, things will work as expected.

Guideline #4. Test the whole way through

Whatever you want to call them, you need tests that prove all your components are wired together and working as expected, in the same manor as they will be used in production.

The reason a lot of developers have felt pain with pure unit testing and isolation is because they forget to add that secondary layer of tests on top that ensure that the way things are wired together works too.

Unit tests are there to drive our design. Acceptance tests are there to make sure that things are actually working the whole way through. Each of these are essential and not to be skipped over.

If you are having problems testing, it may be your design. If you are getting burned by isolation, you are probably missing higher level tests. You should be able to kill your unit tests and still have reasonable confidence that your system is working.

Nowadays, I often start with a high level test and then work my way in unit testing the pieces as I make them. I’ve found this keeps me focused on the value I am adding and ensures that my coverage is good.

Conclusion

While it has definitely taken a lot of trial and error, I am starting to find the right balance between flexibility, isolation and overkill.

Stick to single responsibilities. Inject decorated dependencies through initialization and use accessors for other dependencies. Create real interfaces. Test in isolation and the whole way through.

Follow these guidelines and I believe you will start to feel better about the code you are writing, as I have over the past few months.

I would love to hear what others of you are doing and see examples. Comment below with gists, github urls, and other thoughts. Thanks!