Reading Ruby Code: ROM - Exploration

This is the third part of my on-going series on code reading, the beginning can be found here

In the first two posts on code reading we got the setup and presented an overview of the Ruby Object Mapper (ROM). With that out of the way, lets dig in to the real code reading and begin exploring. To start, lets focus on the container method in the example code.

For a refresher, here it is within the example:

1 rom = ROM . container ( :sql , 'sqlite::memory' ) do | conf | 2 conf . default . create_table ( :users ) do 3 primary_key :id 4 column :name , String , null: false 5 column :email , String , null: false 6 end 7 end

Here the container method is called on the ROM module with two arguments and a block. An object is passed to the block, and we configure options on that object. This is a fairly common pattern, where a singleton class provides a method for configuring itself via block or normal method chains, for example:

Client . configuration . protocol = 'https' Client . configuration . domain = 'test.com' # or Client . configure do | config | config . protocol = 'https' config . domain = 'test.com' end

In both cases, a configuration object is being stored on the singleton class. In the block case, this object is yielded to the block, in the method-chain case it is accessed directly. The block pattern avoids the need to make a chained method call repetitively and also segregates the configuration into a block. This is both easier to write and read. ROM is following this pattern to an extent, but we will see that the configuration happening within the container method is much more complex. To find the container method, a quick grep reveals the method within the create_container.rb file:

lib/rom/create_container.rb

⊕ This code sample is by rom-rb, you can view the full file here.

57 def self . container ( * args , & block ) 58 InlineCreateContainer . new ( * args , & block ). container 59 end

Which is just a wrapper around the InlineCreateContainer class:

lib/rom/create_container.rb

⊕ This code sample is by rom-rb, you can view the full file here.

38 class InlineCreateContainer < CreateContainer 39 def initialize ( * args , & block ) 40 case args . first 41 when Configuration 42 environment = args . first . environment 43 setup = args . first . setup 44 when Environment 45 environment = args . first 46 setup = args [ 1 ] 47 else 48 configuration = Configuration . new ( * args , & block ) 49 environment = configuration . environment 50 setup = configuration . setup 51 end 52 53 super ( environment , setup ) 54 end 55 end

The highlighted lines above show the path that our example code, with its arguments and block, would take. Whats interesting here is the flexibility of this method’s inputs. It can accept a Configuration , an Environment (and Setup ), or it will build a Configuration object using its args and block. At the bottom of the method, we see that we need an environment and setup. Configuration objects have both an Environment and Setup , so we can build up to a configuration to get these required objects:

This is a good example of the robustness principle or Postel’s law:

Be conservative in what you do, be liberal in what you accept from others

Here a container can be made using a variety of different inputs, but will always (with valid inputs) return a container. The actual building of the Configuration object from args and a block is isolated within the configuration class:

lib/rom/configuration.rb

⊕ This code sample is by rom-rb, you can view the full file here.

21 def initialize ( * args , & block ) 22 @environment = Environment . new ( * args ) 23 @setup = Setup . new 24 25 block . call ( self ) unless block . nil? 26 end

The args are used to build the Environment object that will eventually be a part of the Configuration object. That configuration object is yielded to the passed block allowing it to be modified from within that block.

With the environment and setup in hand, we use those classes to build a Finalize class and run it:

lib/rom/create_container.rb

⊕ This code sample is by rom-rb, you can view the full file here.

10 def initialize ( environment , setup ) 11 @container = finalize ( environment , setup ) 12 end 13 14 private 15 16 def finalize ( environment , setup ) 17 environment . configure do | config | 18 environment . gateways . each_key do | key | 19 gateway_config = config . gateways [ key ] 20 gateway_config . infer_relations = true unless gateway_config . key? ( :infer_relations ) 21 end 22 end 23 24 finalize = Finalize . new ( 25 gateways: environment . gateways , 26 gateway_map: environment . gateways_map , 27 relation_classes: setup . relation_classes , 28 command_classes: setup . command_classes , 29 mappers: setup . mapper_classes , 30 plugins: setup . plugins , 31 config: environment . config . dup . freeze 32 ) 33 34 finalize . run! 35 end

It is the Finalize class that is responsible for building up the container object:

lib/rom/setup/finalize.rb

⊕ This code sample is by rom-rb, you can view the full file here.

16 module ROM 17 # This giant builds an container using defined classes for core parts of ROM 18 # 19 # It is used by the setup object after it's done gathering class definitions 20 # 21 # @private 22 class Finalize 23 attr_reader :gateways , :repo_adapter , :datasets , :gateway_map , 24 :relation_classes , :mapper_classes , :mapper_objects , :command_classes , :plugins , :config

Just to keep track, when ROM.container is original called here is the process:

The usage of three classes to perform one operation could be viewed in several ways. Some may argue that it is unnecessarily complex, that it is Object Oriented design run amok. Or that it is too hard to follow through multiple files. I disagree with these characterizations because:

The container is a central piece of ROM, and so having flexible and varied ways to build one is useful Classes are well named ( InlineCreateContainer explains what it does) therefore following becomes easier Functionality is well segerated between classes

However, it does require navigating between three files (create_container, configuration, and finalize) and about twice that many methods to piece together how a container is built. As a code reader, this might be more difficult than scrolling through a single file. From a practical perspective, as I am following a progression, I open each new file as a separate “buffer” (in some editors this might be called a window or tab). In that way I can flip quickly between the files and follow the progress.

One other trick to try for not to losing focus on what we want to know is:

When following an operation through several classes or methods, focus on the beginning and end of methods

For example, the InlineCreateContainer initializer has 13 lines in it, but the key to the method is that it takes an environment and setup object up to its super class. It accepts a lot, but setup and environment are moving on. This is revealed on the last line of the method. So, while we may want to know what environment and setup are, we probably should continue on the CreateContainer superclass to follow the progression. If the local variables are well named, we should also be able to reason as to what they are and what class they might come from.

A hard to find method

After resolving the overall creation of the container, the block configuration of the container still needs to be examined. In the example, the configuration object is yielded to the block. A default method is called, revealing something that responds to the create_table method. default however does not exist on the Configuration class, also its not easily revealed through a search of the project (“default” the word is fairly common).

One issue with dynamic interpreted languages such as Ruby is that methods can come from several sources. Aside from standard definition, they can also come from mixins, meta-programming, etc. Moreover, classes can be re-opened and added to at later points. While this offers a great deal of flexibility as a developer, it may make finding a given method in ruby code difficult. Luckily, there is the aptly name method method and corresponding class. Lets use that method to find our mysterious default method. First you need an instance of the class that you want the method for, so at this point I would drop a debugger pry is my debugger of choice, byebug is another option. in my example code and break here:

rom = ROM . container ( :sql , 'sqlite::memory' ) do | conf | require 'pry' ; binding . pry conf . default . create_table ( :users ) do

Running the example, I now have a Configuration object ( conf ) that I can interrogate and get its methods You could of course build the object in a small script or test, but this is much faster for the problem at hand . From looking at the source I know the use method exists, so lets try that:

[ 3 ] pry ( main ) > conf . method ( :use ). source_location => [ "/Users/michael/projects/ruby/rom-rb-exploration/vendor/ruby/2.3.0/gems/rom-2.0.0/lib/rom/configuration.rb" , 34 ]

Calling source_location reveals the line and file where the method is defined. This is very handy when you have multiple gems/repositories in play. However, when we call method for the default method we get:

[ 4 ] pry ( main ) > conf . method ( :default ) NameError : undefined method `default' for class ` #<Class:#<ROM::Configuration:0x007f8896277f88>>'

What happened? Ruby is telling us that the default is undefined on the Configuration class. The example code runs fine though, so something else must be in play. This is, of course, method_missing a useful meta-programming feature to provide dynamic methods.

When using method and a known defined method is “undefined” look for a method_missing definition

Indeed, this is the case for the configuration class:

lib/rom/configuration.rb

⊕ This code sample is by rom-rb, you can view the full file here.

62 def method_missing ( name , * ) 63 gateways . fetch ( name ) { super } 64 end

If a method is undefined, this class will try to find that method name as a key in the hash-like gateways object. Failing that, the normal ruby method_missing behavior continues. So, when we call default , the value for the default key within gateways is returned. Since we still have a debugger open, lets look at the gateways object:

1 [ 5 ] pry ( main ) > conf . gateways 2 => { :default => 3 #<ROM::SQL::Gateway:0x007f8896275fa8 4 @connection = #<Sequel::SQLite::Database: "sqlite::memory">, 5 @migrator = #<ROM::SQL::Migration::Migrator:0x007f88949ea880 @connection=#<Sequel::SQLite::Database: "sqlite::memory">, @options={:path=>"db/migrate"}, @path="db/migrate">, 6 @options = { :migrator => #<ROM::SQL::Migration::Migrator:0x007f88949ea880 @connection=#<Sequel::SQLite::Database: "sqlite::memory">, @options={:path=>"db/migrate"}, @path="db/migrate">}>}

As expected, gateways is a hash with the values being Gateway objects. In the example, default is a SQL::Gateway because that is what we passed as an argument to the original container. ROM containers are not limited to a single gateway/adapter, but when there is just one it will become the default . We can see how this adapter switching/differentiation happens by looking at the Gateway class itself I’m skipping over tracing back into the container building to show this piece. If interested refer back to the initializer of the container which initializes an Environment object.

lib/rom/gateway.rb

⊕ This code sample is by rom-rb, you can view the full file here.

97 adapter = ROM . adapters . fetch ( type ) { 98 begin 99 require "rom/ #{ type } " 100 rescue LoadError 101 raise AdapterLoadError , "Failed to load adapter rom/ #{ type } " 102 end 103 104 ROM . adapters . fetch ( type ) 105 }

This is a clever usage of Hash’s fetch to provide some dynamism and on-demand loading. fetch first looks for the passed adapter key in the adapters hash. If the key is found, the value will be returned. If not found, the block in lines 98-104 will be executed. In the block, an attempt is made to require /load the missing adapter. Presumably, requiring the adapter will add it to this hash because the fetch method is called again on 104, and at this point it will either return the adapter or throw a KeyError . In short, this code will attempt to load an unloaded adapter, and failing that it throw an error Either because the adapter doesn’t exist/can’t be loaded or because the adapter wasn’t added properly (KeyError). . We are assured by the end of this method that we either have an adapter like object or have thrown an error. To confirm this we can see that actual set of adapters is stored as a hash on the ROM module as an attribute:

lib/rom/global.rb

⊕ This code sample is by rom-rb, you can view the full file here.

19 # An internal adapter identifier => adapter module map used by setup 20 # 21 # @return [Hash<Symbol=>Module>] 22 # 23 # @api private 24 attr_reader :adapters

To add an adapter, you “register” it with register_adapter method:

lib/rom/global.rb

⊕ This code sample is by rom-rb, you can view the full file here.

53 def register_adapter ( identifier , adapter ) 54 adapters [ identifier ] = adapter 55 self 56 end

We can see this in action in the gem rom-sql gem:

lib/rom/sql.rb

⊕ This code sample is by rom-rb, you can view the full file here.

23 ROM . register_adapter ( :sql , ROM :: SQL )

So, requiring the file rom/sql as we might in line 99 of the gatway code will add the ROM::SQL to the list of adapters.

This is our first encounter with the plug-able/modular nature of ROM. ROM deals with data persistence, and as such wants to support a wide range of databases and persistence formats ROM is not confined to databases such as mongo or PostgreSQL, for example, you can use CSVs as a persistence format in ROM. . In addition to performing a standard set of business logic, ROM also wants to leave the door open to expansion to new data storage formats.

One approach to solve this problem is “one gem to rule them all”, i.e. rom-rb/rom holds everything from PostgreSQL code to CSV code. No sub-gems or plugins, just ROM. This is problematic for a few reasons, the most tangible being that every time the maintainer needs to fix anything, the gem must be bumped and pushed out. So if you are a happy CSV user, you are pushed to upgrade every time a PostgreSQL fix is pushed out. It also becomes a burden on the maintainers because the lines between sections of code are not as clear.

Given these issues, a more robust design choice is creating a plugin like architecture where the end user can choose the adapters they want to use/load. The user is not required to pull in code that is not needed Thus a CSV user just loads the rom/csv gem and can happily ignore all updates to rom/sql and vice versa . This leads to its own trade offs and maintenance burdens, but its likely the better choice for this situation.

The other noteworthy design aspect is the clear separation of concerns in the plugin architecture. ROM the module provides a structure/harness for using individual adapters. The individual adapter only has to register itself with the main module. What doesn’t happen, is the listing of individual adapters in advance with the ROM module. Additionally, the individual adapter have only a single method call on the module. They aren’t directly accessing the adapters hash or anything of that nature. In fact, the adapter doesn’t even know that adapters exists, much less that it is a hash. Thus, the implementation of the adapters storage can change at will as long as the method signature of register_adapter stays the same.

Takeaways

In this blog post we explored how the container method works and more generally the Container class. I demonstrated how I follow code through multiple classes, and using the method call to find a dynamic method. For code reading in general though I hope I have demonstrated that there are things to be gained from code reading, no matter what your level of development:

For the absolute beginner… simply being able to follow and understand the above code would be considered a major success

For the intermediate rubyist… clever usage of nested fetch to achieve auto-loading behavior and writing methods with flexible inputs

For the more seasoned developer… interesting architectural patterns abound

For example, I personally have never written a serious plugin type architecture. While I have implemented a register-like interface, the architecture explored herein is much more developed. If I were to implement something like this down the road, I’d have a vague idea of what I’d want to do and could always refer back to this or other open source examples. This is the power of code reading: I’ve given myself a design shortcut for the future by simply being exposed to ideas in the present.

We will continue with the exploration of ROM in our next article. Thanks for reading!