Evolutionary architecture and emergent design

Building DSLs in JRuby

Leverage Ruby's expressiveness by using JRuby atop Java code

Content series: This content is part # of # in the series: Evolutionary architecture and emergent design Stay tuned for additional content in this series. This content is part of the series: Evolutionary architecture and emergent design Stay tuned for additional content in this series.

A few installments ago, I began covering the harvesting of domain idiomatic patterns (solutions to emergent business problems) using domain-specific languages. DSLs work nicely for this task because they are concise (containing as little noisy syntax as possible) and readable (even by nondevelopers), and they stand out from more API-centric code. In the last installment, I showed how to build DSLs in Groovy, taking advantage of several of its features. In this installment, I'll wrap up this discussion of using DSLs to harvest idiomatic patterns by showing how to build more-sophisticated DSLs in Ruby, leveraging JRuby.

About this series This series aims to provide a fresh perspective on the often-discussed but elusive concepts of software architecture and design. Through concrete examples, Neal Ford gives you a solid grounding in the agile practices of evolutionary architecture and emergent design. By deferring important architectural and design decisions until the last responsible moment, you can prevent unnecessary complexity from undermining your software projects.

Ruby is currently the most popular language for building internal DSLs. Most of the infrastructure you think about when developing in Ruby is DSL-based — Ruby on Rails, RSpec, Cucumber, Rake, and many others (see Related topics) — because it is amenable to hosting internal DSLs. And the trendy technique of behavior-driven development (BDD) required a strong DSL base to achieve its popularity. This installment will help you understand why Ruby is so popular among DSL aficionados.

Open classes in Ruby

Using open classes to add new methods to a built-in class is a common technique for adding expressiveness to DSLs. In the last installment, I showed two different syntaxes in Groovy for open classes. In Ruby, you have the same mechanism but with a single syntax. For example, to create a recipe DSL, you need a way to capture quantities. Consider the DSL fragment in Listing 1:

Listing 1. Target syntax for my Ruby-based recipe DSL

recipe = Recipe.new "Spicy bread" recipe.add 200.grams.of "flour" recipe.add 1.lb.of "nutmeg"

To make this code executable, I must add the gram and lb methods to numbers by opening the Numeric class, as shown in Listing 2:

Listing 2. Open class definitions in Ruby

class Numeric def gram self end alias_method :grams, :gram def pound self * 453.59237 end alias_method :pounds, :pound alias_method :lb, :pound alias_method :lbs, :pound

In Ruby, class names must start with a capital letter, which is also the rule for Ruby constants, meaning that every class name is also a constant. When Ruby "sees" a class definition, it checks to see if that class has already been loaded on its class path. Because class names are constants, you can have only one class of a given name. If the class is already loaded, the class definition reopens the class, allowing me to make changes. In Listing 2, I reopen the Numeric class (which handles both fixed and floating-point numbers) to add the gram and pound methods. Unlike Groovy, Ruby doesn't have the rule that methods accepting no parameters must be called with empty parenthesis, meaning that Ruby doesn't need to distinguish between properties and methods.

Ruby also includes another handy DSL mechanism: the alias_method class method. You want to enhance the fluency of your DSLs as much as possible, suggesting that you should handle cases like pluralization. (If you want to see elaborate efforts to achieve this result, check out the pluralization code in Ruby on Rails for handling pluralizing model class names.) I don't want to form grammatically clumsy sentences like recipe.add 2.gram.of("flour") in my DSL when I'm clearly adding more than one gram. The alias_method mechanism in Ruby makes it easy to create alternate names for methods to enhance readability. To that end, Listing 2 adds a pluralized method for gram , and both alternate abbreviations and pluralized versions for pound .

Building fluent interfaces

One of the goals of using a DSL to capture idiomatic patterns is the ability to eliminate noisy syntax from the programming-language version of your abstractions. Consider the snippet of noisy recipe DSL code in Listing 3:

Listing 3. Noisy recipe definition

recipe = Recipe.new "Spicy bread" recipe.add 200.grams.of "flour" recipe.add 1.lb.of "nutmeg" recipe.directions << "mix ingredients" recipe.directions << "cook for 30 minutes at 250 degrees"

Although the syntax in Listing 3 for adding recipe ingredients and directions is fairly concise, the noisy repetition there is embodied by the host variable name ( recipe ). A cleaner version appears in Listing 4:

Listing 4. Contextualized recipe definition

alternate_recipe = Recipe.new("Milky Gravy") alternate_recipe.consists_of { add 1.lb.of "flour" add 200.grams.of "milk" add 1.gram.of "nutmeg" steps( "mix ingredients", "cook for some amount of time" ) }

The addition of the consists_of method to the fluent interface allows me to use containership (embodied in Ruby via closure blocks delimited with curly braces ( {} ) to eliminate the noisy host-object repetition. The implementation of this method is trivial in Ruby, as shown in Listing 5:

Listing 5. The Recipe class definition, including the consists_of method

class Recipe attr_reader :ingredients attr_accessor :name attr_accessor :directions def initialize(name="") @ingredients = [] @directions = [] @name = name end def add ingredient @ingredients << ingredient return self end def steps *direction_list @directions = direction_list.collect end def consists_of &block instance_eval &block end end

The consists_of method accepts a code block. (That's the syntax you see with the ampersand before the parameter name. The ampersand identifies the parameter as the holder of a code block.) The method executes the code block using the instance_eval method, one of the built-in methods in Ruby. The instance_eval method executes the code passed to it by changing the definition of the host object. In other words, when you execute code via instance_eval , you change self (Ruby's version of the Java language's this ) to be the variable that called instance_eval . Thus, you can call the add and steps methods without using the recipe host object if you call them with recipe.instance_eval , which is what the consists_of method does.

Regular readers will recognize this concept in the guise of Java syntax from the "Leveraging reusable code, Part 2" installment, reproduced here in Listing 6:

Listing 6. Fluentizing code blocks in Java code using instance initializers

MarketingDescription desc = new MarketingDescriptionImpl() {{ setType("Box"); setSubType("Insulated"); setAttribute("length", "50.5"); setAttribute("ladder", "yes"); setAttribute("lining type", "cork"); }};

Although the syntax is passingly similar, the Java version suffers from a couple of serious limitations. First, it is unusual syntax in the Java language. (Most developers never encounter the instance initializer in everyday coding.) Second, because it uses anonymous inner classes (the only code-block-like mechanism in Java), any variables from the outer scope must be declared final , which places serious limitations on the kinds of things you can do inside the code block. In Ruby, the instance_eval method is a standard (and unexotic) language feature, meaning that it is more commonly used.

Polishing

One common technique many DSLs use (especially those targeting nondevelopers) is to leverage spoken languages. Molding computer syntax toward a spoken language is possible if your base computer language is flexible enough. Consider the recipe DSL I have created thus far. Creating an entire DSL just to hold simple data structures (like lists of ingredients and directions) seems like a bit of overkill; why not just keep this information in standard data structures? By encoding the operations in a DSL, I can take extra actions (like beneficial side effects) in addition to populating data structures. For example, perhaps I want to capture nutrition information for each ingredient as I define it in the DSL, allowing me to provide an aggregate value of the nutrition of the recipe when done. The NutritionProfile class is a simple data holder, shown in Listing 7:

Listing 7. Recipe nutrition record

class NutritionProfile attr_accessor :name, :protein, :lipid, :sugars, :calcium, :sodium def initialize(name, protein=0, lipid=0, sugars=0, calcium=0, sodium=0) @name = name @protein, @lipid, @sugars = protein, lipid, sugars @calcium, @sodium = calcium, sodium end def self.create_from_hash(name, h) new(name, h['protein'], h['lipid'], h['sugars'], h['calcium'], h['sodium']) end def to_s() "\tProtein: " + @protein.to_s + "

\tLipid: " + @lipid.to_s + "

\tSugars: " + @sugars.to_s + "

\tCalcium: " + @calcium.to_s + "

\tSodium: " + @sodium.to_s end end

To populate a database of these nutrition records, I create a text file that contains one record on each row:

ingredient "flour" has protein=11.5, lipid=1.45, sugars=1.12, calcium=20, and sodium=0

As you can probably guess, each line of this definition file is a Ruby-based DSL. Rather than think of its syntax as just a line of text, consider what it "looks" like from a computer-language standpoint, as shown in Figure 1.

Ingredient text definition as a method call

Each line starts with ingredient , which is the method name. The first parameter is the name of the ingredient. The word has is called a bubble word — a word that makes the DSL more readable but doesn't contribute to the final definition. The rest of the line consists of name/value pairs, separated by commas. Given that this is not yet legal Ruby syntax, how do I translate it into Ruby? That job is called polishing: taking almost-legal syntax and polishing it into actual syntax. The job of polishing this DSL is handled by the NutritionProfileDefinition class, shown in Listing 8:

Listing 8. NutritionProfileDefinition class

class NutritionProfileDefinition def polish_text(definition_line) polished_text = definition_line.clone polished_text.gsub!(/=/, '=>') polished_text.sub!(/and /, '') polished_text.sub!(/has /, ',') polished_text end def process_definition(definition) instance_eval polish_text(definition) end def ingredient(name, ingredients) NutritionProfile.create_from_hash name, ingredients end end

The entry point of this class is the process_definition method, shown in Listing 9:

Listing 9. The process_definition method

def process_definition(definition) instance_eval polish_text(definition) end

This method calls polish_text using instance_eval , switching the execution context of polish_text to the NutritionProfileDefinition instance. The polish_text method, shown in Listing 10, does the necessary substitutions and translations to convert the almost-code to code:

Listing 10. The polish_text method

def polish_text(definition_line) polished_text = definition_line.clone polished_text.gsub!(/=/, '=>') polished_text.sub!(/and /, '') polished_text.sub!(/has /, ',') polished_text end

The polish_text method consists of simple string substitutions to convert the definition syntax into Ruby syntax, converting the equals sign to a hash identifier ( => ), getting rid of excess instances of the word and , and converting has to a comma. This polished line of code is passed to instance_eval , executing it via the ingredient method of the NutritionProfileDefinition class.

You could write this code in the Java language, but Java's syntactic limitations would add so much noise that you would lose the benefits of the fluent interface, rendering the exercise moot. Ruby offers enough syntactic sugar to make it feasible (and desirable) to cast abstractions as DSLs.

Method missing

Unlike the preceding example, the next one cannot be done in Java code, even with cumbersome syntax. One convenient mechanism in languages that commonly host DSLs is method missing. When you call a method that doesn't exist in Ruby, it doesn't immediately generate an exception. You have an opportunity to add a method_missing method to your class that will handle any missing method calls. This is used heavily in DSLs that build internal data structures. Consider this example from the XMLBuilder in Ruby (see Related topics), shown in Listing 11:

Listing 11. Using XMLBuilder in Ruby

xml = Builder::XmlMarkup.new(:indent => 2) xml.person { xml.name("Neo") xml.catch_phrase("Whoa") } puts xml.target!

This code outputs an XML document with the structure shown in the DSL. Builder works its magic via method_missing . When you call a method on the xml variable, that method doesn't already exist, so it falls into method_missing , which constructs the corresponding XML. This makes the code for the Builder library very small; most of its mechanics rely on underlying language features of Ruby. One problem remains with this approach, however, as illustrated in Listing 12:

Listing 12. Method missing collisions with built-in methods

xml = Builder::XmlMarkup.new(:indent => 2) xml.person { xml.name("Neo") xml.catch_phrase("Whoa") xml.class("pod-born") } puts xml.target!

Builders in Groovy vs. Ruby The inspiration for the Builder class in Ruby came from similar builders classes in Groovy. Jim Weirich, the creator of Builder in Ruby, liked the concept but not the implementation in Groovy because it uses an elaborate mapping strategy between the XML tags and the generated XML. Weirich created XMLBuilder (and BlankSlate ) as a simpler, more elegant solution to the problem. This is interesting because it offers a glimpse into how language communities tend to solve problems. In general, the Java community tends to build structural elements (such as frameworks and design patterns) to solve problems, building up abstraction layers upon layers. In Ruby, developers tend to use metaprogramming to build downward, using the simplest underlying mechanism they can leverage. Contrast any Java web framework with Ruby on Rails; or compare builders, where Weirich used metaprogramming to strip out what he didn't need, allowing him to leverage a language feature.

If you rely solely on method_missing , the code in Listing 12 won't work because the class method is already defined in Ruby as part of Object , which (as in the Java language), is the base class for all classes. Obviously, method_missing won't work with existing methods. This would seem to doom this approach. However, Jim Weirich (the creator of Builder), came up with an elegant solution: he created BlankSlate . BlankSlate is a class that inherits from Object but programmatically removes all the methods normally found in Object . This allows him to leverage the method_missing infrastructure without any annoying side effects.

This BlankSlate mechanism is so powerful and useful that it's being built into the next major version of Ruby. In Ruby 1.9, SimpleObject becomes the very top of the object hierarchy, with Object as its immediate descendent. Having SimpleObject makes building builder DSLs much easier because you'll no longer need BlankSlate .

The ability to create a DSL like Builder illustrates why expressiveness and power in languages are so critical. The amount of code in Ruby's Builder is much smaller than similar libraries from other languages because it was written atop a more flexible design medium: Ruby.

Conclusion

I've been making the case since the beginning of this series that the design of software systems encompasses its complete source code, which implies that you have a broader design palette if you use more-expressive languages. This applies not only to your choice of general-purpose language (Java, Ruby, Groovy, Clojure), but also to the languages you can write atop your base language using DSLs. Building a language that expresses your business concepts exactly becomes a valuable asset to your organization: you are capturing important ways of solving real problems in a language highly suited to the purpose.

Even if your organization won't switch to a language like Ruby or Groovy for most development, you can "sneak in" these languages by using tools implemented in them, such as RSpec and easyb (see Related topics). By bringing these alternate languages in through the back door, you can help those who are needlessly wary of introducing new languages understand that they offer significant benefits.

Downloadable resources

Related topics