Ruby Code & Style

Creating DSLs with Ruby

by Jim Freeze

March 16, 2006



Broadly speaking, there are two ways to create a DSL. One is to invent a syntax from scratch, and build an interpreter or compiler. The other is to tailor an existing general-purpose language by adding or changing methods, operators, and default actions. This article explores using the latter method to build a DSL on top of Ruby.

A DSL, or domain specific language, is a (usually small) programming or description language designed for a fairly narrow purpose. In contrast to general-purpose languages designed to handle arbitrary computational tasks, DSLs are specific to particular domains. You can create a DSL in two basic ways:

Invent a DSL syntax from scratch, and build an interpreter or compiler. Tailor an existing general-purpose language by adding or changing methods, operators, and default actions.

An advantage to the second approach is that you save time because you don't have to write and debug a new language, leaving you more time to focus on the problem confronting the end-user. Among the disadvantages is that the DSL will be constrained by the syntax and the capabilities of the underlying general-purpose language. Furthermore, building on another language often means that the full power of the base language is available to the end-user, which may be a plus or minus depending on the circumstances. This article explores using the second approach to build a DSL on top of Ruby.

Describing Stackup Models

At my job, where I work as an interconnect modeling engineer, we needed a way to describe the vertical geometric profile of the circuitry on a semiconductor wafer. Descriptions were kept in a stackup model file (We coined the word stackup because of the logical way the metal wires are constructed by stacking layers on top of each other in the fabrication process). The issue was that each vendor had their own format for describing a stackup; but we wanted a common format so we could convert between the various file types. In other words, we needed to define a common stackup DSL and write a program that could export from our stackup format to any of the other vendors’ stackup format.

The vendors did not use a sophisticated DSL language, but instead their languages contained only static data elements in a mostly flat textual database. Their file formats did not allow for parameterized types, variables, constants, or equations. Just static data. Further, the format was overly simple. It was either line based or block based with one level of hierarchy.

We started out describing our stack-up format with limited ambition since we only needed to meet the vendors’ level of implementation. But once we saw the benefit of having a more expressive language, we quickly augmented our format. Why were we able to do this, but not the vendors? I believe it was because we used Ruby as a DSL, rather than start from scratch using C as the vendors did. Granted, other languages could have been used, but I don’t think the finished product would have been as elegant; the selection of the general-purpose language is a critical step.

I also believe the vendors’ development speed was handicapped from using C, prompting them to keep their stackup syntax simple to parse. Perhaps, not coincidentally, many of the vendors used simple syntax constructs in their file formats common to many DSLs. Because they occur so often, we are going to first look at how we can mimic these in Ruby before we move into more sophisticated language constructs.

Line-based and Block-level DSL Constructs

Line-based constructs are a way to assign a value or a range of values to a parameter. Among the vendors’ files we were looking at, the following formats were used:

parameter = value parameter value parameter min_value max_value step_value

Formats 1 and 2 are equivalent except for the implied ’=’ in 2. Format 3 assigned a range of values to the parameter.

The more complicated formats contained a block construct. The two that we encountered could be manually parsed with a line-based parser and a stack, or a key-letter and word parser with a stack. These two formats are illustrated below:

begin type = TYPE name = NAME param1 = value1 param2 = value2 ... end

One of the block formats used “C” style curly braces to identify a block, but parameter/value pairs were separated by white space.

TYPE NAME {param1 = value1 param2 = value2 }

Third Time’s a Charm

When we were building our DSL for the stackup file we solved the problem three times. First, we wrote our own parser and decided it was too much work to maintain. Not only the code, but also the documentation. Since our DSL was sufficiently complicated, it wasn’t obvious how to use all its features and therefore it had to be copiously documented.

Next, for a short period, we implemented the DSL in XML. This removed the need for us to write our own parser, as XML is universally understood, but it contained too much noise and obscured the contents of the file. Our engineers found it too difficult to mentally task-switch between thinking about the meaning of the stackup and mentally parsing XML. For me, the lesson learned was that XML is not to be read by humans and probably a bad choice for a DSL, regardless of the parsing benefits.

Finally, we implemented the DSL in Ruby. The implementation was quick since Ruby provides the parsing. Documentation on the parser (i.e. Ruby) was not required since it is already available. And the final DSL was easily understood by humans, yet compact and versatile.

So, let’s build a DSL in Ruby that lets us define ‘parameter = value’ statements. Consider the following hypothetical DSL file.

% cat params_with_equal.dsl name = fred parameter = .55

This is not valid Ruby code, so we need to modify the syntax slightly so Ruby accepts it. Let’s change it to:

% cat params_with_equal.dsl name = "fred" parameter = 0.55

Once we get the DSL to follow valid Ruby syntax, Ruby does all the work to parse the file and hold the data in a way that we can operate on it. Now let’s write some Ruby code to read this DSL.

First we want to encapsulate these parameters somehow. A good way is to put them into a class. We’ll call this class MyDSL.

% cat mydsl.rb class MyDSL ... end#class MyDSL

From the developer’s perspective, we want a simple and straightforward way to parse the DSL file. Something like:

my_dsl = MyDSL.load(filename)

So, let’s write the class method load :

def self.load(filename) dsl = new dsl.instance_eval(File.read(filename), filename) dsl end

The class method load creates a MyDSL object and calls instance_eval on the DSL file (params_with_equal.dsl above). The second argument to instance_eval is optional and allows Ruby to report a filename on parse errors. An optional third argument (not shown) gives you the ability to provide a starting line number for parse error reporting.

% cat dsl-loader.rb require 'mydsl' my_dsl = MyDSL.load(ARGV.shift) # put the DSL filename on the command line p my_dsl p my_dsl.instance_variables % ruby dsl-loader.rb params_with_equal.dsl #<MyDSL:0x89cd8> []

name

parameter

name

parameter

self.name = "fred"

self.parameter = 0.55

'@'

@name = "fred" @parameter = 0.55

Is this code going to work? Let’s see what happens:What happened? Where didandgo? Well, sinceandare on the left hand side of the equals sign, Ruby thinks they are local variables. We can tell Ruby otherwise by writingandin the DSL file or we can impose upon the user to do this using thesymbol:

But that is kind of ugly and, to me, about the same as if we had written

$name = "fred" $parameter = 0.55

Another way to let Ruby know the context of these methods is to declare the scope explicitly by yielding self (the MyDSL object instance) to a block. To do this, we will need to add a top level method to jump start our DSL and put the contents inside of the attached block. Our modified DSL now looks like:

% cat params_with_equal2.dsl define_parameters do |p| p.name = "fred" p.parameter = 0.55 end

where we have defined define_parameters as an instance method:

% cat mydsl2.rb class MyDSL def define_parameters yield self end def self.load(filename) dsl = new dsl.instance_eval(File.read(filename), filename) dsl end end#class MyDSL

And we change the require in dsl-loader to use the new version of the MyDSL class in mydsl2.rb :

% cat dsl-loader.rb require 'mydsl2' my_dsl = MyDSL.load(ARGV.shift) p my_dsl p my_dsl.instance_variables

Theoretically, this should work, but let’s test it out just to make sure.

% ruby dsl-loader.rb params_with_equal2.dsl params_with_equal2.dsl:2:in `load': undefined method `name=' for #<MyDSL:0x26300> (NoMethodError)

Oops. We forgot the accessors for name and parameter . Let’s add those and look at the complete program:

% cat mydsl2.rb class MyDSL attr_accessor :name, :parameter def define_parameters yield self end def self.load(filename) # ... same as before end end

% ruby dsl-loader.rb params_with_equal2.dsl #<MyDSL:0x25ec8 @name="fred", @parameter=0.55> ["@name", "@parameter"]

Now, let's test it again.

Success! This now works, but we have added two extra lines to the DSL file and have added some noise with the ‘p.’ notation. This notation is better suited when there exists multiple levels of hierarchy in the file and there is actually a need for and a benefit from explicitly specifying context. In our simple case we can implicitly define context and leave no doubt for Ruby that name and parameter are methods. We do this by removing the ’=’ sign and write the DSL file as

% cat params.dsl name "fred" parameter 0.55

Now we need to define a new type of accessor for name and parameter . The trick here is to realize that name without an argument is a reader for @name , and name with one or more arguments is a setter for @name . (Note: it is convenient to use this methodology even when multiple levels of hierarchy are present and context is explicitly declared.) We define the accessors for name and parameter by removing the attr_accessor line and adding the following code:

% cat mydsl3.rb class MyDSL def name(*val) if val.empty? @name else @name = val.size == 1 ? val[0] : val end end def parameter(*val) if val.empty? @parameter else @parameters = val.size == 1 ? val[0] : val end end def self.load(filename) # ... same as before end end#class MyDSL

If either name or parameter is seen without arguments, they will return their value. If arguments are present, they will be assigned the value when a single argument is present, or they will be assigned to an array of values for multiple arguments.

Let’s run our sample parser (changed to require the file mydsl3.rb) to test our handiwork:

% ruby dsl-loader.rb params.dsl #<MyDSL:0x25edc @parameter=0.55, @name="fred"> ["@parameter", "@name"]

Success again! But defining these accessors explicitly is a pain. So let’s define a custom DSL accessor and make it available to all classes. We do this by putting the method in the Module class.

% cat dslhelper.rb class Module def dsl_accessor(*symbols) symbols.each { |sym| class_eval %{ def #{sym}(*val) if val.empty? @#{sym} else @#{sym} = val.size == 1 ? val[0] : val end end } } end end

The above code simply defines the dsl_accessor method which creates our DSL specific accessors. We now plug it into the application and use dsl_accessor instead of attr_accessor to get:

% cat mydsl4.rb require 'dslhelper' class MyDSL dsl_accessor :name, :parameter def self.load(filename) # ... same as before end end#class MyDSL

require

dsl-loader.rb

mydsl4.rb

% ruby dsl-loader.rb params.dsl #<MyDSL:0x25edc @parameter=0.55, @name="fred"> ["@parameter", "@name"]

Again, we update thestatement into load thefile and run the loader:

This is all well and good, but what if we don’t know the parameter names in advance? Depending on the use cases for the DSL, parameter names may be generated by the user. Never fear. With Ruby, we have the power of method_missing. A two-line method added to MyDSL will define a DSL attribute with dsl_accessor on demand. That is, if a value is to be assigned to a (thus far) non-existent parameter, method_missing will define the getters and setters and assign the value to the parameter.

% cat mydsl5.rb require 'dslhelper' class MyDSL def method_missing(sym, *args) self.class.dsl_accessor sym send(sym, *args) end def self.load(filename) # ... Same as before end end % head -1 dsl-loader.rb require 'mydsl5' % ruby dsl-loader.rb params.dsl #<MyDSL:0x25b80 @parameter=0.55, @name="fred"> ["@parameter", "@name"]

Wow! Doesn't that make you feel good? With just a little bit of code, we have a parser that can read and define an arbitrary number of parameters. Well, almost. What if the end-user doesn't know Ruby and uses parameter names that collide with existing method calls? For example, what if our DSL file contains the following:

% cat params_with_keyword.dsl methods %w(one two three) id 12345 % ruby dsl-loader.rb params_with_keyword.dsl params_with_keyword.dsl:2:in `id': wrong number of arguments (1 for 0) (ArgumentError)

BlankSlate

MyDSL

% cat mydsl6.rb require 'dslhelper' class BlankSlate instance_methods.each { |m| undef_method(m) unless %w( __send__ __id__ send class inspect instance_eval instance_variables ).include?(m) } end#class BlankSlate # MyDSL now inherits from BlankSlate class MyDSL < BlankSlate # ... nothing new here, move along... end#class MyDSL

% head -1 dsl-loader.rb require 'mydsl6' % ruby dsl-loader.rb params_with_keyword.dsl #<MyDSL:0x23538 @id=12345, @methods=["one", "two", "three"]> ["@id", "@methods"]

Oh, how embarrassing. Well, we can fix this (mostly) in short order with a little help from a class called 0 ], which was initially conceived by Jim Weirich [ 1 ]. The BlankSlate class used here is a little different than the one introduced by Jim simply because we want to keep a little more functionality around. So we keep seven methods. You can experiment with these to see which ones are absolutely required and which ones we are using just to visualize the contents of ourobject.Now when we try to load the DSL file that is loaded with keywords, we should get something a little more sensible:

And sure enough, we do. This is good news that we can remove spurious methods and free up more possibilities of parameter names for our end-users. However, note that we can't give end-users a completely unrestrained license to use any name for a parameter. This is one of the downsides of using a generic-programming language as a DSL, but I think that an end-user being prohibited from using class as a parameter name has only a small risk of being a deal killer.

Getting More Sophisticated

We are now ready to look at more complex DSL features. Instead of a DSL for manipulating data, let’s look at one that performs a more concrete action. Imagine that we are tired of manually creating a common set of directories and files whenever we start a new project. It would be nice if we had Ruby do this for us. It would even be nicer if we had a small DSL such that we could modify the project directory structure without editing the low-level code.

We begin this project by defining a DSL that makes sense for this problem. The file below is our version 0.0.1 of just such a DSL.

% cat project_template.dsl create_project do dir "bin" do create_from_template :exe, name end dir "lib" do create_rb_file name dir name do create_rb_file name end end dir "test" touch :CHANGELOG, :README, :TODO end

In this file, we create a project and add three directories and three files. Inside the “ bin ” directory we create an executable file with the same name as the project using the :exe template. In the ‘ lib ’ directory, we create a .rb file, and a directory, both named after the project. Inside that inner directory, another .rb file with the same name as the project. Next, back at the top level, the ‘ test ’ directory is created, and, finally, three empty files are created.

The methods needed for this DSL are: create_project , dir , create_from_template , create_rb_file and touch . Let’s look at these methods one by one.

The create_project method is our top level wrapper. This method provides scope by letting us put all the DSL code inside a block. (Complete listings are at the end of the article.)

def create_project() yield end

The dir method is the workhorse. This method not only creates the directory, it also maintains the current working directory in the @cwd instance variable. Here, the use of ensure allows us to trivially maintain the proper state of @cwd .

def dir(dir_name) old_cwd = @cwd @cwd = File.join(@cwd, dir_name) FileUtils.mkdir_p(@cwd) yield self if block_given? ensure @cwd = old_cwd end

The touch and create_rb_file methods are the same except that the latter adds ” .rb ” to the filename. These methods may be given one or more filenames where the names can be either strings or symbols.

def touch(*file_names) file_names.flatten.each { |file| FileUtils.touch(File.join(@cwd, "#{file}")) } end

Finally, the create_from_template method is just a quick dash to illustrate how one may put some actual functionality into a DSL . (See the source listings for the complete code.)

To run and test the code, we build a small test application:

% cat create_project.rb require 'project_builder' project_name = ARGV.shift proj = ProjectBuilder.load(project_name) puts "== DIR TREE OF PROJECT '#{project_name}' ==" puts `find #{project_name}`

And the results are:

% ruby create_project.rb fred == DIR TREE OF PROJECT 'fred' == fred fred/bin fred/bin/fred fred/CHANGELOG fred/lib fred/lib/fred fred/lib/fred/fred.rb fred/lib/fred.rb fred/README fred/test fred/TODO

% cat fred/bin/fred #!/usr/bin/env ruby require 'rubygems' require 'commandline require 'fred' class FredApp < CommandLine::Application def initialize end def main end end#class FredApp

Wow! It worked! And with not much effort.

Summary

I work on many projects that require a rather detailed control flow description. For every project, this used to make me pause and consider how to get all this detailed configuration data into the application. Now, Ruby as a DSL is near the top of the list of possibilities, and usually solves the problem quickly and efficiently.

When I was doing Ruby training, I would take the class through a problem solving technique where we would describe the problem in plain English, then in pseudo code, and then in Ruby. But, in some cases, the pseudo code would be valid Ruby code. I think that the high readability quotient of Ruby makes it an ideal language for use as a DSL. And as Ruby becomes known by more people, DSLs written in Ruby will be a favorable way of communicating with an application.

Code listing for project ProjectBuilder DSL:

% cat project_builder.rb require 'fileutils' class ProjectBuilder PROJECT_TEMPLATE_DSL = "project_template.dsl" attr_reader :name TEMPLATES = { :exe => <<-EOT #!/usr/bin/env ruby require 'rubygems' require 'commandline require '%name%' class %name.capitalize%App < CommandLine::Application def initialize end def main end end#class %name.capitalize%App EOT } def initialize(name) @name = name @top_level_dir = Dir.pwd @project_dir = File.join(@top_level_dir, @name) FileUtils.mkdir_p(@project_dir) @cwd = @project_dir end def create_project yield end def self.load(project_name, dsl=PROJECT_TEMPLATE_DSL) proj = new(project_name) proj = proj.instance_eval(File.read(dsl), dsl) proj end def dir(dir_name) old_cwd = @cwd @cwd = File.join(@cwd, dir_name) FileUtils.mkdir_p(@cwd) yield self if block_given? ensure @cwd = old_cwd end def touch(*file_names) file_names.flatten.each { |file| FileUtils.touch(File.join(@cwd, "#{file}")) } end def create_rb_file(file_names) file_names.each { |file| touch(file + ".rb") } end def create_from_template(template_id, filename) File.open(File.join(@cwd, filename), "w+") { |f| str = TEMPLATES[template_id] str.gsub!(/%[^%]+%/) { |m| instance_eval m[1..-2] } f.puts str } end end#class ProjectBuilder # Execute as: # ruby create-project.rb project_name

Resources

[0] BlankSlate is a Ruby class designed to create method-free objects.

http://onestepback.org/index.cgi/Tech/Ruby/BlankSlate.rdoc

[1] Jim Weirich is the creator of BlankSlate, as well as other notable Ruby tools and libraries.

http://onestepback.org

About the author

Jim Freeze has been a Ruby enthusiast since he learned of the language in early 2001.

An Electrical Engineer by trade working in the Semiconductor Industry, Jim has focused on extending Ruby into the EDA space and building libraries to make the language more palatable for the corporate community. Lately Jim has been working on integrating Ruby and Rails with Asterisk.

Jim is the author of the CommandLine and Stax gems.