Short Preface

Many years ago, when I considered myself a C++ expert (which was presumptous yet excusable for 19 yrs), I’ve liked to think something like “My language is the most powerful and expressive in entire world! I can implement literally any other language’s feature with my templates and operator redefinitions and …”

Now, being Ruby developer in my mid-30s, I became much more pragmatic, but still curious (and sometimes envious) about other languages and with passion to experiments like “what can we do to have this Erlang/Scala/Rust/you-name-it cool thing in our everyday code?”

So, here is one of those exercises: taking the whole idea of “pattern matching” from functional languages and seeing, how we can make use of it.

What is Pattern Matching?

In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. (Wikipedia)

Not sounding really cool?

In practice of languages like Haskell, there is a very powerful tool, allowing, in particular:

checking values against complex patterns (like “list consisting of element, then several, then exactly the same element”);

decomposition of values (after checking list agains (x:xs) pattern you can immediately work with “x” variable, having matched element;

pattern you can immediately work with “x” variable, having matched element; code execution dispatching by patters (you can have several defintions of function for different “argument patterns”);

guards, which are not only check data structure but also can perform any checks on value ( x < 15 ).

Read about Haskell pattern matching. It’s seriously mind-blowing.

Ruby Has (Sort Of) Pattern Matching

Three Equality Signs

Not everyone is aware of power of #=== operator. In official documentation it is named “case equality operator” (suggesting typical, but not only, use) and you implicitly use it in your everyday code:

x = 6.0 case x when ( 1 .. 5 ) # calls Range#===(x), which returns true if (1..5).cover?(x) when Fixnum # calls Class#===(x), which returns true if x.kind_of?(Fixnum) when /foo/ # calls Regexp#===(x), which returns true if x =~ /foo/ when "foo" # calls default Object#===(x), which is just an alias for == when -> ( x ){ x > 5 } # calls Proc#===(x), which returns true if call(x) returns true end

You can also define #=== for your class and have very complicated checks inside case .

This is hell of close to pattern matching. I’m constantly finding myself using this feature exactly for dispatching. It is not as cool as method overloading, yet concise enough:

def parse ( value ) case value when /^\d+$/ parse_int ( value ) when /^"([^"]+)"$/ parse_string_literal ( value ) #... end end # or def output_to_logs ( value ) case value when /^ERROR(.+)$/ log . error $1 when Numeric log . warn "Number value %.2f" % value when Exception log . error " #{ value . class } : #{ value . message } " when String log . info value # ... end end

It is less known that #=== method also used in grep :

[ 1 , nil , 2 , 'test' ]. grep ( Numeric ) # => [1,2]

And in fact for many tasks like parsing and evaluating complex contexts you can store your own array of heterogenous patterns and check some values against them.

But it is only a part of “pattern matching” puzzle.

Argument Decomposition in Expressions

Ruby definitely has some of decomposing-values-by-pattern features:

a , b = [ 1 , 2 , 3 ] # a == 1, b == 2, the rest is lost a , * b = [ 1 , 2 , 3 ] # a == 1, b == [2,3] a , ( b , c ), * d = [[ 1 , 2 ], [ 3 , 4 ], [ 5 , 6 ], [ 7 , 8 ]] # a == [1,2], b == 3, c == 4, d == [[5,6],[7,8]]

It is extremely useful for method/block arguments. When we are talking about Ruby 2.2, we also have **args for arguments, catching hashes, and keyword arguments, deconstructing them. Unfortunately, in other places you can’t has it:

a , ** b = [ 1 , { test: 'me' }] # SyntaxError

Even more unfortunately, decomposition in expressions have nothing in common with #=== , so, you can’t do anything like

list = [ 1 , 2 , 3 ] case list when [ 1 , * rest ] # undefined local variable or method `rest' p rest # you'd like to have [2,3] here... but ↑ end

Regexps

Regexps are different. They are patterns. The can #=== , as we’ve seen before. They also can decompose:

'test' =~ /(.+)st/ $1 # => 'te' # or, without nasty globals: Regexp . last_match [ 1 ] # 'te' # or, more Ruby-ish way m = /(.+)st/ . match ( 'test' ) m [ 1 ] # 'te' # or, like it's 2015 already m = /(?<somename>.+)st/ . match ( 'test' ) m [ 'somename' ] # 'te'

You can think of them as a separate sub-language inside Ruby, which can be in no use outside the domain of string checking. Yet regexps can learn us a lesson of how could matching+decomposition look in Ruby.

RSpec Matchers

They are used, well, for specs and inside framework. But, in fact, many RSpec matchers are complicated patterns for compound types. Like this:

# Fortunately, matchers are smart enough to implement === array_including ( 1 ) === [ 1 , 2 , 3 ] # => true hash_including ( test: instance_of ( String )) === { test: 'me' } # => true

That’s cool. Really, seriously cool and very Ruby-way. In fact, you can try it right now:

x = [ 1 , 2 , 3 ] case x when array_including ( instance_of ( Fixnum ), 2 ) # .... end

And it just works! I am, in fact, puzzled that we are not seeing more and more of this in real life code (or, I rather not; skip to “Being Pragmatic” section if you want to know why).

Yet those cool matchers still lack “decomposition” of values into variables. (To be fair, it’s not needed in context they are used today.)

So, we need to go deeper.

Implementing Missing Features

There is several attempts to bring pattern matching in Ruby.

For me, all of them (looking pretty similar) are missing the point. They are trying to invent some DSL, mimicing existing pattern-matching languages. Yet, IMO, pattern matching is not some “domain”, it’s rather basic language feature. So, the “domain-specific language” attempt to it is doomed by design.

Let’s try to invent solution, defined by next requirements:

it should play well with existing Ruby quasi-pattern-matching features: case , grep , #=== ;

, , ; it should not pollute global namespace (and monkey-patch core classes) with lot of “cool inventions”;

it should be some “opt-in” solution, which you can use in 2-3 places in your code (not changing anything else) and everything should just work, and just stay readable, not implying some “new paradigm”

providing all the cool features of deep matching, values decomposition, value guards and more.

So, meet matchish. For now, it is NOT a gem, but rather an attempt to show the way. If you are wondering why I’m not over-enthusiastic over matchish acceptance, please read “Being Pragmatic” section.

Types Being Algebraic

Task: treat any Ruby object as “algebraic” data type, which allows to check its deep inside structure.

Solution: At first, I assumed something like RSpec’s argument matchers, like any , array_including(...) , instance_of(...) and so on. But for “core language feature” it seemed to be too verbose and not very “guessable”. So, current solution looks like this:

require 'matchish/matchers' # normal Ruby behavior Object === 1 # => true [ 1 , 2 , 3 ] === [ 1 , 2 , 3 ] # => true # Method #matchish converts any object to pattern # exactly the same as above: Object . matchish === 1 # => true [ 1 , 2 , 3 ]. matchish === [ 1 , 2 , 3 ] # => true # ...but the patterns are powerful and extensible: # check object attribute: String . matchish . with ( length: 4 ) === 'test' # => true # or call object method: Array . matchish . which ( include?: 1 ) === [ 3 , 2 , 1 ] # => true # deep check with `===` for nested objects [ 1 , 2 , Numeric ]. matchish === [ 1 , 2 , 3 ] # => true # or even: [ 1 , * Numeric . matchish ]. matchish == [ 1 , 2 , 3 ] # => true # shorter version, yet without including entire module, is also available: require 'matchish/ma' Array . ma . which ( include?: 1 ) === [ 3 , 2 , 1 ] # => true Array . ma . which ( include?: 1 ) === [ 4 , 5 , 6 ] # => false # Aaaaand even shorter, for fun: require 'matchish/m' String . m . with ( length: 3 ) === 'wtf' # => true String . m . with ( length: 3 ) === 'sorry' # => false

Solution seems pretty straightforward for implementation, you can look at it in matchish/matchers.

As it is rather proof-of-concept than library, not all of useful matchers are implemented.

Compromise:

one (and only one) method matchish or ma monkey-patched in each object;

or monkey-patched in each object; (and it even can be refinement, not patch, in modern Ruby).

Showcase:

# classic functional list length, just for fun def length ( list ) case list when [ Object . m ]. m 1 when [ Object . m , * Object . m ]. m 1 + length ( list [ 1 ..- 1 ]) end end length ([ 1 , 2 , 3 ]) # => 3 # Something more interesting: %w[1980-12-28 1983-02-14 1992-03-30 1998-10-17] . map ( & Time . method ( :parse )). grep ( Time . m . with ( month: ( 2 .. 3 ))) # => [1983-02-14 00:00:00 +0300, 1992-03-30 00:00:00 +0300]

Values Being Decomposed

Task: alongside with matching pattern, store matched values in variables.

Solution 1: strangely beautiful

require 'matchish/decompose/bind' [ x = Object . m , x , * Object . m ]. m === [ 1 , 1 , 2 , 3 ] # => true x . value # => 1 [ x = Object . m , x , * Object . m ]. m === [ 1 , 2 , 3 , 4 ] # => false String . m . with ( length: ( x = Object . m )) === 'test' # => true x . value # => 4

The .value part is a bit ugly, but the overall solution looks cute, for me.

How it’s done? x = Object.m part is simple variable assignment. After it, x has value of Matchish::Matcher . After very first comparison with real value, pattern is “bound” to this value, and all forecoming comparisons only return true if the value is exactly the same.

NB: at the moment, I haven’t invented elegant solution to name the splatted part of match (like y = *any ).

Downside of this approach is you can’t save this kind of pattern in the constant (as it will not know the context of “x”, and is usable only once, after which pattern is “bound”).

Solution 2: Regexp -like

require 'matchish/decompose/as' [ Object . m . as ( :x ), Object . m . as ( :x ), * Object . m . as ( :y )]. m === [ 1 , 1 , 2 , 3 ] # => true Matchish . last_match [ :x ] # => 1 Matchish . last_match [ :y ] # => [2, 3] [ Object . m . as ( :x ), Object . m . as ( :x ), * Object . m ]. m === [ 1 , 2 , 3 , 4 ] # => false

Looks like “more complex DSL”, yet, at the same time, less magic. (By “less magic”, BTW, I mean that there is no need for “how is it done?” explanation, as any expirienced Ruby programmer reads this as pretty obvious code.)

Compromise: It seems both solutions are mere compromises, each having its pros and cons. I honestly don’t know which is “more Ruby-way”. The first one seems “smarter” while the last is definitely provides less magic and therefore more usability.

Showcase:

# For "match" solution: the same "list length" functional style def length ( list ) case list when [ Object . m ]. m 1 when [ Object . m , * Object . m . as ( :tail )]. m 1 + length ( Matchish . last_match [ :tail ]) end end length ([ 1 , 2 , 3 ]) # => 3 # For vars solution: some deep matching def check ( hash ) if {( key = Symbol . m ) => Time . m . with ( month: ( m = Object . m ))}. m === hash [ key . value , m . value ] end end check ( birthday: Time . new ( 2015 , 2 , 14 )) # => [:birthday, 2]

Imperative Inside Functional: Guards

Task: Last but not least. Sometimes you are finding yourself writing code like…

case x when 'something' if some_condition do_something else do_something_else end end

…which is not too bad, yet branches inside branches inside branches… The whole idea around pattern-matching is - let’s do the only match, no branching.

Solution:

require 'matchish/guard' x = 10 flag = true case x when Fixnum . m . guard { | x | x > 5 && flag } p "here" when Fixnum # other case, without guard p "there" end

Implementation, again, is pretty straightforward.

Being Pragmatic

All the examples above are implemented in my matchish repository and working. But currently I’m not releasing this “matchish” as a gem (and therefore, not tried to make it feature-full: for ex., of all possible RSpec-like matchers only several necessary for showcase were implemented).

Why?

Look, I’ve invented matchish and tried to use it when working on a parser project (unreleased yet) – that’s the domain where pattern matching should definitely be in use. And there was hard lesson learned:

All in all, powerful pattern matching need to be core language feature.

That’s because of:

natural look : the more I’ve used “cool” matchish feature, the more was fear of other developers being puzzled or disgusted with code… it became “extraterrastrial” a bit;

: the more I’ve used “cool” matchish feature, the more was fear of other developers being puzzled or disgusted with code… it became “extraterrastrial” a bit; more important: speed, optimization, overhead ; it became really easy to write “clever” algorithm implementation in some small method, then call it 1000 times and then profiler shows you tons of Matchish objects created, dispatching checks furiously while binding complex “matching context” and feasting with your memory and CPU time;

; it became really easy to write “clever” algorithm implementation in some small method, then call it 1000 times and then profiler shows you tons of Matchish objects created, dispatching checks furiously while binding complex “matching context” and feasting with your memory and CPU time; even more important: after all the considerations, it now seems for me that pattern matching IS a paradigm; either you do all the execution dispatching through the pattern matched function alternatives, or it will always be another (typically, more clear) way to express your intentions; the way your language (Ruby) is more naturally support.

Though, it still seems that some incremental enchancements of Ruby existing “match-y” features will be appreciated by most of us. Personally I’d be happy to write case code with inline decomposition, as shown above:

list = [ 1 , 2 , 3 ] case list when [ 1 , * tail ] p other # => [2,3] # ... end

Also, what about #=== for types like Array and Hash using === recursively?

# Current Ruby behavior: [ 1 , 2 , 3 ] === [ 1 , 2 , 3 ] # => true [ 1 , Numeric , Numeric ] === [ 1 , 2 , 3 ] # => false, but I want true!

P.S. How I Did This Article

As I wanted just to “show the point”, I didn’t want to write extensive specifications for weeks, then updating them, and only THEN writing an article.

But as I wanted to show the point, I’ve need some tool to check at least all code I’ve written for article (before writing an implementation, BTW :) will work.

So, I’ve used so-called “readme-driven development” approach (using your readme as a specification) and tool dokaz helping to enforce this approach (ok, it’s shameless self-promotion, as I’m the author of the tool).