Value objects in Ruby: Creating custom data types

Sep 01, 2015 Christopher Moeller

This is the first in a two-part series on value objects in Ruby and using them with a database. In the first part, we'll explore the benefits of using value objects in Ruby. The second part will be a tutorial on how to use a custom data type with ActiveRecord in a Ruby on Rails application.

Introduction

Ruby provides a rich set of value objects for things like IP addresses, Dates, Strings, Hashes, Arrays, etc. When working with data in our applications we can usually start with one of the types that Ruby gives us. This helps us get started very quickly but it can get out of hand as requirements change. Just about every application needs data in a particular format, like an email address or a list of investments.

Let's take a look at something we all do: using Ruby Strings to store email addresses and work with them.

Using a Ruby String to store an email address

We'll leave out data persistence for this post and focus on our application code. For now, we'll use irb as a playground. Let's fire it up and take a poke around with some email addresses.

It should come as no surprise that if we have two Strings with the same sequence of characters, they're equivalent:

$ irb >> "user@example.com" == "user@example.com" #=> true

But what happens if somehow one of the Strings started with an upper case letter? Again, no surprise here:

$ irb >> "user@example.com" == "User@example.com" #=> false

One quick and dirty way to solve this is by calling downcase on both strings:

$ irb >> "user@example.com" .downcase == "User@example.com" .downcase #=> true

Perfect! But what did we do here? For all practical purposes, we know that email addresses are not case sensitive, so we added a little bit of code to handle that characteristic when comparing them. Now every time we need to compare a pair email addresses we just need to remember to call downcase . No big deal, right? Well, until we miss one. Or two. And then we're sending invalid data throughout our application, meaning that we'll have unexpected hard-to-track-down bugs.

It turns out that email addresses aren't just strings — they're strings in a specific format. There are some specific rules that every email address follows (this is not an exhaustive list): * It is case insensitive (as mentioned above) * It has a local part and a domain part separated by an "@" * There are other requirements such as character requirements. Some characters are required, some are optional, and some are not permitted (like brackets and semicolons)

Using a value object

While all of the rules for email address formatting are very complex (the Wikipedia page is a good place to start), we don't need to validate against every single one. At the end of the day, the only way to verify an email address is to send it an email. That said, we do need to encode the email address rules that are important to our application somewhere. (Update: See the update at the end of this post for the final EmailAddress class.)

Let's start with a basic EmailAddress class:

class EmailAddress include Comparable def initialize ( string ) if string =~ /@/ @raw_email_address = string . downcase . strip else raise ArgumentError , "email address must have an '@'" end end def < => ( other ) raw_email_address <=> other . to_s end def to_s raw_email_address end protected attr_reader :raw_email_address end

With some very basic validation in the initializer, we have a class we can use to compare two email addresses.

$ irb >> require "./email_address" #=> true >> EmailAddress.new ( "user@example.com" ) == EmailAddress.new ( "user@example.com" ) #=> true >> EmailAddress.new ( "user@example.com" ) == EmailAddress.new ( "User@example.com" ) #=> true

We can also sort a list of email addresses because we included Comparable and implemented <=> :

$ irb >> require "./email_address" #=> true >> email1 = EmailAddress.new ( "jason@example.com" ) #=> #<EmailAddress:0x007fa45a0a3e98 @raw_email_address="jason@example.com"> >> email2 = EmailAddress.new ( "apple@example.com" ) #=> #<EmailAddress:0x007fa45a0a3e98 @raw_email_address="apple@example.com"> >> email3 = EmailAddress.new ( "zebra@example.com" ) #=> #<EmailAddress:0x007fa45a0a3e98 @raw_email_address="zebra@example.com"> >> emails = [ email1, email2, email3] #=> [#<EmailAddress:0x007fa45a0a3e98 @raw_email_address="jason@example.com">, #<EmailAddress:0x007fa459ca8f78 @raw_email_address="apple@example.com">, #<EmailAddress:0x007fa45a092e68 @raw_email_address="zebra@example.com">] >> emails.sort #=> [#<EmailAddress:0x007fa459ca8f78 @raw_email_address="apple@example.com">, #<EmailAddress:0x007fa45a0a3e98 @raw_email_address="jason@example.com">, #<EmailAddress:0x007fa45a092e68 @raw_email_address="zebra@example.com">]

If someone tries to build an email address that doesn't match the regex, Ruby will blow up, preventing invalid data from being passed around in your application:

$ irb >> require "./email_address" #=> true >> EmailAddress . new ( "yo!" ) ArgumentError : email address must have an '@'

Conclusion

Nearly every production application I've seen passes around built-in data types for almost everything that isn't considered a composite type (like a user). There's an up-front investment in time to do this, yes, but using custom types also means more robust code since you can know you're passing around valid data, eliminating one source of bugs.

What do you think? Where have you been using a built-in type when you should be using a custom value object? Physical addresses pop out to me but what about a person's name?

Update

Reddit user materialdesigner brought up the point that we should not call to_s on the comparing object in the <=> method because it forces a type coercion and you get the issue (like in JavaScript) of 1 == "1" . Instead a user of this class would need to create a new EmailAddress for the comparison. The final EmailAddress class is this:

class EmailAddress include Comparable def initialize ( string ) if string =~ /@/ @raw_email_address = string . downcase . strip else raise ArgumentError , "email address must have an '@'" end end def < => ( other ) raw_email_address <=> other end def to_s raw_email_address end protected attr_reader :raw_email_address end

Now using the class: