Using Ruby Object Type Classes to Safely Build Data

When building collections of data you will find situations where the types aren’t what you planned to work with. And when I say types I’m speaking generically of arrays, hashes, strings, integers, nil, etc. Everything’s cosy when you know what your getting. For example putting 10 integers into an Array:

arr = [] 10.times do |num| arr << num end arr # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 1 2 3 4 5 6 7 arr = [ ] 10.times do | num | arr << num end arr # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

But if you want to work something that is not yet defined, or is nil, then you’ll get an error:

y << 5 # NameError: undefined local variable or method `y' for main:Object y = nil # => nil y << 5 # NoMethodError: undefined method `<<' for nil:NilClass 1 2 3 4 5 6 7 y << 5 # NameError: undefined local variable or method `y' for main:Object y = nil # => nil y << 5 # NoMethodError: undefined method `<<' for nil:NilClass

You will find yourself more likely to run into this kind of situation when working with Hashes:

example = {} # => {} example[:foo] # => nil example[:foo] << 5 # NoMethodError: undefined method `<<' for nil:NilClass 1 2 3 4 5 6 7 example = { } # => {} example [ : foo ] # => nil example [ : foo ] << 5 # NoMethodError: undefined method `<<' for nil:NilClass

In each of these examples I’ve been using << to put something into what we would like to be an Array. But the Array Object must first be instantiated before we can insert items into it. We could do it like so:

example[:bar] = [] # => [] example[:bar] << 5 # => [5] 1 2 3 4 5 example [ : bar ] = [ ] # => [] example [ : bar ] << 5 # => [5]

But now we’ve taken two lines to accomplish this. We may end up writing LOTS of code with similar behavior so we don’t really want to have to write more than we need to. Well there’s good news. Ruby has classes that allow us to create the right Objects for just this kind of situation.

Array(nil) # => [] Hash(nil) # => {} String(nil) # => "" 1 2 3 4 Array ( nil ) # => [] Hash ( nil ) # => {} String ( nil ) # => ""

As you can see we handed these classes a nil Object and it returned an empty collection of the type class we used. So with this we can take our example and one-line the nil Object assignment and insertion.

example[:fiz] # => nil example[:fiz] = Array(example[:fiz]) << 5 # => [5] 1 2 3 4 5 example [ : fiz ] # => nil example [ : fiz ] = Array ( example [ : fiz ] ) << 5 # => [5]

And it worked! The Array(example[:fiz]) created an empty Array since example[:fiz] was nil which then allowed us to insert 5 in to the Array and finally save it on the left side of the equals into example[:fiz]. It looks a lot like the way += works except that += will not work on nil. Now lets try it with the loop.

sample = {} # => {} 10.times do |num| sample[:fez] << num end # NoMethodError: undefined method `<<' for nil:NilClass 10.times do |num| sample[:fez] = Array(sample[:fez]) << num end sample[:fez] # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 1 2 3 4 5 6 7 8 9 10 11 12 sample = { } # => {} 10.times do | num | sample [ : fez ] << num end # NoMethodError: undefined method `<<' for nil:NilClass 10.times do | num | sample [ : fez ] = Array ( sample [ : fez ] ) << num end sample [ : fez ] # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In this situation the Array() class method ensured we had an Array the very first time. It took nil and made an empty Array. From there it used the same Array for each cycle of the loop and appended the new number to the end. With this we don’t have to care about nil being an Issue. That issue’s been dealt with.

A Complex Example

When organizing data you will get into far more complex situations where you will need this. One great way is to handle files and directories and organize collections with them. Here’s the start of an MP3 play-list creator I’m working on. The way it works is it finds all mp3s in subdirectories and then uses the folder as the play-list group that they will belong to.

def parse_dir(dir) result = {} Dir.glob("#{dir}/**/*.mp3").each {|path| m3u = "#{path.split('/')[-2].gsub(' ','_').downcase}.m3u" result[m3u] = Hash(result[m3u]).update({ files: Array(Hash(result[m3u])[:files]).<<({ path: path, filename: path.split('/').last }) }) } result end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 def parse_dir ( dir ) result = { } Dir . glob ( "#{dir}/**/*.mp3" ) . each { | path | m3u = "#{path.split('/')[-2].gsub(' ','_').downcase}.m3u" result [ m3u ] = Hash ( result [ m3u ] ) . update ( { files : Array ( Hash ( result [ m3u ] ) [ : files ] ) . << ( { path : path , filename : path . split ( '/' ) . last } ) } ) } result end

There’s a lot going on here. You can see that I’m using both Array() and Hash(). In each case where these class methods are first reached the Object within them will evaluate as nil. So they will create a new empty instance of either the Array or Hash instance that the collection gets built from.

Lets break it down. Dir.glob(“#{dir}/**/*.mp3”) takes the path we hand in to the parse_dir method and goes through all subdirectories no matter how deep. The double stars ** are what has the glob method traversing all the directories. The end of the string *.mp3 selects any file that ends with .mp3 . The result of this will be an enumerable Object which we can iterate over (a list of results).

Now that we have the list of results we want to take each one and place them in a “play-list” group in our Hash called result. So with each we hand each item in the list to the variable path and start our process.

First we want to create the play-list name. So we take the files complete path and split it by directory seperators “/”, from there we take the second from last [-2] which is the directory the file is in. ([-1] is the file name itself) We’ll want uniform file names so we replace the spaces with underscores and lower-case the whole thing. From there we have our string “my_directory_name.m3u” stored in the variable m3u.

Now we apply the technique I’ve shown here with result[m3u] = Hash(result[m3u]). The first time this code is cycled through result[m3u] is nil, so Hash() turns it into an empty Hash {} and it gets assigned to result[m3u] = {}.

Next we update the Hash with new values. Now this may take a little bit to wrap your head around so I’ll see if I can simplify it. The first time this block is run it looks like this (with nils):

result[m3u] = Hash(nil).update({ files: Array(Hash(nil)[:files]).<<({ path: path, filename: path.split('/').last }) }) 1 2 3 4 5 6 7 result [ m3u ] = Hash ( nil ) . update ( { files : Array ( Hash ( nil ) [ : files ] ) . << ( { path : path , filename : path . split ( '/' ) . last } ) } )

During the first time the Hash(nil)[:files] is attempting to access the symbol :files on an empty Hash {}[:files] #=> nil . So that turns it into files: Array(nil).<< which further turns into files: [] << . So the first time the :files key inside the result[m3u] Hash is created with an empty Array. Then continues to insert the first item into that Array via the << method. The Object getting inserted to that Array is the hand written Hash you see above which will look something like:

{ path: "/computer/path/to/file.mp3", filename: "file.mp3" } 1 2 3 4 5 { path : "/computer/path/to/file.mp3" , filename : "file.mp3" }

Each time the loop goes through it will now update the Array within the Hash within the Hash with the individual file details as hashes of their own. Let me simplify it. So result has multiple keys labelled as m3u lists (“my_directory_name.m3u”) based on the directory names. Each of those keys will access the Hash that has the key :files which returns an Array (list) of files with their own hashes of path/filename.

And now we have successfully grouped all the file details by directory name. If you’re wondering why I used a :files Hash instead of just an Array here it is because I have more items in the Hash in my production code. My next step in the project is to dynamically render a m3u play-list as a view and use it to stream audio over my local area network. It’s fun and I’m looking forward to using it.

Cautionary Note:

These methods are good for working with collections. But not all Ruby Type Classes are equal. If you use the Integer() method on nil it doesn’t produce zero.

Integer(nil) # TypeError: can't convert nil into Integer nil.to_i # => 0 1 2 3 4 5 Integer ( nil ) # TypeError: can't convert nil into Integer nil . to_i # => 0

So you’ll need to use .to_i for integers. That being the case if you want to use any other methods like this then test them first to ensure their behavior.

Summary

Using Type Classes are a good practice. They will save time and effort by incorporating them on the right side of the assignment methods. It’s also nice that Array([]) won’t result in doubling the depth of the Array like [[]] but still returns []. So these methods are a safety net against nil. These methods are like having your cake and eating it too ^_^.

I hope you enjoyed this and that it was insightful. Please comment, share, subscribe to my RSS Feed, and follow me on twitter @6ftdan!

God Bless!

-Daniel P. Clark

Image by Mark Thurman via the Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic License.