What I learned from reading spreadsheet_architect code

Recently I heard about spreadsheet_architect gem and I wondered a few things after reading its README. It all lead me to look into its code and wonder…

What started my curiosity was the API:

class Post < ActiveRecord :: Base include SpreadsheetArchitect end Post . order ( name: :asc ). where ( published: true ). to_xlsx

I was like wow, why would anyone add such a thing to a model? Why is it a good idea to extend my ActiveRecord class (which most likely already has tons of responsibilities) with methods responsible for generating reporting files. Dunno. Doesn’t sound like the best use of Single responsibility principle.

But I was like… There must be a way to avoid it. After all, this gem can work with normal classes as well as documented:

Post . to_xlsx ( instances: posts_array )

or another way:

headers = [ 'Col 1' , 'Col 2' , 'Col 3' ] data = [[ 1 , 2 , 3 ], [ 4 , 5 , 6 ], [ 7 , 8 , 9 ]] SpreadsheetArchitect . to_xlsx ( data: data , headers: headers )

So probably including SpreadsheetArchitect is not mandatory.

I was thinking about checking the code to see how it gets the list of records and implementing a compatible interface inside a different class. I was pretty sure the method was #to_a because why not? What else could it be? So I hoped for this kind of workaround:

class PostsReport include SpreadsheetArchitect def to_a Post . order ( name: :asc ). where ( published: true ). to_a end end

or

class PostsReport include SpreadsheetArchitect def initialize ( collection ) @collection = collection end def to_a @collection . to_a end end

if we want to dynamically build and pass the collection for generating a report.

Then I could define methods such as def spreadsheet_columns which configure what is displayed in PostsReport instead of Posts because that sounded better to me. Less coupling, and bigger separation. Probably easier if we need to support multiple reports.

So I started looking into the code, how it works internally, to confirm my guesses.

module SpreadsheetArchitect module ClassMethods def to_csv ( opts = {}) opts = SpreadsheetArchitect :: Utils . get_cell_data ( opts , self ) # ...

It turned out the logic was implemented in get_cell_data . Let’s see what inside. It had 80 lines of code including this:

def self . get_cell_data ( options = {}, klass ) # ... if ! options [ :instances ] && defined? ( ActiveRecord ) && klass . ancestors . include? ( ActiveRecord :: Base ) # triggers the relation call, # not sure how this works but it does options [ :instances ] = klass . where ( nil ). to_a end # ...

Ok, so it turned out it is not just to_a but rather where(nil).to_a . I am not exactly sure why would where(nil) be necessary. At this point, I decided that working around the API is probably not worth it and too hard.

But I got curious how can it work? Because you see…

And now something completely different

to_csv , to_xlsx , methods are defined on Post (if you include the module)

Post.method(:to_csv) => #<Method: Class(SpreadsheetArchitect::ClassMethods)#to_csv>

The example shows

Post . order ( name: :asc ). to_xlsx

but that is a relation

Post.order(name: :asc).class # => Post::ActiveRecord_Relation

so this check should fail

klass.ancestors.include?(ActiveRecord::Base)

because

Post :: ActiveRecord_Relation . ancestors . include? ( ActiveRecord :: Base ) # => NameError: private constant #<Class:0x0000000478a240>::ActiveRecord_Relation referenced # doh... # workaround Post . const_get ( :ActiveRecord_Relation ). ancestors . include? ( ActiveRecord :: Base ) # => false

Yep, Ruby has private classes and they are used here.

So how can this gem work with relations? Let’s try something…

class Post < ApplicationRecord include SpreadsheetArchitect def self . bump puts self . inspect end end

Nothing out of ordinary here:

Post . bump # Post(id: integer, ...)

Post . where ( id: 1 ). inspect # Post Load (0.6ms) SELECT "posts".* FROM "posts" WHERE "posts"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 11]] # => "#<ActiveRecord::Relation [#<Post id: 1, ...">]>"

But check this out:

Post . where ( id: 1 ). bump # Post(id: integer, ...)

Even though this is ActiveRecord::Relation , methods which are dynamically executed report that self is the Post class.

Let’s investigate further:

Post.where(id: 1).method(:method_missing) # => #<Method: Post::ActiveRecord_Relation(ActiveRecord::Delegation::ClassSpecificRelation)#method_missing> Post.where(id: 1).method(:method_missing).source_location # => [".rvm/gems/ruby-2.4.1/gems/activerecord-5.1.2/lib/active_record/relation/delegation.rb", 87]

Let’s see how this Rails magic works:

def method_missing ( method , * args , & block ) if @klass . respond_to? ( method ) self . class . delegate_to_scoped_klass ( method ) scoping { @klass . public_send ( method , * args , & block ) } elsif arel . respond_to? ( method ) self . class . delegate method , to: :arel arel . public_send ( method , * args , & block ) else super end end

as you can see the method bump is delegated to the class directly. So that’s why self is Post .

is delegated to the class directly. So that’s why is . But why is only a subset of records used for generating the file, and not all records, when the method is delegated directly to a class? Because scoping is used, which works like a global (probably a thread-safe global).

Post . where ( id: 1 ). method ( :scoping ). source_location # => [".rvm/gems/ruby-2.4.1/gems/activerecord-5.1.2/lib/active_record/relation.rb", 334]

# Scope all queries to the current scope. # # Comment.where(post_id: 1).scoping do # Comment.first # end # # => SELECT "comments".* FROM "comments" # WHERE "comments"."post_id" = 1 # ORDER BY "comments"."id" ASC # LIMIT 1 # # Please check unscoped if you want to remove all previous scopes (including # the default_scope) during the execution of a block. def scoping previous , klass . current_scope = klass . current_scope , self yield ensure klass . current_scope = previous end

In other words before Post#to_xlsx is called, Post.current_scope is set temporarily and as a result Post.where(nil).to_a called by the SpreadsheetArchitect is limited in scope and does not include all records. That’s how it works.

via GIPHY

Wanna learn more?

Subscribe to our newsletter to receive weekly free Ruby and Rails lessons.