Resources this article is dealing with:

Imagine that we have an existing application that is storing file attachments in a Ruby on Rails application with Paperclip library and persisting reference to relational database (e.g.: PostgreSQL).

note: It can be any other Ruby based file attachments library (e.g. CarrierWave, Dragonfly)

So for example we will have a dummy Real Estate application that have model Property showing #title and #description of the properties on the market. Property has many images .

For simplicity of our code example we are rendering just urls via JSON API but same principals will apply if you want to generate server side HTML via ERB, Slim, Haml, …

Relational DB example first

We will setup Paperclip standard setup as described in https://github.com/thoughtbot/paperclip#models on our Image model mounting Paperclip on attachement with two styles thumb and screen .

# Gemfile # ... gem 'pg' gem 'paperclip' # ...

# app/model/property.rb class Property < ActiveRecord::Base # DB attributes :id, :title, :description has_many :images end

# app/model/image.rb class Image < ActiveRecord::Base # DB attributes :id, attachment_file_name, :updated_at belongs_to :property has_attached_file :attachment, styles: { thumb: '100x100>', screen: '1024x1024' } def screen_url attachment.url(:screen) end def thumb_url attachment.url(:thumb) end end

Our controller looks like this:

# app/controller/properties_controller.rb class PropertiesController < ApplicationController def index render json: properties_as_json, layout: false end private def set_properties if search_term @properties = Property.where("title LIKE '%?%'", search_term) else @properties = Property.all end end def search_term params["q"] end def properties_as_json PropertesSerializer.new(collection: @properties).as_json end end

# config/routes.rb # ... resources :properties, only: [:index] # ...

# app/serializer/properties_serializer.rb class PropertesSerializer attr_reader :collection def initialize(collection:) @collection = collection end def as_json collection.map do |property| { id: property.id.to_i, title: property.title, description: property.description, images: property.images.map do |i| { thumb: i.thumb_url screen: i.screen_url } end } end end end

JSON representation is generated by PropertesSerializer which is just plain old Ruby object generating hash that is turned to JSON on controller level by Rails built in render json: {}

Result will contain either all records when no search query parameter provided, or just results matching the search query.

curl localhost:3000/properties?q=cool

[ { "id": 123, "title": "really cool property", "description":"foobar", "images": [ { "thumb": "http://localhost:3000/..../thumb/foo.jpg" "screen": "http://localhost:3000/..../screen/foo.jpg" }, { "thumb": "http://localhost:3000/..../thumb/bar.jpg" "screen": "http://localhost:3000/..../screen/bar.jpg" } ] } ]

Elasticsearch solution

Now the relational DB solution above is good enough if you have low traffic or really basic search requirements. But once you want to do more sophisticated search queries or faster displaying of search results it’s time to introduce Elasticsearch.

I’ve seen several times people introducing elasticsearch-rails gem to do the search query and then just collecting id of records and using ActiveRecord to fetch the records.

es_result = Property.search(query: { match_all: {} } ) Property .where(id: es_result.map(:id).map(:to_i)) #first SQL query call .each do |property| property.images # ...will call another SQL query end

In my opinion that’s an huge waste of resources and time as elasticsearch-model and elasticsearch-rails provide really good data-mapper solution so that you won’t need to call relational DB.

In this example we will go with this opinion and we won’t make any SQL call when searching.

We will use ActiveRecord relations just for constructing Elasticsarch index ( #as_indexed_json method)

Let’s add Elasticsearch Rails libraries to our Gemfile

# Gemfile # ... gem 'pg' gem 'elasticsearch-model' gem 'elasticsearch-rails' gem 'paperclip' # ...

…and to the model we want to search (as described in https://github.com/elastic/elasticsearch-rails#usage)

We will be searching only on Property model and we will use image attributes in it’s Document/Index

# app/model/property.rb class Property < ActiveRecord::Base include Elasticsearch::Model include Elasticsearch::Model::Callbacks mapping do indexes :title end has_many :images def as_indexed_json { id: id, title: title, description: description, images: images.map(&:as_indexed_json) } end end

# app/model/image.rb class Image < ActiveRecord::Base belongs_to :property, touch: true has_attached_file :attachment, styles: { thumb: '100x100>', screen: '1024x1024' } def screen_url attachment.url(:screen) end def thumb_url attachment.url(:thumb) end def as_indexed_json { id: image.id, updated_at: image.updated_at, attachment_file_name: image.attachment_file_name, } end end

Now let’s update our controller and instead of ActiveRecord search use Elasticsearch search.

# app/controller/properties_controller.rb class PropertiesController < ApplicationController def index render json: properties_as_json, layout: false end private def set_properties if search_term @properties = Property.search({ query: { term: { title: search_term } } }) else @properties = Property.search(query: { match_all: {} }) end end def search_term params["q"] end def properties_as_json PropertesSerializer.new(collection: @properties).as_json end end

Property.search is just alias for Property.__elasticsearch__.search provided by Elasticsearch Rails gems

Now here comes the tricky part: “Generating the Paperclip images attachement urls”. Again easiest way would just be to fetch the Image.find_by(id: es_image.id) but that would make Unnecessary SQL calls.

Instead of that we will just initialize Image instance and we will pass the attributes required by Paperclip to generate the url. These attributes are:

id - inner part of path

- inner part of path updated_at - in order to generate timestamp query urls (e.g.: http://localhost/.../foo.jpg?1467114169 )

- in order to generate timestamp query urls (e.g.: ) attachment_file_name - file identifier

depending on your configuration different attributes may be required too.

# app/serializer/properties_serializer.rb class PropertesSerializer attr_reader :collection def initialize(collection:) @collection = collection end def as_json collection.map do |property| { id: property.id, title: property.title, description: property.description, images: property.images.map do |es_image| { thumb: es_image_to_image(es_image).thumb_url, screen: es_image_to_image(es_image).screen_url } end } end end # This image instance is purely for generating urls don't # persist any data on it def es_image_to_image(es_image) Image.new({ id: es_image.id, updated_at: es_image.updated_at, attachment_file_name: es_image.attachment_file_name, }) end end

Lunch Rails console and run:

Property.__elasticsearch__.create_index! Property.import

…and restart Rails server.

Now you should have fully functional Elasticsearch that is rendering image urls without the need to access PostgreSQL data.

curl localhost:3000/properties?q=cool

[ { "id": 123, "title": "really cool property", "description":"foobar", "images": [ { "thumb": "http://localhost:3000/..../thumb/foo.jpg" "screen": "http://localhost:3000/..../screen/foo.jpg" }, { "thumb": "http://localhost:3000/..../thumb/bar.jpg" "screen": "http://localhost:3000/..../screen/bar.jpg" } ] } ]

note I wrote this article just based on what I’ve implemented in different Rails project few days ago, I didn’t tested the code example so there may be typos. However I’ve created this Gits that contains example that I’m sure works.

Other Elasticsearch examples: