Rodrigo Rosenfeld Rosas

Fri, 15 Dec 2017 12:30:00 +0000 (Updated at Mon, 18 Dec 2017 18:55:00 +0000)

The importance of the little details (skip this section unless you enjoy rants)

Seriously, this section is big and not important at all, feel free to completely skip it right now if you're short in time or don't enjoy rants.

This is a rant explaining how ActiveRecord migrations completely defined my career in the past years.

I became curious about programming and computers when I was kid. I remember reading a huge C++ book when I was about 10 years old. I had learned Clipper just a bit before and I recall creating a Bingo game with Clipper, just because I wanted to play Bingo in those machines but I couldn't :) While learning Clipper I also had my first experience learning SQL and client-server design. My dad subscribed me to a few computer courses by that time, such as "DOS/dBase III Plus", Clipper + SQL and a few years later Delphi + Advanced SQL. I learned C and C++ from books and when services like Geocities and similar were showing up and the Internet was becoming supported in lots of homes I also became interested in learning HTML to build my own sites, the new hotness for that time. Since I also wanted to serve dynamic content, I decided to learn Perl since it was possible to find some free hosting services supporting Perl, and that was the first interpreted language I learned and I was really fascinated by it by that time.

For a long while I used Perl exclusively for server-side web programming since it was the only option I could find in free hosting services, but while in Electrical Engineering college, I barely did any web programming, and my programming tasks (extra classes) were mostly related to desktop programming (Delphi / C++) and embedded and hard real-time systems using a mix of C and C++ during my master thesis in Mobile Robotics. By that time I had a solid understanding of C and C++, good times, I don't find myself proficient with them anymore these days. That was a time where I would read and know the entire specs from W3C or HTML 4.01 and CSS. Today it's simply unfeasible to completely follow all related specs and I'm glad we have competition in the browser's marketing since it's really hard to follow up with all changes happening every day.

Once I finished my master thesis and had to find a job, I looked mostly for programming jobs, since I considered myself good in programming, there were lots of interesting opportunities out there while it was really hard to find companies in Brazil working on electronic devices development or Robotics and I never actually enjoyed the other part of Electrical Engineering such as machines, power or electrical installations. I only enjoyed the micro-electronics and embedded devices creation and one should consider themselves very lucky if they can work in such area in Brazil, and I didn't want to count on luck, so I decided to focus on the programming career instead. I remember my first curriculum was sent to Opera Software, my preferred browser, to apply to a C++ developer position, by that time, but after tons of interviews they didn't call me, so I'm not currently living in Norway these days ;)

After working for 3 months in a new parking system using Delphi (despite asking for using C++ instead) the contract was finished, the product was already working in one of the malls in my city, and I had to look for another job. They actually extended the offer to keep working with them, but at the same time I found another opportunity and this time I would have to get back to web programming. That was in 2007. Several years later and I couldn't really remember much of Perl and a lot had happened to web programming in the past years and I didn't follow that progress.

After a few stressful days trying to learn about every major web programming framework (specially while trying to read about J2EE), I came to the conclusion that I would finally choose one of TurboGears, Django or Rails. I didn't know Java, Python or Ruby by that time, so the language didn't take an important role while choosing the framework. I was more interested in learning about how the frameworks would make my life easier. At that time I had to maintain an existing ASP application but at some point I would have to create a new application and I could choose whatever I wanted and definitely I didn't enjoy ASP.

Since that application had to be displayed in Portuguese, I was considering the Python frameworks more than the Ruby one, as Rails didn't support internationalization by that time (i18n support was added to Rails 2 if I recall correctly) and even supporting UTF-8 wasn't straightforward with Ruby 1.8. Iconv and $KCODE were something you'd often hear about in the Ruby community by that time. There were tons of posts dedicated to encoding in Ruby by that time.

But there was that one Rails feature that made me change my mind and choose Rails over TurboGears or Django, which were supposed to work well with encodings and had announced internationalization support. And it was the approach used to evolve databases, which was the right strategy to use from my previous experiences, while I was pretty scared by the model-centered approaches used by TurboGears and Django to handle the database evolution.

By that time I had already plenty of experience working with RDBMS, specially Firebird, and having to deal with versioning the database and supporting multiple environments. That took me a lot of effort every time I started a new project because I basically had to implement the ActiveRecord migrations features every time and I knew that was very time consuming, so I was glad I wouldn't have to roll my own solution if I used Rails, as ActiveRecord migrations were clearly more than enough for my needs and they worked pretty well. So, despite the issues with encoding and lack of internationalization support, I decided to pick Rails due to the ActiveRecord migrations.

And even though I don't use ActiveRecord for several years, I've been still using its migrations tools since 2007, more recently through my wrapper around it called active_record_migrations.

While I don't appreciate ActiveRecord as an ORM solution, I like its migrations tooling very much and they haven't changed much since I used them with Rails 1. The most significant changes since then were support for time-stamped migrations, the reversible block and finally, many years later, proper support for foreign keys (I struggled to add foreign keys using plain SQL for many years).

When I first read about Sequel I was fascinated by it. ActiveRecord wasn't built around Arel yet by that time, so all those lazy evaluations in Sequel were very appealing to me. But around 2009 I took another job opportunity and this time I would work with Grails and Java rather than Rails, so I missed many recent changes to Rails for a while. In 2011 I changed my job again, but still had to support a Grails application, but I was free to do whatever I liked to the project and since there were quite a lot of Grails bugs that were never fixed and I couldn't find work-arounds for, I decided to slowly migrate the Grails app to Rails. By that time, Arel had been integrated to ActiveRecord, so it would finally support lazy evaluation as well, so I decided to try to stick with Rails defaults, but a week later I realized that there were still many more reasons why Sequel was far superior to ActiveRecord and decided to replace ActiveRecord with Sequel and never looked back. Best decision ever.

See, I'm a database guy. I work with the database, not against it. I don't feel the need to abstract the database because I'd prefer to use Ruby over SQL. I was able to appreciate not only SQL but several other powerful tools provided by good database vendors, such as triggers, CTE, stored procedures, constraints, transactions, functions, foreign keys and definitely I didn't want to avoid the database features at all. ActiveRecord seems to try to focus on hiding the database from the application, by trying to abstract as much as possible so that you feel you're just working with objects. That's probably the main reason why I loved Sequel. Sequel embraced the database, it didn't fight the database. It would try to make it as easy as possible to use whatever vendor-specific feature I wanted to, without getting in my way. That's why I don't see Sequel as an ORM, but as a tool that allows me to write the SQL I want with a level of control and logic that would be pretty hard to achieve by building SQL queries through concatenation techniques and manual typecasting of params and result sets.

I can always have a clear idea on the SQL generated by Sequel and it's way more readable than if I had to write the SQL by hand myself.

When I first learned about Sequel, Jeremy Evans was already its maintainer, but it seems Sequel was first created by Sharon Rosner. Recently I read this article, where this quote came to my attention:

I'm the original author of Sequel [1], an ORM for Ruby. Lately I've been finding that ORM's actually get in the way of accomplishing stuff. I think there's a case to be made for less abstraction in programming in general, and access to data stores is a major part of that. For an in-production system I've been maintaining for the last 10 years, I've recently ripped out the ORM code, replacing it with raw SQL queries, and a bit of DRY glue code. Results: less code, better performing queries, and less dependencies. Sharon Rosner, Sequel original author

Good that it's working well for him, but I really find it weird to see that he would consider Sequel a traditional ORM. To me, Sequel allows me to write more maintainable queries, so I consider it more of a query builder than an ORM. If I had to build all SQL by hand and typecast params and result sets by hand, I think the result would be much worse, not better.

So, nowadays, I'm considering creating a brand new application after several years, and I'm frustrated that it takes a really long time to bootstrap a production-ready new application with the state-of-the-art features. I started working on such sample project to serve as a start point. The idea is to add features such as automated deployment, including blue-green (canary) strategies for zero downtime, using Roda as the Ruby framework, Webpack to bundle static resources, support a lightweight alternative to React, such as Dio.js or Inferno.js, supporting multiple environments, flexible configurations, client-side routing, proper security measures (CSRF, CSP headers), a proper authentication system, such as Rodauth, proper images uploading (think of Shrine), distributed logging (think of fluentd) with proper details, reliable background jobs, server-side and client-side testing, support for lazy code loading for both client-side and server-side, autoreloading of Ruby code in the server-side, analytics, APM, client-side performance tricks such as link preloading, performance tracking for both server-side and client-side code, errors tracking for both server-side and client-side code, integrated with sourcemaps and notifications from monitoring services, CDN support, full-text search through ElasticSearch or Solr, caching storage such as Redis, Docker based infra-structure, backup, high-availability of databases, and many many more features that are supposed to be found in production-ready applications. As you can see, it's really frustrating to create a new application from scratch these days, as it seems any new product could easily take an year to reach a solid production-ready level. And, of course, support for database migrations.

The last thing I would want to worry about while working on this huge project is to waste time with a simple task, such as managing the database state through some migrations and related tools. Specially as ActiveRecord migrations have been providing that for so long and it works pretty well. However, this time I really wanted to ditch the dependency on railties for this new project, and active_record_migrations relies on railties for simplicity, so that it can take advantage of the Rails generators and just be a very simple wrapper around ActiveRecord migrations. But since AR itself won't be used in this project, I decided to spend several hours (about two full days), replicating the most important tools from ActiveRecord to Sequel. And this is how sequel_tools was born this week.

I find it interesting how such a little detail, like Rails bundling a proper database migrations tooling, influenced a lot of my career, since I only learned Ruby because of Rails in the first place and I only chose Rails because of ActiveRecord migrations :) If I was working with Python I wouldn't have learned Ruby most likely and wouldn't work in my current job, and wouldn't have created many gems such as:

I've also been using Ruby for some other projects such as cert-generator, a Rack application that can be launched from a Docker container that allows development suited auto-signed root CA and HTTPS certificates in such a way supported by modern browsers. I've written about it in my previous article.

Or I wouldn't have contributed to some Ruby projects such as Rails, orm_adapter-sequel, Redmine, Gitorious (now dead), Unicorn, RSpec-rails, RSpec, Capistrano, Sequel, js-routes, jbundler, database_cleaner, Devise, ChiliProject, RVM, rails-i18n, rb-readline and acl9. Most of them were minor contributions or documentation updates, but anyway... :)

Not to mention many bugs reported to MRI, JRuby and Ruby projects that have been fixed since then. And, before I forget, some features have been added to Ruby after Matz approved some of my requests. For example, the soon to be released Ruby 2.5 is introducing ERB#result_with_hash (see issue #8631.

Or my request to remove the 'useless' 'contatenation' syntax that was approved by Matz about 5 years ago, and I still hope someone would implement it at some point :)

I wonder what would be my current situation if ActiveRecord migrations weren't bundled with Rails in 2007 :) On the other side, maybe I could have become rich working with Python? ;)

Introducing sequel_tools

If you're a Sequel user, you probably spent a while searching for Rake integration around Sequel migrations and realized it was more time than you'd wished. I've been in the same situation, but it was so frustrating to me, because I wasn't able to find all tasks I want to have at disposal, that I'd often just forget about using Sequel migrations to stick with ActiveRecord migrations. Not because I like the AR migrations DSL better (I don't by the way), but because all tooling is already there, ready to be used through some simple rake commands.

sequel_tools is my effort in trying to come up with some de facto solution for integrating Sequel migrations and related tooling and Rake, and see if the Sequel community could concentrate the efforts on building together a solid foundation for Sequel migrations. I hope others would sympathize and contribute to the goal, so that we wouldn't have to waste time thinking about migrations again in the future when using Sequel.

Here are some of the supported actions, which can be easily integrated to Rake, but are implemented in such a way that other interfaces, such as command lines or Thor, should be also made easy to build:

create the database;

drop the database;

migrate (optionally to a given version, or latest if not informed);

generate a migration file (time-stamp based only);

status (which migrations are applied but missing locally and which are not yet applied to the database);

version (show current version / last applied migration);

rollback last applied migration which is present in the migrations path;

run a given migration up block if it hasn't been applied yet;

run a given migration down block if it hasn't been applied yet;

redo: runs a given migration down and up, which is useful when writing some complex migrations;

dump schema to schema.sql (configurable, can happen automatically upon migration - implemented just for PostgreSQL for now, by calling pg_dump, but should be easy to extend to support other databases: PRs are welcomed or additional gems);

load from schema;

support for seeds.rb;

reset by re-running all migrations over a new database and running the seeds if available;

setup by loading the saved schema dump in a new database and running the seeds if available;

execute a sql console through the "shell" action;

execute an irb console through the "irb" action. This works like calling "bundle exec sequel connection_uri". The connection is stored in the DB constant in the irb session.

I decided not to support the Integer based migrations at this point as I can't see any drawbacks of time-stamp based migrations that would be addressed by the Integer strategy while there are many problems with the Integer strategy even if there's a single developer working in the project. I'm open to discuss this with anyone that thinks that could convince me otherwise that supporting Integer based migrations would add something to the table. It's just that it's more code to maintain and test and I'm not willing to do that unless there is indeed some advantage over using time-stamp based migrations.

The project also allows missing migration files, since I find it useful specially when reviewing multiple branches, dealing with independent migrations.

I don't think it's a good idea to work with a Ruby format for storing the current schema, as a lot of things are specific to the database vendor. I never used the Ruby vendor-independent format in all those years, but if you think you'd value such a feature in case you just use the basics when designing the tables and want your project to support multiple database vendors, then go ahead and either send a Pull Request to make it configurable, or create an additional gem to add that feature and I can link to it in the documentation.

I'd love to get some feedback regarding what the Sequel community would think about it. I'd love for us to get to some consensus on what should be the de facto solution for managing Sequel migrations in a somewhat feature-complete fashion and would love to get the community help on making such de facto solution happen to the best interest of we, Sequel happy (and sometimes frustrated by the lack of proper tooling around migrations - no more) users ;)

Please take a look at how the code looks like and I hope you find it easy to extend to your own needs. Any suggestions and feedback are very welcome, specially now that the project is new and we can change a lot before it gets a stable API.

May I count with your help? ;)