All about Catalyst – interview of Matt S. Trout (Part 1 of 3)

We talk to Matt S. Trout, technical team leader at consulting firm Shadowcat Systems Limited, creator of the DBIx::Class ORM and of many other CPAN modules, and of course co-maintainer of the Catalyst web framework. These are some of his activities, but for this interview we are interested in Matt’s work with Catalyst.

Our discussion turned out not to be just about Catalyst though. While discussing the virtues of the framework, we learned, in Matt’s own colourful language, what makes other popular web frameworks tick, managed to bring the consultant out of him who shared invaluable thoughts on architecting software as well as on the possibility of Perl 6 someday replacing Perl 5 for web development.

We concluded that there’s no framework that wins by knockout, but that the game’s winner will be decided on points, points given by the final judge, your needs.

So, Matt, let’s start with the basics. Catalyst is a MVC framework. What is the MVC pattern and how does Catalyst implement it?

The fun part about MVC is that if you go through a dozen pages about it on Google you’ll end up with at least ten different definitions. The two that are probably most worthwhile paying attention to are the original and the Rails definitions.

The original concept of MVC came out of the Xerox PARC work and was invented for Smalltalk GUIs. It posits a model which is basically data that you’re live-manipulating, a view which is responsible for rendering that, and a controller which accepts user actions.

The key thing about it was that the view knew about the model, but nothing else. The controller knew about the model and the view, while the model was treated like a mushroom – kept in the dark; the view/controller classes handled changes to the model by using the observer pattern, so an event got fired when they changed (you’ll find that angular.js, for example, works on pretty much this basis – it’s very much a direct-UI-side pattern).

Now, what Rails calls MVC (and, pretty much, Catalyst also does) is a sort of attempt to squash that into the server side at which point your view is basically the sum of the templating system you’re using plus the browser’s rendering engine, and your controller is the sum of the browser’s dispatch of links and forms and the code that handles that server side. So, server side, you end up with the controller being the receiver for the HTTP request, which picks some model data and puts it in … usually some sort of unstructured bag. In Catalyst we have a hash attached to the request context called the stash. In Rails they use the controller’s instance attributes and then you hand that unstructured bag of model objects off to a template, which then renders it – this is your server-side view.

So, the request cycle for a traditional HTML rendering Catalyst app is:

the request comes in the appropriate controller is selected Catalyst calls that the controller code, performs any required alterations to the model then tells the view to render a template name with a set of data

The fun part, of course, is that for things like REST APIs you tend to think in terms of serialize/deserialize rather than event->GUI change, so at that point the controller basically becomes “the request handler” and the view part becomes pretty much vestigial, because the work to translate that data into something to display to the user is done elsewhere, usually client side JavaScript (well, assuming the client is a user facing app at all, anyway).

So, in practice, a lot of stuff isn’t exactly MVC … but there’ve been so many variants and reinterpretations of the pattern over the years that above all it means to “keep the interaction flow, the business logic, and the display separate … somehow” which is clearly a good thing, and idiomatic catalyst code tends to do so. The usual rule of thumb is “if this logic could make sense in a different UI (e.g. a CLI script or a cron job), then it probably belongs inside the domain code that your web app regards as its model”; plus “keep the templates simple, and keep their interaction with the model read–only – anything clever or mutating probably belongs in the controller”.

So you basically drive to push anything non-cosmetic out of the view, and then anything non-current-UI-specific out of the controller and the end result is at least approximately MVC for some of the definitions and ends up being decently maintainable

Can you swap template engines for the view as well as, at the backend, swap DBMS’s for the model?

Access to the models and views is built atop a fairly simple IOC system – inversion of control – so basically Catalyst loads and makes available whatever models and views are provided, and then the controller will ask Catalyst for the objects it needs. So the key thing is that a single view is responsible for a view onto the application; the templating engine is an implementation detail, in effect, and there are a bunch of view base classes that mean you don’t have to worry about that, but if you had an app with a main UI and an admin UI, you might decide to keep both those UIs within the same Catalyst application and have two views that use the same templating system but a completely different set of templates/HTML style/etc.

In terms of models, if you need support from your web framework to swap database backends, you’re doing something horribly wrong. The idea is that your domain model code is just something that exposes methods that the rest of the code uses – normally it doesn’t even live in the Catalyst model/ directory. In there are adapter classes that basically bolt external code into your application.

Because the domain code shouldn’t be web-specific in the first place you have some slightly more specialised adapters – notably Catalyst::model::DBIC::Schema which makes it easier to do a bunch of clever things involving DBIx::Class – but the DBIx::Class code, which is what talks to your database for you, is outside the scope of the Catalyst app itself.

The web application should be designed as an interface to the domain model which not only makes things a lot cleaner, but means that you can test your domain model code without needing to involve Catalyst at all. Running a full HTTP request cycle just to see if a web-independent calculation is implemented correctly is a waste of time, money and perfectly good electricity!

So, Catalyst isn’t so much DBMS-independent as domain-implementation-agnostic. There are catalyst apps that don’t even have a database, that manage, for example, LDAP trees, or serve files from disk (e.g. the app for our advent calendar). The model/view instantiation stuff is useful, but the crucial advantage is cultural. It’s not so much about explicitly building for pluggability as refusing to impose requirements on the domain code, at which point you don’t actually need to implement anything specific to be able to plug in pretty much whatever code is most appropriate. Sometimes opinion is really useful. Opinion about somebody else’s business logic, on the other hand, should in my experience be left to the domain experts rather than the web architect.

Catalyst also has a RESTfull interface. How is the URI mapped to an action?

The URI mapping works the same way it always does. Basically, you have methods that are each responsible for a chunk of the URL, so for a URL like /domain/example.com/user/bob you’d have basically a method per path element: the first one sets up any domain-generic stuff and the base collection, the second pulls the domain object out of the collection, then you go from there to a collection of users for that domain and pull the specific user. Catalyst’s chained dispatch system is basically entirely oriented around the URL space, drilling down through representations/entities anyway which is a key thing to do to achieve REST, but basically a good idea in terms of URL design anyway plus, because the core stuff is all about path matching, it becomes pretty natural to handle HTTP methods last. So there’s Catalyst::Action::REST and the core method matching stuff that makes it cleaner to do that, but basically they both just save you writing:

if (<GET request>) { ... } elsif (<POST request>) { ... } etc.

Of course you can do RESTful straight HTTP+HTML UIs, although personally I’ve found that style a little contrived in places. For APIs, though, the approach really shines but basically API code is – well, it’s going to be using a serializer/deserializer pair (usually JSON these days) instead of form parsing and a view – but apart from that, the writing of the logic stuff isn’t hugely different. RESTful isn’t really about a specific interface, it’s about how you use the capabilities available. But the URI mapping and request dispatch cycle is a very rich set of capabilities – and allow a bunch of places to fiddle with dispatch during the matching process. Catalyst::Action::REST basically hijacks the part where Catalyst would normally call a method and calls a method based on the HTTP method instead; so, say, instead of user you’d have user_GET called. There’s also Catalyst::Controller::DBIC::API which can provide a fairly complete REST-style JSON API onto your DBIx::Class object graph.

So again it’s not so much that we have specific support for something, but that the features provide mechanism and then the policy/patterns you implement using those are enabled rather than dictated by the tools. I think the point I’m trying to make is that REST is about methods as verbs and about entities as first class things so it implies good URL design … but you can do good URL design and not do the rest of REST, it’s just that they caught on about the same time.

What about plugging CPAN modules in? I understand that this is another showcase of Catalyst’s extensibility. Can any module be used, or must they adhere to a public interface of some sort?

There’s very little interface; for most classes, either your own or from CPAN, Catalyst::model::Adaptor can do the trick. There are three versions of that, which are:

call new once, during startup, and hang onto the object

once, during startup, and hang onto the object call new no more than once per request, keeping the same object for the rest of the request once it’s been asked for

no more than once per request, keeping the same object for the rest of the request once it’s been asked for call new every time somebody asks for the object

The first one is probably most common, but it’s often nice to use the second approach so that your model can have a first class understanding of, for example, which user is currently logged in, if any, so that manages the lifecycle for you. Anything with a new method is going to work, which means any and all object-oriented stuff written according to convention in at least the past 10 years or so.

For anything else you break out Moose/Moo, and write yourself a quick normal class that wraps whatever crazy thing you’re using and now you’re back to it being easy (and you’ll probably find that class is more pleasant to use in all your code, anyway). Really, any attempt at automating the remaining cases would probably be more code to configure for whatever use-case you have than to just write the code to do it.

For example, sometimes you want a component that’s instantiated once, but then specialises itself as requested. A useful example would be “I want to keep the database connection persistent, but still have a concept of a current user to use to enforce restrictions on queries”. So there’s a role called Catalyst::Component::InstancePerContext that provides that – instead of using the Adaptor’s per-request version, you use a normal adaptor, and use that role in the class it constructs and then that object will get a method called on it once per request, which can return the final model object to be used by the controller code. I’ve probably expended more characters describing it than any given implementation takes, because it’s really just the implementation of a couple of short methods and besides that, the most common case for that is DBIx::Class. There’s also a PerRequestSchema role shipped with Catalyst::model::DBIC::Schema (which is basically a DBIx::Class-specialised adaptor, remember) that reduces it to something like:

sub _build_per_request_schema {

my ($self, $c) = @_;

$self->schema->restrict_with_object($c->user);

}

… but again the goal isn’t so much to have lots of full-featured integration code, but to minimise the need to write integration code in the first place.

In the forthcoming second part of the interview, we talk about the flexibility of Catalyst, its learning curve, Ruby on Rails, and the framework in Software Enginnering terms