In a typical web application the most frequently occurring task is to get parameters from a request. Perl community and popular frameworks have been having two interfaces to this: param() and parameters() . And there's a few issues.

param()

Good old CGI.pm has a convenient param() method, which behaves differently based on a context:

my $q = CGI->new; my @keys = $q->param(); # get the list of param names my $name = $q->param('name'); # scalar context: always get single my @names = $q->param('name'); # list context: get multiple (if any)

This is quite nice, since your code says how you want values by explicitly stating the context (whether a scalar context or a list context). The only place it bites is that there are cases where you accidentally force a list context, such as when assigning it to a hash or pass to a method call:

my $vars = { name => $q->param('name'), # Oops, it's a list context! email => scalar $q->param('email'), # this is correct };

This code quite doesn't work if there are multiple (and even number of ) name parameters, or even worse, injects some unintentional parameters to $vars that could be seriously dangerous if you inject that to an internal utilities or databases.

So, param() is quite nice but only if you are really careful for this list context gotcha.

parameters()

Catalyst has added parameters() to its Catalyst::Request object and it allows you to get values in an array ref if there are multiple.

my $form = $c->request->parameters; # ?a=b&b=c # $form = { a => 'b', b => 'c' } # ?a=b&a=c&b=c # $form = { a => [ 'b', 'c' ], b => 'c' };

This might look intuitive but wait a minute. The data structure gets different per user input rather than how you code it, and that sucks. This means you have to always check if the value is an array ref or not, since:

my $v = $c->request->parameters; my $query = $v->{query}; my @names = @{$v->{name}};

$query might become ARRAY(0xabcdef) if there are multiple query= parameters in the query. @names line might cause Can't use string as an ARRAY ref error if there's only one (or zero) name parameter. This causes horrible issues when using standard HTML elements like option or checkbox forms, or tools like jQuery's serialize() .

The correct way to write that would be:

my $v = $c->request->parameters; my $query = ref $v->{query} eq 'ARRAY' ? $v->{query}->[0] : $v->{query}; my @names = ref $v->{name} eq 'ARRAY' ? @{$v->{name}} : ($v->{name});

and it is tedious and gross.

Rack::Request

Let's see how other languages try to solve this problem. First, Rack::Request.

Rack::Request has params method which always returns a Hash object. They have their own rule for multiple values. If there are multiple values for the same key (like foo ), the value is always the last value. By naming the key in a special way, like foo[] , you can state that "This key might have multiple values", and req.params['foo'] would return Array instead of the String value.

Although it kind of hurts that you have to force this behavior in the low level library like Rack, but I think this is a good middle ground, since you can name your parameters in your templates and the request handler code to specify whether you want an Array or a String. This technique has been actually ported to Perl as modules like Catalyst::Plugin::Params::Nested

WebOb.py

WebOb is a Python paste library to handle WSGI request parameters and such and is used in Python frameworks such as Pylons. WebOb document explicitly talks about this may-or-may-not-be-multiple params problem very clearly:

Several parts of WebOb use a “multidict”; this is a dictionary where a key can have multiple values. The quintessential example is a query string like ?pref=red&pref=blue; the pref variable has two values: red and blue. In a multidict, when you do request.GET['pref'] you’ll get back only 'blue' (the last value of pref). Sometimes returning a string, and sometimes returning a list, is the cause of frequent exceptions. If you want all the values back, use request.GET.getall('pref'). If you want to be sure there is one and only one value, use request.GET.getone('pref'), which will raise an exception if there is zero or more than one value for pref.

and I like it. It does the right thing if you handle as a normal hash but provides a method like getall to explicitly demand list instead of a string.

Hash::MultiValue

So, I was thinking of stealing this idea for our Plack::Request which currently inherits this sucky parameters() from HTTP::Engine and then Catalyst::Request, which most of the Plack gang agree is a bad idea.

Last night I was sketching the initial implementation of WebOb's MultiDict into Perl: Hash::MultiValue. It uses tie to behave like a normal hash with a single entry, but with an API to get multiple values if you want:

use Hash::MultiValue; my $hash = Hash::MultiValue->new( foo => 'a', foo => 'b', bar => 'baz', ); # $hash is an object, but can be used as a hashref and DWIMs! my $a = $hash->{foo}; # 'b' (the last entry) my @k = keys %$hash; # ('foo', 'bar') not guaranteed to be ordered

You can use the object just like a normal hash reference, and the value always returns the last element (if there are multiple). And you can also use the OO API call on the object to get multiple values, just like WebOb's MultiDict:

my $foo = $hash->get('foo'); # always single (regardless of context) my @bar = $hash->get_all('bar'); # always multi my @keys = $hash->keys; # Ordered keys

You should always use this get_all if you want multiple values. Being explicit is a good thing, right? There is also no list context gotcha like you see with CGI.pm style param().

Performance concern

There is a benchmark script attached because it used to do some tie/overload stuff which should definitely affect the performance.

UPDATE: this module does not use tie nor overload anymore, but uses inside-out object approach, thank to Michael Peters and Aristotle for the suggestion! The post content is updated appropriately.

With my quick test, the inside-out object based approarch, in a typical web request where there's only a few (~10) keys the performance is like 21,000 QPS (Hash::MultiValue) vs 32,000 QPS (normal hash). So, it is just like 80% of the overhead.

Whether this would become a critical overhead depends how fast your web application is: Plack standalone server runs like 1500 QPS and most framework gives an overhead to make it 500 QPS or less, so I think the overhead would be eventually < 1% of your web application, so maybe it doesn't really matter.

I'll probably spend some time soon on Plack-Request repository by creating a branch for this type of thing. Any input would be highly welcome ;)