If you have anchor tags with href attributes external to your own website, you have no log of when those links are clicked. Example:

< a href= "http://external.site.example.com/" > Click to go to "external.site.example.com" </ a >

It has become very common for people to use the JavaScript onclick event and ajax to report back to the server when a click has happened, but this is no good if the user doesn’t have JavaScript enabled, or if the HTML is on an external site such as an RSS aggregator. For those cases, it has become common to change the href to link back to a script on your own site which performs a HTTP redirect. Example:

< a href= "http://my.site.example.com/redirect.cgi?url=http://external.site.example.com/" > Click to go to "external.site.example.com" </ a >

If you just allow any old “url” parameter to your redirect script, this is what is called an “Open Redirect.” Open redirects have been abused heavily by scammers and spammers and should not exist.

One option is to maintain a whitelist of URLs that are allowed to be redirected to. This is cumbersome to maintain. Another option is to add an additional parameter which authenticates the URL. For example, you could take a hash of the URL, along with a private Salt, and then provide that as a parameter too. For my examples I will use a simple short salt, “ABCDEFGHIJKLM”. Here is an example of how I would generate and use such a URL:

Generate the authentication token:

mike @haven :~ $ echo "ABCDEFGHIJKLM_http://external.site.example.com/" |md5sum a54de5ffcb50cead775cac0b254af46 0 - mike @haven :~ $

Build the URL:

< a href= "http://my.site.example.com/redirect.cgi?url=http://external.site.example.com/&auth=a54de5ffcb50cead775cac0b254af460" > Click to go to "external.site.example.com" </ a >

The redirect.cgi is only doing two things. It checks that the MD5 of “ABCDEFGHIJKLM_http://external.site.example.com/" matches a54de5ffcb50cead775cac0b254af460, and if so it redirects. We don’t even need to write a script to do that, it can all be done inside the Apache configuration or a htaccess file. Here follows the mod_rewrite configuration:

RewriteEngine On RewriteMap unescape int:unescape RewriteCond %{QUERY_STRING} ^(?:.*\&)?url=([^\&]* %3 [fF][^\&]*) RewriteRule ^/bounce$ ${unescape: %1 } RewriteCond %{QUERY_STRING} ^(?:.*\&)?url=([^\&]+) RewriteRule ^/bounce$ ${unescape: %1 } ?

On its own, the above would create an open redirector. The next step is to add some mod_security configuration to check the auth parameter:

## Enable ModSecurity and allow HTTP request parsing SecRuleEngine On SecRequestBodyAccess On ## Allow redirects if there is a valid auth parameter matching the url parameter SecRule REQUEST_URI ^/+bounce(\?.*)?$ "chain,phase:1,nolog,allow" SecRule ARGS:auth ^[a-f0 -9 ]{ 32 }$ "chain,setvar:tx.auth=%{MATCHED_VAR}" SecRule ARGS:url ^(?i)https?://.+$ "chain,setvar:tx.urlnsalt=**ABCDEFGHIJKLM**_%{MATCHED_VAR}" SecRule TX:urlnsalt "@streq %{TX.auth}" "t:md5,t:hexEncode" ## Block all other requests for /bounce with a 403 FORBIDDEN error SecRule REQUEST_URI ^/+bounce(\?.*)?$ "phase:1,log,deny,status:403"

It’s pretty amazing what you can do with mod_security.

You might still consider it a pain to convert:

http://external.site.example.com/

To:

http://my.site.example.com/bounce?url=http://external.site.example.com/&auth=a54de5ffcb50cead775cac0b254af460

But for dynamically generated websites especially, it can be done completely transparently. This website its self is generated using server side XSLT, with a self built framework utilising Perls XML::LibXML and XML::LibXSLT modules.

So with a little Perl, XPath and XML trickery, I’ve been able to update the framework to convert all external anchor tags to use my redirector dynamically. Here’s the code:

my $doc = $stylesheet->transform( $xml, ); if ( $stylesheet->media_type eq 'text/html' ){ foreach my $node ( $doc->findnodes( 'html/body//a' ) ){ my $href = $node->getAttribute( 'href' ) || '' ; if ( $href =~ /^https?:\/\/([-a-z0-9\.]+)/i && lc ($1) ne lc ($ENV {HTTP_HOST} ) ){ $node->setAttribute( 'href' , sprintf ( '/bounce?auth=%s&url=%s' , Digest::MD5::md5_hex( "ABCDEFGHIJKLM_$href" ), URI::Escape::uri_escape( $href ), ) ); } } }

Now if I use:

< a href = "http://example.com/" > Click me </ a >

in a blog post, my framework automatically converts it to the redirect version, and if someone then clicks the link when reading my RSS feed, even though it was an external link and they weren’t viewing the blog post from my website, I still know the link was clicked.