Deep Linking, Rich Social Sharing & SEO with AngularJS and Amazon S3

The Problem

The ability to share direct links to upcoming gigs is a very important concept to Gigbloc. A key part of this is allowing Twitter, Facebook, et al. to render preview cards that look good and show information about a specific gig, as below:

The result of sharing a dynamic gig link on Facebook

It’s also important to us that Google and other search engines are able to crawl our site and get some idea of what it’s about, despite it being a true SPA (Single Page Application) — there’s no content, just the ability to explore, discover and listen to live music.

We also don’t like running servers or spending mad dollars. Gigbloc and its data are entirely static server-side; the data powering the service is published periodically to Amazon S3, where browsers can download it as the application requires.

The generally accepted approach to solving these problems involves using User Agent-specified redirects to divert social crawlers and search engines to pre-rendered versions of your site, perhaps generated by services such as prerender.io. But we don’t like spending dollars, so we’ve got a better way using nothing but Amazon S3 and a bit of application logic.

Routes that reflect the state of the application

It’s important that the routes in our application reflect the state of our application, and update as the user browses around. Sharing is then as simple as copy and pasting the URL from their browser. We do this with a simple routing utilising the ui-router:

$urlRouterProvider.otherwise("/gigs");

$stateProvider

.state('home', {

url: "/gigs",

templateUrl: "/views/main.html",

controller: "MainCtrl"

})

.state('home.city', {

url: "/:city",

templateUrl: "/views/main.html",

controller: "MainCtrl"

})

.state('home.city.date', {

url: "/:date",

templateUrl: "/views/main.html",

controller: "MainCtrl"

})

.state('home.city.date.gig', {

url: "/:gig",

templateUrl: "/views/main.html",

controller: "MainCtrl"

}

This reflects the structure of the application — everything is based around being in a city, drilling into a date, and listening to a specific gig on that date.

We then update this route in the application like so:

$state.go('home.city.date.gig', {

city: $scope.city,

date: $scope.date,

gig: $scope.encodedGigId

});

So any navigation in the application correctly reflects itself in the URL, and is consistent and coherent to the user when sharing their application state with others.

HTML5 Mode and Amazon S3

AngularJS supports HTML5 mode, i.e. getting rid of the hashbangs(#!) in the URLs of your Angular routes. This is pretty simple to switch on with a line of code in your app configuration:

$locationProvider.html5Mode({

enabled: true,

requireBase: false

});

An obvious problem with this feature, when coupled with the inability to change .htaccess rules on your webserver, is that links copied and pasted would 404 when the folder isn’t found on the server. Fortunately, you can specify a custom error page on Amazon S3 and point it at your index.html, while still maintaining the URL so Angular correctly interprets the route:

Configuring Amazon S3 to direct 404s to the Angular index.html

We now have an application that doesn’t have hashes in its URL, hosted almost for free on Amazon S3, and with the ability to place static files on Amazon S3 to match these routes if we need to.

Detecting JavaScript support, not User Agents

The issue that we’re trying to solve with deep linking and crawling is effectively caused by a lack of JavaScript support in the crawlers created by social networks and search engines — they don’t understand how to run our application, whereas our users do.

We filter out the social networks and search engines that crawl our gig pages by providing them with a static HTML page containing all the relevant meta tags, and also a simple JavaScript snippet that redirects JavaScript-capable browsers to the root URL, with the correct route hashed out, so that S3 takes you to the application and Angular interprets the route correctly.

Here’s an example of published metadata and redirect files for one of our featured gigs:



<head>

<title>Gigbloc: Listen to this gig at The Windmill, Brixton on Tuesday 22nd September</title>

<meta property="og:title" content="Check out this gig at The Windmill, Brixton on Tuesday 22nd September"/>

<meta property="og:site_name" content="Gigbloc"/>

<meta property="og:type" content="blog"/>

<meta property="og:url" content="

<meta property="og:description" content="Featuring Darkbeat"/>

<meta property="og:image" content="

<meta name="twitter:card" content="summary_large_image">

<meta name="twitter:site" content="

<meta name="twitter:title" content="Check out this gig at The Windmill, Brixton on Tuesday 22nd September">

<meta name="twitter:description" content="Featuring Darkbeat">

<meta name="twitter:image" content="

<script type="text/javascript">

window.location = "

</script>

</head>

<body>

Listen to this gig at The Windmill, Brixton on Tuesday 22nd September. Featuring Darkbeat.

</body>

</html> Gigbloc: Listen to this gig at The Windmill, Brixton on Tuesday 22nd September http://gigbloc.com/gigs/ldn/2015-09-22/ucoubc "/> https://i1.sndcdn.com/artworks-000112874652-0di4l0-large.jpg "/> @gigbloc "> https://i1.sndcdn.com/artworks-000112874652-0di4l0-large.jpg "> window.location = " <a href="http://gigbloc.com/#/gigs/ldn/2015-09-22/ucoubc" class="bq hh im in io ip" rel="noopener nofollow">http://gigbloc.com/#/gigs/ldn/2015-09-22/ucoubc</a> "; Listen to this gig at The Windmill, Brixton on Tuesday 22nd September. Featuring Darkbeat.

When a link to http://gigbloc.com/gigs/ldn/2015-09-22/ucoubc is posted to Facebook or Twitter, or a search engine crawls it without JavaScript support, they use this page to generate their cards and decide the validity of the search results.

A user on a browser will be redirected to the actual live gig page and is able to listen to the music.

We also update our index.html as part of our periodic publishing process to include links to all gig pages featured in the upcoming week. This gives search engines awareness of what users are able to experience on our site, and index it accordingly.

This all costs us pretty much nothing — simply the storage and data transfer costs from hosting on Amazon S3, with no need for a server with User Agent-based rewrite rules, and none of the associated problems with meeting traffic demands and server administration.