Applications built on javascript platforms with big data sets are often difficult to take snapshots of. In my case, I had a large data set paginated over an ngRepeat in AngularJS. When search engines see my web app however, they only receive templated values, e.g. {{title}}. One way to have search engines index your website with the templated data is to take snapshots of the pages, e.g. PhantomJS with grunt-html-snapshot. This grunt task takes snapshots every time the project is built from the command line. However, since my data changes hourly, I needed a solution that kept up with the changes. Thus, I created a separate page in PHP with microdata schema.

Microdata events

Microdata indexes all relevant information per each row or section. In my case, each row indicates an event.

The microdata page is created with PHP. Each row in the database, with dates greater than today, are displayed. Users will never see this page, but each URL will show up on google separately and when searched. Below, the markup shows the type of data it represents and the URL it is associated with:

< table border = "1" > < tr > < th > Title </ th > < th > Date </ th > < th > Time </ th > < th > Price/Tickets </ th > < th > Venue </ th > < th > Url </ th > </ tr > while ($row = mysqli_fetch_array($events)) { < tr class = "event-wrapper" itemscope itemtype = "http://schema.org/Event" > < td itemprop = "name" >< a href = "http://showhaus.org/#!/<?php echo $row['id'];?>/<?php echo $row['city'];?>/<?php echo $row['venue'];?>/<?php echo $row['title'];?>" > echo $row[ 'title' ]; </ a ></ p > < td class = "event-date" itemprop = "startDate" content = "<?php echo $row['date'];?>" > echo $row[ 'date' ]; </ span ></ td > < td class = "event-time" itemprop = "doorTime" content = "<?php echo $row['time'];?>" > echo $row[ 'time' ]; </ span ></ td > < td class = "event-fees" > < span > Price: < span itemprop = "offers" itemscope itemtype = "http://schema.org/Offer" > < a itemprop = "url" href = "<?php echo $row['ticket_uri'];?>" > ﻿ < span itemprop = "price" content = '<?php if($row["price"] == "-1"){echo "0";}else{echo $row["price"];}?>' >< span itemprop = "priceCurrency" content = "USD" > if ($row[ "price" ] == "-1" ){ echo "0" ;} else { echo $row[ "price" ];} </ span ></ span > </ a > </ span > </ span > </ td > < td class = "location" itemprop = "location" itemscope itemtype = "http://schema.org/Place" > < span class = "name" itemprop = "name" > @ echo $row[ 'venue' ]; </ span > < div itemprop = "address" >< a href = "http://showhaus.org/#!/<?php echo $row['id'];?>/<?php echo $row['city'];?>/<?php echo $row['venue'];?>/<?php echo $row['title'];?>" > Details </ a ></ div > </ td > < td class = "url" itemprop = "url" itemscope itemtype = "http://schema.org/Url" > < a href = "http://showhaus.org/#!/<?php echo $row['id'];?>/<?php echo $row['city'];?>/<?php echo $row['venue'];?>/<?php echo $row['title'];?>" > http://showhaus.org/#!/ echo $row[ 'id' ]; / echo $row[ 'city' ]; / echo $row[ 'venue' ]; / echo $row[ 'title' ]; </ a > </ td > </ tr > } </ table >

With the microdata page, I now have hundreds of indexable links and the URL's to access each event directly:

Server configuration for snapshots

In order for search engines and robots to crawl/index your page, you have to redirect them to a separate page where the microdata lives. A simple rewrite in your .htaccess is all that is needed for redirecting:

< IfModule mod_rewrite.c > RewriteEngine on RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$ RewriteRule ^$ /micro.php$1 [QSA,L] </ IfModule >

We use the query string "_escaped_fragment_=" because crawlers that see #! URL's will automatically add "_escaped_fragment_=" at the url where the hashbang would go. We use this to redirect the crawler.

Google's Search Console

Google's webmaster tool suite includes a structure testing tools to ensure your microdata is detailed enough and valid.

The tool, unfortunately, is limited to a 2.5MB sized page. In SQL terms, that's about 1000 rows.