A few days ago I sent an email to Chad Dickerson, who I’ve met at Yahoo! and had a chance to hang out with at Mashup Camp in Dublin.

Chad, From what I can tell, if you create a Pipe and add additional fields (Shortcuts, Term Extraction), the only way to get to them in an API-like way is to use the JSON renderer. The RSS renderer removes those extra fields to follow the RSS spec. PHP supports JSON decoding, but you need a PEAR library or a quite recent version of PHP. If Yahoo supported serialized php with Pipes like you do with the other common API’s, it would be a lot easier for folks on shared hosting to work with Pipe data on the server side. I imagine with the new badge stuff you released that there’s a push to keep things client side, but there’s a huge advantage to rendering server-side to keep things nice and spiderable. Short Version: Expose Pipe results as serialized PHP. Pretty please.

Chad sends this along to the Pipes team, and less than three days later:

Pipes Blog » Blog Archive » New Yahoo Pipes PHP serialized output renderer

kick.

ass.

Two points to be made: first, I’m damn impressed that one of the largest sites on the ‘net would roll a feature request from an outside developer in less than three days. Second, developers should never resist the urge to ask for help from an API provider. If a company is taking the time to support an API, chances are very good that they will listen to developers and react. I can personally say I’ve gotten immediate results from Technorati, Dapper, and now Yahoo!. So blow off the idea that a big website would never listen to little ol’ developer you. With that negative attitude it’s guaranteed you’ll never get it. Ask, believe, receive, right?

So props to Chad, Jonathan Trevor, Paul Donnelly, and the rest of the Pipes team!

The Details

I’m a big fan of Yahoo Pipes. It’s an incredibly useful tool for putting together quick aggregators and filters for mashups. To integrate a Pipe on a webpage, you have a few options. You can go the cut and paste route and use a Badge, which works client side, or you roll your own code to integrate a pipe.

After you run a Pipe, you’re given a list of output formats. Copy the link location of these to get the URL of the output and tweak the parameters.

Until yesterday, the output formats useful for mashups were JSON and RSS. JSON is great for client side mashups, but as you know, search engines will not index client side content, so you lose any SEO love you might get. RSS is easy to consume server side, but Pipes will normalize the output to conform to the RSS spec. That means if you’re using term extraction or Shortcuts or any other meta data to your pipe, you’ll lose it with RSS ouput unless you put that data into one of the RSS fields (title, description, etc.). So that leaves us with hacking JSON on the server side. The JSON output format retains all that sweet metadata. In PHP, the best options are a JSON PEAR module or, if you’re rocking 5.2 and above, you have the handy json_decode() function.

Now that Yahoo supports serialized PHP, using Pipe output just got a lot easier. I made a Pipe to add Term Extraction info from any RSS feed. Basically what we’re doing is automatically tagging all the posts in the feed and to retrieve the tags in your own script, all it takes is:

<?

$pipeURL = ‘http://pipes.yahoo.com/pipes/pipe.run?_id=Zli1l6UB3RG_l7ZvX0sBXw&_render=php&rssurl=‘;

$feedURL = ‘http://rss.news.yahoo.com/rss/topstories‘;

$tags = array();

$response = unserialize(file_get_contents($pipeURL.rawurlencode($feedURL)));

foreach ($response[‘value’][‘items’] as $item) {

foreach ($item[‘tags’] as $itemTags){

$tags[] = $itemTags[‘content’];

}

}

var_dump($tags);

At this point $tags is and array of all of the terms from the feed. Now what could be done with that data?

Serialized PHP or JSON?

If you have json_decode() available in your PHP install, is there any advantage to using JSON over serialized PHP? Let’s find out.

File Size

Saving the output directly to disk gave me

JSON – 51192 bytes

Serialized PHP – 56885 bytes

Because of syntax and PHP’s type specification, serialized PHP is about 11% larger than JSON. This ratio will increase as the number of elements in your output increases.

Decoding Speed

How long does it take to slurp these formats into PHP variables? My tests decode each 100 times.

JSON

real 0m0.269s

user 0m0.264s

sys 0m0.004s

Serialized PHP

real 0m0.088s

user 0m0.088s

sys 0m0.000s

It’s clear that unwinding serialized PHP is faster than JSON, so it’s a better choice performance-wise despite being slightly bigger over the wire.