The other day I was pondering on Linked Open Data Source Dynamics and as a starting point I wanted to learn more about the caching characteristics of LOD data sources. Now, in order to establish a baseline, one should have a look at what HTTP, one of the pillars of Linked Data, offers (see also RFC2616, Caching in HTTP).

So, I hacked a little PHP script that takes 17 sample resources from the LOD cloud (from representative datasets ranging from DBpedia over GeoSpecies to W3C Wordnet). The results of the LOD caching evaluation are somewhat deflating: more than half of the samples do not support cache control and less than 20% support Last-Modified or ETag headers.

I know, I know, this is just a very limited experiment. And yes, very likely there are not yet that many applications out there consuming Linked Data and hence using up the whole bandwidth. However, given that one of the arguments for the scalability on the Web is the built-in HTTP caching mechanism, LOD dataset publisher might want to consider having a closer look into what the server or platform at hand is able to offer concerning caching support.