So marketers have already started advising companies to pay attention to tags. So I started thinking – what would an information architect do with the wealth of information given by del.icio.us / flickr / technorati tags?

The first thing that comes to mind is to use tags as a proxy for free-listing. Information architects or anyone else researching a domain IAs can perform a card-sort on tags instead of generating items for a card-sort exerise using freelisting.

Did that make sense? Let me explain. Card sorting is used as method for understanding user mental models. Prior to asking users to sort cards, you need to generate a list of relevant items for the exercise. We (at Uzanto) often use free-listing at this stage. In case this is the first time you are hearing about free-listing – its a light-weight technique to understand what lies within a domain. An example: ask someone to list all the foods at Starbucks. Repeat this exercise with 20 people. Write down the frequency of each term mentioned – it will give you a pretty good idea of the scope and boundaries of the domain “Starbucks”. If you must know more, go read my article at Boxes and Arrows.

Tags bear an uncanny resemblance to freelisting data. Navigating by tags is remarkably like looking through free-listing data. Except that you don’t need to incentivize a bunch of participants and design and execute a research study. The taggers of the internet have already done that work for us. Thank you Joshua Schacter and the kind folks at Ludicorp.

Enough background – how does one go about tag sorting. First step, find a tag for a topic you are interested in. I was interested in understanding how people think about “apple”.

I found the tag “apple” on Delicious. Next, I looked up the related tags. Here they are.

1) mac

2) osx

3) ipod

4) software

5) itunes

6) music

7) history

8) technology

9) windows

10) macintosh

11) hardware

Next, I looked up the related tags for each of the above tags. I ended up with a list of all tags that were one or two degrees away. Below is the list showing the above related tags (going down vertically) and related tags for each of those tags (going across horizontally).

1) mac osx software apple windows linux music itunes unix tips howto ipod

2) osx software apple unix audio itunes music tips howto apps music linux

3) ipod apple music mp3 mac itunes software shuffle linux hacks audio osx

4) software mac tools windows linux programming web osx free opensource development music

5) itunes music apple mac osx mp3 software ipod audio hacks drm sync

music mp3 audio google search software radio video art free blog cool

6) history photography war reference design art web culture photos internet politics photo

7) technology blog news web design software science tools internet art google music

8) windows software linux tools security mac free freeware xp microsoft programming osx

9) macintosh apple software osx mac computers ipod raskin macosx history audio hack

10)hardware linux software mac howto apple tools computer hacks technology music hack

Next, I did what I do with freelisting data at this stage. I mapped each word by frequency. There were 132 words in all – 11 directly related tags, and 121 related at a second degree. There were 51 unique words. Here is the list of all the unique words, along with the frequency, starting with the most frequent word.

1) software: 10

2) mac: 8

3) music: 8

4) apple: 7

5) osx: 7

6) linux: 6

7) audio: 5

8) hack: 5

9) ipod: 4

10) itunes: 4

11) tools: 4

12) art: 3

13) free: 3

14) howto: 3

15) mp3: 3

16) web: 3

17) windows: 3

18) blog: 2

19) computer: 2

20) design: 2

21) google: 2

22) history: 2

23) internet: 2

24) photo: 2

25) programming:2

26) technology: 2

27) tips: 2

28) unix: 2

29) cool: 1

30) culture: 1

31) development:1

32) drm: 1

33) hardware: 1

34) macintosh: 1

35) mac osh: 1

36) microsoft: 1

37) news: 1

38) open source:1

39) photography:1

40) politics: 1

41) radio: 1

42) raskin: 1

43) reference: 1

44) science: 1

45) search: 1

46) security: 1

47) shuffle: 1

48) sync: 1

49) video: 1

50) war: 1

51) xp: 1

The list looks very much like a freelisting list. Except that the list is a little tech heavy, and lacks the range of general free-listing data. Remarkably, on de.licio.us apple does not appear even once as a fruit. Nor do you see any reference to the redness of an apple. Interestingly, both fruit and redness do appear as tags directly related to the apple tag on FLickr. Here are the first degree of related tags for apple on Flickr: apple: powerbook, ibook, computer, imac, fruit, music, food, red, store, macro.

Back to the delicious data. Another characteristic typical of freeisting data – the long tail at the end (note all the tags with a frequency of “1”).

I suspect however, that I would have got a richer dataset if I had asked a group of people to freelist on “apple”. I think what we are seeing is the mental models of “Apple” among the early del.icio.us adopters, who are by no means a reresentative sample. Additionally, all the tags are related to links that they have found worth saving. All these bias the data in a certain direction.

Another observation – none of the tags surprised me. Generally, in a freelisting dataset, there are at least some associations that suprise. I think that there is simply not enough variance here for unique, idiosyncratic associations to emerge. I expect that this will change as tagging becomes a more mainstream activity. Also, it possible that I my mental model closely matches those of the del.icio.us taggers and I found all the tags very predictable.

Now for the sorting. I signed up one individual to sort these tags into groups of his choosing. (If you are doing a real study, please ask more than one individual to do the sorting! Aggregate multiple sortings to create group mental model). The groups that popped up are shown below. We can immediately see just from looking at the groupings how apple is strongly associated with the production and consumption of art / music / media. The paucity of microsoft-oriented tags and the large number of linux / OSS tags that emerged point to a warming relationship between OSS hackers and applie users (the fact that OSS, linux, and hacking were grouped together says a lot about the brand identity of linux). On the basis of this preliminary tag-sorting, I think that “Hacking”, “Media”, and “Art” would be excellent top-level categories for an apple-oriented site.

So am I ready to give up freelisting with a group of participants? Are tags a perfect replacement for freelists? Not quite yet. I will make sure to check both del.icio.us and Flickr if my project is related to any topics they cover. Also, I think it would make sense to go upto three degrees of relatedness to get a broader variety of associations. Most importantly, I think that currently, tagging is not mainstream enough to use exclusively, or even as a primary research data stream.

Tag Sorting is an example of the type of method I hope we will see more of. There is so much structured data out there. It is time we learnt how to utilize it to understand people.

Please carry on the tag-sorting experiment. If you try something, please report back to this blog.