A few days ago freelance writer Laura Spencer (@TXWriter) tweeted a link to a Mashable post. The hyped up headline read “Facebook Now Controls 41% of Social Media Traffic.” Before I even read the post my gut screamed “Bullshit!” It often does that. My gut is rather talented at sniffing out shady statistics. It must be that past life in PR where we all learn that statistics can say just about anything we want them to if we twist them enough (my disgust of that attitude makes me hypersensitive to them now).

Then I did read the article. What I found was baffling (okay, it wasn’t really — it was about what I expected):

Charts with no reference points related to the supposed trends shown

Assumptions about people jumping from one site to another without any real evidence to back that up (and data charts right in the post that contradicted the claim)

Other statistical claims that didn’t jive with the “relative” charts shown in the post

Big social media sites being completely left out of the comparison

Whole niches of social media completely left out of the comparison

Sites that probably shouldn’t have been included but were

A huge social media site included in the first set of stats suddenly disappeared from later ones

Yikes. I bet you’re wondering why I haven’t linked you to the post yet. That’s because it seems to have gone “Poof!” Vanished into thin air it did. Because of that I won’t pull the actual charts to show you the problems (doesn’t seem right to publish their charts when they’ve pulled them — especially when it wasn’t even clear in the post if they belonged to Mashable or were Comscore charts taken somewhat out of context). However, I do want to highlight something from the cached version which illustrates my biggest problem of all:

Do you see those nice little Twitter and Facebook counts to the left of the post title? That’s how many times this story was shared just on those two platforms before it was moved or removed. That’s terrifying. Why? Because it shows just how easily and quickly bad information can spread on the Web. Let’s look at some of the specific problems.

1. Major segments of social media (and sites) are ignored. — The first chart shown in the post supposedly illustrates traffic changes for eight social media properties from February 2009 — February 2010. Those sites are Facebook, Myspace, Gmail, Twitter, LinkedIn, Ning, YouTube, and Hulu. While I don’t claim the 41% statistic came from this data set (if you can call it that), it’s a good indication of how narrow these stats clearly are. (The chart showing how big Facebook’s piece of the pie supposedly is actually highlights only six social media sites — take YouTube and Hulu out of the previous group.)

I guess they forgot two of the grandfather platforms of social media — forums and blogs. How can you claim anything controls 41% of the entire industry’s traffic (as their post title did) when you ignore huge segments of it? You can’t. The claim made was clearly too broad, and there’s no excuse for that (especially when readers trust the information you put in front of them).

I also found myself wondering where other reasonably large players were. Anyone remember Flickr, Delicious, Digg, StumbleUpon, Flixter, Classmates.com, or Reddit for example? If you want to really know how much of the social media space Facebook controls, don’t forget you have to include all of the socially-driven P2P (peer-to-peer) networks too — Limewire, BitTorrent, and related sites. Why do we so often forget about old school social media when it doesn’t suit our statistical purposes?

What’s just as questionable as not including some resources is including sites like Hulu in social media stats when it’s little more than a glorified vlog (and that’s coming from someone who loves Hulu just for the record). It’s not so much about social content as it is content consumption. Yes, you can comment and share. But unless I missed something, you can’t really contribute to the primary content base. I’m not saying Hulu doesn’t fall within the realms of social media — only that it’s completely senseless to include it in any “relative” statistic on what’s happening in social media as a whole while ignoring blogs and other significant sources operating in similar ways.

2. What happened to YouTube? — If you could still view the first chart they showed, you would see a dateline from Feb. ’09 through Feb. ’10. What you would not see are any metrics showing what stats were actually being measured during that time — just a blank y-axis.

When the author was questioned about the missing reference points in the comments, the reply was “Units of measurement are relative, since they come from a panel audience of a few hundred thousand. But the data definitely reflect average consumer behavior! comScore wouldn’t lead us astray.” Um, yeah.

This is how faulty information spreads. Even worse, this is how poor interpretations of faulty information spreads. Why do I call it faulty? Because if you look at that chart (which clearly shows YouTube with higher starting and ending traffic levels than Facebook) and look at it in relation to the statistics shared in the post, they just don’t add up.

What does basic logic tell us? If you’re showing YouTube as having more traffic, then the percentage of total social media traffic must be higher than Facebook’s. That would put it at more than 41%. However, the author goes on to state that “As of March 2010, Facebook traffic made up 41% of all traffic on a list of popular social destinations. MySpace was in second place, capturing around 24% of traffic. Gmail had 15%, and Twitter had 8%.”

We’ve already talked about how the “list of popular social destinations” was faulty and didn’t justify Mashable’s claim to begin with. But here we have 88% of social media traffic (for those listed sites) accounted for. If YouTube was already shown to have more in another chart from the same source that would mean at least 42% of traffic went to them — putting us over 100%. That would tell us that YouTube wasn’t included in the list that led to the 41% statistic (or the metric-free graph from earlier in the post was complete hogwash). Neither should be acceptable to anyone reading the site or getting these statistics from anywhere.

It gets better though. I can’t find anything that shows YouTube having more visitors than Facebook. So okay. We’re back to the problem of missing reference points in their first chart. Maybe they wouldn’t be at least at 42% as the first chart in this post would suggest. However, on checking some other statistical sources just for more background (Compete and Alexa — note that I don’t put much faith in either of their stats individually either), it appears that YouTube does get more traffic than Myspace, even if not Facebook. So based on the 41% statistic for Facebook and the 24% statistic for Myspace, it still wouldn’t add up unless YouTube was given the boot from the calculations. Why?

3. Jumping to conclusions — One of my favorite assumptions is that because Myspace’s traffic share was shown to decrease and Facebook’s was shown to increase, that meant users were jumping from one social network to another. Of course there was no actual evidence to back up the claim. Nothing in the post suggests that it’s more than supposition.

In fact, two graphs in the post show that Myspace’s traffic remained relatively stable. In other words, on one hand they were showing that we had one maturing source leveling out in traffic over that year and one rapidly growing. On the other hand, that was somehow twisted into a mass migration. Of course traffic share for other sites will decrease when another comes in and shows massive gains. Less of the pie is available to them.

What would be far more interesting is to look at truly active users — with that “active” status being set uniformly by some kind of social media standard (if everyone could actually agree on one). Facebook isn’t exactly known for making it easy for people to delete their accounts. MySpace has been much easier to leave (as are other social media sites). That’s one reason I’ve always been skeptical of Facebook’s traffic numbers (at least their own claims). To some degree they’re like a black hole, sucking in social media users and not letting them back out. Okay. They apparently can delete accounts. But unless things have very recently changed, they don’t make the process easy enough to pretend we’re comparing apples to apples in most cases.

Then again, relying on “member” numbers is an inherent fault of measuring social media for this very reason. How many members are active? How many are legitimate members vs automated bot-driven “members” and other spammers? What kind of activity are we talking about anyway? Are traffic numbers high because people are really interested in more information, or are the numbers high because the social media site requires multiple page views to do simple things to inflate their overall traffic numbers?

Every site is different in what they consider an active member, how good they are at weeding out the automated spammers, and how they direct traffic on the backend of their sites. Rarely are they directly comparable, and that doesn’t bode well for these types of comparisons beyond very general trends.

There is No Excuse for Spreading Ignorance

This is just one example of why I so often cringe when I see social media statistics thrown about by a supposedly reputable source. This information spreads. Quickly. People don’t seem to analyze information before passing it along anymore (or maybe a good question is “did they ever?”). I don’t think there’s any excuse for it.

I’ll give Mashable some major credit for pulling the piece. I don’t know what their reasons were, and I won’t make assumptions. But even if it’s just an unintended benefit, I’m glad to see the stop of the spreading of that post. One of my first comments in that initial Twitter discussion about the post was that I feared it would spread. It clearly did.

The Bigger Problem of Social Media Statistics

This isn’t a problem with Jolie O’Dell. This isn’t even a problem with Mashable. This is a widespread problem in social media. Companies are essentially able to create their own self-fulfilling prophecies by interpreting and publishing incomplete or distorted data. Actually, it goes well beyond social media.

I’m not saying this is always intentional. Sure, sometimes it might be a case of creating linkbait headlines or trying to puff up a company’s image to bring in more traffic and members (even if just out of sheer curiosity). I don’t assume that O’Dell had any ill intentions. Her post just happened to be the latest example of stats gone awry.

I also can’t give you any easy answers when it comes to social media measurement. In no way am I saying that Facebook doesn’t have a commanding presence in the industry. The point is that we’ll probably never know exactly what percentage of social media traffic any given site has, and it’s silly at best to pretend we have authoritative data when we don’t. That’s because we don’t know every social media site out there. New ones pop up every day. Sometimes they disappear or are sold off and merged (like recent news from AOL about Bebo). And sometimes one changes its business model, potentially fueling the growth of new social platforms (can anybody say “Ning exodus?”).

Does that mean we should stop gathering and interpreting data? Absolutely not. But I do think it means that we have a greater responsibility as publishers to help our readers form their own opinions from that data. We can do that by pointing out not only interesting possibilities but also potential flaws. Or have we just become fad-feeders, discouraging critical thinking and informed decision-making in favor of a “Join me! Follow me! Friend me!” approach of hyping up the tools we use in order to build our own audiences there?