Author: Marc Fawzi

Twitter: http://twitter.com/#!/marcfawzi

License: Attribution-NonCommercial-ShareAlike 3.0

~~

/* This article presents the case against the ‘wisdom of crowds’ and explains the background for how the Wikipedia 3.0: The End of Google? article reached over 200,000 hits */

This article explains and demonstrates a conceptual flaw in digg’s service model that causes biased (or rigged) as well as lowest-common-denominator hype to be generated, causing a dumbing down of society (as the crowd). The experimental evidence and logic supplied here apply equally to other Web 2.0 social bookmarking services such as del.icio.us, and netscape beta.

Since digg is an open system where anyone can submit anything, user behavior has to be carefully monitored to make sure that people do not abuse the system. But given that the number of stories submitted each second is much larger than what Digg’s own staff can monitor, digg has given the power to the users to decide what is good content and what is bad (e.g. spam, miscategorized content, lame stuff, etc.)

This “wisdom of crowds” model, which forms the basis for digg, has a basic and major flaw at its foundation, not to mention at least one process and technology related issue in digg’s implementation of the model.

Let’s look at the simple process and technology issue first before we explore the much bigger problem at the heart of the “wisdom of crowds” model. If enough users report a post from a given site as spam then that site’s URL will be banned from digg, even if the site’s owner had no idea someone was submitting links from his site to digg. The fact is that digg cannot tell for sure whether the person submitting the post is the site’s owner or someone else, so their URL banning policy (or algorithm if it’s automated) must make the assumption that the site’s owner is the one submitting the post. But what if someone starts submitting posts from another person’s blog and placing them under the wrong digg categories just to get that person’s blog banned by digg?

This issue can be eliminated by improvements to the process and technology. [You may skipp the rest of this paragraph if you can take my word for it.] For example, instead of banning a given site’s URL right away upon receiving X number of spam reports for posts from that site, the digg admins would put the site’s URL under a temporary ban and attempt to contact the site’s owner and possibly have the site owner click on a link in an email they’d send him/her to capture his/her IP address and compare it to that used by the spammer. If the IP addresses don’t match then they would ban the IP address of the spam submitter, and not the site’s URL. This obviously assumes that digg is able to automatically ban all known public proxy addresses (including known Tor addresses etc) at any given time, to force the users to use their actual IP addresses.

The bigger problem, however, and what I believe to be the deadliest flaw in the digg model is the concept of the wisdom of crowds. Crowds are not wise. Crowds are great as part of a statistical process to determine the perceived numerical value of something that can be quantified. A crowd, in other words, is a decent calculator of subjective quantity, but still just a calculator. You can show a crowd of 200 people a jar filled with jelly beans and ask each how many jelly beans are in the jar. Then you can take the average and that would be the closest value to the actual number of jelly beans. However, if you were to ask a crowd of 200 million to evaluate taste or beauty or whatever subjective quality, e.g. coolness, the averaging process that helps in the case of counting jelly beans (where members of the crowd use reasoning and don’t let others affect their judgment) doesn’t happen in this scenario. What happens instead is that the crowd members (assuming they communicate with each other such that they would affect each others qualitative judgment, or assuming they already share something in common) would converge toward the lowest-common-denominator opinion. The logic for this is that reasoning is used in the case of estimating measurable values, while psychology is used in the case of judging quality. Thus, in the case of evaluating the subjective quality of a post submitted to digg, the crowd has no wisdom: it will always choose the lowest common denominator, whatever that happens to be.

To understand a crowd’s lack of rationality and wisdom, as a phenomenon, consider the following. I had written a post (see link at the end of this article) about the Semantic Web, domain specific knowledge ontologies and Google as seen from a Google-centric view. I went on about how Google, using Semantic Web and an AI-driven inference engine, would eventually develop into an omnipresent intelligence (a global mind) and how that would have far reaching implications etc. The post was titled “Reality as a Service (RaaS): The Case for GWorld.” I submitted it to digg and I believe I got a few diggs and one good comment on it. That’s all. I probably got 500 hits in total on that post, and mostly because I used the word “Gworld” in the title. More than a week after that, I took the same post, the same idea of combining the Semantic Web, domain-specific knowledge ontologies and an AI-driven inference engine but this time I pitted Wikipedia (as the most likely developer of knowledge ontologies) against Google, and posted it with the sensational but quite plausible title “Wikipedia 3.0: The End of Google.” The crowd went wild. I got over 33,000 hits in the first 24 hours. And as of the latest count about 1600 diggs. In fact, my blog on that day (yesterday) beat the #1 blog on WordPress, which is that of ex Microsoft guy Scobleizer. And now I have an idea of how many hits he gets a day! He gets more than 10,000 and less than 25,000. I know because the first 16 hours I was getting hit by massive traffic I managed to get ahead of him with a total of 25,000 hits, but in the last 8 hours of the first 24 hours cycle (for which I’m reporting the stats here) he beat me back to the #1 spot, as I only had 9,000 hits. I stayed at #2 though. Figure 1: June 25 Traffic, the first 16 hours of a 24 hour graph cycle. Traffic ~ 25,000 hits.

Figure 2: June 26 Traffic, the last 8 hours of a 24 hour graph cycle. Traffic ~ 8,000 hits.

A crowd, not to be confused with individuals (like myself, yourself), aside from being a decent calculator of subjective quantities (like counting jelly beans in a jar) is no smarter than a bull when it comes to judging the intellectual, artistic or philosophical appeal of something. Wave something red in front of it or make a lot of noise and it may notice you. Talk to it or make subtle gestures and you’ll fail to get its attention. Obviously you can have a tame bull or an angry one. An angry one is easier to upset. A crowd is no more than a decent calculator of subjective quantities. It is a tool in that sense and only in that sense. In the context of judging quality, like musical taste or coolness of something, a crowd is neither rational nor wise. It will only respond to the most basic and crude methods of attention grabbing. You can’t grab it’s attention with subtlety or rationality. You have to use psychology, like you would with a bull. As you can see from the graphs of my blog traffic, I’ve proved it. I didn’t just understand it. Social bookmarking systems, and tagging in general, amplifies the intensity of the crowd-as-a-bull behavior by attaching the highest numerical values to the most curde, most raw and the lowest common denominator.

Now all the sudden, when a post gets 100 digs it reaches escape velocity and goes into orbit. The numerical value attached to posts (or the “diggs”) when it grows fast acts like a bait. People rush to see such posts just as they rushed in tens of thousands to see the “Wikipedia 3.0 vs Google” post. Yet it’s basically the same post as the one I did on GWorld over a week ago that only got a few diggs. There is no comparison between the wisdom and rationality of an individual and that of a crowd. The individual is infinitely wiser and more rational than the crowd.

So these social bookmarking systems need to be based on a more evolved model where individuals have as much say as the crowd. Remember that many failed social ideologies were based on the the idea of favoring the so-called “wisdom of crowds” over individualism. The reason they failed is because collectivist behavior is dumb behavior and individual judgment is the only way forward. We need more individuality in society not less.

Censored by digg

This post was censored by digg’s rating system. However, in a software-enabled rating system, such as digg, reddit, del.icio.us, netscape, etc, there is no way to guarantee that manipulation of the system by its owner does not happen. Please see the Update section below for the explanation and the evidence (in the form of a telling list of censored posts) behind why digg itself, and not just some of its fanatic users, may have been behind the censoring of this post.

Note: a fellow wordpress blogger published a post called Digg’s Ultimate Flow which links to this post. It has not been buried/censored yet (June 29, ’06, 5:45pm EST). It’s not to be confused with this post. The reason it hasn’t been buried is because it presents no threat to digg. They can sense danger like an animal and I guess I’ve scared them enough to bury/censor my post. The other me-too post that I’ve just mentioned does not smell as scary. It’s really sad that digg and sites like it are feeding the crude animal-like, instinctive, zero-clarity behavior that is the ‘unwisdom’ of crowds.

The truth is that digg and other so-called “social” bookmarking sites do not give us power, they take it away from us. Always. Think. Innovate. Do not follow. But you may want to follow this link to share your view with other digg users for what it’s worth. Correction I’ve just noticed that this blog is ahead of Scobleizer again at #1. I’ve had 7,796 hits since 8:00pm EST, June 28, ’06 (yesterday.) It’s 8:00pm EST now, on June 29, ’06.

Related

Update The following is a snapshot of digg’s BURIED/CENSORED post section as of 4:00am EST, June 29th, ’06. This post was originally titled “Digg’s Biggest Flaw Discovered.” Note that anything that is perceived as anti-digg, be it a bug report or a serious analysis of digg’s weaknesses, is being censored. Digg’s Biggest Flaw Discovered submitted by evolvingtrends 21 hours 35 minutes ago (via http://evolvingtrends.wordpres…) An actual proof of a major flaw at the foundation of digg’s quality-of-service model category: Programming

Now even CNET wants its stories endorsed by Digg community submitted by aj9702 1 day 17 hours ago (via http://news.com.com/Attack+cod…) Check it out.. CNET which is number 72 on Alexa rankings wants its stories endorsed by the Digg community. They have a digg this link now to their more popular stories. This story links to the news that exploit code is out there for the RRAS exploit announced earlier this month category: Tech Industry News

Dvorak: Understanding Digg and Its Utopian Idealism submitted by kevinmtu 1 day 18 hours ago (via http://www.pcmag.com/article2/…) Dvorak’s PC magazine article on the new version of Digg and its flaws, posing many interesting points.For example, “What would happen to the Digg site if the Bush-supporting minions in the red states, flocked to Digg and actively promoted stories, slammed things they didn’t like, and in the process drove away the libertarian users?” category: Tech Industry News

Pros and Cons of Digg v3 submitted by jobobshishkabob 2 days ago (via http://thenerdnetworks.com/blo…) Well, Digg version 3 got released today. It is really nice and has many great features. But everything has its flaws…. heres a list of pros and cons of the new Digg.com category: Tech Industry News

Easy Digg comment moderation fraud submitted by Pooley 2 days ago (via http://www.davidmcmanus.com/st…) I’ve found a bug in digg.com. A flaw in the way I ‘digg’ a comment, by clicking the thumbs up icon, allows me to mark up a comment multiple times. category: Tech Industry News

Why the digg moderation system is flawed submitted by SpyDerMann 5 days ago (via http://slashdot.org/~Spy+der+M…) Are oil companies astroturfing digg by downmodding the unfavorable comments in global warming discussions? We can’t know for sure that they ARE. However, we can be sure that they CAN. Tags: Semantic Web, Web strandards, Trends, wisdom of crowds, tagging, Startup, mass psychology, Google, cult psychology, inference, inference engine, AI, ontology, Semanticweb, Web 2.0, Web 2.0, Web 3.0, Web 3.0, Google Base, artificial intelligence, AI, Wikipedia, Wikipedia 3.0, collective consciousness, digg, censorship