The interesting result was that some of the rebuttals were quite insightful, and resulted in me making changes to the argument that I would make if I had to present it again. Judging by the literacy and intelligence of some of the respondents, most of them probably wouldn't need Mechanical Turk as a source of income, so I assume most of them fit the profile of this Salon.com writer and are doing it just for fun. Hell, you can find enough people on UseNet and Slashdot who will argue with you just for free.

But there were a few reasons I found this preferable to the conventional ways of gathering interesting rebuttals to your own reasoning. If you send out a sample argument to all of your e-mail buddies, you will probably get some useful replies, but they may start to think you're a little weird for asking them to evaluate your thought processes, especially if you do it over and over. Post an opinion on UseNet or Slashdot, and you may have to wade through a lot of crap to find the useful responses (while others may consider your post to be part of the crap that they have to wade through). And in both cases, there's the potential embarrassment of what you're asking for -- the risk of seeming so uncertain about your own opinions that you want other people to check your work for you. (I actually think that being uncertain about your own beliefs is a virtue, but it doesn't seem to be one that our culture prizes very highly.) Using Mechanical Turk addresses most of these problems; even though you're still admitting to total strangers that you might be wrong and asking them to shoot you down if they can, at least the evidence of your insecurity won't turn up when your next employer or Internet date does a Google search for your name. ("Damn it, I want a man who doesn't question his bumper stickers!")

So, while I didn't find it useful enough that I would run every opinion through the Mechanical Turk machinery to see what feedback I could get from it (I'm not paying a bunch of them to proofread this article), I did like enough to recommend it to people for certain arguments in certain settings. The main kinds of arguments that I would try out on the Mechanical Turk service would be about abstract philosophical or moral questions on issues that have been around forever, like abortion or the death penalty -- topics so explosive that you'd risk making your friends very uncomfortable if you test-marketed your arguments on them, and which would seem almost rude to post about in a public forum because the debate topics have been around for so very, very long. But on Mechanical Turk, $1 is apparently enough to get people to ignore the awkwardness and the exhaustedness of the topic and to focus on what you ask.

And what was the argument that I used to test it out? Perhaps the geek crowd will feel more sympathy with this than the general public does. Basically it was that the conventional wisdom behind allowing adults to smoke, but banning cigarettes for people under 18, is wrong. Either you can believe that smoking should be permitted for everybody, or that it should be banned for everybody, but there is no consistent set of assumptions that could lead you to conclude that smoking should be banned for people under 18 but allowed for everyone else. You have two groups of people under consideration -- people under 18 who smoke, and people over 18 who smoke. What possible reason could there be for wanting to protect the health of the people in the first group, but not the people in the second group?

The problem with the conventional reason for smoking age restrictions -- "Younger people have worse judgment, so they are more likely to smoke" -- is that if this is true, all that means is that the first group of people will be proportionally larger, relative to the total population of people in their age range. But even after that assumption, you're still left with two groups of people, who exhibit the same continued bad judgment with regard to smoking cigarettes. Treating the two groups differently, is a bit like saying we should have lighter sentences for female murderers than for male murderers, just because men are more likely to commit murder.

And yet this conclusion did give me pause, so this is a classic example of an argument where you'd want someone to check your work. Off I went to create a Human Intelligence Task (HIT) on Mechanical Turk simply asking people to read the argument and respond. In the first round, most responders missed what I thought was the point of the argument, and responded with some variation of "Minors are more likely to smoke because they have worse judgment", without addressing the question of why the two groups of smokers should be treated differently. A few people responded with variations of "We've always done it that way" (referring to similar restrictions on alcohol, pornography, etc.); fair enough, it just reminded me that if I asked the question again I'd have to say I didn't consider any argument valid that boiled down to "We've always done it that way".

But then came some more interesting responses. One worker replied that I was wrong to assume that the effects of a cigarette were "the same" on adults and minors because cigarette smoke has been shown to be more damaging to developing tissues. OK, that was worth a dollar. On the other hand, that just means that there is some number N cigarettes that would be just as harmful to an adult, as 1 cigarette would be to a minor, so you're still left without a consistent reason for why you'd let the adult buy those N cigarettes but prevent the minor from buying 1 cigarette. Then another user called me out on the opening line of my original argument, "There is no reason to ban cigarettes for minors but not for adults." He said, quite correctly, that I had only attempted to debunk the most commonly given reason, but it was wrong to conclude that there was no such reason.

So, this led me to another idea for how to present an argument and solicit feedback on Mechanical Turk: in the form of a series of mathematically precise statements, each one following from the previous ones. The new HIT was to ask users if they disagreed with the conclusion, and if they disagreed, then to identify the first statement that they disagreed with. The idea was that each statement would follow logically from the ones before it, so identifying any statement as the "first" one that they disagreed with, would be tantamount to a self-contradictory paradox.

Now, whether or not you want to use this format when running an argument past the Turk workers, depends on what your goal is. If you want to really find out if your own argument is valid, then breaking it down mathematically is one approach. On the other hand, if you already believe your own argument, and you're just trying to find the most persuasive way of phrasing it, then you may not learn anything useful by breaking it down into a series of mathematical steps, because that's probably not going to be the format of our final persuasive essay.

Anyway, the new mathematical format of the argument was (slightly reworked from what I posted on Amazon):

Government should ban smoking by people under 18, because of the harmful health effects. If that's true for the entire group of underage smokers, then it's also true for each individual smoker under 18. In other words, even if only one person under 18 smoked in the entire country, it would still be justified for the government to ban them from smoking. Whatever bad health effects are caused by the average person under 18 smoking 1 cigarette, there is some number N cigarettes that would cause the same bad health effects in the average adult who smoked them. If banning 1 person under 18 from smoking 1 cigarette is justified (even if they were the last smoker on Earth), and the health effects would be the same for an average adult who smoked N cigarettes, then banning 1 adult from smoking those N cigarettes would also be justified (again, even if they were the last smoker on Earth). If banning 1 person over 18 from smoking would be justified, then the same logic would apply to every person over 18, which would imply banning smoking for all people over 18. Hence, if you believe that smoking should be banned for people under 18, then the same logic would lead to a ban on smoking for people over 18 as well.

The response from a lot of workers who responded to this HIT was that... I lost them. Each of them identified the first statement in the list that they disagreed with, as required by the HIT, but many commented that the whole thing was phrased confusingly. There was no clear winner for the first statement that people disagreed with, but several people picked #3 and #4, arguing some version of "People under 18 have less developed judgment." (I still say that doesn't matter, because you're talking about comparing a person under 18 who smokes, with a person over 18 who smokes, and their judgment in both cases is the same, etc.) So this particular experiment failed -- it didn't make it easier to persuade people by formulating the argument as a series of steps, and it also didn't lead to any agreement on what was the Achilles' Heel of the argument itself.

However I think the general idea, of using Mechanical Turk to find sparring partners, may be useful to a lot of people. If you were interested in publishing some kind of persuasive argument, you could use an Amazon HIT to have readers compare several different versions of the same argument and identify the one that they thought was most convincing. If you were feeling more philosophical and simply wanted to know if your argument was correct, you could pay people to look for flaws in it (and here is where the mathematical phrasing could come in handy). If you're crafting an argument for public consumption, you could even have HIT workers build up your argument for you -- start with a position and have them come up with reasons supporting that position -- although to me that feels like a cheapening of the debate process that crosses the line, because you're not even trying to reason your way to a conclusion, instead starting with the conclusion you want and then working backwards (not that this isn't what a lot of debaters do anyway!). My own interest would be to see next if certain types of arguments are more likely to persuade people who are more mathematically inclined (by asking respondents to indicate how well they did at math in school). Perhaps arguments with flowery language are more likely to appeal to people who were English majors, while arguments spelled out as a series of logical steps are more likely to appeal to people who look at things in a mathematical way (also known as the "real" or "right" way of looking at things).

Maybe my preference for the controlled, user-reimbursed process of "debating" that is enabled by Mechanical Turk, has to do with a lifelong focus on bottom-line results: Decide what the result is, and judge the process by how well it brings about that result. I don't think debate and discussion should be like soccer, valued for the fun and the exercise; I think a good debate should actually get somewhere, persuading the participants or the listeners of a new point of view that builds on their old one, or else the debate has failed. If paying HIT workers kills the "spirit" of a good debate but helps achieve the goal, then so much the better. On the other hand, we'll never run out of people who enjoy the process of debating and arguing for its own sake, and will continue to debate things into the ground without anybody paying them. Hey look, here come some of them now!...