There has been an ongoing (and really rather bitter) argument over discrimination against women in the skeptical/atheist community – particularly over whether or not conferences are preferentially selecting old, white, male speakers. Arguably this could be expanded to include discrimination against youth and against different races, but the sexism issue is that which has been front-and-centre over the past year. The allegations have been that the organisers of various conferences (particularly TAM) have not been inclusive when considering female speakers and that this has contributed to an unwelcoming environment at skeptical conferences.

Plenty of people have made plenty of accusations and there have been assertions galore, but what do the data say? Well I can’t say that I have looked that hard, but I have only been able to find a couple of point estimates of gender and conference involvement from Jen McCreight and a brief note from Brian Thompson on the JREF’s Swift Blog. However, there seems to have been no rigorous collection, comparison, or analysis of the data.

Well I propose a bit of science! There are two hypotheses that we can test:

Speakers at atheist/skeptical conferences are not representative of the diversity of the members of those communities [i.e. “we’re discriminating”]. Speakers at atheist/skeptical conferences are becoming more representative of the members of those communities [i.e. “we’re getting better”].

Whenever people start talking about “diversity”, I instantly think of ecological questions rather than socio-cultural questions. It seems that the diversity in skepticism problem is one that is directly quantifiable using the suite of tools provided by the field of ecology.

The Biographical Data

The data that we need are information relating to the different speakers that have presented at conferences over the past couple of years. I propose that the following data be collected: sex, age, race, nationality, and education. This is analogous to the “trait” data in an ecological analysis.

The Attendance Records

Next we need to record which speakers were present at which conferences. I suggest the following conferences be included:

TAM 1, 2, 3, 4, 5, 5.5, 6, 7, London, 8, London 2010, OZ, 9, and 2012 (14 conventions)

Skepticon 1, 2, 3, 4, and 5 (5 conventions)

NECSS 2009, 2010, 2011, and 2012 (4 conventions)

Australian Skeptics Conventions (however many there are!)

I have made a start by collecting attendance and biographical data on most of the data for TAM 2012 speakers, and this can be seen in this Google Doc. This is analogous to the “community” data in ecology. If anyone else (SitP, local CFI chapters, Skepticamps, etc) wants to contribute data then that would be awesome, but I can’t go that far down the rabbit hole.

Is it Representative?

The wonderful thing about doing this analysis now is that the Atheist Census is currently running (and over 100,000 people have taken the short survey at the time of writing). While this isn’t necessarily going to give us a sample of skeptics, it will give us an idea of the freethought community more broadly from which we can infer the pool from which speakers could theoretically be drawn. We can quibble over the validity of this dataset and apply caveats accordingly but I see this as the best dataset available. The only drawback is that the Census doesn’t ask about race, but all other variables can be included in the analysis…

The Analysis

We test Hypothesis 1 (we are discriminating) by randomly sampling from the Atheist Census data to produce a lot (let’s say n=1000) different “speaker lists”. We then compare the diversity of the actual speaker list at different conferences to the simulated lists (which form a “null distribution”) to see if the actual lists are more or less diverse than what you would expect by chance. This is a standard procedure in statistical analysis.

We test Hypothesis 2 (we are getting better) by looking at time series plots of diversity across the different conferences. We might expect that there would not be a linear trend, as there were renewed efforts to enhance diversity prior to TAM9. I’ll consider more complex statistical models after eye-balling the data to see which is most appropriate.

I’m perfectly willing to be corrected on the methods, so I have posted the whole statistical code (in R, naturally) to a Google Doc. I’ve also uploaded two files showing examples of the biographical (actual data) and community data (mocked-up). Just to show you what these data look like in “ordination space”, here is a plot of some of 18 regular skeptical speakers. The similarity of speakers is shown by their distance (closer together=more similar). I’ve annotated the plot to show how the speakers group together (click to embiggen):

You’ll notice a couple of things: (i) the only non-white speaker is Leo Igwe who has his own little corner of the plot (not a good start for diversity…), (ii) Randi also has his own corner by virtue of seniority, and (iii) all the women are separated out on the left. The stereotypical skeptical speaker is the older, white male and they sit pretty well in the middle (Shermer, Krauss, Novella). A diverse group would have names in all the spaces, indicating that we had speakers not only representing extremes but also all aspects of the range of traits that we are measuring. A less diverse group would have a wide range of types of people, but big open gaps where intermediates weren’t present.

The Next Step

I need help to collate the data. The Google Doc that I posted contains about four hours’ of work for half of TAM 2012. Ideally the conference organisers (DJ et al.?) would supply these kinds of spreadsheets rather than me having to trawl archived websites, but I’m prepared to get this done the long way if needs be. However, if a dozen people get involved (and organised!) then we could get this knocked off in an afternoon! Also, if you have spoken at a skeptical con, feel free to enter (or correct!) your own biographical data. If you are interested in helping, leave a comment below and I’ll coordinate volunteers.

PS: I realise that some speakers might not want to reveal their ages. If that is the case then we will just guess based on photographs and online CVs (time since graduation, etc), so don’t be offended if I (or we if I get some help) get it wrong!