Persuasion science uses statistics. Some people want to run away screaming anytime math enters the equation, but I want to show you a way of using numbers in a simple and straightforward way that helps you understand what’s really happening. It’s called the “effect size” and it answers a simple question: How big of a difference is it? An outstanding psychologist named Jacob Cohen articulated this concept. Another excellent psychologist, Robert Rosenthal also developed the idea further. I’m going to combine their ideas and put them in the Windowpane Display. Here’s how the Windowpane works.

The Windowpane

Think about a window. Imagine that it is divided into four equal panes. Easy to visualize, right? Those 2 by 2 panes. Now, let’s say we do an experiment where we randomize one group of people to get the New Thing while another group of people get the Old Thing. To make the math simple, we’ll give 100 people the New Thing (treatment group) and 100 people the Old Thing (control group). After each group does its Thing, we carefully observe each person to see if they Changed. Either they did Change or the did Not Change. Let’s dress up the Windowpanes with these labels.

WINDOWPANE WITH TREATMENT and OUTCOME

Pretty simple so far. We’re testing the New Thing against the Old Thing. We have 100 people randomly assigned to each group. We then see how the people Change either into Yes or No. Now, let’s fill in each of the four little windowpanes to demonstrate different scenarios.

We’ll start with failure which is what usually happens with science. All our good intentions are smashed against the Rock of Experimental Science and nothing happens. Let’s be polite and call this the No Effect outcome rather than use the words scientists use when looking at the stat results on the screen and realizing their next grant application just died. It looks like this.

NO EFFECT

We’ve got 50 people in each little windowpane. To understand what’s happening, read each row. We started with 100 people in the treatment condition who got the New Thing and when we observed them we found that 50 of the 100 changed and 50 of the 100 didn’t change. We also started with 100 people in the control condition who got the Old Thing and when we observed them we found 50 of the 100 changed and 50 didn’t. No effect. Nada. Zip. The New Thing is not different from the Old Thing. [Side Bar: If you’re a stat maven or just pretty quick you know that failure would result if both rows were 10/90 or 30/70 or even 90/10, anything as long as both rows have the same percentage. Failure is not just 50/50, but rather when both rows show the same finding. I used 50/50 because it makes the math and the concept easier to follow.]

Now, let’s create an example where we start to get differences. Let’s assume that Something Happens when people get the New Thing and it looks like this.

SMALL EFFECT

We now see on the rows and the columns, a 45/55 effect, a 10 point difference. In social science parlance, this 10 point difference is called a “small” effect as popularized by Jacob Cohen in his work on power analysis and effect sizes. Make sure that you see the impact of the treatment. Notice in this example that more people who get the New Thing showed the desired change (read the row) compared to people who got the Old Thing (read their row).

Now, let’s increase the effect size. Let’s go from “small” to “medium.” Here’s the Windowpane for a medium effect.

MEDIUM EFFECT

Now, our row values are 35 and 65. A moderate effect is a 30 point difference. That sounds somewhat impressive, a 30 percentage point difference. Think about this medium effect another way. Notice that 65 is almost twice as large as 35. A medium effect means that you’re getting almost twice as much change in the treatment group compared to the control group. A medium effect is getting to be pretty obvious. Think how obvious a “large” effect must be. It looks like this.

LARGE EFFECT

The row values here are 25 and 75, a 50 point difference. Now the rate of difference is three times with the Treatment producing a 300% increase over the Control. That’s big. That’s obvious. Take a quick scan now and review the four Windowpanes, No Effect, Small Effect, Medium Effect, and Large Effect. See the numbers change.

Windowpane as a Jar of Marbles

If you’re still with me, let me offer my congratulations for your patience and motivation. You’re hanging tough with numbers, never easy or fun. So, I’ve got a treat for you. Let’s use jars of marbles to illustrate effect sizes. This will give you a quick and easy visual way of observing effects rather than counting effects.

We have two jars, one is the New Thing and the other is the Old Thing (or Treatment and Control; or Special Sauce and Regular Sauce; you get it). Inside each jar are 100 marbles, either black or white. If we’re testing mortality, white is Alive and black is Dead. Let’s start with the No Effect condition like we did with numbers.

Here, each jar contains 50 white and 50 black marbles meaning there is no difference between the Jars or Things or Conditions or Sauces. It’s just that 50/50 No Effect Condition.

Now, let’s demonstrate a Small Effect, that 45/55 Effect. Which Jar has the most white marbles?

Not so easy to see at a glance or even with a long stare. Many people, even trained and experienced statisticians, need to count the white marbles to figure out that the orange jar contains 55 white marbles and that the blue jar contains 45 white marbles.

Now, let’s look at the Medium Effect size of 35/65. Which jar has the most white marbles?

Pretty obvious, isn’t it? That orange jar clearly has more white marbles and only a propeller head counts everything following the Ronald Reagan Rule of, “doveryai, no proveryai.” See how much of a difference a Medium Effect is. On the medical side of things, many researchers call these effect eizes, Clinically Significant, which means a physician in an examination room looking at a specific patient can see the impact of a new drug, for example. If that drug has a Medium Effect, you’ll see it.

Now, just to be complete, let’s look at a Large Effect Size of 25/75 which you probably expect to be as obvious as a zit on your nose on Prom Night. Here it is.

Boom! Large Effects are incredibly obvious. The impact of the New Thing compared to the Old Thing is so strong you wonder why anyone did the test in the first place. Don’t we know that, on average, men can lift more weight than women, that athletes can run faster than injured people, that Melanie is prettier than I am?

If you’d like more examples of practical effect sizes, check out this Blog post. It looks at effects with speed, height, and IQ.

Two Rooms and a Bell Metaphor

Let’s hear the difference with the Windowpane. Engage a different sensory system to sense Small, Medium, or Large differences. Consider two rooms and a bell to explain the Windowpane. Rather than do any math, even the simple ratio math of 45/55 versus 35/65, just walk into my metaphor.

Imagine you are in a hallway facing two doors. Behind each door in the room an experiment is underway testing the effect of something on an outcome. Whenever the outcome occurs – an Other Guy votes, buys, smokes, exercises, whatever – a bell rings. Imagine that all this takes 5 minutes so the pattern of bell ringing is not one stream, but clangs out over time.

Now, assume that the pattern of bell ringing for the two rooms matches the Windowpanes for Small, Medium, and Large effects. Thus, with 100 total bells rings, if the effect is Small, you would have heard 45 rings behind one door and 55 behind the other. And, so on. Again, not in one bar, but over the song.

I think that most people paying attention for 5 minutes could NOT hear the difference for a Small Windowpane. They could NOT distinguish the room with 45 rings over 5 minutes compared to the room with 55 rings in 5 minutes. However, when those 100 rings opened a Medium Windowpane, I think most people could hear that difference. Twice as many rings would sound behind one door compared to the other. That should be obvious. And, of course, a Large effect at 25/75 would be apparent to even distracted listeners.

Zen with Venn Effect Sizes

Dr. Venn invented his eponymous diagrams as an Oxford don in the 19th century. They remain interesting, useful, and attractive. Consider this series of four Venn diagrams as dramatizations of Effect Sizes.

I call these dramatizations because we should not understand the circles and their relationship as exact. They convey the sense of no, little, more, and much overlap, connection, association or whatever word you prefer.

Now realize another dramatization of the Venn diagrams. If you compare the shared area (the intersection of the two sets) with the unshared area, you can see them as a ratio. Imagine them as the numerator (top) and denominator (bottom) of a fraction. This becomes an illustration of the t-test. That test divides the amount of explained variance in the overlap area by the amount of error variance in the nonoverlap area.

Quickly see a bonus dramatization! This Zen of Venn also reveals that an effect size and its associated test of significance are identities! When you look at the overlap you see the effect size. When you compare that overlap with the nonoverlap you see the test of statistical significance.

See yet another dramatization: the Venn of two predictors on one outcome.

Zen this Venn for a moment. We see that each predictor overlaps with the outcome. We can contemplate both the likely statistical significance and the effect sizes. Both appear to be SSD and Medium. We also see that using two predictors increases the amount of explained variance (the overlap) and also decreases the amount of error variance (the nonoverlapped part of the outcome). This example dramatizes a main effect for both predictors, but no interaction because the two predictors only overlap with the outcome, but not each other.

More drama. Add another predictor. Like this.

Zen again. The new predictor shows little overlap with the outcome, but notice what now happens when we form the t-test. The presence of the other, stronger, predictors consumes variance from the outcome, shrinking the error term now making that little overlap more significant than it would be alone.

This is the Sin of Venn with the Observational Tooth Fairy. You see its operation in Climate Change, Breast Cancer, Soda Pop, Sitting, Statins, Exercise, Diet, and even Health Insurance. Adjust the Venn with extra predictors that consume error variance and convert it to explained variance, then use that shrunken error term to test your Tooth Fairy predictor.

P.S. I always thought that the Alan Parsons Project should do math and science concept album with Dr. Venn as an inspiration. Something like Dr. Tarr and Professor Fether (YouTube) from Tales of Imagination.

A Nuance or Sophistical Statistics or Persuading with Numbers

Rarely will you see reports offering something like the Windowpane. Instead you’ll see correlations (r), standard deviation effects (d or g), or ratios (risk, absolute, relative, hazard, odds). Here’s a handy conversion guide again using the Cohen conventions.



