I’ve been thinking about redesigning my self-pub covers, and as a part of this got curious about whether there were any obvious patterns that distinguish highly successful self-published covers and recently added covers in fantasy.

I screen-capped the first 45 fantasy top series starter covers and 45 ‘recently added’ fantasy. I also screen capped the 45 most downloaded, but there were so many author repeats that this wasn’t very useful (too little independence within the sample). The screen caps are below…

RECENTLY ADDED (random sample of uploaded material)

I then went through and (very subjectively) assessed…

Tone (overall light, medium or dark)

Art: Pro, semi-pro (probably paid for, but not at the top end), Effortful (probably home-made but the author has put some effort in) and Placeholder (the Author just didn’t bother with cover art)

Dominant colours: I listed all colours that felt subjectively dominant in the cover: categories were black, white, grey, red, orange, yellow, purple, blue, green, brown, gold, silver. I also ended up adding ‘Caucasian skin tone’ because some covers were dominated by a face or semi-naked bodies, so that the overall ‘colour’ of the cover was skin. I made this ‘Caucasian skin tone’ (CST) because it was soon obvious that almost all skin tone was Caucasian. There were people of colour on only about 3-5 covers (a little hard to tell), and in only one instance did I think enough skin was being shown to list Person of Colour Skin Tone (PoCST) as a dominant colour. Incidentally, a fully clothed person standing mid-distance wouldn’t get this tag. The cover has to be mostly skin.

Number of Colours: A simple count of the number of colours.

Type: The class of image, figurative (for any living thing, but mostly this was people), abstract (for surreal, abstract or symbolic covers), landscape, interior (as in the interior of a building) and object (crown, sword, helmet etc). I had to add the category ‘Hand’ for a hand holding something (usually magic flamey stuff), as this turned out to be a type of cover.

AYW and AYM: If there was a young woman or young man presented in an attractive or sexual way, the book also got this listed next to it. Highly subjective, but I tried to note this whenever I thought that was the artists intent (i.e. I don’t necessarily have to find an image attractive for it to get an AYW and/or AYM tag.

I also notes whether there were cloaks, dragons, lycanthropes, unicorns and other sundry fantasy elements, although these were surprisingly infrequent. The fantasy beasties got a category (Beasts), but everything else was just listed as present/absent. A few things (like watches, maps) had a couple instances, but I dropped them as a category because they were just too infrequent.

SUBJECTIVE IMPRESSIONS

First off, what was I thinking? I have so many things that need to get done and this took hours. Anyway. Surprises: covers are way more figurative than I expected. They were also much darker than I expected. A lot of black and red / black and purple / black and blue. I associate black and red with horror, but maybe the supernatural romance in the mix is bringing in more horror-associated colours. I also thought there were way more AYW and AYM in the top sellers category than in recent uploads, but the analysis (below) suggests that it isn’t a real difference.

ANALYSIS

I used Random Forests (R ‘cforest’ in ‘party’), a model averaging method based on conditional inference trees. Conditional inference trees are a binary recursive partitioning system (much like an old-fashioned classification and regression tree) that using a Bayesian learning algorithm embedded in an inference framework to identify significant splits in the data where the goal is to maximise between group and minimise within group variation. Unlike an old-fashioned classification tree, a conditional inference tree won’t split the data if there are no meaningful groups (i.e. it is possible to obtain a null). I ran the random forests at alpha = 0.05, but checked the patterns with individual conditional inferences trees at alpha = 0.10. These trees are quite aggressive with controlling Type I error, and in the published literature they are often run at 0.1 because otherwise you may have difficulty showing that a mouse is of a different mass to an elephant.

Bars to the right of the vertical red dashed line are meaningful enough to warrant interpretation. Anything to the right of zero is more meaningful than random, however the vertical dashed red line represents the ‘error’ in the model, and anything between zero and the error line should be treated with care. Bars to the left of the zero are less predictive than random, which of course isn’t actually possible, but this is used to work out the ‘error’.

Immediate take-home

Art (Pro, semi-pro, effortful, placeholder) is the most important predictor of whether a cover is a ‘top series starter’ or a ‘recent upload’.

Colours is the second most important predictor. This is the number of colours.

The presence of absence of ‘black’ as a dominant colour comes next

The presence of absence of ‘silver’ comes next, but it is pretty borderline

Let’s look at the individual categories broken down using trees…

Only one split in Art. If your cover was ‘Effortful’ (according to my subjective eye) you are more likely to be in the ‘recent upload’ category (light grey). Effortful art was about 90% recent and 10% ‘top series’. The others (including ‘placeholder’ strangely enough) were about 50/50 recent/top.

Take-home: Pro-art doesn’t guarantee sales, but ‘effortful’ art dominates the recent uploads.

I had an inkling this might be the case as I was working through the images. The ‘top series starts’ tended to have a much more coherrent colour scheme, often relying on just 2-3 colours. The split here is at 2. If your cover had 2 or fewer dominant colours, there was a 90% chance of being in the ‘top seller’ category. If your cover had more than 2 colours, you have about a 40% chance of being in the ‘top seller’ and a 60% chance of being in the ‘recent uploads’ categories.

Take-home: Ask an artist to use a limited palette. 2-3 ‘colours’ at most (for a broad definition of colour, all shades of ‘blue’ were just ‘blue’ in this, for example)

I also had a bit of an inkling about this too, but I honestly thought a bunch of other stuff would be more important. I used 1 = black and 0 = no black. So, the split on the left have no black, and the split on the right have black as a dominant colour. It’s about 40:60 top seller:recent upload for ‘no black’ and the reverse for ‘black’.

Take-home: Make some large portion of your cover black. Maybe it just works better in grey-scale for people using black and white Kindles to have your cover using a lot of black for contrast at the outset?

When I was doing the colour classification I actually thought ‘gold’ would come out rather than ‘silver’. So, if you have silver on your cover, you are more likely (80%) to be a recent upload than a ‘top seller’. If there is no silver, then it is about 50:50. I noticed this with gold as well (but it didn’t turn out to be statistically meaningful). I started to wonder if self-pub authors were unconsciously adding ‘gold’ and ‘silver’ to their covers because it gave them a feeling of award, or achievement or value? At any rate…

Take-home: Making ‘silver’ a dominant colour probably isn’t going to help sales.

Conclusion. Pro art or semi-pro art definitely is a mark of top-sellers, as compared to a random sample of uploads. Although, interestingly, there wasn’t any difference between pro (obviously a skilled pro artist) and semi-pro (probably just getting established, not long out of art school). You should probably use fewer colours rather than more on a cover. Use black. Do not use silver (or at least, accept that adding silver will not make people buy your book).

Limitations

A bit reductionist: This isn’t really capturing complex interactions or patterns. It is a very simplistic over-view.

Bias. I knew which covers and high sellers and which were not. I did my best to judge all of them even-handedly, but undoubtedly some unintentional bias has crept in.

Some of the ‘recently added’ are by successful authors, but that’s just part of the random noise of a sample.

Exploratory. This is not classical hypothesis testing. But, on the other hand, a controlled experiment seems a bit tricky to devise and run.