So You Want to Build a Hearthstone Deck…

Building and refining Hearthstone decks is hard. It takes high levels of skill to even figure out where new, powerful opportunities reside. This is the “big” problem of deckbuilding, as it covers the larger outline and gameplan of your deck. It helps determine everything that is to follow.

However, once you have the general picture of your deck in mind there will be a great many smaller problems that, in many cases, take longer to solve. You need to figure out which and how many synergy cards to include, which stand-alone cards are worth playing, which “packages” of cards can and cannot be ignored, and how you can execute your intended strategy in the face of other players trying to beat you. These are all the “small” optimization problems.

What makes these matters incredibly difficult is that Hearthstone is a high-variance game. Your deck might lose a match, but was that because it was built poorly or was it because your opponent drew well? Perhaps there was a random card generated that changed the entire course of the match, which is unlikely to happen again. Personally, I have had back-to-back days where I played an identical deck for 30 games each time, ending with a 67% win rate on the first day and a 37% on the second.

This makes assessing your decks and card choices incredibly difficult. Not only you do have 30 cards to assess, but you’re trying to assess all of them at the same time in an environment that’s constantly uncertain and shifting.

Put plainly, there is no way we can, as humans, fully optimize our decks with the information we are capable of collecting personally. While we can get reasonably close at times, our personal data will never be sufficient to decisively determine how favored deck A is against deck B in general, let alone figure out whether card A is 0.5% better than card B, on average. No one is even capable of doing so (with appropriate amounts of confidence. Some players may still be inappropriately confident in their conclusions). Instead, we have to rely on our intuitions and emotions about how well everything is performing, what should stay, what should go, and what should replace it.

We can be good at these problems, but we’re far from perfect.

Modern Problems Require Modern Solutions

As Hearthstone is quite variable and our brains are not designed to detect the statistical patterns in that noise accurately enough to make the best conclusions, we have created a series of tools to aid us in our quest for the perfect lists.

Sites like HSreplay.net are incredibly useful because they can aggregate the data collected from thousands of players who are using these cards as well, allowing us to see what works and what doesn’t. This is a staggering amount of information that we cannot even come close to collecting on our own. By pooling our personal data into a collective, however, we end up with something much greater than the sum of its parts. Instead of relying on intuitions, we can see in plain numbers how often decks have been winning, at what ranks, over what time, and how many games result in a win when a player has mulliganed into, drawn, or played any particular card.

Now it’s not as simple as just reading the numbers and thinking you’ve learned all there is to know about them. Data analysis on these sites is a skill just like any other, and we need to interpret it properly to get the most out of it (for a quick guide on some tips to do so, see here). Nevertheless, it’s easy to get started using data.

One quick and useful metric to look at is “Drawn WR” which is the winrate of the deck for games when that card is drawn at any point. Looking at this metric can help you sort out the weakest cards in your deck.

Now there will always be a “worst” card in your deck – there has to be – but one of your goals during deckbuilding should be to ensure your worst card isn’t a clear liability; not much worse than other cards in the deck. If you see a worst card that is, say, 0.1% worse than your next worst card, that’s not a big deal. If you see a card that is 1-5% worse than your next worst card, that’s a sign something has likely gone wrong and some choices should be reevaluated.

This part can be exceptionally tricky, but not just because data isn’t always easy to interpret. It can instead be tricky because we are very good at tricking ourselves. As the famous line goes in science, you must not fool yourself, and you are the easiest person to fool. Remember those intuitions and emotions I mentioned early? Some people can convince themselves that a particular deck or card is good, and once you have done that it can be hard to unstick that idea from your brain, even in the face of a lot of contradictory data.

In fact, the smarter the person you are, the more dangerous these incorrect intuitions and feelings can become. This is because smart people are, to be blunt, very good at being stupid. Smart people (and good players) are very good at thinking up plausible-sounding justifications (because they’re smart/good) for choices that ultimately end up being bad. They can feel more confident in ignoring other people’s data because it doesn’t match their high-skill intuitions.

This was the case when it came to Spirit of the Shark not so long ago. Many players were convinced the card was good, despite mountains of data saying it was bad. It was always the worst card in any list playing it; a fact which remains true even today, even in Highlander lists.

If you need a testament to how bad this card is, in decks with 30 separate cards, Shark is worse than the other 29 by a decently-wide margin. (That is, of course, assuming you don’t do something like play an even worse card that no one would rightly touch, like Kidnapper). I say all that as someone who was rooting for the card to be good. I’d love to have more fun and powerful tools at my disposal, but the card just made every deck it touched worse.

Despite that, many players – both good and bad – came the conclusion that the card should stay in the deck, created plausible-sounding justifications for why its performance wasn’t good when looking at the data, and went on to lose more games than they otherwise needed to because decks without the card outperformed the decks with it (even if many people will still deny it to this day).

What Causes Sticky Intuitions

To understand why at least some of these intuitions get “stuck,” despite ample data they’re bad, we can consider another contemporary example to pick up some possible similarities: Battle Rage.

If we look at modern Galakrond Warriors, you’ll be hard-pressed to find lists that don’t run this card. Despite that frequency of play, Battle Rage remains the card with the lowest drawn win rate in the deck, and not by a small margin, either: the card usually wins 2-3% less than the next worst card when it gets drawn at any point. This doesn’t seem to change in decks that run the new Risky Skipper either, which could be premium synergy. It’s a consistent pattern of underperformance seen across tens of thousands of games.

So why does the card stick around in decks? Part of the reason is surely inertia: most people aren’t building decks; they’re simply copying lists that have them. But there’s more to it than that, as many people who seek to win as much as possible do legitimately seem to believe the card is powerful.

One important thing to note is that – much like Spirit of the Shark – Battle Rage sometimes creates huge, powerful moments. There will be games where you draw 5 cards for 2 mana and come back to win a game you thought was lost. Other times, you might see your opponent fire off that same play, chain a big Battle Rage into another big Battle Rage, and suddenly refill their hand. These are big moments that don’t happen within the usual power curve of the game (see Arcane Intellect for what an on-powercurve draw card looks like), and they create equally large memories of them. One can even imagine all the huge, possible plays they might make with the card and get excited about how it will work in the deck. It makes intuitive-sounding sense that the card would be good.

Do you know what doesn’t create similarly large memories and anticipation? Sitting on a Battle Rage you cannot play to any real effect for many turns because you either don’t have enough damaged targets or you cannot use the mana to draw without falling too far behind on tempo. Maybe you just Battle Rage for one. Maybe you get a big Battle Rage, but it’s after the card sits dead for 6 turns and then it’s too late. It’s a much less memorable experience, and yet it’s a much more common one.

Further, you might come to think that Battle Rage is good because your opponents always seem to get good ones off, don’t they? That’s probably true, but that’s also probably because your opponents simply won’t play the card if it’s bad most of the time. If the card sucks and does nothing for them, they’ll probably make any play they can besides Battle Rage. You won’t even know it’s in their hand being useless. This can leave you with a biased set of memories for how the card plays out in the game because you literally see if more often when it’s good than you do when it’s bad (not unlike how you usually only see Leeroy Jenkins played when it’s going to kill you; not when it’s inefficient and useless).

While it might make intuitive sense that Battle Rage could be good, it also made sense to some people that putting the Bazaar Burglary Quest in a Galakrond Rogue shell could be good too. Not because they were trying to complete the quest, but rather because they could play the “if you have a quest” payoff cards. This felt like you’re doing something sneaky, powerful, and intelligent to many. It could create some high-roll moments with Questing Adventurer or Edwin VanCleef (like Shark, like Battle Rage…). It made good sense to them.

It also threw about 3-4% of the Galakrond Rogue deck’s win rate into the garbage.

Getting Unstuck

Breaking from these intuitions won’t be easy. I cannot promise that I can even do much to help you besides raising awareness of their existence. The one piece of advice I can at least offer is to change your perspective on the questions you ask yourself.

When it comes to Battle Rage, Spirit of the Shark, and other similar cards, don’t just ask yourself about what their possible uses are, or what their best cases can be. You might not even want to ask yourself what their worst cases are, as I assure you that you’re capable of assuming those worst cases won’t happen very often and the best cases will happen regularly. Instead, ask yourself the following question: what pattern of data could prove me wrong? If you think a card is good, ask yourself what might change your opinion. In the event you’re unable to provide a good answer – that there isn’t much that could change your mind or that you couldn’t just rationalize away – there’s a good chance you’re dealing with a sticky intuition. It doesn’t mean your intuition is wrong necessarily, but it does mean you should be careful around it.

And remember: there’s nothing unusual about being wrong. Everyone is wrong several times a day. Don’t be ashamed of being wrong: celebrate the opportunity it allows you to find out how to be better.