85K Games of Standard Analyzed

Tweet by SaffronOlive // Jun 22, 2015

standard

As you probably know by now, a few weeks after a new set comes out MTGGoldfish always publishes a boatload of data, drawn from Magic Online, to help us understand the limited format. This includes win percentages, average game lengths, archetypes, basically anything relevant to the format. Well, a couple days ago I got a message saying that, for the first time ever, we had something similar for Standard. Being a numbers nerd, I was excited, but I was by no means prepared for what awaited when I opened up the spread sheet the next morning - 85,000 games worth of data. To put this in perspective, if we assume an average match lasts 2.5 games, this data set is about the size of a 2,300 person GP where no one drops and everyone gets to play all 15 rounds, except instead of just getting the lists of the top performing decks or knowing the outcomes of feature matches (like most GPs) we know what every single player played, what their opponents played, and the outcome of every single one of their games.

So before continuing this article, please check out Standard Metagame Breakdown to get an overview of the raw data.

Now back to our analysis. Of course the data has all the expected stuff, Atarka Sligh, 13.02 percent of the field, 53.48 win percentages, 22,646 total games, but this data is far more important than simple win percentages. I've always used win percentages begrudgingly. Yes, they are helpful, but they are also lacking, since they don't tell us anything about the matchups. If you bring UWR Miracles to a Legacy Open and every one of your opponents is playing Cloudpost, you will lose 90 percent of your matches. However, in this instance, a match win percentage of 10 doesn't actually mean that Miracles is a bad deck; all it means it that it cannot beat Cloudpost.

As a result, most of the data we get leaves us with theoretical or subjective analysis of the matchups. This data set minimizes this problem. Instead of simply looking to pros for their opinions, or theorizing about how Pack Rat might be good against control, we now have numbers — cold, hard, unflinching numbers. For instance, did you know that Rally the Ancestors actually sucks? Over the course of 2,250 matches, ts only managed to win 40.62 percent of the time. But maybe even more importantly, it actually performs worse when paired many top tiered decks, only winning 39.18 percent of the time against Esper Dragons, 37.74 percent against Abzan Midrange and 34.25 percent against Abzan Aggro.

Now, while this data isn't the end of the story (since your build of a deck could perform better or worse than the decks in our sample based on variance in deckbuilding and sideboards), it does provide a great jumping off point for your own playtesting. I mean, if you are building Rally the Ancestors, wouldn't it be beneficial to have a friend that has already tested the Abzan matchup hundreds of times and can tell you that it's pretty rough? This sort of time saving advice is, in my opinion, the biggest draw of this type of data. Instead of playing a ton of games to figure out that the Rally vs. Abzan matchup is bad, you have a friend saying, "trust me, if you are going to play Rally, you need to figure out a way to improve the Abzan matchup" (or, maybe a better friend would just tell you not to play Rally the Ancestors at all if your goal is to win a tournament.)

As such, today we will be talking about this data and discuss what it can tell us about the Standard format. Since you can read for yourself all the basics (and I highly encourage you to do so), we are going to be focused mostly on the decks and matchups.

Most Played Deck - Atarka Sligh

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

Let's talk a little bit about Atarka Sligh (and Mono-Red, for that matter) in regards to our data set. The first thing that jumps off the pages it the fact that Atarka Sligh makes up a whopping 13.02 percent of our field, which means it is the most played deck by a significant margin (Esper Dragons comes in second at a bit over 8 percent). I expect this is somewhat higher than in the paper world (remember, this data comes from Magic Online replays) because there are a few factors that push players towards red-based aggro online. For one thing, the deck is extremely cheap to build online compared to most other tier one decks. Second, it is fairly easy to play — if you are building your first Standard deck, you'll probably have much more success with Lightning Strike than with Silumgar's Scorn or Jeskai Ascendancy. Third, and maybe most important to Magic Online grinders, you get a lot of fast matches with the deck which allows you to grind through more matches and makes double-queuing easier.

Regardless of whether of not Atarka Slight is over-represented, we can still learn quite a bit from the matchups. The most surprising thing to me was how many 50/50 matchups the aggro deck has. Basically, the deck is slightly favored (between 51 and 54 match win percentage (MWP)) against control, midrange dragon decks and Nykthos, Shrine to Nyx devotion, and only a slight underdog (between 47 and 49.78 MPW) to Abzan off all flavors. It does, however, get beaten badly by GW Company and even worse by straight Mono-Red.

At first I thought the consistently bad matchup against GWx decks was due to Courser of Kruphix, but this isn't exactly true since the matchup against RG Devotion is solid and Abzan Aggro doesn't bother to play the enchantment creature. So if it's not Courser of Kruphix that's the common theme in decks that beat Atarka Sligh, what card is it?

$ 0.00 $ 0.00

That's right. Atarka Sligh is a dog to any deck running four copies of Fleecemane Lion. Now, of course it's not the cat all by itself — I would expect that it is the conglomeration of solid blockers (which are also efficient offensive threats) backed up by incidental lifegain from Siege Rhino and even Hidden Dragonslayer — but Fleecemane Lion is the one non-land card found in (almost) all of the bad matchups and not found in any of the good matchups.

The Better Red Deck

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

I've been wondering ever since Atarka's Command was printed whether of not the green splash was correct in mono-red aggro. Well, based on out data, it seems clear that we are better off playing Mono-Red and forgetting the green for the time being.

Since Mono-Red only makes up 1.56 percent of our field, we are working with a much smaller sample size of matches, but the numbers without the green splash are significantly better. In fact, over the course of 2,712 matches, Mono-Red was the second best performing deck in our field boasting a 57.37 MWP. However, the most surprising thing about going Mono-Red is how it manages to flip many of Atarka Red's worst matchups. Check out this side-by-side comparison.

Atarka Red vs. Mono Red Deck Atarka Red Mono Red +/- Esper Dragons 53.84 58.79 +4.95% Abzan Midrange 47.28 55.13 +7.85% Mardu Dragons 51.83 63.14 +11.31% GW Company 44.20 45.45 +1.25% Abzan Aggro 49.78 60.29 +10.51%

In fact, the only major matchup where Atarka Sligh has a better MWP is against RG Devotion, where Mono-Red posts a miserable 33.33 MWP. This begs the question, why does Mono-Red have such better numbers than Atarka Red? I mean, the decks play 50 of the same cards. Can 10 (or even less) changes really swing matchups to this extent? Unfortunately, I have no clue since red aggro is very far from my thing (if you have some ideas, make sure to let me know in the comments), but when you see increases of over 5 percent in most major matchups across hundreds of matches it is time to take notice.

The "Bad Unless You're a Pro" Deck

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

As I mentioned earlier, Esper Dragons is the second most played deck in our sample coming in a 8.34 percent of the field. While its 51.14 MWP looks solid on its face, these numbers are actually quite deceptive. Basically, Esper Dragons crushes bad decks, tier two decks, and decks that you probably shouldn't be playing anyway. I mean, sure, you can beat Mono-U Devotion, BW Control, RW Triplicate Spirits, and Rally the Ancestors more than 60 percent of the time, but guess what? So can everyone else!

When you look at the matchups that matter, the matchups you'll play over and over again at a GP or SCG Open, Esper Dragons' record leaves much to be desired. The good news is it beats up on RG Devotion and can play Abzan Aggro to a draw, but Mardu Dragons, Abzan Midrange, and Atarka Sligh beat the deck more than half the time while Mono-Red, GW Collected Company, and straight UW Control win over Esper Dragons nearly 60 percent of the time. Considering that the list of bad matchups reads very much like a list of the most played decks in the format, this is bad news for Esper Dragons.

Now I don't think Esper Dragons is a bad deck. Bad decks don't consistently win big tournaments. My theory is that, thanks to the love of the deck by pro players, its flashy dragons, and its solid results on the GP and SCG circuits, people who should be playing a more aggressive or midrange strategy are playing Esper Dragons instead. It's much more difficult to play a control deck competently than to play an aggro or midrange deck competently (although it might be harder to master an aggro deck). A deck like Esper Dragons doesn't give its pilots many (or even any) free wins, you have to grind your way through pretty much every game of every match. If you are playing Mono-Red, sometimes you just have a hand with three one drops and you turn them sideways for three turns and win the game. If you are playing Abzan Midrange, sometimes you draw three Siege Rhinos and Essence Drain your opponent into oblivion. If you are playing Esper Dragons, sometimes you have what? Three Dig Through Times? Three Hero's Downfalls?

It seems like you have to be a solid control player to win with Esper Dragons (or happen to be paired against an opponent playing a fringe deck). If this is not you and you are thinking about playing Esper Dragons in Standard, expect a significant learning curve fueled by losses against other good decks.

The "Cross Your Fingers and Hope for Good Matchups" Deck

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

If you keep up on Legacy, you'll know that Goblin Charbelcher is a busted card in a busted deck. While it technically wins by casting a bunch of rituals and activating Goblin Charbelcher for about 80 damage, really what the deck does when it sits down to the table is ask its opponent three questions:

Do you have Leyline of Sanctity? If yes, I scoop, if no, go to question two. Are you a blue deck? If yes, go to question three, if no I win. Do you have Force of Will? If yes, I scoop, if no, I win.

This is basically what Mono-Black Warrior Aggro does in Standard. PVDDR has a great tweet about Modern this past weekend where he said something like "Every match I play I feel like a roll a D6. If I get a 1 or 2, I win. If I get a 5 or 6, I lose. If I get a 3 or 4, I actually get to play Magic." This is Mono-Black Warrior Aggro in a nutshell. If you roll Esper Dragons, Abzan Aggro or RG Devotion you win a huge percentage of the time. If you roll Atarka Red, Mardu Dragons, or Mono-Red, you lose a massive percentage of the time; the Mardu Dragons matchup is especially laughable — Mono-Black loses over 70 percent of the time. If you roll GW Collected Company or Abzan Midrange, you actually get to play Magic.

The "Why Do People Still Play Me" Deck?

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

Despite a solid over-all win percentage of over 55 percent, the truth of the matter is that a RG Devotion player only asks one question: are you playing GW Company or Mono-Red? If the answer is yes, he or she breaths a sigh of relief and starts to shuffle their deck, but if the answer is no, I'm playing... insert any other deck in the format, it's going to be a long match. Come to think of it, RG Devotion is basically the anti-Charbelcher. Where the Charbelcher player is only worried if you have one of two specific cards, the RG Devotion player is worried if you have anything but one or two specific cards. You might think I'm being hyperbolic, I'm not. Check out these match win percentages. (However, it is possible these numbers are distorted by a relatively small sample size of only 796 total matches)

RG Devotion MWP Deck Percentage Esper Dragons 39.13 Abzan Midrange 35.00 Abzan Aggro 42.86 Mardu Dragons 38.46 Mono-Black Warrior Aggro 33.33

To be fair, Esper Dragons loses to a lot of decks too, but the difference is RG Devotion gets demolished by a lot of decks. When a deck has a 47 or 48 win percentage in a specific matchup, it suggests that the problem is fixable (or maybe even that the problem doesn't need fixing; four extra loses out of 100 games isn't really a deal breaker assuming you have other good matchups). On the other hand, when you can only win 35 or 40 percent of the time against some of the most popular decks in the field, you have a major problem. You can't play a deck that loses 60+ percent of the time to 50 percent of the decks in the field — you just can't.

Can RG Devotion fix these matchups? Maybe, but it's going to take a better deckbuilder than me to figure out how. Plus, several of the most important cards in the deck are rotating in a few months (Polukranos, Worldeater, Stormbreath Dragon and especially Nykthos, Shrine to Nyx), so personally I would just find something else to play

The "On Second Thought, You'd Probably Be Better Off Playing RG Devotion" Decks

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

$ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00 $ 0.00

UB Control and Jeskai Tokens are by far the two worst performing decks that make up a significant portion of the field (so they don't get to hide behind the sample size issue like RG Devotion). In fact, they are the only two decks that make up at least two percent of our sample and post overall negative match win percentages.

Jeskai Tokens just isn't good against anything that matters. Out of all the decks it played at least 100 times, the only ones it beat consistently were RUG and UW Control. The bigger problem is that it consistently performed poorly against the decks that make up a large portion of the field including Esper Dragons, all builds of Abzan, Mardu Dragons, and Atarka Red. UB Control faces a similar problem; while it can beat up on tier two decks like Bant Heroic, Rally the Ancestors, and RG Dragons, much like Jeskai Tokens, it loses to Abzan, all types of red-aggro, and Esper Dragons.

Based on these numbers I'm starting to wonder if it is just a very bad time to be a true control deck. Esper Dragons seems to be worse than it looks at first glance, UB Control is one of the two worst performing top tier decks, and UW Control is non-existent (and also bad). Apparently the combination of lacking a 4-CMC wrath and the printing of powerful, recursive threats like Deathmist Raptor is just too much for control to handle, even with Silumgar's Scorn, the best counterspell printed in years, in the format.

The Decks You Should Be Playing

GW Collected Company is — by pretty much any measure — the best deck in our field. Not only did it post the highest overall MWP (coming in at 57.43 percent across nearly 4,000 matches), but it has very few truly bad matchups. It beats both of the most heavily played decks in our field (Atarka Red and Esper Dragons) more than 55 percent of the time, Abzan Aggro over 53 percent of the time, and is only a slight underdog to Mardu Dragons and Abzan Midrange. Better yet, the deck seems like a very good place to be for the fall. Even though it loses Courser of Kruphix, Elvish Mystic, and Fleecemane Lion, it's hard for me to imagine a shell featuring Collected Company, Deathmist Raptor, and Den Protector not being among the best and most played come rotation, not to mention we are getting an entire cycle of planeswalkers that can be hit by Collected Company. I mean, doesn't the new Gideon (oddly named Kytheon, Hero of Akros) seem pretty solid in this deck?

All things considered, if my goal was to buy a deck that is very good now, but should still be very good this winter, GW Collected Company would be my choice. Even if the splash color changes come rotation (who knows, maybe from white to black for Liliana, Heretical Healer, you can literally splash any color from the CoCo/Deathmist/Protector framework), you'll still have the most important (and likely most expensive) cards for any Collected Company based build.

Mardu Dragons is basically the Jund of Standard; it manages to be (more or less) 50/50 against the field (its worst matchup is Atarka Sligh where it still wins 48.17 percent of the time) by playing a mix of the best removal in the format and solid midrange threats. Despite the great numbers, including a 53.89 MWP across nearly 6,000 matches, it still makes up a relatively small percentage of the field, coming in behind (the awful) Jeskai Tokens as the 8th most played deck. I wouldn't be surprised if this was a deck on the rise in coming weeks.

Unfortunately, while Mardu Dragons seems like a great choice for the summer, I'm a little less sure of the deck's post-rotation future. While losing Thoughtseize isn't a deal breaker (mostly because everyone loses Thoughtseize), Goblin Rabblemaster, Stormbreath Dragon, and Anger of the Gods seem fairly important to the deck. On the other hand, the deck keeps its entire "best in Standard" removal suite which is one of the biggest reasons to play the color combination anyway. I think there is a good chance there will be a midrange Mardu deck this fall, but whether it looks like this or more like the Butcher of the Horde builds from months ago remains to be seen.

Other Notes and Observations

Conclusion

Anyway, that's all for today. Make sure to take some time to browse the data for yourself and see what you can find. As always, leave your thoughts and opinions in the comments, and you can reach me on Twitter (or MTGO) @SaffronOlive