We did some maintenance on our model this weekend, and now that it’s done, we thought we should put out a post explaining how exactly our model creates its probabilities and bracketologies. This is probably overdue. To be fair, the maintenance was overdue too. We’re a low-budget outfit, and that reflects in how much time we choose to spend on things like a statistical model that gets the vast majority of its traffic from fans of NIT hopefuls. If you want more, click more (and thank you for the clicks).

If you want to have all our model’s outputs open while you read this, here they are:

We’ll start with how our model works, then get into how accurate it is.

How Our Model Works

At the highest level, this model has three tasks:

Simulate the remainder of the regular season and all conference tournaments Simulate the tournament selection process for the NCAA Tournament and the NIT, creating a bracket for each Simulate the NCAA Tournament and the NIT

Every day, the model does these three tasks 4,000 times. 4,000 times in a row, it goes through step one, then step two, then step three, aggregating the results into outputs that we then post on the website.

How Our Model Simulates the Season

We do not have our own model for simulating games. There are far better models out there than anything we would produce, so rather than reinvent the wheel, we use Ken Pomeroy’s KenPom ratings and ESPN’s BPI for all simulations of games. We take both ratings, mix them together (weighting KenPom more heavily than BPI), adjust for home-court advantage where applicable, then run the simulations randomly such that teams with a 90% likelihood of winning win 90% of the time, those with a 50% likelihood win 50% of the time, and so on.

The model tracks regular season results and conference standings, using those conference standings to populate the respective tournament bracket for each conference. One shortcoming of the model is that we have not input every conference’s tiebreakers, so there are some scenarios where the model is overstating or understating a team’s chances of winning its conference tournament. Overall, the impact of these cases is small, and we do input some tiebreakers as they become clear, but we wanted to be up-front about that in case, say, Alabama State has been eliminated from contention for an appearance in the SWAC Tournament and we haven’t noticed and gotten it into our model yet.

Once the regular season and conference tournament simulations are complete, our model moves onto the tournament selection processes.

How Our Model Simulates the NCAA Tournament and NIT Selection Processes

There are a lot of ways to simulate these processes. Some are better than others. Some are better some years than they are other years. The committees themselves are different every year, and the information presented to them changes, so they do not always treat teams exactly the same as previous committees did in previous seasons. Subjectivity undoubtedly plays a significant role in their selections, as do some circumstantial things like timing of conference tournaments (the final brackets are often constructed prior to the completion of the Big Ten Tournament Championship, for example).

This is our third year doing any sort of bracketology, and our second running this model. In our first year, we stuck to the most basic of formulas, used it only for projecting the NIT field, and found it did fairly well. Last year, we complicated it a little bit, but kept the error margins wide, as the formula we chose had a largely variant performance when applied to previous NCAA Tournament brackets. It was well-calibrated, but it had some bad misses it should not have had (it projected Gonzaga as a 4-seed, when it should have at worst had them a 2-seed), and those misses had correlation within themselves: teams from the power conferences were often overseeded in our final bracketology; mid-majors were often underseeded.

For this year’s formula, we focused on tightening the error margins as much as possible while maintaining simplicity. We did not try, and are not trying, to reflect any of the subjective pieces at this point (in future years, we would love to model those out, but again—low budget, we are not making money doing this, we make our money elsewhere and those jobs will not pay us if we spend all our time doing NIT Bracketology). Instead, we take pieces of information formally presented on the Team Sheets used by the NCAA Tournament Selection Committee (NET ranking, Quadrant I wins, Nonconference Strength of Schedule, etc.), we put them into a formula we’ve built that—within a margin described below—fairly accurately depicts the final brackets from the last few years, and we use that formula to churn out a “selection score.” This score is used, in each simulation, to seed the teams and construct a bracket, having adjusted for automatic bids to both tournaments. This bracket is different from our bracketology brackets in that it does not subject itself to the rules regarding how soon in the tournament teams from the same conference can play, how to place teams geographically, etc. We don’t have those parameters in the model because it slows our machine down too much (we aim to fix this shortcoming in future iterations of the model). Instead, we just let the 1st overall seed play in the same region as the 8th overall seed and the 9th overall seed and the 16th, and the 2nd play in the same region as the 7th and the 10th and the 15th, and so on throughout the tournaments. This creates some error, which we account for in our error margin, but on average, it works out well enough—the actual placement of teams within the bracket doesn’t do an extraordinary amount, on average, to affect teams’ individual championship chances.

If this process sounds fairly straightforward, it is: We created a formula entirely derived from the team sheets. It’s one of the simplest ways to do bracketology, and we like it because it’s entirely objective, removing our own bias from the equation. What isn’t entirely straightforward is predicting what each metric on the team sheet will look like at the end of the season, because almost certain metrics are a black box—we don’t know the exact formulas used for things like NET, which is hugely important in how the team sheet ends up being constructed. Where possible, we use known formulas, but when we don’t know the formula, we either approximate the formula or, in the case of NET, resort to an elo-inspired system which begins with current rankings, translates them to normally distributed values, and then adjusts them in elo fashion based on the simulated game results, such that surprising results move the values more than expected ones. It’s a far from perfect way to do this, and we factor that into the model’s error margins, but it’s the best way we’ve found to approximate changes in the NET rankings and similar metrics with the processing power we have available on a couple of laptops. This process runs concurrently with the regular season and conference tournament simulations, so when those games are completed, we not only have a list of winners and losers, but we have our best approximation of the metrics that will be associated with teams who have achieved those wins and losses.

The Bracketologies

Now is probably the best point to explain the bracketologies. The bracketologies are their own process within the model. It takes a lot of processing in its own right, so we don’t run them for every one of the 4,000 simulations. Instead, we do the following:

For the NCAA Tournament:

Take the most likely team to win each conference’s automatic bid and award it the conference’s automatic bid (in most cases, this is the conference tournament champion, but in the Atlantic Sun, North Alabama is not eligible for the NCAA Tournament or NIT, yet still competes in the A-Sun Tournament).

Of the teams who did not receive automatic bids, take the 36 with the best median overall seed and award them at-large bids.

Seed the teams in order of median overall seed, with mean overall seed serving as the tiebreaker.

Construct the bracket according to the NCAA’s Bracket Principles.

For the NIT:

Look at our model’s median projection for how many teams will receive automatic bids to the NIT (teams receive these bids when they win their conference’s regular season title, lose in the conference tournament, and don’t receive an at-large bid to the NCAA Tournament). Award those bids to the teams that are most likely to receive them, with a limit of one per conference (if Yale and Harvard are both among the twelve most likely teams to receive an automatic bid, and twelve is the median number of automatic bids awarded in our simulations, our model will only take one of Yale and Harvard, giving the twelfth auto-bid to the 13 th -most likely team to receive it).

-most likely team to receive it). Look at our model’s median projection for the NCAA Tournament/NIT cut line, then start adding at-large teams from there, again sorted by median overall seed, with mean overall seed serving as the tiebreaker. If a team is projected to receive its conference’s NCAA Tournament automatic bid but not an NIT automatic bid, we don’t include that team in our NIT Bracketology, since that would increase the number of automatic bids we’re including.

Construct the bracket according to the best understanding we’ve got of the NIT’s Bracket Principles.

How Our Model Simulates the NCAA Tournament and the NIT

At this point, it’s back to KenPom and BPI. We’ve got the brackets constructed, so we just let them play out, tracking who wins each tournament and bringing it back to you, the reader.

How Accurate You Should Expect Our Model to Be

The answer for this is different between the NCAA Tournament and the NIT. For the NCAA Tournament, our final bracketology is expected to have a Paymon Score close to 330 out of a possible 408, which is only better than roughly 20% of all scored brackets on Bracket Matrix over the last two years. We expect to miss three teams or so on the bubble, have one or two big whiffs on seeding (we may project a team as an 11-seed, while they end up an 8-seed), and otherwise be fairly solid. Obviously, this isn’t great, and we would like to improve it. With that being said, our model is well-calibrated, meaning that when we say 40%, we mean 40%, and when we say 80%, we mean 80%, and our results will, in all likelihood, back that up. It’s also well-calibrated as of right now—it isn’t reflective of current results; it anticipates future results and forms brackets based on those, and does that accurately in terms of percent likelihoods. So, when you’re looking at our bracket, you’re looking at a decent projection of where things will end up, and when you’re looking at our probabilities, you’re getting a good idea of the chances of individual teams.

There are things we could do to make the expected Paymon Score higher. We intend to do those things in future iterations of the model. We did not do them in this case because a lot of them are based on exceptions, and are therefore hard to code in efficiently and effectively: Nonconference Strength of Schedule might not matter a whole lot, having adjusted for the rest of a team’s résumé, until it gets to being one of the poorest in the country, or the poorest in the country, as NC State’s was last year. Quadrant I wins might not matter a whole lot, having adjusted for the rest of the résumé, until a team has ten of them, as St. John’s did last year. Some things matter in conjunction with one another, but don’t matter separately, such as how Marquette and Kansas’s respective performances against Quadrant I teams seem to have been evaluated differently last year because Kansas had a more difficult overall schedule, which was already baked into other metrics on the team sheet. And, of course, subjectivity matters: some teams’ conference tournament performances are treated as more important than those of others due to timing and prior expectations; sometimes a team like Oklahoma in 2018 loses a lot of games in the back half of the season and falls victim to “the eye test.” We would like to account for all these things, but we want more data, and as we gather more data, we don’t want to make any wrong bets and mislead anyone unnecessarily. We’d rather have every team within one or two seed lines of its correct place than have eight or nine teams five seed lines off.

I also mentioned the NIT. Our final NIT bracket should be expected to be more accurate than our NCAA Tournament bracket, relative to the other bracketologies out there. This is somewhat intentional on our part: there are hundreds of NCAA Tournament bracketologies online (that’s not an exaggeration), while there are only a few NIT bracketologies, and a lot of our traffic comes from NIT Bracketology. We are, at our core, still an NIT blog, and we chose to prioritize our bracketology’s accuracy in projecting the NIT because we wanted to add an effective projection to that market, complimenting the work of others so fans have multiple credible opinions. Our Paymon Score for that bracket should end up close to 160 out of a possible 192, which isn’t that much better than our NCAA Tournament accuracy as a percentage of the overall possible score, but is stronger in relation to the field—somewhere in the upper half.

There’s also the matter of day-to-day variations within our simulations. If we were running one million simulations a day, there would be no variations without inputs changing, but since we only run the 4,000 simulations, we do have some natural variability within the model. Our probabilities should be considered accurate within the following 95% confidence intervals:

Make NCAAT: 0.9% (0.9 points, not 0.9% of the probability itself)

Win NCAAT: 0.2% (0.2 points)

Make NIT: 0.8% (0.8 points)

Win NIT: 0.2% (0.2 points)

***

Please, reach out with questions and comments, and keep an eye out for models similar to this one regarding other collegiate and professional sports. College baseball and college softball models will hopefully be on their way soon. We’ve discussed creating a NASCAR model and an NHL model for future years. Our college football model will return this fall, hopefully beefed up. And, next year, we aim to have a more nuanced, precise model for men’s college basketball, as well as our first for women’s college basketball.

Thanks for being here.