(Note that this last image was generated with a custom overlay I created just for debugging purposes. So don’t go looking for it in the next version)

When a census is run on a cluster for the first time, the cluster keeps the census data stored. That way I can reuse that cluster data in over several polity censuses. But if a group within the cluster gets updated, the entire cluster gets flagged. Which means next time the polity runs a census, then the cluster census will be rerun to update its stored data.

All in all, what this means is that instead of having to go through every cell in a polity whenever I request a census, I instead just go through the stored cluster data, which is way faster. This method actually generates some rounding errors (less than 1%) compared with the old, cell by cell method. But, unlike the differential census method, those rounding errors do not accumulate over time, which is the actual deal-breaker for differential censuses. Small rounding errors are OK as long as they don’t accumulate.

So how much performance did the simulation gain with this? Quite a lot. The previous census method would account for about 25% of the overall app’s CPU time for worlds like the one above. On the other hand, the new method accounts for less than 1% of the app’s CPU time on the same conditions. Now, that 25% might not seem like a lot, but taking into account how the simulation works, this actually translates to what I estimate is a 2x to 5x simulation speed boost in the late game given that, for example, instead of just evaluating 10 or so polities per frame, now the game can evaluate 50 or more polities per frame (though my estimations might be completely off base).

So this was, honestly, a greater success than I expected it to be. Nevertheless there are still a bunch of places where I think I can shave time of the simulation before I can say I’m done with performance improvements for 0.3.1. So next week I’ll start digging for milliseconds through other areas of the code.