On Tuesday morning, a poll came out from ABC News and The Washington Post showing Democrats ahead by 14 percentage points on the generic congressional ballot. On Wednesday morning, another generic ballot poll, from Selzer & Co., also rated as an A+ pollster by FiveThirtyEight, had Democrats ahead by only 2 percentage points.

True, these sort of disagreements happen some of the time in other polling series, such as for President Trump’s approval rating. And that’s not a bad thing — pollsters disagreeing with one another, and even publishing the occasional “outlier” that bucks the consensus, is proof that they’re doing good, honest work.

But these big disagreements happen a lot more often for the generic congressional ballot than for other types of polls. For whatever reason, generic ballot polls tend to disagree with one another. They also tend to be fairly volatile even within the same poll: CNN has shown Democrats ahead by as few as 3 points and as many as 16 points in generic ballot polling they’ve conducted this year, for instance. Whereas for presidential approval numbers, you usually only need a few polls for the average to stabilize, it can take a dozen or more polls in your average before the generic ballot stops bouncing around.

We learned about this the hard way, after seeing our generic ballot average wobble around, variously showing Democrats with leads of anywhere from 4 to 13 points at different points this year. Importantly, the average tended to be mean-reverting. Whenever our average showed Democrats with a lead in the double digits, it predictably retreated to a more modest lead — typically in the range of 6 to 9 percentage points. Likewise, when the Democrats’ lead fell below 6 points, it predictably moved back upward into that 6-to-9 point range. A metric should do a good job of predicting itself instead of bouncing around in this way. The more technical name for this problem is autocorrelation, and when you see it in a data series you’re generating, it’s often a sign that you haven’t designed the metric as well as you could have.

The reason we were surprised by this is that the settings for our generic ballot tracker had been imported from our presidential approval rating tracker — and we’d tested them extensively on past presidential approval data and were happy with how they were working this year. You’ll notice, for instance, that when there’s a change in our Trump approval rating average, it usually sticks around for at least several weeks if not longer. “Usually” does not mean always: Just as it’s a problem to show movement in the numbers when it’s really just noise, it’s equally problematic to have the average fail to pick up on real changes in Trump’s popularity because it’s too slow-moving. But our approval-rating average tends to strike a pretty good balance between being too aggressive and too conservative.

Our generic ballot tracker was not striking that balance well, by contrast, as we discovered when creating our House forecast. It was being too aggressive.

Using a slower-moving generic ballot average — one that uses a larger number of polls even if those polls are less recent — would have done a better job of maximizing predictive accuracy and minimizing autocorrelation in past years, so that’s what we used for the House model. And as of today, we’ve changed our generic ballot interactive to match the settings that our House model is using. The average is designed to be slightly more aggressive as we approach Election Day, but in general, it will yield a much more stable estimate of the generic ballot than the one we’d been using before. We’ve also revised our generic ballot estimates for previous dates to reflect what they would have been using our new-and-improved methodology.

You can see that the new average takes more convincing before jumping at a new trend. (The generic ballot numbers as originally published will still be available — you can see them using the link under the chart.)

As an aside, this is one of the reasons that averaging polls isn’t quite as straightforward as it might seem. How to manage the trade-off between using the most recent polls on the one hand and a larger sample of polls on the other hand is a tricky question and one where the right answer can vary between different types of elections. For the generic ballot, you should take a rather conservative approach. But that doesn’t necessarily hold for something like a presidential race — being too conservative would have caused you to miss crucial late movement toward Trump in 2016.