As far as I can tell, substantial part of Qanswers in current formula is problematic:

(log(Qviews)*4) + ((Qanswers * Qscore)/5) + sum(Ascores)

About 1/3 of the answers studied here (83 of total 254) have score less than 1/100 of top voted post in respective question. Given the high amount of views and votes on studied questions, it looks like when sufficient evidence is obtained for the question, time comes to re-check (Qanswers * Qscore) part of the algorithm to make sure that it still reflects the underlying assumption:

one assumes if there are lots of answers, there will be a lot more voting on the answers, too

The observed score difference ("less than 1/100 of top voted post") clearly indicates that not all the answers satisfy above assumption. Algorithm assumes voting on the answers but the evidence strongly indicates readers don't vote on some of these; thus (Qanswers * Qscore) part becomes fake.

Given that questions checked were ones with tens thousands views, insultingly low score indicates that assuming these answers to be popular wouldn't even be in the ballpark. Still, the formula pumps these into Qanswers value, as if it is something everyone would be happy to read (hint: it isn't).

Consider tuning the formula to make it closer match observed voting evidence (when it becomes sufficient to learn from) with the initial assumption of "a lot more voting on the answers".

When question and answers gain a lot of votes, begin ignoring answers with low / non-positive score. Or better yet, ignore answers scored less than some reasonable fraction (eg 1/10) of the top one.

For the sake of completeness, another option would be to keep current algorithm and instead modify "feature specification" to better match the formula. Although I honestly can not imagine how apparently useless answers can be explained / specified as contributing to "hotness score" in a meaningful way.

Note there is a feature request at Prog.SE meta to test a less 'klingonic' modification of the formula:





Yet another indication of issues with current formula is how easy it is to manipulate hotness score. In my recent experiments with a particular typical hot question, it looked like changing direction of a single question vote (up->down->up) has been causing changes in score by 10-15 points. This is because any question vote is leveraged by amount of answers. Consolidated, 2-3 voters can "swing" the score by 30-50 (for comparison, current top screen at collider shows me 6 questions scored from 77 to 38). Similarly, adding / removing an answer (any answer) in a highly voted question offers even more possibilities for cheating, because of it being leveraged by question score. If it was a game, I would call this somewhat boring because of poor game balance.

Actually, one can say that sort of unconscious coordinated manipulation already happens quite regularly, when hotness lemmings attracted by titles shown on top of collider land on the questions and begin posting multiple zero-effort braindumps, making formula pump the score even higher, which attracts more lemmings that break things further and so on. "My particular concern is the poisonous effect these mis-answers have on questions, making interesting and well presented problems look the same as non-constructive popularity contests." (quote source).

For the record, in August 2010 algorithm has been tweaked as follows:

Succeeding questions from the same site are penalized by increasing amounts. So, the first question from SO in the list gets multiplied by 1.0, the second by 0.98, the third by 0.96, etc)

Community wiki questions are penalized, to keep the entire home page from being Poll-type questions

The benefit of many answers is capped at 10, and we only look at the score of the top 3 answers

We only degrade based on question age, and not the last update date on a question, so questions don't pop back up to the top every time they're edited

Views are not counted towards the score The core of the formula (without the site-based degrading or traffic scaling) is: (MIN(AnswerCount, 10) * QScore) / 5 + AnswerScore ------------------------------------------------- MAX(QAgeInHours + 1, 6) ^ 1.4

Yeah "benefit of many answers is capped at 10", how cute.

Even with tweaked formula, stuffing 8 useless, zero-score answers into +50 question would have the same effect as giving 80 upvotes to answers. At +200 question, this would be like giving 320 (over three hundreds!) upvotes to answers.

No wonder than even with above tweak, some questions stick to the top of the hot questions list forever.