Nassim Nicholas Taleb has tweeted a set of remarks about intelligence research.

https://threadreaderapp.com/thread/1076845397795065856.html

He has now gathered those together into one format, with links and explanations.

https://medium.com/incerto/iq-is-largely-a-pseudoscientific-swindle-f131c101ba39

There is no lack of confidence in his essay. There is much to discuss here, and what follows covers what I see as the main points. I have added some links to relevant publications, but you can put any of the concepts and author names in my search bar to get further details.

1 IQ is largely a pseudoscientific swindle

Given that Taleb criticizes the poor statistics used by intelligence researchers, a mild comment is that it would have been better to be more precise. I have assumed he means that more than half of intelligence research findings are wrong, and for malicious reasons. If this is his point, he is factually wrong.

2 IQ is stale, mostly measures very low intelligence, or a lesser form of intelligence for paper shufflers or those ill-suited to real life. “it explains at best between 13% and 50% of the performance in some tasks”. It is based on poor maths, and promoted by racists and swindlers.

It seems that Taleb has a poor opinion of many people. “Paper shufflers” probably include all the backroom workers who keep the books and process the transactions of star traders. The reason for his doubts about the maths behind IQ, Taleb explains, is that he can computer-generate correlations based on particular assumptions which then look like some of the reported findings on intelligence and scholastic attainment. He implies that if he can do that on a simple basis (create a mythical test which only measures below IQ 100 performance and progressively just add noise above that) then that invalidates the actual observations reported by Frey and Detterman (2004). This is not a compelling argument. A far simpler explanation is that a population wide measure (general IQ) is being compared with a scholastic test taken only by a selection of brighter students (SAT) yet still does a pretty good job of showing the link between the two. This is a real-life finding, of the sort that Taleb supposedly favours.

3 If you want to detect how someone fares at a task, say loan sharking, tennis playing, or random matrix theory, make him/her do that task; we don’t need theoretical exams for a real-world function by probability-challenged psychologists.



In fact, psychologists have understood this point. Hunter and Schmidt and Kunzel point out that the best test of whether a person can do a job is to let them try it. However, this is expensive in time and money, since you have to supervise them to prevent disasters, give them detailed instructions and monitor their performance carefully, all of which takes at least two weeks to get a reasonable estimate of the applicant’s capabilities. You cannot do this for all applicants, or it would take up all the staff time required for doing the actual work of the business. The above researchers show that an intelligence test is a close second-best in terms of outcome, and far quicker and cheaper. Add a test of honesty and you have an efficient selection system.

4 Different populations have different variances, even different skewness and these comparisons require richer models.

Again, most psychometricians agree with that and it has been known for decades. At the very least, they like seeing the data plotted out properly, so the actual findings are visible, and so that they can be analyzed by different statistical approaches. Nothing new or insightful here.

5 A measure that works in left tail not right tail (IQ decorrelates as it goes higher) is problematic.

Lubinski and Benbow have shown in prospective studies with a large sample that IQ is still predictive at the very highest levels, and keeps working at each higher band. Taleb’s point is demonstrably wrong.

6 It (IQ) can measure some arbitrarily selected mental abilities (in a testing environment) believed to be useful. However, if you take a Popperian-Hayekian view on intelligence, you would realize that to measure it you would need to know the mental skills needed in a future ecology, which requires predictability of said future ecology. It also requires the skills to make it to the future (hence the need for mental biases for survival).

Intelligence test items are not arbitrary. They are selected to represent a wide range of abilities drawn from actual tasks and real-life problems. They correlate highly with tests which specifically base themselves on real life tasks in American society, such as the Wonderlic Personnel Test. Linda Gottfredson has shown all this, many times, for decades. As to “mental skills needed in a future ecology”, that is an excellent example of intelligent behaviour, as is survival. In a Scottish population study, Ian Deary has shown that intelligence tested at age 11 predicted lifespan into old age. Brighter people were capable of surviving longer than the less bright. Taleb is wrong again.

7 Real life never never offers crisp questions with crisp answers (most questions don’t have answers; perhaps the worst problem with IQ is that it seems to selects for people who don’t like to say “there is no answer, don’t waste time, find something else”.)

If this were a relevant objection, then the crisp answers required in the Scottish 11+ would not have shown any relation to lifespan and decades of achievement. Equally, the crisp answers required of SMPY participants would not have predicted their mid-life achievements (and will probably predict decades of achievement as the follow-ups continue). Digits backwards is a crisp-answer task. It wastes little time, yet is a good predictor of general ability. Crisp test answers also correlate to many brain structure and function measures assessed by neuroimaging (Haier, 2017). Also, given that all puzzles require brain power, these selected items may tap a general ability to solve puzzles of a far more general and urgent nature.

8 It takes a certain type of person to waste intelligent concentration on classroom/academic problems. These are lifeless bureaucrats who can muster sterile motivation. Some people can only focus on problems that are real, not fictional textbook ones.

Taleb is very free with his insults. It might play to those already taking an anti-IQ stance. A rough measure of ability can be obtained in two minutes, which does not tax concentration. Sure, many people favour the practical over the academic, and might concentrate best on real-life problems. This is testable, and once again, on a broad range of people and a broad range of real-life problems, intelligence tests maintain predictive utility. Detterman shows many of the correlations.

9 IQ doesn’t detect convexity (by an argument similar to bias-variance you need to make a lot of small inconsequential mistakes in order to avoid a large consequential one. See Antifragile and how any measure of “intelligence” w/o convexity is sterile edge.org/conversation/n…)

Taleb makes interesting points about what he has described as “convexity”. In his Edge essay he points out that “chance” is not a good explanation for long term gains. Now we have something we can agree upon. By “convexity” Taleb means that in his view research progresses by “a significant asymmetry between the gains (as they need to be large) and the errors (small or harmless), and it is from such asymmetry that luck and trial and error can produce results”. This is a convex function, hence the name he gives it. Fine. This may or may not be the case, and it is not clear how this hypothesis could be tested directly (and therefore not scientific), nor is it clear why this proposal means that a measure of intelligence would be “sterile”. One test might be to see what the correlation is between IQ and options trading/financial investments. The latter are real world tests of considerable significance, and a null result would strengthen his argument.

As luck would have it, here is a relevant publication.

https://sci-hub.shop/10.1016/j.jfineco.2011.05.016

We analyze whether IQ influences trading behavior, performance, and transaction costs. The analysis combines equity return, trade, and limit order book data with two decades of scores from an intelligence (IQ) test administered to nearly every Finnish male of draft age. Controlling for a variety of factors, we find that high-IQ investors are less subject to the disposition effect, more aggressive about tax-loss trading, and more likely to supply liquidity when stocks experience a one-month high. High-IQ investors also exhibit superior market timing, stock-picking skill, and trade execution.

The authors find that by making better stock selections and achieving lower transaction costs high IQ subjects do 4.9% per year better than low IQ subjects. Given that real returns average 7%, this is a massive difference which will accumulate over time and result in far high net personal worth for brighter investors. By the way, intelligence is measured at conscription age, long before there is much investment history, so is more likely to be causal.

https://www.sciencedirect.com/science/article/pii/S0304405X1100211X

10 Seeing shallow patterns is not a virtue — leads to naive interventionism. Some psychologist wrote back to me: “IQ selects for pattern recognition, essential for functioning in modern society”. No. Not seeing patterns except when they are significant is a virtue in real life. To do well in life you need depth and ability to select your own problems and to think independently.

The ability to see patterns where others cannot has traditionally been seen as a sign of intelligence. Interestingly, Taleb accepts that some problems are shallow. How does he know? Presumably he can see through them, and finds them easy. Good. Item difficulties vary. Depth, ability to select problems and to think independently are signs of intelligence. This is a point of agreement with psychometrics although he may not realize it.

11 Functionary Quotient: If you renamed IQ, from “Intelligent Quotient” to FQ “Functionary Quotient” or SQ “Salaryperson Quotient”, then some of the stuff will be true. It measures best the ability to be a good slave.

Taleb’s argument seems to be that IQ tests only work for common folk in humdrum jobs. Not exactly oblesse oblige. This is a version of the old familiar argument that intelligence tests do not measure creativity. Easy to assert, but the evidence seems to be against it, so long as you measure creativity by quality and quantity. Rex Jung has studied this matter, creatively and carefully.

Kunzel shows that college entrance tests can predict success even in jobs that are far from humdrum.

12 “IQ” is most predictive of performance in military training, w/correlation~.5, (which is circular since hiring isn’t random).

Two criticisms. First, the military training data is interesting in itself, but the key issue in terms of the generality of the findings is that quite a few tasks were identical to non-military tasks. For example, vehicle maintenance is the same task, so what we have is far more detail on the IQ/training link than was usually collected in commercial car repair garages.

Second, far from being “circular” the observed correlation of .5 is in fact attenuated by range restriction. Can Taleb have made a simple statistical error? Impossible. Since the US military are allowed to screen out low ability candidates, they provide a precise test of what Taleb had asserted earlier, namely that the tests only worked for low ability people. The correlation of .5 is achieved on higher ability people. If we assume that at least IQ 100 is required, then the true correlation might be .7

13 I have here no psychological references for backup: simply, the field is bust. So far ~ 50% of the research DOES NOT replicate, & papers that do have weaker effect. Not counting poor transfer to reality.

Why does he offer no references? Probably because, in fact, the main findings in psychometrics have replicated as well as or better than other areas in psychology. It is just that many people hate the results.

14 The Flynn effect should warn us not just that IQ is somewhat environment dependent, but that it is circular.

Well, the Flynn effect did not raise Digit Span or Maths scores, so there is an ignored story here about measurement problems. Not clear that the FE is a g effect. Few psychometricians doubt that the environment affects ability. IQ is not circular, but can be determined by very simple mental tasks which are found in all cultures, and have long term predictive power on far more complex tasks.

Summary

Taleb has made sweeping assertions with great confidence and surrounded by insulting language. Those assertions may well influence people who feel unsure about intelligence, and who assume that someone who is sure of themselves must know what they are talking about. That is understandable: an unsure person is aware they need to do more reading and thinking before feeling confident, and charitably assume that only a knowledgeable person who had done the necessary reading would dare speak with confidence.

Yet, far from giving scientific references at the end of his essay, Taleb confidently asserts that he does not need to do so, because the field is broken because…. Convexity. This is presented as if it were an essential ingredient of statistical analysis, rather than one of his interesting ideas about research strategies. This is amusing, because even in the area which Taleb calls his own, as a financial instruments trader, it is easy to find a careful, long term, large sample study that shows the beneficial effects of intelligence on investment behaviour. On his own home ground he is down 1-0.

The other lapse is to ignore the decades of debate carried out by intelligence researchers, notably Jensen, to improve measures of intelligence so that they conform to the requirements set out by SS Stevens. Digit Span is such a measure. So is Digit Symbol and, if measured extensively, Vocabulary. Simple and complex reaction times are other examples. Overall, Taleb is not providing new or original insights that advance the field. But his aim does not appear to be constructive or even informative.

I don’t know why an able man is so ill-disposed to measures of ability, but can only assume he is well aware of his abilities, and regards himself as above such mundanities. He does not give references, but mentions a book he is about to publish. Better to stick to the facts.

Does Taleb’s boastful dismissal of a field he palpably does not master mean that we should dismiss his contributions to other fields? Probably not. Public figures sometimes stray out of their field of competence. It is an occupational hazard brought on by public adulation, known since Roman times. However, if he can be so bombastic when out of his depth, then it would be prudent to go back to his other writings with a slightly more critical eye. When I read his thoughts on probability I made positive assumptions about some of his pronouncements on risk on the very prudent grounds that I could not contest his mathematical excursions. Perhaps I was Fooled by Algebra. Perhaps I was not the only one.

Taleb describes himself as a flaneur, which is a stroller, the sort of person who swans about. No problem with that. Swans are beguiling, but beautiful shapes can lead us astray.