home » fallacies » assessment fallacies & pitfalls

10 Fallacies in Psychological Assessment

Kenneth S. Pope, Ph.D., ABPP

PLEASE NOTE: I created this site to be fully accessible for people with disabilities; please follow this link to change text size, color, or contrast; please follow this link for other accessibility functions for those with visual, mobility, and other disabilities.

I've gathered 10 of the most common fallacies and pitfalls that plague psychological testing and assessment, and provided a brief definition and discussion of each.

The list, of course, is far from exhaustive, but these 10 often show up in clinical, forensic, and other psychological assessments and I'm guessing that many if not all of us have gotten tangled up in them at one time or another.

They are: mismatched validity; confirmation bias; confusing retrospective and prospective accuracy (switching conditional probabilities); unstandardizing standardized tests; ignoring the effects of low base rates; misinterpreting dual high base rates; perfect conditions fallacy; financial bias; ignoring the effects of audio-recording, video-recording, or the presence of third-party observers; and uncertain gatekeeping.

These assessment fallacies and pitfalls are discussed in more detail in the articles and other materials on this site, but it seemed worthwhile to draw them together.

For those interested, other articles in this section present 22 Logical Fallacies in Psychology; 21 Ethical Fallacies in Psychology; and Ethics, Language, & Critical Thinking: Using Words To Deceive.

Mismatched Validity

Some tests are useful in diverse situations, but no test works well for all tasks with all people in all situations. In his classic 1967 article, Gordon Paul helped move us away from the oversimplified search for effective therapies toward a more difficult but meaningful question: "What treatment, by whom, is most effective for this individual with that specific problem, and under which set of circumstances?"

Selecting assessment instruments involves similarly complex questions, such as: "Has research established sufficient reliability and validity (as well as sensitivity, specificity, and other relevant features) for this test, with an individual from this population, for this task (i.e., the purpose of the assessment), in this set of circumstances?" It is important to note that as the population, task, or circumstances change, the measures of validity, reliability, sensitivity, etc., will also tend to change.

To determine whether tests are well-matched to the task, individual, and situation at hand, it is crucial that the psychologist ask a basic question at the outset: Why--exactly--am I conducting this assessment?

Confirmation Bias

Often we tend to seek, recognize, and value information that is consistent with our attitudes, beliefs, and expectations. If we form an initial impression, we may favor findings that support that impression, and discount, ignore, or misconstrue data that don't fit.

This premature cognitive commitment to an initial impression--which can form a strong cognitive set through which we sift all subsequent findings--is similar to the logical fallacy of hasty generalization.

To help protect ourselves against confirmation bias (in which we give preference to information that confirms our expectations), it is useful to search actively for data that disconfirm our expectations, and to try out alternate interpretations of the available data.

Confusing Retrospective & Predictive Accuracy (Switching Conditional Probabilities)

Predictive accuracy begins with the individual's test results and asks: What is the likelihood, expressed as a conditional probability, that a person with these results has condition (or ability, aptitude, quality, etc.) X? Retrospective accuracy begins with the condition (or ability, aptitude, quality) X and asks: What is the likelihood, expressed as a conditional probability, that a person who has X will show these test results? Confusing the "directionality'' of the inference (e.g., the likelihood that those who score positive on a hypothetical predictor variable will fall into a specific group versus the likelihood that those in a specific group will score positive on the predictor variable) causes many errors.

This mistake of confusing retrospective with predictive accuracy often resembles the affirming the consequent logical fallacy:

People with condition X are overwhelmingly likely to have these specific test results.

Person Y has these specific test results.

Therefore: Person Y is overwhelmingly likely to have condition X.

Unstandardizing Standardized Tests

Standardized tests gain their power from their standardization. Norms, validity, reliability, specificity, sensitivity, and similar measures emerge from an actuarial base: a well-selected sample of people providing data (through answering questions, performing tasks, etc.) in response to a uniform procedure in (reasonably) uniform conditions. When we change the instructions, or the test items themselves, or the way items are administered or scored, we depart from that standardization and our attempts to draw on the actuarial base become questionable.

There are other ways in which standardization can be defeated. People may show up for an assessment session without adequate reading glasses, or having taken cold medication that affects their alertness, or having experienced a family emergency or loss that leaves them unable to concentrate, or having stayed up all night with a loved one and now can barely keep their eyes open. The professional conducting the assessment must be alert to these situational factors, how they can threaten the assessment's validity, and how to address them effectively.

Any of us who conduct assessments can fall prey to these same situational factors and, on a given day, be unable to function adequately. We can also fall short through lack of competence. It is important to administer only those tests for which there has been adequate education, training, and supervised experience. We may function well in one area -- e.g., counseling psychology, clinical psychology, sport psychology, organizational psychology, school psychology, or forensic psychology -- and falsely assume that our competence transfers easily to the other areas. It is our responsibility to recognize the limits of competence and to make sure that any assessment is based on adequate competence in the relevant areas of practice, the relevant issues, and the relevant instruments.

In the same way that searching for disconfirming data and alternative explanations can help avoid confirmation bias, it can be helpful to search for conditions, incidents, or factors that may be undermining the validity of the assessment so that they can be taken into account and explicitly addressed in the assessment report.

Ignoring the Effects of Low Base Rates

Ignoring base rates can play a role in many testing problems but very low base rates seem particularly troublesome. Imagine you've been commissioned to develop an assessment procedure that will identify crooked judges so that candidates for judicial appointment can be screened. It's a difficult challenge, in part because only 1 out of 500 judges is (hypothetically speaking) dishonest.

You pull together all the actuarial data you can locate and find that you are able to develop a screening test for crookedness based on a variety of characteristics, personal history, and test results. Your method is 90% accurate.

When your method is used to screen the next 5,000 judicial candidates, there might be 10 candidates who are crooked (because about 1 out of 500 is crooked). A 90% accurate screening method will identify 9 of these 10 crooked candidates as crooked and one as honest.

So far, so good. The problem is the 4,990 honest candidates. Because the screening is wrong 10% of the time, and the only way for the screening to be wrong about honest candidates is to identify them as crooked, it will falsely classify 10% of the honest candidates as crooked. Therefore, this screening method will incorrectly classify 499 of these 4,990 honest candidates as crooked.

So out of the 5,000 candidates who were screened, the 90% accurate test has classified 508 of them as crooked (i.e., 9 who actually were crooked and 499 who were honest). Every 508 times the screening method indicates crookedness, it tends to be right only 9 times. And it has falsely branded 499 honest people as crooked.

Misinterpreting Dual High Base Rates

As part of a disaster response team, you are flown in to work at a community mental health center in a city devastated by a severe earthquake. Taking a quick look at the records the center has compiled, you note that of the 200 people who have come for services since the earthquake, there are 162 who are of a particular religious faith and are diagnosed with PTSD related to the earthquake, and 18 of that faith who came for services unrelated to the earthquake. Of those who are not of that faith, 18 have been diagnosed with PTSD related to the earthquake, and 2 have come for services unrelated to the earthquake.

It seems almost self-evident that there is a strong association between that particular religious faith and developing PTSD related to the earthquake: 81% of the people who came for services were of that religious faith and had developed PTSD. Perhaps this faith makes people vulnerable to PTSD. Or perhaps it is a more subtle association: this faith might make it easier for people with PTSD to seek mental heath services.

But the inference of an association is a fallacy: religious faith and the development of PTSD in this community are independent factors. Ninety percent of all people who seek services at this center happen to be of that specific religious faith (i.e., 90% of those who had developed PTSD and 90% who had come for other reasons) and 90% of all people who seek services after the earthquake (i.e., 90% of those with that particular religious faith and 90% of those who are not of that faith) have developed PTSD. The 2 factors appear to be associated because both have high base rates, but they are statistically unrelated.

Perfect Conditions Fallacy

Especially when we're hurried, we like to assume that "all is well," that in fact "conditions are perfect." If we don't check, we may not discover that the person we're assessing for a job, a custody hearing, a disability claim, a criminal case, asylum status, or a competency hearing took standardized psychological tests and completed other phases of formal assessment under conditions that significantly distorted the results. For example, the person may have forgotten the glasses they need for reading, be suffering from a severe headache or illness, be using a hearing aid that is not functioning well, be taking medication that impairs cognition or perception, have forgotten to take needed psychotropic medication, have experienced a crisis that makes it difficult to concentrate, be in physical pain, or have trouble understanding the language in which the assessment is conducted.

Financial Bias

It is a very human error to assume that we are immune to the effects of financial bias. But a financial conflict of interest can subtly -- and sometimes not so subtly -- affect the ways in which we gather, interpret, and present even the most routine data. This principle is reflected in well-established forensic texts and formal guidelines prohibiting liens and any other form of fee that is contingent on the outcome of a case. The Specialty Guidelines for Forensic Psychologists, for example, state: "Forensic psychologists do not provide professional services to parties to a legal proceeding on the basis of 'contingent fees,' when those services involve the offering of expert testimony to a court or administrative body, or when they call upon the psychologist to make affirmations or representations intended to be relied upon by third parties."

Ignoring Effects of Audio-recording, Video-recording or the Presence of Third-party Observers

Empirical research has identified ways in which audio-recording, video-recording, or the presence of third parties can affect the responses (e.g., various aspects of cognitive performance) of people during psychological and neuropsychological assessment. Ignoring these potential effects can create an extremely misleading assessment. Part of adequate preparation for an assessment that will involve recording or the presence of third-parties is reviewing the relevant research and professional guidelines.

Uncertain Gatekeeping

Psychologists who conduct assessments are gatekeepers of sensitive information that may have profound and lasting effects on the life of the person who was assessed. The gatekeeping responsibilities exist within a complex framework of federal (e.g., HIPAA) and state legislation and case law as well as other relevant regulations, codes, and contexts.

The following scenario illustrates some gatekeeping decisions psychologists may be called upon to make. This passage is taken verbatim from Ethics in Psychotherapy & Counseling, 4th Edition:

A 17-year-old boy comes to your office and asks for a comprehensive psychological evaluation. He has been experiencing some headaches, anxiety, and depression. A high-school dropout, he has been married for a year and has a one-year-old baby, but has left his wife and child and returned to live with his parents. He works full time as an auto mechanic and has insurance that covers the testing procedures. You complete the testing.

During the following year you receive requests for information about the testing from:

The boy's physician, an internist

The boy's parents, who are concerned about his depression

The boy's employer, in connection with a worker's compensation claim filed by the boy

The attorney for the insurance company that is contesting the worker's compensation claim

The attorney for the boy's wife, who is suing for divorce and for custody of the baby

The boy's attorney, who is considering suing you for malpractice because he does not like the results of the tests

Each of the requests asks for the full formal report, the original test data, and copies of each of the tests you administered (for example, instructions and all items for the MMPI-2).

To which of these people are you ethically or legally obligated to supply all information requested, partial information, a summary of the report, or no information at all? Which requests require having the boy's written informed consent before information can be released?

It is unfortunately all too easy, in the crush of a busy schedule or a hurried lapse of attention, to release data to those who are not legally or ethically entitled to it, sometimes with disastrous results. Clarifying these issues while planning an assessment is important because if the psychologist does not clearly understand them, it is impossible to communicate the information effectively as part of the process of informed consent and informed refusal. Information about who will or won't have access to an assessment report may be the key to an individual's decision to give or withhold informed consent for an assessment. It is the psychologist's responsibility to remain aware of the evolving legal, ethical, and practical frameworks that inform gatekeeping decisions.

[© copyright K. S. Pope, 2003, 2010]

***

Related articles:

Deposition & Cross-Examination Questions on Tests & Psychometrics.

22 Logical Fallacies in Psychology.

21 Ethical Fallacies in Psychology.

Ethics, Critical Thinking, & Language: Using Words To Deceive.

[Back to Top]