Judicial decisions and the individual judges behind the judgments are liable to claims of objectivity and impartiality. Given the workload judges face, the vast diversity of subject matter they are expected to master, and – more fundamentally – given the fact that they run on human brains, it comes as no surprise that judgments don’t always meet normatively accurate standards. Studies so far have focussed mostly on judges’ susceptibility to the pursuance of biased political goals as well as to the influence of their emotional dispositions. Another angle from which judicial decision-making can be approached is the heuristics and biases program.
Judicial cognitive psychology
This field of inquiry belongs to “Behavioural Law and Economics”, representing a new form of forensic psychology. Groundwork for this research has been provided by Tversky and Kahneman, whose “Judgment under Uncertainty” in 1974 influences present researchers’ work in a wide range of disciplines. Applying the heuristics and biases program, Swiss legal scholar Mark Schweizer published his PhD thesis in 2005. Among other things, it contains an examination of important biases judges are prone to. Although the issue is of greater relevance in the common law system than in civil law countries due to differing scopes of judicial decisional power, research of kinds is of fundamental practical significance. (In civil law countries, the bias-analysis transfers to the legislator to a greater extent.)
Rational analysis is directed at calculable mistakes typical for the human beings (or brains) that judges are. The heuristics, or rules of thumb, that are drawn on for judicial decision-making serve the purpose of efficient reduction of complexity and act as guides to normatively accurate results. However, the problem is that these heuristics often come along with fallacies. Thus, the heuristics may represent aberrations from desirable rational decision-making. The issue about these fallacies is that they occur systematically. On the positive side, this makes the fallacies predictable and thus offers the possibility for correction.
The first part of Schweizer’s thesis is devoted to the examination of the emergence of Behavioural Law and Economics, its interplay and correlation with legal realism (the doctrine that legal decision-making is not exclusively determined by legal rules), as well as to the classic economic analysis of law and the normative framework of expected utility theory. In the second part, Schweizer analyses the effects of cognitive biases on judicial-decision making in criminal and civil law proceedings. For this purpose, he conducted a survey with 415 Swiss judges from seven different cantons. The biases whose effects are examined are ten in number. In what follows, these will be presented in a cursory way.
10 biases in court and law
First, anchoring bias occurs e.g. when first proposals to a settlement work like anchors for the final judgment and dominate its direction. Second, framing effect and loss aversion explain why parties to a dispute take less risk if “gains” are at stake and more risk if it is about “losses”. Schweizer examines judges’ tendency to fall prey to framing effects by means of prospect theory. The third bias subject to investigation is the omission bias, i.e. our tendency to judge harmful or risky actions worse than harmful or risky omissions, even if the consequences are identical. It explains why one does rather not act than to act if consequences are uncertain – although not acting may be no better in terms of the expected consequences. Thereafter, probabilistic thinking in terms of Bayes’ Theorem is treated and our probability incompetence exposed, especially relating to the consideration of evidence. The Amanda Knox case (not treated in Schweizer’s dissertation) may be a case in point: On LessWrong, it has been claimed that a rational/Bayesian hour on the internet can beat a whole year of probability incompetence in court. This is shocking, if indeed true. But if we take seriously the hypothesis that human judgment and decision-making is deeply and dangerously biased in many ways, we shouldn’t be surprised. Similarly fatal errors should be expected to be quite pervasive in all domains of individual and collective human action. It’s not easy to grasp the extent and significance of what this means, if indeed true. In any case, de-biasing interventions should likely be a societal priority. The fifth bias presented is confirmation bias. It posits that information confirming a hypothesis is likelier to be searched for (motivated cognition), acknowledged, more heavily weighted and remembered. Thus, we interpret information in a biased way. Moreover, ignorance of regression towards the mean is elucidated, where coincidental factors are erroneously used to explain incidents. Hindsight bias is the seventh bias under scrutiny. It reveals that most judges overestimate the predictability of an incident retrospectively. A possible fix could involve judging cases, i.e. harmful/criminal intentions and actions, without knowing about their actual outcomes, which partly result from unexpected and thus irrelevant chance events. Thereafter, judges’ tendency to wrongly assume and evaluate connections between independent and unrelated characteristics of a person is analysed, i.e. the halo effect. At the ninth position, contrast effect is presented: A strict law gains more acceptance than it would (and should) if the contrasting possibility of an even stricter law is mentioned, and vice versa. Finally, overconfidence bias explains why parties in court systematically overestimate their chance of success.
In his post about biases in judicial decision-making, Jesse Galef describes a further bias. Let’s call it hunger negativity/unreliability (which likely generalises to stress negativity/unreliability):
“On the surface all we need to do is experience the world and figure out what does and doesn’t work at achieving goals (the focus of instrumental rationality). That’s why we tend to respect expert opinion: they have a lot more experience on an issue and have considered/evaluated different approaches.
Let’s take the example of deciding whether or not to grant prisoners parole. If the goal is to reduce repeat offenses, we tend to trust a panel of expert judges who evaluate the case and use their subjective opinion. They’ll do a good job, or at least as good a job as anyone else, right? Well… that’s the problem: everyone does a pretty bad job. Quite frankly, even experts’ decision-making is influenced by factors that are unrelated to the matter at hand. Ed Yong calls attention to a fascinating study which finds that a prisoner’s chance of being granted parole is strongly influenced by when their case is heard in relation to the judges’ snack breaks: (…) the odds that prisoners will be successfully paroled start off fairly high [in the morning] at around 65% and quickly plummet to nothing over a few hours. After the judges have returned from their breaks, the odds abruptly climb back up to 65%, before resuming their downward slide. A prisoner’s fate could hinge upon the point in the day when their case is heard.”
Replacing meat-brain intuition with computer calculation?
One wonders: Why isn’t this big news? Are we too biased to see the potentially huge relevance of our being biased?
And don’t we have a legitimate claim to bias-less-ness in the law, concerning the judiciary as well as the legislature?
Maybe we should start trusting human intuition (including the one of experts) much less – and computers following Statistical Prediction Rules (SPRs) much more? Jesse Galef goes on to explain:
“Fortunately, we have science and statistics to help. We can objectively record evidential cues, look at the resulting target property, and find correlations. Over time, we can build an objective model, meat-brain limitations out of the way.
In “Epistemology and the Psychology of Human Judgment“, Bishop and Trout argued that we should use such SPRs far more often than we do. Not only are they faster, it turns out they’re more trustworthy: Using the same amount of information (or often less) a simple mathematical model consistently out-performs expert opinion.
They point out that when Grove and Meehl did a survey of 136 different studies comparing an SPR to the expert opinion, they found that “64 clearly favored the SPR, 64 showed approximately equivalent accuracy, and 8 clearly favored the clinician.” The target properties the studies were predicting varied from medical diagnoses to academic performance to – yup – parole violation and violence.
So based on some cues, an SPR would probably give a better prediction than the judges on whether a prisoner will break parole or commit a crime. And they’d do it very quickly – just by putting the numbers into an equation! So all we need to do is show the judges the SPRs and they’ll save time and do a better job, right? Well, not so much.”
It turns out that due to shockingly strong biases such as overconfidence, experts do worse than SPRs even when they are presented with the SPR results! They are unable to reliably judge when to go with their intuition over the SPR result – and when not.
The SPR finding generalizes to many areas. Luke Muelhauser describes them:
“A parole board considers the release of a prisoner: Will he be violent again? A hiring officer considers a job candidate: Will she be a valuable asset to the company? A young couple considers marriage: Will they have a happy marriage? (…)
- Howard and Dawes (1976) found they can reliably predict marital happiness with one of the simplest SPRs ever conceived, using only two cues: P = [rate of lovemaking] – [rate of fighting]. The reliability of this SPR was confirmed by Edwards & Edwards (1977) and by Thornton (1979).
- Unstructured interviews reliably degrade the decisions of gatekeepers (e.g. hiring and admissions officers, parole boards, etc.). Gatekeepers (and SPRs) make better decisions on the basis of dossiers alone than on the basis of dossiers and unstructured interviews. (Bloom and Brundage 1947, DeVaul et. al. 1957, Oskamp 1965, Milstein et. al. 1981; Hunter & Hunter 1984; Wiesner & Cronshaw 1988). If you’re hiring, you’re probably better off not doing interviews.
- Wittman (1941) constructed an SPR that predicted the success of electroshock therapy for patients more reliably than the medical or psychological staff.
- Carroll et. al. (1988) found an SPR that predicts criminal recidivism better than expert criminologists.
- An SPR constructed by Goldberg (1968) did a better job of diagnosing patients as neurotic or psychotic than did trained clinical psychologists.
- SPRs regularly predict academic performance better than admissions officers, whether for medical schools (DeVaul et. al. 1957), law schools (Swets, Dawes and Monahan 2000), or graduate school in psychology (Dawes 1971).
- SPRs predict loan and credit risk better than bank officers (Stillwell et. al. 1983).
- SPRs predict newborns at risk for Sudden Infant Death Syndrome better than human experts do (Lowry 1975; Carpenter et. al. 1977; Golding et. al. 1985).
- SPRs are better at predicting who is prone to violence than are forensic psychologists (Faust & Ziskin 1988).
Robyn Dawes (2002) drew out the normative implications of such studies:
If a well-validated SPR that is superior to professional judgment exists in a relevant decision making context, professionals should use it, totally absenting themselves from the prediction.
Sometimes, being rational is easy. When there exists a reliable statistical prediction rule for the problem you’re considering, you need not waste your brain power trying to make a careful judgment. Just take an outside view and use the damn SPR.”