Book report: Thinking Fast and Slow by Daniel Kahneman

(Link)

One of the most important books I’ve ever read. It gives insight into how our minds work and why we fall prey to as many errors of judgement as we do.

Kahneman observes that our mental reactions to situations can be categorized into two groups, fast and slow, and that the fast ones are both more influential and more prone to error.

The fast system, which he calls system 1, does things like distinguish between near and far objects, identify the direction of sounds, auto-complete common phrases, make and identify facial expressions, read tone of voice, answer the most familiar math problems (2+2), read easy words, do familiar tasks like riding a bike or driving a car under easy circumstances, and comprehend stereotypes.

The slow system, system 2, does things that require attention, such as: Get ready to start in a race, pay attention to particular people in a crowd, search for things meeting a description, recall memories that require searching, do familiar physical tasks under unusual circumstances, keep your foot out of your mouth, count more than a few instances of a thing, compare lists of attributes, do difficult math or evaluate logical reasoning. System 2 is a very limited resource, which is why it’s difficult to pat your head and rub your belly at the same time.

For my own future reference, I decided to itemize his findings here:

  1. It’s difficult or impossible to do more than one system 2 task at the same time; you can’t do a multi-digit multiplication while dealing with dangerous traffic. This is why you don’t see the guy in the gorilla suit while counting ball passes. (p.23)
  2. System 2 tasks require effort that we prefer to avoid. It’s a noticeable mental effort. Most of the time system 2 will just rubber-stamp suggestions from system 1. (p.24)
  3. Your pupils dilate and your heart rate increases slightly when working on system 2 tasks. The pupil response is body language that can give away your level of interest to someone trying to sell you something. (p.32)
  4. You’re more likely to make bad decisions or put your foot in your mouth while system 2 is occupied. (p.41)
  5. Low blood glucose makes high cognition tasks harder. People are more likely to make hasty decisions when hungry. (p.43)
  6. System 1 performs memory associations and is responsible for the priming effect. (ch.4)
    • IMPORTANT NOTE: there have been some replicability problems with chapter 4 of this book. The author may have fallen victim to some of the same overconfidence fallacies he documents.
  7. Tasks that cause cognitive strain, such as reading a poorly legible font, make you more accurate but less intuitive and creative. (p.60)
  8. Cognitive ease, or anything that makes our associative machinery run more smoothly, can cause bias by making it easier to reach a conclusion that is primed than a conclusion that is true. (ch.5) This is one of the mechanisms of propaganda: repetition. This is also why using simpler language makes speakers appear more credible to some.
  9. Norms shape a lot of our thinking and are what we draw on to explain behavior we witness when an external explanation isn’t available. For example, norms make it possible to tell a simple story with abstract shapes by moving them in ways that mimic the human body language for different emotions. Norms are also very easily changed by our experiences, and therefore are very likely to be wrong. (ch.6)
  10. Our brains are highly optimized for jumping to conclusions, and it saves a lot of time and effort, but it’s risky when the cost of being wrong is high. (ch.7)
  11. The Halo Effect is a form of jumping to conclusions, in which we are likely to judge someone favorably based on irrelevant favorable impressions of them in other contexts. It’s also why first impressions carry the most weight. There are techniques for avoiding halo effect bias in some circumstances, such as blinding yourself to earlier impressions by anonymizing the data you’re presently evaluating. (p.82-84)
  12. What You See Is All There Is (WYSIATI) fallacy: Coherence-seeking system 1 and lazy system 2 combine to cause judgments based on available evidence, without making much effort to seek additional relevant evidence or even to notice that relevant evidence might be missing. This causes lots of kinds of biased judgement, including overconfidence, framing effects and base-rate neglect. (p.86-88)
  13. System 1 deals well with tasks such as averaging or comparing similar objects, but not with things like summing or judging the relative impact of numbers. This is why statistics so easily mislead. System 1 likes things that can be mapped to intensity in some way. This is also why we judge people by their faces – system 1 maps certain facial features to intensities of important attributes such as dominance and competence. (ch.8)
  14. We often unconsciously answer difficult questions by answering a simpler proxy question and then using an intensity scale to map that answer back to the original question. For example, when asked how much money you would be willing to contribute to help clean oil-soaked seabirds, there is no obvious mapping from oily birds to money so you might instead ask yourself what the emotional impact of the birds’ condition is on you, then use that intensity to judge how much money to give. This effect can be exploited easily by first priming you to have a more intense reaction to the question. (ch.9)
  15. The Law of Small Numbers: Small sample sizes produce more remarkable results, because they are more likely to generate sampling biases. This is one way you can get two studies of the same thing reaching contradictory conclusions. Faith in small numbers is what causes belief in winning streaks in sports and gambling. (ch.10)
  16. The Anchoring Effect: Exposure to an intensity can bias your answer to a completely unrelated question. For example, first being asked “Was Ghandi older or younger than 144 when he died?” will cause people to give higher answers to the subsequent question “How old was Ghandi when he died?”, and hearing temperatures mentioned that you recognize as being warm or cool will make it easier to recognize words related to summer or winter, respectively. This is a form of priming and can be used to bias surveys. (ch.11)
  17. Availability (easier recall of recent information) biases our decisions. (ch.12)
  18. Our evaluation of risk depends on the choice of measure used when presenting the statistics, including the choice of what the measure is compared to. This is why it’s so easy to be misled by death statistics; they’re rarely presented with enough comparative context, and this is usually done deliberately to produce bias. (ch.13)
  19. Base rate neglect: We perform our own unconscious statistical sampling of other people, and this is called stereotyping. In some circumstances stereotypes can be usefully accurate, but often they are misleading and we’re not good at recognizing when. (ch.14)
  20. Conjunction fallacy: People tend to favor more specific explanations even though they are always less probable than less specific ones. Another form of this is a tendency to favor a smaller uniform set over a larger disparate one (example: a smaller dinnerware set was valued higher than a larger one that contained everything in the smaller one and more plus some broken pieces.) (ch.15)
  21. Statistical results with a causal interpretation have a much stronger impact on our evaluation than pure statistics, even when the statistics suggest the causal interpretation is wrong. In other words, humans are terrible at Bayes’ theorem. Also, we love generalizing from small samples but are unwilling to go the other way and assume general rules apply to specific cases. (ch.16)
  22. Regression to the mean: We mistakenly believe in “streaks” of good or bad luck or performance and assume they will continue. It’s always a safer bet that things will return to the average, but we never bet that way and when we’re wrong we tend to mis-attribute the reason. (ch.17)
  23. Hindsight bias: If a predictions actually happens, we retroactively assign it a higher probability and will tend to erroneously assign more of the responsibility to personal attributes than to luck. We tend to say mistakes should have been obvious after the fact, when they were not. (ch.18)
  24. Illusion of validity / skill: Having a poor record of prediction does not shake our faith in our ability to predict. Example: Stock market analysts who consistently predict worse than random chance are still considered to be doing a good job. We reward luck as if it were skill. A coherent story feels better than chance. (ch.20)
  25. We value intuition more than we should. Despite negative reactions, the right algorithm can always produce good decisions at a higher rate than humans. (ch.21)
  26. The way to avoid the Planning Fallacy (unrealistically optimistic project estimates) is to use statistics about similar cases: how many of them fail, and how long does success typically take? The fallacy comes from having an insider’s view and not knowing the unknown unknowns. Past data gives you a partial outsider’s view. (ch.23)
  27. Humans rarely reason based on expected value (economics) but on perceived value (prospect theory). Wealth change has a greater effect on our happiness than absolute value. Amount of change as a percentage is more important than the absolute value of the change. Context matters, as does priming. (ch.25)
  28. Most people are risk-averse when it comes to gain, but risk-seeking when it comes to avoiding loss (prospect theory, loss aversion). (ch.26)
  29. The Endowment Effect: Owning a good increases its value to you. Your sell price for something you value is typically much higher than the maximum price you would pay to get it. Another way of putting it is that you may not care what you get until you get it. (ch.27)
  30. Decision weights: Probabilities close to zero or close to 100% are curved differently from the rest of the range. 5% is seen as a huge improvement over 0% (experimentally, it has been weighted at 13%). 95% is seen as vastly inferior to 100% (experimentally, 95% is felt as valuable as 79%). (ch.29)
  31. The Fourfold Pattern (ch.29):
    • High probability produces a certainty effect.
    • High probability of gains produces risk aversion, fear of disappointment, and willingness to accept unfavorable settlements. (Bernoulli’s theory fits this quadrant)
    • High probability of losses produces risk-seeking, hope of avoiding loss, and rejection of favorable settlements.
    • Low probability produces a possibility effect. (fighting losing battles)
    • Low probability of gains causes risk-seeking, hope of large gains, and rejection of favorable settlements. (lotteries benefit here)
    • Low probability of losses causes risk aversion, fear of large loss, and acceptance of unfavorable settlements. (insurance benefits here)
  32. We overestimate the probability of unlikely events and over-weight them in our decision making. (ch.30)
  33. Denominator neglect: Larger numbers given for the numerator and denominator of a probability can have more effect on our decision than the probability they represent. (8/100 is more popular than 1/10 even though the latter is better.) (ch.30)
  34. Vivid or well understood events are over-weighted relative to their probability in our decision making. This is why the “poster child” concept works. (ch.30)
  35. We weight the pain of loss more than the pleasure of gain. Losing $1 feels as bad as gaining $2 feels good; even an economist won’t make that bet unless they can make the same bet many times to get the expected return of 50 cents per bet.  (ch.31)  This 2:1 weighting applies in multiple contexts. (ch.32)
  36. The Disposition Effect: We want to close each transaction with a gain, rather than an average gain. This leads to the Sunk Cost Fallacy, which mistakenly uses existing investment in a failure to justify more investment in the same. (ch.32)
  37. Exposure to a wider context can affect your decisions even if the extra information is irrelevant. (ch.33)
  38. Framing matters: “Team A won” is very different from “Team B lost”. Costs are not losses. Gain is not the opposite of loss. (ch.34)
  39. We rate the painfulness of an experience by the difference between its peak and its end. The duration is largely irrelevant. Our experiencing self is different from our remembering self. (ch.35)
  40. The Focusing Illusion: Nothing is as important as you think it is when you’re thinking about it. Your emotional context at the time greatly affects your answers to questions about feelings. (ch.38)

Summary of the characteristics of system 1 (reproduced from p.105):

  • Generates impressions, feelings, and inclinations. When endorsed by system 2 these become beliefs, attitudes and intentions.
  • Operates automatically and quickly, with little or no effort, and no sense of voluntary control.
  • Can be programmed by system 2 to mobilize intention when a search pattern is detected.
  • Executes skilled responses and intuitions, after being trained.
  • Creates a coherent pattern of activated ideas in associative memory.
  • Links cognitive ease to illusions of truth, pleasant feelings, and reduced vigilance.
  • Distinguishes the surprising from the normal.
  • Infers and invents causes and intentions.
  • Neglects ambiguity and suppresses doubt.
  • Is biased to believe and confirm.
  • Exaggerates emotional consistency (halo effect).
  • Focuses on existing evidence and ignores absent evidence (WYSIATI).
  • Generates a limited set of basic assessments.
  • Represents set by norms and prototypes, and does not integrate.
  • Matches intensities across scales.
  • Computes more than intended (mental shotgun effect).
  • Sometimes substitutes easier questions for difficult ones.
  • Is more sensitive to changes than to states (prospect theory).
  • Over-weights low probabilities.
  • Shows diminishing sensitivity to quantity.
  • Responds more strongly to losses than to gains (loss aversion).
  • Frames decision problems narrowly and in isolation.