We recently completed the first of what will be a running series of online rationality experiments. We’ll be publishing a short report on each experiment, regardless of whether our results were significant or exciting. In part we’re hoping to give you a look inside our process at CFAR, to see the kinds of rationality training techniques we’re considering and how we go about testing our hypotheses.

Our other main goal is to avoid the problem of publication bias. If only significant results get published, while other studies languish unpublished in file-drawers, the public never knows to what degree the significance of the published results represents the discovery of real phenomena in the world, and to what degree it simply represents the fact that if you test enough hypotheses, some will look significant just by chance.

Personally, I did find the results of our first experiment somewhat interesting, despite the fact that the main hypothesis we were testing was not supported. The full 4-page report, which details our background reasoning, method, results, and discussion of those results, is here:

CFAR Rationality Experiment #1: Surprise as a Cue to Probability

Here’s a summary:

We hypothesized that prompting people to consider their feelings of surprise about a hypothetical outcome might improve their accuracy in predicting the probability of that outcome. This is a technique we sometimes use at CFAR to combat overconfidence in planning: we might feel confident that we’ll have finished some particular project by Thursday, but when we ask ourselves: “Imagine Thursday night rolls around and the project’s not done yet. How surprised would you be?” we often realize the answer is, “Not very.” Which means we should assign a lower probability to that outcome than our initial, overconfident guess.

We asked subjects (n=101) to make predictions about the demographics, values, and lifestyles of Americans today. (Data for the questions came from Pew.) Subjects were randomly assigned to one of three groups:
1. A control group was simply asked to estimate the probability of each of the outcomes; for example, “What do you think is the probability an 18-29 year old American who self-identifies as conservative has a tattoo?"
2. The first intervention group was given a one-way surprise question (“Imagine meeting an 18-29 year old American who self-identifies as conservative. How surprised would you be to learn (s)he has a tattoo?”) and then asked the same probability question as the control group.
3. The second intervention group was given a two-way surprise question (“Imagine meeting an 18-29 year old American who self identifies as conservative. How surprised would you be to learn (s)he has a tattoo? How surprised would you be to learn (s)he does NOT have a tattoo?”), and then asked the same probability question as the control group.

We were interested in whether either intervention group would make more accurate probability estimates than the control group. However, there was no significant difference between the accuracy of the groups (although the one-way surprise group was slightly less accurate). This suggests the surprise technique is not universally useful, in its current form, at least – although it’s still plausible that it could prove effective with (1) more instruction on how to perform it, and/or (2) a different type of prediction question, like one more similar to planning. We’ll be investigating those possibilities in the future.

But one statistically significant pattern did emerge. The one-way surprise intervention group gave significantly higher probability estimates compared to the other two groups. (In fact, their average estimate was higher on all 13 out of 13 questions.) Our interpretation: there is some evidence that the act of imagining an outcome makes it seem more probable, which means that the one-sided surprise technique could be introducing a new bias, and should be used with caution.

(Download the full report here.)