黑料不打烊

Subscribe to the OSS Weekly Newsletter!

Tips for Better Thinking: Shooting First, Drawing the Bullseye Later

The easiest way to become a skilled shooter is to draw the bullseye after firing. Our brain can do the same thing.

As we sift through more information than ever before, I would argue it鈥檚 becoming easier to commit an error in thinking known as the Texas sharpshooter fallacy. Imagine the stereotype of a Texan cowboy who randomly shoots up the side of a barn, goes up to it and traces a bullseye around the tightest cluster of bullet holes. It鈥檚 easy to declare yourself Texas鈥 sharpest shooter when you draw the bullseye after the fact.

This fallacious reasoning can be seen in communities where people start obsessing over numbers. If we hear that 鈥11:11鈥 is a number that carries special significance, we may begin to obsess over it and to notice it everywhere: it鈥檚 on our alarm clocks, our watches, on the calendar once a year, in error numbers when a particular software crashes, in someone鈥檚 Twitter handle, on the news. We have not hit the bullseye; rather, there are numbers all around us and we have decided to trace a bullseye around the instances we see of 鈥11:11鈥 and not around the multiple examples of 鈥23:23鈥 or 鈥4:56鈥. We have failed to show that 鈥11:11鈥 is more special than any other number combination.

Scientific research, unfortunately, does not escape from the Texas sharpshooter fallacy. The fact that many scientists now have to parse through massive data sets makes this fallacy all the more tempting. When I was a graduate student, we tested a few blood samples from cancer patients to know the levels they contained of over 200 tiny molecules known as microRNAs. Was each one of these 200 molecules present in the expected amount, or were they overexpressed or underexpressed? Sure enough, we found a small number that were present in much larger or much smaller amounts than anticipated. We could have stopped there and called it a signature for this particular type of cancer. Want to know if your patient has this cancer? Just test the expression of these specific microRNAs in their blood and if you find our signature, they have this cancer!

But that would have been the equivalent of drawing a bullseye around our findings, which could have been (and, as it turns out, were) spurious. The more things you look for, the more likely you are to find something that makes your detector go 鈥渄ing-ding-ding.鈥 So you need to validate your preliminary results, which were gathered without knowing what you were looking for, in a different set of samples. You need to put your new hypothesis through the wringer, not glorify it with an unearned crown. Failing to do so is to commit the Texas sharpshooter fallacy which, in scientific research, has a special name: HARKing, or Hypothesizing After the Results are Known.

In the world outside of academic research, this erroneous logic can manufacture fears that never really go away. A massive study coming out of Sweden in 1992 seemed to show beyond the shadow of a doubt that living near high-voltage power lines significantly increased the odds of developing childhood leukemia, a type of blood cancer. But, they found that its authors had not simply looked at childhood leukemia risk. They had measured 800 risk ratios. Childhood leukemia was the one that people who lived close to these power lines had more of than people who lived further away, so the authors drew their bullseye around this particular disease and declared victory. Sweden, which had considered making policy changes based on this presumed link between power lines and cancer, eventually decided not to due to lack of good evidence.

The bottom line is that it鈥檚 OK to build a hypothesis from a data set, but you cannot draw a conclusion from it as well. You need to validate your hypothesis in a new data set. Otherwise, you will start discovering associations that are simply not true.


Back to top