If you have paid any attention to the reproducibility crisis in recent years you have probably become familiar with the term ‘P-Hacking’. You probably have also come to understand P-Hacking as a form of data fishing in which researchers search for any sort of “statistically significant” correlations in a data set.
P-Hacking and Donald Trump
A June 2016 research paper presented at last week’s International Conference on Machine Learning claimed that negative priming of the word ‘Trump’ by associations with Donald Trump have left American bridge players “subtly deranged by the prospect of Trump and play their hands worse.”
The tongue in cheek study found that a 2015 Vanderbilt bridge tournament had a higher percentage of No Trump hands made (P-value 0.0492) than a similar 2015 Dutch tournament or an earlier Vanderbilt tournament. The authors add “it would be natural to attribute the findings of this paper to a direct response to the candidacy of Donald Trump; however we must consider other pathways as well, as it is well understood from social psychology that seemingly trivial inputs such as football games, and subliminal images, and shark attacks van be more important than actual policy positions when affecting political attitudes.” I really hope that got at least a chuckle out of you.
Honestly, this is the funniest paper I have ever read.
Dr. Andrew Gelman, one of the authors of the paper, notes that once they wrote the paper he thought it was so funny that they “went to some effort to publicize it, so that more people could share in the fun”—there were no hopes to accomplish anything more than making people laugh.
The sample size was small, data points were ignored, the significance was overstated, the correlation between the variables was tenuous at best, but that was precisely the point. A P-value is in no way a testament to the quality of an experiment. Manipulating a data set such that an absurd statistical correlation could be found is not a means of attaining significance, but a gross perversion of the scientific method.
P-values have traditionally allowed us to suspend our doubt—well, as long as they are lower than 0.05 (or some other threshold). They have now, however, turned into yet another source of doubt and confusion. Most people by now understand that correlation does not imply causation, but what about when “correlation” doesn’t even imply correlation?
The paper ends with perhaps its most striking claim: “disbelief is not an option.” The truth is, it generally isn’t. Most papers do not manipulate and ignore data as egregiously as this one. Most papers do not make outlandish claims about the party affiliation of bridge players (they are apparently Democrats in case you were wondering) like this one does. Bad science, unfortunately, is generally not so obviously bad.
(Would you have been able to catch the fraudulence that Derek Lowe outlines in this post?)
The Attraction to Correlations
We are naturally attracted to correlations, because they suggest a certain interconnectedness that many people find satisfying or reassuring. Wouldn’t it be incredible if it actually was demonstrated that Trump’s candidacy led bridge players to completely change their strategies? Disbelief is not an option, because we want to believe so badly.
Like most problems of fraud (or ignorance), there are no easy solutions. Uri Simmonsohn, a social psychologist at the University of Pennsylvania and one of the early popularizers of the term “P-Hacking,” proposes three possible solutions to reduce fraud in science:
- “Retract without asking ‘are the data fake’”
Retraction should not necessitate a confirmation that the data was fraudulent, but rather just that the reviewers no longer have confidence in the results.
- “Show receipts”
Researchers should provide a detailed accounted of how their data was acquired.
- “Post data, material and code”
It is impossible to adequately scrutinize a data analysis without access to the initial data and the code used to transform that data into usable information.
Though these recommendations seem modest (which they are compared to some of the more systemic solution proposed), they will certainly impact the way in which science is done.
Extensive Supplementary Information?
A nice parallel for this change can be found in a recent paper published by the Baran Group out of the Scripps Institute. In addition to their paper detailing a new C-C bond forming reaction, the Baran Group released a 180-page supplementary file detailing the specifics behind the reaction. For those of you who do not frequently read through academic chemistry papers, such extensive supplementary information is rarely found in publication. Though there was no compulsion for the Baran Group to do so, it appears that there was a hope to clearly describe every facet of their new reaction so that it could be reproduced in other labs around the world.
Surely it took the Baran Group a great deal of time to prepare the 180-page document. It will similarly take time for research groups to detail every facet of their research and analysis in their papers. Publications will be harder. Publications will take much longer.
Research groups, however, would greatly benefit from the perspective of the Baran Group. The purpose of scientific publications is not merely to show off your work—it is also to share novel results and methodologies with researchers around the world. More work will be necessary, but it will help prevent fraud and negligence that could significantly derail a scientific field for years.
See previous blogs in this series:
- Reproducibility Crisis, Part 3: Scientific Skepticism
- Reproducibility Crisis, Part 2: Scientific Realism and Antirealism
- Reproducibility Crisis – Why We Need More Dark Reactions Projects