Science and replicability

The basic claim made in published science is that something about the nature of the universe has been uncovered. That makes it distressing when other researchers attempting to isolate the same phenomenon are unable to do so:

For social ‘scientists’ with aspirations of matching the rigour of their peers in the ‘pure’ or ‘natural’ sciences. If different groups of scientists using true double-blind controlled experiments can’t reach compatible conclusions about the world, what hope is there for people trying to deduce causality from historical data?

Author: Milan

In the spring of 2005, I graduated from the University of British Columbia with a degree in International Relations and a general focus in the area of environmental politics. In the fall of 2005, I began reading for an M.Phil in IR at Wadham College, Oxford. Outside school, I am very interested in photography, writing, and the outdoors. I am writing this blog to keep in touch with friends and family around the world, provide a more personal view of graduate student life in Oxford, and pass on some lessons I've learned here.

7 thoughts on “Science and replicability”

  1. As all of you know, of course, questions have been raised about the robustness of priming results. The storm of doubts is fed by several sources, including the recent exposure of fraudulent researchers, general concerns with replicability that affect many disciplines, multiple reported failures to replicate salient results in the priming literature, and the growing belief in the existence of a pervasive file drawer problem that undermines two methodological pillars of your field: the preference for conceptual over literal replication and the use of meta-analysis. Objective observers will point out that the problem could well be more severe in your field than in other branches of experimental psychology, because every priming study involves the invention of a new experimental situation.

  2. This is not a problem with science so much as a problem with incentives for scientific publishing. New and startling results are written up, placed in better journals, get more attention and are more likely to make you famous. Results confirming the status quo get little attention and will not make you famous, so often you wouldn’t bother to write it up. There is a strong incentive to publish everything surprising, and it gets lots of media attention, but very often it is not true – small samples throw up lots of false positives by chance. See “Why Most Published Resrarch Findings are False” http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/ (the article mentions it but weirdly doesn’t link)

    In the case you link to, the original researcher’s claim that she is due a rebuttal is absurd – she’s just taking her research far too personally, and also seeking to undermine the independence of a separate research team in a manner that jeopardises scientific objectivity (such as it is). The real solution lies in creating more balanced incentive structures where we reward solidly replicated scientific findings and not surprising chance findings that turn out to be wrong. Here, social science is in a better position, I think, because discussions of methodology are very robust and there’s far more scrutiny of starting claims than surprising ones. A comparative politics analysis that used a small subset of countries to make a startlingly controversial argument (e.g. if we look at Iran, North Korea, Lesotho, and Guam we can see that improving education access lowers economic growth!) wouldn’t be published and the academic would be laughed out of the field. In science, one headline grabbing weird finding with a small sample might make your career. So, different systems & incentives, different risks. I have never heard anyone suggest that in politics the majority of published findings are false, & the reasons that published findings in science are false generally don’t apply to our field.

  3. Sorry, a typo above: I meant “far more scrutiny of startling claims than conventional ones”.

  4. Sarah,

    I agree that the incentives of publication and publicity favour surprising claims, and that this is likely to explain why a good number of false results get a lot of attention.

    There are a few articles that highlight the somewhat limited extent of what political science has been able to robustly demonstrate about the world.

    One that comes to mind is Hans Noel’s “Ten Things Political Scientists Know that You Don’t“, which is more about the limits of knowledge than about its extent. I can recall seeing some others previously, but can’t remember the details right now. If I come across them again, I will add another comment.

    I am also presently reading Green and Shapiro’s Pathologies of Rational Choice Theory, which centres on the claim that rational choice theory is called into question less by the validity of its assumptions about decision-making and more by the absence of successful efforts to demonstrate its predictive power empirically.

    In terms of social versus pure/natural sciences, the biggest handicap for the former may be the general inability to run double-blind controlled experiments, which means questions about the direction of causality can rarely be definitively answered. We are often only able to say with rigour that phenomenon A and B are often seen together, and not whether one causes the other or both are caused by some third factor.

  5. On a somewhat related note, I found this lecture about the historical Jesus interesting. Again, it highlights the limits of knowledge and certainty. Even using the somewhat dubious methodology of considering separate biblical texts as independent sources, there are only a few facts that can be considered robustly established as likely to be true.

  6. Obsession with p-values as the test of the validity of results is also part of the problem. A large number of studies will have publishable p-values by accident, and the odds are raised because researchers tweak their statistical tests until they find something that works.

    The p-value was never meant to be an all or nothing signifier of the strength of a result, but it has become one in the practice of many disciplines.

Leave a Reply

Your email address will not be published. Required fields are marked *