Topics …

Peer Review
Martin Reinhart © October 2006

Peer review is a method for assessing research work. It is mainly used on manuscripts submitted for publication and on research proposals that have been submitted for funding. Peer reviews are also often used to evaluate individual scientists and even whole institutions. Independent experts (peers) are commissioned with carrying out an examination (review) to determine whether the manuscript or proposal meets the quality standards of the discipline, of the funding organisation or of the journal.

1. History

The first kinds of peer reviews developed in connection with the founding of scientific societies in the 17th century, and especially the Royal Society in England. The Royal Society found itself facing the problem of a large number of observations and experiments being presented to the society, without it ever being very clear about how reliable these were. Results reported by Members of the Royal Society had to be trustworthy. These "natural philosophers" were, after all, noble persons (peers) committed to the gentlemanly ethos that required them to be veritable. Accordingly, the self-financed research performed by or on behalf of the noble gentlemen was also considered to be credible and authentic. However, an additional means had to be found to ensure that the reports presented by unknown persons of lower standing were equally reliable. One way of confirming this was to have a Member of the Royal Society vouch for it. It was also possible for the experiment to be repeated before the members so that these could confirm that they had observed it, through which the validity could be vouched for. For example, even Robert Hooke had to repeat numerous experiments before the peers before he was himself allowed to became a member of the Royal Society. The authenticated findings were then published in the "Philosophical Transactions of the Royal Society". We can then conclude that the second scientific journal – the first having been the French "Journal des sçavans" – already included a quality assurance system in the form of peer review. However, it was only in as late as 1750 that an explicit peer review system was introduced for the "Philosophical Transactions" themselves. Officially, this was because it had been necessary to select papers from the huge volume of material that had been sent to the Society. Unofficially, it was because the Royal Society had come under strong political pressure. Prior to this, however, the "Académie Royale des Sciences" in Paris had already established a rigorous peer review system for the "Journal des Sçavans" as a concession to the French king, through which the Académie was allowed to publish independently. The introduction of peer review was not, therefore, caused by the wish for scientific quality control. Rather, it was a political compromise aimed at guaranteeing self-control in the interest of the establishment of the day.

Although peer review played an important role in the origin of modern science, it was only used here and there over the course of time. Research was largely privately financed and so state funding organisations did not exist that would have needed a review system. At the same time, most of the scientific journals were controlled by a single editor who decided which articles were published and which not. Only after the Second World War was a massive expansion in the scientific research system seen that led to the general use of peer review to assess and autonomously control science. Today, it is only natural in most subject areas for manuscripts and proposals to be sent to two or more anonymous peers from the same subject whose review then forms the basis for a decision to accept or reject the article in question.

2. Criticism of the peer review system / Research on the peer review system

Peer review is a key control mechanism within the scientific research system. It determines which articles appear in which journals and which projects can indeed be carried out. Because it is researchers and not, for example, politicians or judges, who decide on what an interesting research result or highly-promising project is, it is a self-control mechanism. Self-control can be seen as a fundamental prerequisite for the autonomy of the scientific research system:

National Intitutes of Health (NIH)
National Science Foundation (NSF)
Schweizerischer Nationalfonds (SNF)
Deutsche Forschungsgemeinschaft (DFG)

The ratio of approved to rejected projects (success rate) can differ substantially from one funding organisation to the next. However, a falling trend has been observed for most of these organisations over recent years. This makes peer review all the more important and so into more of a discussion topic.

It is not exactly clear when we can actually speak of peer review. For example, various definitions are possible for what a "peer" is. At the one end of the scale, the definition only allows researchers who themselves do active research in the same special field – which can be as small and as specialised as it wants. A more generous definition of peers could include all researchers working within a specific discipline (e.g. biology). More recent models from the philosophy of science even ascribe expertise to non-researchers, such as politicians or non professionals (lays) who are consulted to assess science (participation and transdisciplinary models). But no method has yet been standardised. Reviewers can, for example, reach their decisions anonymously or not. Details on the reviewers can be anonymised or not (blind vs. double blind). The whole process can be transparent to the public or confidential, and the verdicts reached by the reviewers can be definitive or only serve as a suggestion or recommendation for a subsequent decision-making body. What may still be described as peer review can in turn depend on the local and disciplinary setting.

Since peer reviews play such an important role as a decision-making basis for the distribution of funding and for public reward and recognition, peers decide significantly on the course of scientific research and on the career success of researchers. In view of the power that is thus associated with peer review, critical voices have repeatedly been heard over the past 30 years. Frequently-expressed criticisms include (among many others):

  • Peer review is too slow and too expensive.
  • Peer review is unfair, because it discriminates against women and young female researchers and favours known figures
    (Matthew Effect).
  • Peer review is hostile to innovation because it favours established methods and ways of thinking.
  • Peer review is unreliable because reviewers often contradict each other.
  • Peer review opens the door to cronyism in which established researchers favour each other in the anonymity of their review work (Old-Boys-Network).

In connection with this criticism, a whole research field has formed, initiated by a study on the US National Science Foundation (NSF) in 1978. The authors came to the conclusion that 50% of a proposal's success was determined by random factors associated with the choice of peers. Most of the peer review research, however, meanwhile concentrates on journals rather than on funding organisations. Above all, three questions were studied:

  • How fair is the process (bias)?
  • To what extent do the peers agree (reliability)?
  • Is the approved proposal indeed better than the rejected project (validity)?

No generally-accepted answers have crystallised for any of these three questions. Although some studies did manage to detect some degree of bias or conflict of interest, it remains unclear whether this is a result of peer review or is caused by existing discrimination. As far as reliability is concerned, agreement exists that the determined numerical values must mostly be seen as low. However, agreement has not been reached on the significance of these values and on whether higher values would be desirable. And differences even exist on the most important question: validity. This mainly focuses on the methodological question of how rejected and approved articles or projects can be compared at all, since it is the approval or rejection itself that is decisive for the future success. It is also striking that a large proportion of these studies come from medicine and biology and mainly use quantitative methods. Studies that deal with the content of review reports or proceedings are extremely rare, presumably because journals and funding organisations extremely rarely allow access to their archives. To what extent the criticism of peer review is justified consequently remains unclear for the time being.

3. Alternatives to peer review

Beyond the question of how reliable peer review actually is, more or less successful proposals and trials of modified or alternative review and control methods have been presented time and time again. Many of these are a response to the increasingly excessive workload for peers who mainly do their reviews on an honorary basis. However, electronic communications technologies have also triggered an innovative leap that has achieved more than merely accelerating the process.

The opportunities offered by electronic publications, for example, reduce the pressure of having to carry out extensive selection, since, in principle, there is enough publication space for all. Certain sub-areas of physics that rely on quick communication have therefore created a publicly-accessible database (www.arxiv.org)in which each and every researcher can publish preprints of their work without having to have it reviewed. This method has been joined by an evaluation model that is only used after the work has been published. Users of the database can grade the articles and post a detailed assessment for all to see (www.naboj.com). Other journals have also begun to make manuscripts available prior to review. The review process is then carried out in public on the website and determines whether the article remains permanently available or not. The journal "Atmospheric Chemistry and Physics" is one example of this approach. It uses a 2-stage electronic system in which peers, authors and interested public can each discuss the article.

Innovations are also found among funding organisations, too. Most have begun to carry out the review process completely electronically, which, as mentioned above, has accelerated the process. Some practise a completely open approach as part of the effort to make public administration as transparent as possible. The German Research Foundation (Deutsche Forschungsgemeinschaft – DFG) established a new approach which separated review and assessment. The first instance involves conventional peer review. Subsequently, other peers check whether the process was satisfactory in the sense that the selection of peers was appropriate and the quality of the review reports acceptable.

Some of these models have already firmly established themselves, such as the above-mentioned example of open communications with preprints in physics. Most of the alternatives are, however, still new in comparison to "classical" peer review and are only used locally, which means that it is hardly possible to estimate what their eventual success and impact may be.

References

  Cole, Stephen, Leonard Rubin and Jonathan R. Cole, 1978: Peer Review in the National Science Foundation. Phase I of a Study. Washington, DC: National Academy of Sciences.
  Kronick, David A., 1962: A History of Scientific and Technical Periodicals. The Origins and Development of the Scientific and Technological Press, 1665-1790. New York: Scarecrow Press.
Download Merton, Robert K. 1968: The Matthew Effect in Science. Science 159, No. 3810 (5. Jan. 1968): 56-63. [Retrieved 31.10.2006]
Download Merton, Robert K., 1988: The Matthew Effect in Science, II. Cumulative Advantage and the Symbolism of Intellectual Property. Isis, Vol. 79, Issue 4: 606-623. [Retrieved 31.10.2006]
  Neidhardt, Friedhelm, 1988: Selbststeuerung in der Forschungsförderung. Das Gutachterwesen der DFG. Opladen: Westdeutscher Verlag.
  Shapin, Steven, 1994: The Social History of Truth. Civility and Science in Seventeenth-Century England. Chicago: University of Chicago Press.
  Weller, Anne C., 2001: Editorial Peer Review. It's Strengths and Weaknesses. New Jersey: Information Today Inc.

 

Robert Hooke (1635-1703)
was a universal scholar whose empirical and theoretical works made an important contribution to the scientific revolution of the 17th century. Before becoming a Member of the Royal Society, he was commissioned with carrying out experiments and worked as an assistant to Robert Boyle. It is Hooke who is believed to have formulated Boyle's Law, since Boyle was not, in contrast to Hooke, a mathematician. His interests were widely spread, extending from biology, chemistry, physics, mathematics and astronomy through to architecture. It was Hooke who first spoke of biological cells, who wrote the first book on microscopy (Micrographia) and who postulated that gravity obeys an inverse square law. In addition, he played a decisive role in rebuilding London after the Great Fire of 1666.
The Matthew Effect states:
"For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath." (Matthew XXV:29, KJV). Starting with this Bible quote from the Gospel of St. Matthew, Robert K. Merton called the link between reward and communication in science the "Matthew Effect". This involves distinguished researchers receiving greater reward than unknown researchers for work of identical quality. This effect above all appears in priority disputes where the discovery in question is mostly attributed to the more well-known person. Texts with several authors are also mostly ascribed to the most well-known of the authors. This results in communication and reward being accumulated in the scientific research systems – as exemplified by the Bible quote – through which level-specific advantages and disadvantages can develop and intensify. Empirical studies have also been able to detect the Matthew Effect.