Errami and Garner published in the January 24, 2008, issue of Nature an article entitled, “A Tale of Two Citations,” that deals with the timely subject of publication of duplicate reports(1). With the increase in the number of scientific journals and the ongoing pressure to publish in them, inappropriate and unethical practices, such as plagiarism, unauthorized “cosubmission” of a paper to 2 or more journals simultaneously, and duplication of a previous report, may also be on the rise.
The authors of the Nature Commentary focused on 2 important reports to document this problem(2)(3). The first was from a large study that used text-matching software to mine 280 000 entries in arXiv, an open-access database for publications in mathematics, physics, biology, statistics, and computer science(2). Of the examined articles, 0.2% of them were suspected of plagiarism, and 10.5% were suspected of being a duplicate publication by the same authors. The second study was from an anonymous survey of 3247 American biomedical researchers(3). In this survey, 1.4% of the researchers admitted to the act of plagiarism, and 4.7% confessed to repeated publication of the same results. With more than 17 million citations in the Medline database, these figures imply that the number of articles with suspected plagiarism ranges from 34 000–238 000 and that for duplicate publication is approximately 800 000 to 1.8 million.
Armed with this information, Erami and Garner embarked on the challenging task of documenting the trend in suspected duplicates in the biomedical literature over the past 30 years (1975–2005) and sought to determine whether the publication of duplicates is a country-dependent phenomenon. The authors searched online databases such as Medline by using text-similarity software and the eTBLAST search engine, which is freely available online. The potential duplicates were deposited in Déjà vu, a publicly accessible database (http://spore.swmed.edu/dejavu).
The authors found that the annual number of biomedical publications increased from approximately 250 000 to 600 000 over this 30-year period, while the percentage of suspected duplicates had risen disproportionately from 0.018% to 0.9%. Furthermore, the duplication rates for particular countries were proportional to the numbers of contributions from those countries, with the exception of China and Japan, which had twice the rates of duplicates than expected for their numbers of contributed publications.
Although these findings are interesting, some aspects of the conclusions are of concern. The authors acknowledged that manual verification of the articles listed in Déjà vu as duplicates is currently ongoing and that the results should be interpreted with caution. The frequency of suspected duplicates appears to be high, however, and the issue of false-positive identification is not well addressed in the overall message of the report.
To examine the false-positive rate in the Déjà vu database, we checked the suspected duplicates in 3 journals, The New England Journal of Medicine (NEJM), Clinical Chemistry, and The Lancet, since 1975. According to ISI (http://portal.isiknowledge.com), NEJM published 11 779 original and review articles during this period. According to Déjà vu, 14 of these reports were suspected duplicates. After a careful examination, we found none of these articles in NEJM to be a duplicate publication.
In Clinical Chemistry, 8867 original and review articles were published in this period. Of the 27 identified as suspected duplicates, one report could be viewed as a duplicate that might require a corrective action, a second could not be verified (because we have not been able to access the other journal/newsletter), and a third was republished in a language other than English. This last article exemplifies a situation that requires examination. Perhaps the republication of a review article or a case study in a language other than that of the original report may be justifiable if the appropriate acknowledgment is clearly stated. This practice enables the dissemination of educationally relevant information to a broader audience and is consistent with that of the World Association of Medical Editors (www.wame.org/resources/policies); however, it is unclear whether such an argument can convincingly be made for original reports.
In The Lancet, 68 948 letters, original articles, and reviews have been published since 1975. Of the 24 identified as suspected duplicates, 4 could be viewed as duplicates, and 4 appear to have been published in a language other than English (we have not been able to verify whether the second publication made the appropriate reference to the original one).
Out of curiosity, we also investigated whether Déjà vu would list suspected duplicates for the 3 authors of this Editorial. We collectively had 34 unverified duplicates, according to Déjà vu, none of which represented plagiarism, cosubmission, or redundant publication (one case-control study was published in 2 languages, with the proper referencing). The reasons for misclassification of the reports in the 3 journals studied are presented in Table 1⇓ .
From this brief exercise, it is clear that although Errami and Garner have made progress in a worthwhile cause, their efforts have not achieved their intended purpose, and despite their words of caution regarding the utility of their database, the surprisingly high number of false positives is alarming.
Although the availability of such a database could act as a deterrent to undesirable and unethical behavior, the misuse and misinterpretation of information from Déjà vu could have damaging consequences to the reputations and careers of honest scientists. Incorrect entries in the Déjà vu database could lead to false accusations of scientific misconduct. Our exercise shows that a large number of authors may have to defend themselves to free their names from such unfounded allegations.
There is no doubt that Errami and Garner’s undertaking is challenging and daunting, but one cannot help but wonder whether the publication of “A Tale of Two Citations” was premature. Safeguarding the integrity of biomedical research is essential, but one must also remember that the first rule in medicine is, “First, do no harm.”
Grant/Funding Support: None declared.
Financial Disclosures: None declared.
- © 2008 The American Association for Clinical Chemistry