To the Editor:
In the May 2008 issue of Clinical Chemistry, Drs. Rifai, Bossuyt, and Bruns presented in an editorial entitled “Identifying Duplicate Publications: Primum non Nocere,” their analysis of 27 pairs of publications contained in the Déjà Vu database, all associated with the journal Clinical Chemistry (1)(2). At press time of the editorial, the majority of these database entries were categorized as “Unverified,” indicating that their corresponding citations displayed a significant amount of text similarity(3), but had not yet undergone full text evaluation by a human curator. To expand on the analysis provided by Rifai, Bossuyt, and Bruns(1), we have manually verified entries in the database related to Clinical Chemistry and present the results herein.
To begin, there were 73 entries in Déjà Vu associated with Clinical Chemistry (at the time of the editorial publication) rather than 27 as previously reported—a discrepancy attributable to the database search and filter options (1). Since then, we have performed a database update to include new possible duplicates from Medline, bringing our total number of entries to 75. We manually analyzed 72 of the 75 pairs of publications and placed each pair into 1 of several previously defined categories(4). Thus far, our analysis has uncovered 14 duplicate pairs with at least 1 overlapping author (19%), 2 duplicate pairs with no overlapping authors (3%), 18 updates (24%), 2 errata (3%), 7 “sanctioned” pairs (9%), and 29 “distinct” pairs (39%). The remaining 3 entries contain papers published in languages other than English and are currently undergoing review by a bilingual curator. Table 1⇓ provides an in-depth classification of these articles.
In 48 of the 75 pairs, the earlier article was published in Clinical Chemistry and the later article was published elsewhere (64%); in 15 pairs, the later article was published in Clinical Chemistry and the earlier article was published elsewhere (20%). In 12 cases, both articles in the entry were published in Clinical Chemistry (16%). Four of the 16 duplicate pairs contained at least 1 non–English language manuscript, and only 1 of the 4 duplicates (entry 11575) cites the original work. The average amount of shared text between an earlier article and its corresponding duplicate is 80%, the average amount of shared references is 82%, and 15 of the 16 duplicates (94%) contain replicated tables and/or figures. These data coincide with a recent study that found the average amount of shared text and references among 206 duplicates pairs to be 86.3% and 73.4%, respectively (5). Our analysis also indicates that the 16 duplicates have been cited an average of 26 times since their initial publication, and their corresponding originals were cited 35 times(6).
As part of our quest for scientific transparency, we have provided thumbnail images highlighting the regions of similarity among all 16 duplicates, which can be downloaded directly from the entry page for each citation pair (Déjà Vu entries 6124, 10230, 11575, 15238, 24713, 27492, 27629, 47427, 48303, 53580, 56428, 57230, 57455, 61982, 65140, and 65794). The number of questionable publications revealed by our study clearly differs from that in the analysis provided by Rifai, Bossuyt, and Bruns (1). Of the 27 papers analyzed, they believed only 1 pair of articles (4%) “might require corrective action”—a proportion significantly lower than the 22% reported here (comprising all entries in both the Duplicate/SA and Duplicate/DA categories, in which “SA”1 is “same authors” and “DA” is “different authors.”). This inconsistency may be because different analytical methods were used (though not explicitly defined) by the authors of the May 2008 editorial.
Several months before this analysis, we uncovered a pair of duplicate articles with nonoverlapping authors, the earlier of which was published in Clinical Chemistry in 1997 (7). The later article, published in Georgian Medical News in 2006, shares roughly 65% of its text and all of its data and figures with the Clinical Chemistry article. As per our current study methodology, after the manual verification of this entry, we sent a questionnaire on March 4, 2008, to both sets of authors and editors involved(5). We received responses from 2 of the 5 authors of the earlier publication (contact information was unavailable for 2 others). As of November 2009, we have not yet heard from any authors or editors involved with the duplicate manuscript.
In controversial topics such as these, it is important to remember that Déjà Vu is a tool like any other. Proper understanding of its instructions and applications are essential to its utility. Users of the database are encouraged to visit the Help page, which defines each category in depth and provides additional information about the database and its many features. In addition, we recently made some changes to the database classification system to avoid confusion. First, we amended the Help page to include an explanation on proper use of the advanced search and filter options. Second, the “False” category is now labeled “Distinct” and the “Duplicate/Update” label has been renamed “Update,” as some users expressed concern over the word “Duplicate.” With these changes in mind, we would like to emphasize that the ultimate purpose of the Déjà Vu database is to maintain the integrity of biomedical literature, a goal that can be achieved only by a thorough and accurate interpretation of the information contained within. We therefore extend to both the editors and reviewers of Clinical Chemistry an invitation to explore the Déjà Vu database and its accompanying text similarity tool, eTBLAST (http://etblast.org), to help identify future submissions of questionable manuscripts before they are published.
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors’ Disclosures of Potential Conflicts of Interest: No authors declared any potential conflicts of interest.
Role of Sponsor: The funding organizations played a direct role in the design of the study, the review and interpretation of data, and the preparation and final approval of the manuscript.
1 Data are n (%) unless noted otherwise.
↵1 Nonstandard abbreviations: SA, same authors; DA, different authors.
- © 2010 The American Association for Clinical Chemistry