What is an iThenticate Similarity Report?

"The Similarity Report provides an overall similarity breakdown for each submission to the iThenticate database. This breakdown determines the percentage of similarity between a submission and content existing in the database of the text comparison tool, iThenticate.

 is perfectly natural for a submission to match against sources in the database. If the submission has used quotes and has referenced correctly, there will be instances where there will be a match. The similarity score simply makes the user aware of any problem areas in the submission; iThenticate should be used as part of a larger process, in order to determine if the match is or is not acceptable." Quoted from  iThenticate User Guide

What does the total percentage mean?

Authors tend to focus on the similarity index or percentage shown at the top of the report. The higher the index, the thinking goes, the greater the likelihood of inappropriate reuse of previously published text. But this is not necessarily the case. The same similarity index can have vastly different meanings for different manuscripts; for example, a similarity index of 25% might indicate that 25% of the manuscript matches text in only one source, or it might indicate that 25% of the manuscript matches text in 25 different sources. Or it could indicate that multiple paragraphs in the manuscript were copied verbatim from published sources. 

Authors should review the full report for any highlighted sections as any of these could be potentially problematic (review the list below). Any particular call-out in the report that is under 2% is mostly likely a proper name or a few words of text, however, these sections should still be skimmed to ensure they are accurate. Authors can contact for advice on rewording text. 

Highlighted sections to focus on in the Similarity Report:

  • A string of one or more sentences that exactly or closely repeats another source

  • A paragraph or multiple paragraphs

Highlighted sections that are not an issue:

  • proper names (authors, institutions)

  • technical or discipline-specific terms or phrases

  • equations or formulas

  • boilerplate text (conflict of interest disclosures, funding sources, acknowledgements)

  • references in the bibliography

  • quoted material that is properly cited

  • sentences or phrases that are common in scientific writing (Example: “Breast cancer is the second most common cancer among women in the United States” ; Example: “P values of less than 0.05 were considered statistically significant")


