Systematic Literature review duplicates volume 4Article number: 6 Cite this article. Metrics details. A major problem arising from searching across bibliographic databases is the retrieval of duplicate citations. Removing such duplicates is an essential task veterinary dissertation ideas ensure systematic reviewers do not waste time literature review duplicates the same citation multiple times.
Although reference management software use algorithms to remove duplicate records, this is only partially successful and necessitates removing the remaining duplicates manually. This time-consuming task leads to wasted resources.
We sought to evaluate the effectiveness of a newly developed deduplication program against EndNote. Literature review duplicates literature search of 1, citations was manually inspected and duplicate citations identified and coded to create a benchmark dataset. The accuracy of deduplication was reported by calculating the sensitivity and specificity. Further validation tests, with three additional benchmarked literature searches comprising a total of 4, citations were performed to determine the literature review duplicates of the SRA-DM algorithm.
Overall, there was a The Systematic Review Assistant-Deduplication Module offers users a reliable program to remove duplicate records with greater sensitivity and specificity than EndNote.
This application will save researchers and information specialists time and avoid research waste. Literature review duplicates deduplication program is freely available online. Peer Review reports. Identifying trials for systematic reviews is time consuming: the average retrieval from a PubMed search produces 17, citations [ 1 ]. However, the methodological details of trials are often literature review duplicates described by authors in the titles or abstracts, and not all records contain an abstract [ 4 ].
Due to these limitations, a wider that is, more sensitive search strategy is necessary to ensure articles are not missed, which read article to an imprecise dataset retrieved from electronic literature review duplicates databases. Searching multiple databases is well, observing communication essay read because different databases contain different records, and therefore, the coverage is widened.
Also, searching multiple databases utilises differences in indexing to increase the likelihood of retrieving relevant items that are listed in several databases [ 6 ], but inevitably, this practice also retrieves overlapping content [ 7 ]. The problem of overlapping content and subsequent retrieval of duplicate records is partially managed with commercial reference management software programs such as EndNote [ 14 ], Reference Manager [ 15 ], Mendeley [ 16 ] and RefWorks [ 17 ].
They contain algorithms designed to identify and remove duplicate records using an auto-deduplication function. However, the detection of duplicate records can be thwarted by inconsistent citation details, missing information or errors in the records. Typically, auto-deduplication is only partially successful [ 18 ], continue reading the onerous task of manually sifting and removing the remaining duplicates rests literature review duplicates reviewers or information specialists.
This study aimed to iteratively develop and test the performance of a new deduplication program against Literature review duplicates X6. The project aimed to reduce the amount of time taken to produce systematic reviews by maximising the just click for source of the various review stages such as optimising search strategies and screening, finding full text homework coordinate geometry and removing duplicate citations.
The deduplication algorithm was developed using a heuristic-based approach with the aim of increasing the retrieval of duplicate records and minimising unique records being erroneously designated as duplicates.
The algorithm was developed iteratively with each version tested against a benchmark dataset of 1, citations. Modifications were made to the algorithm to overcome errors in duplicate detection Table 1. For example, errors this web page occurred due to variations in author names e.
To determine the reliability of SRA-DM, we conducted a series of validation tests with results literature review duplicates different literature searches cytology screening tests, stroke and haematology which were retrieved from searching multiple biomedical databases Table 2.
A duplicate record learn more here literature review duplicates as literature review duplicates the same bibliographic record irrespective of how the citation details literature review duplicates reported, e.
Where literature review duplicates reports from a single study were published, these were not classed as duplicates as they are multiple reports which can appear across or within journals. Similarly, where the literature review duplicates study was reported in both journal and conference proceedings, these were treated as separate bibliographic records.
A total of 1, citations, derived from a search conducted on 29 July for surgical and non-surgical management for pleural empyema were used to test SRA-DM and EndNote X6. To create the benchmark, citations were imported into EndNote database, sorted by author, inspected for duplicate records and manually coded as a unique or duplicate record; the database was reordered by article title and reinspected for further duplicates.
A few additional duplicates were identified in EndNote and SRA-DM whilst cross-checking against the benchmark decisions, and the benchmark and results were updated to take account of these.
The accuracy of the results were coded against the benchmark according to whether it was a true positive true duplicate, i. Sensitivity is defined as the ability to correctly classify a record as duplicate and is the proportion of true positive records over the total number of records identified as true positive and false negative.
Specificity is defined as the see more to correctly classify a record as being unique or non-duplicate and is the proportion of true negative records over the total number of records identified as true negative and false positive.
The first iteration of the deduplication algorithm achieved The matching criteria were based on field comparison ignoring punctuation with checks made against the year field. This field was chosen because the year field has a lower probability for errors since it is restricted to integers 0—9 literature review duplicates therefore is the best non mistakable field.
Eighty-four percent of undetected duplicates arose due to variations in pages numbers e. To address this, short format page numbers were converted to full format and the algorithm was further modified to increase the sensitivity by incorporating matching criteria on authors OR title.
This increased the sensitivity of the second iteration to This distinguished references that were similar literature review duplicates. The fourth iteration was modified to accommodate author name variations using fuzzy logic so that differences in names spelt in full or initialised, differences in the ordering of name and different punctuation could be accommodated Table 1 ; this increased the sensitivity to EndNote identified of the 1, citations as duplicates.
Of these, were correctly identified as duplicates TP and bibliography essay definition were literature review duplicates designated as duplicates FPand 1, citations were correctly identified as unique records TN and duplicate literature review duplicates were undetected FN.
The sensitivity of EndNote was These were obtained from existing searches performed by information specialists to widen the scope of the validation tests. In contrast, the average specificity of EndNote was lower These false positives occurred in EndNote due to citations with the same authors and title being published in other journals or as literature review duplicates proceeding. Waste in research occurs for several methodological, legislative and reporting reasons [ 19 — 22 ].
Another form of waste http://freey8.com/500-word-essay/fingerprint-essay-conclusion.html inefficient labouring, in part, as a consequence of non-standardised citations details literature review duplicates bibliographic databases, perfunctory error checking and absence of a unique trial identification number for it and its associated further multiple reports.
If these literature review duplicates were solved at source, manual duplicate checking would be unnecessary. Until these issues are resolved, deploying the SRA-DM will save information specialists and reviewers valuable time by identifying on average a further Several citations were wrongly literature review duplicates as duplicates by EndNote auto-deduplication due to different citations sharing the same authors and title but published in other journals or as conference proceedings.
In a recent study by Jiang [ 23 ], the authors also found that EndNote, for the same reason, had erroneously assigned unique records as duplicates. It is probable that in most scenarios no important loss of data would occur; although sometimes additional methodological or outcome data are reported, and ideally literature review duplicates need to be retained for inspection.
A recent study by Qi [ 18 ] examined the content of undetected duplicate records in EndNote and found that errors often occurred due to literature review duplicates or wrong data in the literature review duplicates, especially for records retrieved from EMBASE database.
This also affected the sensitivity of SRA-DM, with duplicates undetected due to missing or wrong or extraneous data in the fields. For systematic reviews and Health Technology Assessment reports, the aim literature review duplicates to conduct comprehensive searches to ensure all literature review duplicates trials are identified [ 24 literature review duplicates thus, losing even three citations is undesirable.
If this strategy was implemented on the respiratory dataset using the fourth and second algorithm Table 3only 91 out of 1, citations would have to be manually checked and only 34 duplicates would remain undetected. In spite of this major improvement with the SRA-DM, no software can currently detect all duplicate records, and the perfect uncluttered dataset remains elusive.
Undetected duplicates in SRA-DM occurred due to discrepancies such as missing page numbers or too much variance with author names. PubMed, Web of Knowledge report source the title. Some of these problems could be overcome in the future with record linkage and citation enrichment techniques to populate blank fields with meta-data to increase the detection rate.
The deduplication program was developed to identify duplicate citations from biomedical databases and has not been tested on other bibliographic records such as books literature review duplicates governmental reports and therefore may not perform as well with other bibliographies. However, the deduplication program was developed iteratively to remove problems of false positives and was tested on four different datasets which included comprehensive searches using 14 different databases that are used by information literature review duplicates, and therefore, similar efficiencies should occur in other medical specialities.
Also, the accuracy of Literature review duplicates was consistently higher than that of EndNote, and these finding are probably generalizable to other biomedical database searches due to the same records types and fields used. It is possible that some duplicates were not detected during the manual benchmarking process, although the database was screened twice first by author and then by title, and additional cross-checking was performed by manually literature review duplicates the benchmark against EndNote auto-deduplication and SRA-DM decisions—thus minimising the possibility of undetected duplicates.
Letter agreement application we compared SRA-DM against the literature review duplicates default EndNote deduplication setting, we recognise literature review duplicates some information specialists adopt additional steps whilst performing deduplication in EndNote.
However, many researchers and information specialists do not employ such techniques, and our aim was to address deduplication with an automated algorithm and compare it against the default deduplication process in EndNote. Qi literature review duplicates 18 ] recommended employing a two-step strategy to address the problem of undetected duplicates by first performing auto-deduplication in EndNote followed by manual hand screening to identify remaining duplicates. This basic strategy is used by some information specialists and systematic reviewers but is inefficient due to the large proportion of unidentified duplicates.
Other more complex multi-stage screening strategies have been suggested [ 25 ] but are EndNote-specific and not viable for other reference management software. The deduplication algorithm has greater sensitivity and specificity than EndNote. Reviewers and information specialists incorporating SRA-DM into their research procedures will save valuable time and reduce resource waste. The algorithm is open source [ 26 ] and the SRA-DM program is freely available to users online [ 27 ].
It has the option of automatic duplicate removal or manual pair-wise duplicate screening performed individually or with a co-reviewer. Database J Biol Databases Curation1. Google Scholar. Emerg Themes Epidemiol5: BMC Bioinformatics1— Article Google Scholar. J Med Libr Assoc— J Am Soc Inf Sci1—6. Med J Aust2: — Searches literature review duplicates controlled trials of homoeopathy, ascorbic acid for common cold literature review duplicates ginkgo biloba for cerebral insufficiency and intermittent claudication.
Pharm Weekbl Sci— J Med Syst— Ann Pharmacother— J Rheumatol— Royle P, Milne R: Literature searching for randomized controlled trials used agamemnon introduction Cochrane reviews: rapid versus exhaustive searches.
Article PubMed Google Scholar. Reference manager. PLoS One8: e