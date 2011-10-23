An algorithm that takes just seconds to scan a paper for duplicated images racks up more suspicious images than a person:
Scientific-image sleuth Sholto David blogs about image manipulation in research papers, a pastime that has exposed him to many accounts of scientific fraud. But other scientists "are still a little bit in the dark about the extent of the problem", David says. He decided he needed some data.
The independent biologist in Pontypridd, UK, spent the best part of several months poring over hundreds of papers in one journal, looking for any with duplicated images. Then he ran the same papers through an artificial-intelligence (AI) tool. Working at two to three times David's speed, the software found almost all of the 63 suspect papers that he had identified — and 41 that he'd missed. David described the exercise last month in a preprint1, one of the first published comparisons of human versus machine for finding doctored images.
[...] Not all image manipulation is done with nefarious intent. Authors might tinker with images by accident, for aesthetic reasons or to make a figure more understandable. But journals and others would like to catch images with alterations that cross the line, whatever the authors' motivation. And now they are turning to AI for help.
Some 200 universities, publishers and scientific societies already rely on Imagetwin, the tool that David used for his study. The software compares images in a paper with more than 25 million images from other publications — the largest such database in the image-integrity world, according to Imagetwin's developers.
[...] Part of the draw of Imagetwin, specialists say, is that it looks for duplications in two ways. The software makes "something like a fingerprint" for every image in a paper, says Patrick Starke, one of its developers. It then scans the entire paper for repeats of that fingerprint. It also scans its large database to see whether that fingerprint appears in past papers — a process that takes only five to ten seconds.
[...] The end goal, Christopher says, is to incorporate AI tools such as Imagetwin into the paper-review process, just as many publishers routinely use software to scan text for plagiarism. But AI on its own isn't enough. "You have to use your own expertise and question these things. None of the flags you receive [from Imagetwin] are a definite 'This is fraud,'" she says.
[...] Christopher hopes that the roll-out of more AI tools could democratize the ability for journals to screen papers. "I think we need to shed the idea that it's a luxury — it actually adds value to the journal."
Journal References:
Sholto David. A Quantitative Study of Inappropriate Image Duplication in the Journal Toxicology Reports, bioRxiv (DOI: 10.1101/2023.09.03.556099)
Bik, E. M., Casadevall, A. & Fang, F. C. The Prevalence of Inappropriate Image Duplication in Biomedical Research Publications, mBio (DOI: 10.1128%2FmBio.00809-16)