It was early January when the Dana-Farber Cancer Institute received a complaint about signs of image manipulation in dozens of papers by senior researchers. Days later, the organization said it was seeking to retract or correct several of the studies, sending shock waves through the scientific community.
It was early January when the Dana-Farber Cancer Institute received a complaint about signs of image manipulation in dozens of papers by senior researchers. Days later, the organization said it was seeking to retract or correct several of the studies, sending shock waves through the scientific community.
Mass General Brigham and Harvard Medical School were sent a complaint the same month: A collection of nearly 30 papers co-authored by another professor appeared to contain copied or doctored images.
Premium benefits
35+ Premium articles every day
Specially curated Newsletters every day
Access to 15+ Print edition articles every day
Subscriber only webinar by specialist journalists
E Paper, Archives, select The Wall Street Journal & The Economist articles
Access to Subscriber only specials : Infographics I Podcasts
Unlock 35+ well researched
premium articles every day
Access to global insights with
100+ exclusive articles from
international publications
Get complimentary access to
3+ investment based apps
TRENDLYNE
Get One Month GuruQ plan at Rs 1
FINOLOGY
Free finology subscription for 1 month.
SMALLCASE
20% off on all smallcases
5+ subscriber only newsletters
specially curated by the experts
Free access to e-paper and
WhatsApp updates
Mass General Brigham and Harvard Medical School were sent a complaint the same month: A collection of nearly 30 papers co-authored by another professor appeared to contain copied or doctored images.
The complaints were from different critics, but they had something in common. Both scientists—molecular biologist Sholto David and image expert Elisabeth Bik—had used the same tool in their analyses: an image-scanning software called Imagetwin.
Behind the recent spotlight on suspect science lies software such as Imagetwin, from a company based in Vienna, and another called Proofig AI, made by a company in Israel. The software brands aid scientists in scouring hundreds of studies and are turbocharging the process of spotting deceptive images.
Before the tools emerged, data detectives pored over images in published research with their own eyes, something that could take a few minutes or about an hour, with some people possessing a flair for seeing patterns. Now, the tech tools automate this effort, pointing to problematic images within a minute or two.
Scientific images offer a rare glimpse of raw data: millions of pixels tidily presented alongside the text of a paper. Common types of images include photographs of tissue slices and cells. Researchers say no two tissue samples taken from different animals should ever look the same under a microscope, nor should two different cell cultures.
When they do, that is a red flag.
Bik has spent more than a decade scrutinizing scientific images and found red flags in more than 2,000 papers that were retracted or corrected.
Much of Bik’s time is spent looking for duplications or manipulation in photographs that shouldn’t exist. It is like looking at a photograph of a family seated around a dinner table, she said, but Uncle Joe’s face appears twice.
Imagetwin in particular offers a feature that none of the human detectives can replicate: It compares photos in one paper against a database of 51 million images reaching back 20 years, flagging photos copied from previous studies. “That is an amazing find that humans can never do,” Bik said.
Using the tool as well as Google’s image-search tool that combs the Web, Bik recently found that a paper published in 2022 in the journal Nature Communications contained images that looked identical to pictures in more than a dozen other sources.
It was among the collection of almost 30 studies Bik reported earlier this year to Harvard Medical School and Brigham and Women’s Hospital, where the researcher who co-authored all of the papers works.
In 2020, Bik had access to an early version of the software to test from Markus Zlabinger, who created the tool as his master’s thesis in graduate school.
After Bik praised Imagetwin on social media, a handful of publishers contacted Zlabinger asking how they could buy his tool. There was just one problem: “I cannot sell you anything,” he replied. “I don’t have a company.”
Zlabinger pulled in Patrick Starke, a friend from college who had studied economics. The two had bonded over a shared love of sports and beach volleyball. In 2022, they formed Imagetwin AI as a startup, and got grant funding from the Austrian government to cover early costs.
Since going live that year, Imagetwin’s customers include major universities and some of the biggest names in scientific publishing, the founder said.
The tools, which can cost between $35 and $50 a manuscript for Proofig and about $27 a paper for Imagetwin, with subscriptions varying, offer a chance to improve the quality of published science. And there is some evidence the software is moving the needle.
The existing system of reviewing a paper for publication in a journal isn’t built to thwart fraud. For decades, volunteer researchers such as Bik have been catching and calling out errors in published papers on X (formerly known as Twitter), and on public discussion boards such as the forum PubPeer, prompting lengthy and sometimes contentious reviews involving journals, authors and the institutions at which the research was done.
Technology offers a way to avoid such a “postpublication postmortem,” said Dror Kolodkin-Gal, founder of Proofiger, the company that makes the software Proofig. The tool compares images within a single paper and points to signs of copying or manipulation within a few minutes.
In its first year in use at the Journal of Clinical Investigation, a 100-year-old journal that showcases papers about the mechanisms of human disease, Proofig tripled the rejection rate for papers that had otherwise passed peer review, from 1% to 3%. Editors had accepted those papers, but had missed disqualifying duplications or errors in images in the studies.
Before purchasing the software, the journal had relied on a keen-eyed staffer to pore over any images in accepted papers, said JCI Executive Editor Sarah Jackson.
After a pilot at the Science family of journals last year, Editor in Chief Holden Thorp said that moving forward papers would be scanned through Proofig before publication. “Papers that should not be published were detected,” he said.
Even so, the technology isn’t perfect.
“It’s not like a calculator giving you an answer,” said Chris Graf, research-integrity director at Springer Nature, which publishes Nature and some 3,000 other journals. At that publisher, editors use Proofig or an in-house tool to scan papers under consideration or under investigation, but they evaluate every red flag individually.
Bik, who uses Imagetwin, said the tool sometimes misses problems in images where the contrast is low, or sometimes highlights a run of images where similarity is expected, such as in experimental results captured in time series, minutes or seconds apart.
And just as innovation is helping detect one kind of unreliable science, technology is throwing up a new problem. Sophisticated algorithms are now capable of fabricating the text of scientific studies, and researchers are worried they will be used to make fake experimental images as well.
Both Imagetwin and Proofig are trying to detect AI, betting that even if humans can’t see the difference between algorithm-created images and real ones, one day their software will.