AI and the Replication Crisis in Scientific Research

NHGRI via AP

There's a long history of new developments in culture or technology leading to moral panics of various kinds. Growing up I can remember the moral panics around video games and Dungeons & Dragons. There were also moral panics about kidnapped children (the milk carton kids) and Satanic day care facilities as well as rock music and later rap music. 

Advertisement

Some people say that the current concern over the impact of social media on teens is just one more example of this pattern. There are a lot more examples and a lot more (positive and negative) to say about the idea of moral panic, but suffice it to say that there is now a general sense that any impulse we feel to criticize the latest tech fad needs to be considered carefully in light of this history of moral panic.

I bring all of that up because I'm starting to feel some concern about the world that AI is ushering in right now. I don't think it's necessary to panic but there do seem to be some warning lights blinking on the cultural dashboard. For instance, this story about a guy who used AI to fake an entire resume that got him a job in Baltimore County Public Schools. It appears he had no real qualifications at all, no degrees, no teaching certifications, but somehow his fake resumes (he made two) got through the system. Then, when he was in danger of getting caught, he used AI to whip up a racism hoax and nearly destroyed the career of the school's principal.

The good news is that this particular person got caught but the whole affair makes you wonder how many people like him haven't been caught? And how many were smarter than he was and didn't fake their entire resumes, just a few bits that would make them more appealing to a hiring manager than the next guy or gal. The temptation to cheat is obvious.

For instance, what if some middling academic was applying for a university job and added a few faked publications to their resume. And what if they didn't just make up the paper but backstopped their minor academic fraud by actually getting the paper published in a supposedly reputable journal. What if that is happening all the time?

Advertisement

The 217-year-old publisher of numerous scientific journals, Wiley, is dealing with an existential crisis. It recently shuttered 19 journals and retracted more than 11,300 papers upon discovering large-scale research fraud afflicting them.

The fraud wasn’t limited to the usual suspects: the academically dubious social-science “grievance studies” fields (e.g., gender studies, race studies, and fat studies). Nearly 900 fraudulent papers were from the physical-sciences publisher IOP Publishing.

“That really crystallized for us, everybody internally, everybody involved with the business,” Kim Eggleton, head of peer review and research integrity at the publisher, told the Wall Street Journal. “This is a real threat.”

Apparently, this is a fairly common way for not very good academics to boost their marketability on paper. The story refers to the bogus studies being churned out in this fashion as coming from "paper mills." Some of these academics are paying to have their name included on research that a) they had nothing to do with and b) isn't real anyway. The advantage is that if some hiring manager does type the name of the paper into Google, they'll find that it really was published and is therefore a bona fide credential booster.

But it's all just junk and an unknown amount of it is being generated by AI. Back in February, an AI generated image of a rat went viral after it was published by a respected journal.

A scientific paper purporting to show the signalling pathway of sperm stem cells has met with widespread ridicule after it depicted a rodent with an anatomically eye-watering appendage and four giant testicles.

The creature, labelled “rat”, was also sitting upright in the manner of a squirrel, while the graphic was littered with nonsensical words such as “dissilced”, “testtomcels” and “senctolic”...

It appeared in the journal Frontiers in Cell and Development Biology this week alongside several other absurd graphics that had been generated by the AI tool Midjourney.

Advertisement

Here's the rat image in question:

Allegedly one of the peer reviewers of this article did point out the problems with that image but somehow it was never changed before publication. But here's the really worrisome part. The tools used to create these fakes are getting better everyday.

"The article slipped through the author compliance checks that normally ensures every reviewer comment is addressed," Fred Fenter, chief executive editor of Frontiers, said in an additional statement emailed to Business Insider, calling it a "human error."...

"Those bad faith actors using AI improperly in science will get better and better and so we will have to get better and better too. This is analogous to cybersecurity constantly improving to block new tricks of hackers," Fenter said.

National Review highlights this post from 2020 which offers a pretty downbeat assessment of where the replication crisis is headed. This author read more than 2,500 papers as part of a meta-examination for DARPA.

Criticizing bad science from an abstract, 10000-foot view is pleasant: you hear about some stuff that doesn't replicate, some methodologies that seem a bit silly. "They should improve their methods", "p-hacking is bad", "we must change the incentives", you declare Zeuslike from your throne in the clouds, and then go on with your day.

But actually diving into the sea of trash that is social science gives you a more tangible perspective, a more visceral revulsion, and perhaps even a sense of Lovecraftian awe at the sheer magnitude of it all: a vast landfill—a great agglomeration of garbage extending as far as the eye can see, effluvious waves crashing and throwing up a foul foam of p=0.049 papers. As you walk up to the diving platform, the deformed attendant hands you a pair of flippers. Noticing your reticence, he gives a subtle nod as if to say: "come on then, jump in".

Advertisement

The worst part of it is, they know what they are doing. The replication crisis isn't made of ignorance, it's made of avarice and complicity. All of this is a business model in which everyone is trying to work the system the best they can.

There's a popular belief that weak studies are the result of unconscious biases leading researchers down a "garden of forking paths". Given enough "researcher degrees of freedom" even the most punctilious investigator can be misled.

I find this belief impossible to accept. The brain is a credulous piece of meat; but there are limits to self-delusion. Most of them have to know. It's understandable to be led down the garden of forking paths while producing the research, but when the paper is done and you give it a final read-over you will surely notice that all you have is a n=23, p=0.049 three-way interaction effect (one of dozens you tested, and with no multiple testing adjustments of course). At that point it takes more than a subtle unconscious bias to believe you have found something real. And even if the authors really are misled by the forking paths, what are the editors and reviewers doing? Are we supposed to believe they are all gullible rubes?...

Authors are just one small cog in the vast machine of scientific production. For this stuff to be financed, generated, published, and eventually rewarded requires the complicity of funding agencies, journal editors, peer reviewers, and hiring/tenure committees. Given the current structure of the machine, ultimately the funding agencies are to blame. But "I was just following the incentives" only goes so far. Editors and reviewers don't actually need to accept these blatantly bad papers.

Advertisement

The bottom line is that the system of academic publishing was awash in an ocean of tendentious garbage before AI came along. As the author suggests, it's not hard to tell most of these papers are junk but there was no incentive to stop them. 

So what happens now that this same system can be gamed with much more sophisticated tools that potentially allow "researchers" to whip up multiple variations on a theme in a matter of hours? What if all the crap becomes slightly more plausible crap, not because the work is good but because the ability to fake it is so much better?

I don't know the answer to those questions but It really does seem like a something to worry about.

Join the conversation as a VIP Member

Trending on HotAir Videos

Advertisement
Advertisement
Advertisement
Advertisement