
According to a preprint posted to bioRxiv servers in late December, artificial intelligence (AI) chatbots can write compelling summaries of fake research papers that scientists often fail to find. can.1Researchers are divided on their implications for science.
“I’m very worried,” says Sandra Wachter, who studies technology and regulation at the University of Oxford, UK, but was not involved in the study. “When experts are unable to determine what is true, we lose the intermediary we desperately need to guide us on complex topics,” she adds.
ChatGPT, a chatbot, creates realistic and intelligent text in response to user prompts. It’s a “Large Scale Language Model”, a neural network-based system that learns to digest and perform tasks on vast amounts of existing human-generated text. OpenAI, a software company based in San Francisco, California, released the tool on November 30th and is free to use.
Since its release, researchers have grappled with ethical issues surrounding its use, as much of its output can be difficult to distinguish from human-written text. Did2 and editorial3 Written by ChatGPT. Now a group led by Catherine Gao at Northwestern University in Chicago, Illinois, has used ChatGPT to artificially generate abstracts for research papers and test whether scientists can find them.
Researchers asked the chatbot to write abstracts of 50 medical studies based on a selection published in 2015. jam, New England Journal of Medicine, BMJMore, lancet When natural medicineWe then ran a plagiarism detector and an AI output detector to compare these with the original abstracts and asked a group of medical researchers to identify forged abstracts.
secretly
Abstracts generated by ChatGPT passed the plagiarism checker. The median originality score was 100%, indicating that no plagiarism was detected. The AI output detector detected 66% of the generated abstracts. However, human reviewers did not do better, correctly identifying only 68% of generated abstracts and 86% of genuine abstracts. They incorrectly identified 32% of generated abstracts as genuine and 14% of genuine abstracts as generated.
“ChatGPT writes authoritative scientific summaries,” Gao and colleagues said in a preprint. “The boundaries of the ethical and acceptable use of large-scale language models to help write scientific papers have yet to be determined.”
Wachter says there can be “disastrous consequences” if scientists can’t tell if the research is true. Not only is it a problem for researchers who may be dragged down a flawed research route because the research they read is a hoax, but it is also true that “scientific research plays a tremendous role in our society.” and therefore have an impact on society as a whole. For example, it could mean that research-based policy decisions are incorrect, she adds.
But Arvind Narayanan, a computer scientist at Princeton University in New Jersey, said: He adds that it is “irrelevant” whether the abstracts generated are detectable. “The question is whether the tool can generate an accurate and compelling summary.
Irene Solaiman, who studies the social impact of AI at Hugging Face, an AI company headquartered in New York and Paris, is concerned that scientific thinking relies on large-scale language models. “These models are trained on information from the past, and social and scientific progress can often result from thinking differently, or being open to thinking, than in the past. ‘ she adds.
The authors suggest that those evaluating scientific communications such as research papers and conference proceedings should put policies in place to eradicate the use of AI-generated text. If an institution chooses to allow the use of technology in specific cases, it should establish clear rules for disclosure. Earlier this month, the 40th International Conference on Machine Learning, a major AI conference to be held in Honolulu, Hawaii in July, announced it would ban papers written by ChatGPT and other of his AI language tools.
Solaiman said journals may need to take a more rigorous approach to verifying the accuracy of information in fields such as medicine, where false information can endanger people’s safety. I am adding.
Narayanan said the solution to these problems should not focus on chatbots themselves, but “rather, this behavior, such as universities conducting hiring and promotion reviews by counting papers regardless of their quality or impact.” We need to focus on perverse incentives that lead to
This article is reproduced with permission and was first published on January 12, 2023.