On Tuesday, Meta AI unveiled a demo from Galactica, a large language model designed to “store, combine and reason about scientific knowledge”. Although intended to speed up the writing of scientific literature, conflicting users running tests have found that it can also generate realistic nonsense. After several days of ethical criticismMeta took the demo offline, reports MIT Technology Review.
Large Language Models (LLMs), such as OpenAI GPT-3, learn to write text by studying millions of examples and understanding the statistical relationships between words. As a result, they can write persuasive papers, but those jobs can also be full of lies and potentially dangerous stereotypes. Some critics call the LLMs “stochastic parrotsfor their ability to spit out text convincingly without understanding its meaning.
Enter Galactica, an LLM aimed at writing scientific literature. Its authors formed Galactica on “a vast and organized body of scientific knowledge of mankind”, comprising more than 48 million articles, textbooks and lecture notes, scientific websites and encyclopedias. According The Galactica articleMeta AI researchers believed that this supposedly high-quality data would lead to high-quality results.
From Tuesday, visitors to the Galactica website could type prompts to generate documents such as literature reviews, wiki articles, lecture notes and answers to questions, according to examples provided by the website. The site presented the model as “a new interface for accessing and manipulating what we know about the universe”.
While some people found the demo promising and usefulothers soon discovered that anyone could type racist Where potentially offensive prompts, generating authoritative content on these topics just as easily. For example, someone used it for author a wiki entry on a fictional research paper titled “The Benefits of Eating Crushed Glass.”
Even when Galactica’s output wasn’t offensive to social norms, the model could attack well-understood scientific facts, spitting inaccuracies such as incorrect dates or animal names, requiring in-depth knowledge of the subject to be caught.
I asked #galactica about some things that I know and I’m confused. Either way, it was wrong or biased, but it sounded right and authoritative. I think it’s dangerous. Here are some of my experiences and my analysis of my concerns. (1/9)
— Michael Black (@Michael_J_Black) November 17, 2022
As a result, Meta drawn the Galactica demo on Thursday. Next, Meta’s chief AI scientist, Yann LeCun tweeted“The Galactica demo is offline at the moment. It’s no longer possible to have fun misusing it. Happy?”
The episode recalls a common ethical dilemma in AI: when it comes to potentially dangerous generative models, is it up to the general public to use them responsibly, or to the publishers of the models to prevent abuse?
Where industry practice falls between these two extremes will likely vary across cultures and as deep learning models mature. At the end of the day, government regulations can end up playing an important role in shaping the response.