In late July, OpenAI began rolling out an eerily humanlike voice interface for ChatGPT. In a safety analysis released today, the company acknowledges that this anthropomorphic voice may lure some users into becoming emotionally attached to their chatbot.
The warnings are included in a “system card” for GPT-4o, a technical document that lays out what the company believes are the risks associated with the model, plus details surrounding safety testing and the mitigation efforts the company’s taking to reduce potential risk.
OpenAI has faced scrutiny in recent months after a number of employees working on AI’s long-term risks quit the company. Some subsequently accused OpenAI of taking unnecessary chances and muzzling dissenters in its race to commercialize AI. Revealing more details of OpenAI’s safety regime may help mitigate the criticism and reassure the public that the company takes the issue seriously.
The risks explored in the new system card are wide-ranging, and include the potential for GPT-4o to amplify societal biases, spread disinformation, and aid in the development of chemical or biological weapons. It also discloses details of testing designed to ensure that AI models won’t try to break free of their controls, deceive people, or scheme catastrophic plans.
Some outside experts commend OpenAI for its transparency but say it could go further.
Lucie-Aimée Kaffee, an applied policy researcher at Hugging Face, a company that hosts AI tools, notes that OpenAI's system card for GPT-4o does not include extensive details on the model’s training data or who owns that data. "The question of consent in creating such a large dataset spanning multiple modalities, including text, image, and speech, needs to be addressed," Kaffee says.
Others note that risks could change as tools are used in the wild. “Their internal review should only be the first piece of ensuring AI safety,” says Neil Thompson, a professor at MIT who studies AI risk assessments. “Many risks only manifest when AI is used in the real world. It is important that these other risks are cataloged and evaluated as new models emerge.”
The new system card highlights how rapidly AI risks are evolving with the development of powerful new features such as OpenAI’s voice interface. In May, when the company unveiled its voice mode, which can respond swiftly and handle interruptions in a natural back and forth, many users noticed it appeared overly flirtatious in demos. The company later faced criticism from the actress Scarlett Johansson, who accused it of copying her style of speech.
A section of the system card titled “Anthropomorphization and Emotional Reliance” explores problems that arise when users perceive AI in human terms, something apparently exacerbated by the humanlike voice mode. During the red teaming, or stress testing, of GPT-4o, for instance, OpenAI researchers noticed instances of speech from users that conveyed a sense of emotional connection with the model. For example, people used language such as “This is our last day together.”
Anthropomorphism might cause users to place more trust in the output of a model when it “hallucinates” incorrect information, OpenAI says. Over time, it might even affect users’ relationships with other people. “Users might form social relationships with the AI, reducing their need for human interaction—potentially benefiting lonely individuals but possibly affecting healthy relationships,” the document says.
Joaquin Quiñonero Candela, head of preparedness at OpenAI, says that voice mode could evolve into a uniquely powerful interface. He also notes that the kind of emotional effects seen with GPT-4o can be positive—say, by helping those who are lonely or who need to practice social interactions. He adds that the company will study anthropomorphism and the emotional connections closely, including by monitoring how beta testers interact with ChatGPT. “We don’t have results to share at the moment, but it’s on our list of concerns,” he says.
Most PopularThe Top New Features Coming to Apple’s iOS 18 and iPadOS 18By Julian Chokkattu CultureConfessions of a Hinge Power UserBy Jason Parham SecurityWhat You Need to Know About Grok AI and Your PrivacyBy Kate O'Flaherty GearHow Do You Solve a Problem Like Polestar?By Carlton Reid
GearOther problems arising from voice mode include potential new ways of “jailbreaking” OpenAI’s model—by inputting audio that causes the model to break loose of its restrictions, for instance. The jailbroken voice mode could be coaxed into impersonating a particular person or attempting to read a users’ emotions. The voice mode can also malfunction in response to random noise, OpenAI found, and in one instance, testers noticed it adopting a voice similar to that of the user. OpenAI also says it is studying whether the voice interface might be more effective at persuading people to adopt a particular viewpoint.
OpenAI is not alone in recognizing the risk of AI assistants mimicking human interaction. In April, Google DeepMind released a lengthy paper discussing the potential ethical challenges raised by more capable AI assistants. Iason Gabriel, a staff research scientist at the company and a coauthor of the paper, tells WIRED that chatbots’ ability to use language “creates this impression of genuine intimacy,” adding that he himself had found an experimental voice interface for Google DeepMind’s AI to be especially sticky. “There are all these questions about emotional entanglement,” Gabriel says of voice interfaces in general.
Such emotional ties may be more common than many realize. Some users of chatbots like Character AI and Replika report antisocial tensions resulting from their chat habits. A recent TikTok with almost a million views shows one user apparently so addicted to Character AI that they use the app while watching a movie in a theater. Some commenters mentioned that they would have to be alone to use the chatbot because of the intimacy of their interactions. “I’ll never be on [Character AI] unless I’m in my room,” wrote one.