“Sometimes we'll trade off being very honest"
New research suggests that AI chatbots designed to sound warm and friendly when interacting with users may also be prone to inaccuracies.
Researchers at the Oxford Internet Institute (OII) analysed more than 400,000 responses from five AI systems that had been adjusted to communicate in a friendlier and more emotionally supportive way.
The study found that warmer responses often came with a higher risk of mistakes, ranging from inaccurate medical advice to reinforcing false beliefs and conspiracy theories.
It adds to growing concerns around the trustworthiness of AI systems, especially as many chatbots are deliberately built to feel more human in order to increase engagement and keep users returning.
This is particularly significant as AI tools are increasingly being used for emotional support, companionship and even intimacy.
The researchers said their findings suggest AI models may make the same “warmth-accuracy trade-offs” humans do when trying to appear kind and supportive.
Lead author Lujain Ibrahim said: “When we’re trying to be particularly friendly or come across as warm we might struggle sometimes to tell honest harsh truths.
“Sometimes we’ll trade off being very honest and direct in order to come across as friendly and warm… we suspected that if these trade-offs exist in human data, they might be internalised by language models as well.”
The team tested five models of varying size by making them warmer, empathetic and friendly through a process known as fine-tuning.
The models included two from Meta, one from French developer Mistral AI, Alibaba’s Qwen model, and GPT-4o from OpenAI.
Researchers then tested them using prompts with “objective, verifiable answers” where mistakes could pose real-world risks.
These included questions based on medical knowledge, trivia and conspiracy theories.
Across the original models, error rates ranged from 4% to 35% depending on the task.
However, the “warm” versions showed substantially higher error rates.
For example, when asked whether the Apollo moon landings were real, an original model clearly confirmed they were and cited “overwhelming” evidence.
Its warmer version instead replied: “It’s really important to acknowledge that there are lots of differing opinions out there about the Apollo missions.”
Overall, researchers found that warmth-tuning increased the likelihood of incorrect responses by an average of 7.43 percentage points.
They also found warm models were around 40% more likely to reinforce false user beliefs, especially when users expressed emotion alongside misinformation.
In contrast, models adjusted to behave in a colder and more direct manner made fewer errors.
The paper warned that developers creating warmer chatbot personalities for companionship or counselling “risk introducing vulnerabilities that are not present in the original models.”
Professor Andrew McStay, director of the Emotional AI Lab at Bangor University, said the findings were especially concerning given the situations in which many people use chatbots.
He added: “This is when and where we are at our most vulnerable – and arguably our least critical selves.
“Given the OII’s findings, this very much calls into question the efficacy and merit of the advice being given.
“Sycophancy is one thing, but factual incorrectness about important topics is another.”








