These AI models are kind of stupid.
Forget-Me-Bots
We’ve certainly seen our fair share of demented behavior from AI models — but dementia? That’s a new one.
As detailed in a new study published in the journal The BMJ, some of the tech industry’s leading chatbots are showing clear signs of mild cognitive impairment. And, like with humans, the effects become more pronounced with age, with the older large language models performing the worst out of the bunch.
The point of the work isn’t to medically diagnose these AIs, but to rebuff a tidal wave of research suggesting that the tech is competent enough to be used in the medical field, especially as a diagnostic tool.
“These findings challenge the assumption that artificial intelligence will soon replace human doctors, as the cognitive impairment evident in leading chatbots may affect their reliability in medical diagnostics and undermine patients’ confidence,” the researchers wrote.
Generative Geriatrics
The brainiacs on trial here are OpenAI’s GPT-4 and GPT-4o; Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 1.0 and 1.5.
When subjected to the Montreal Cognitive Assessment (MoCA), a test designed to detect early signs of dementia, with a higher scoring indicating a superior cognitive ability, GPT-4o scored the highest (26 out of 30, which barely meets the threshold of what’s normal), while the Gemini family scored the lowest (16 out of 30, horrendous).
All the chatbots excelled at most types of tasks, like naming, attention, language, and abstraction, the researchers found.
But that’s overshadowed by the areas where the AIs struggled in. Every single one of them performed poorly with visuospatial and executive tasks, such as drawing a line between circled numbers in ascending order. Drawing a clock showing a specified time also proved too formidable for the AIs.
Embarrassingly, both Gemini models outright failed at a fairly simple delayed recall task which involves remembering a five word sequence. That obviously doesn’t speak to a stellar cognitive ability in general, but you can see why this would be especially problematic for doctors, who must process whatever new information their patients tell them and not just work off what’s written down on their medical sheet.
You might also want your doctor to not be a psychopath. Based on the tests, however, the researchers found that all the chatbots showed an alarming lack of empathy — which is a hallmark symptom of frontotemporal dementia, they said.
Memory Ward
It can be a bad habit to anthropomorphize AI models and talk about them as if they’re practically human. After all, that’s basically what the AI industry wants you to do. And the researchers say they’re aware of this risk, acknowledging the essential differences between a brain and an LLM.
But if tech companies are talking about these AI models like they’re already conscious beings, why not hold them to the same standard that humans are?
On those terms — the AI industry’s own terms — these chatbots are floundering.
“Not only are neurologists unlikely to be replaced by large language models any time soon, but our findings suggest that they may soon find themselves treating new, virtual patients — artificial intelligence models presenting with cognitive impairment,” the researchers wrote.
More on AI: We Regret to Bring You This Audio of Two Google AIs Having EXTREMELY Explicit Cybersex
Source link
lol