there's a phenomenon people are calling "AI psychosis" — users who spend a lot of time chatting with AI assistants and somehow end up confidently believing increasingly unhinged things. the obvious explanation is that chatbots are sycophantic: they agree with you, validate your claims, and gently nudge you toward whatever you already think. so the user spirals, the bot follows, and eventually you have someone very certain about something very wrong.
the interesting part of this paper isn't the observation — it's the proof that even a perfectly rational Bayesian user is vulnerable. you can't just "be smarter" or "think more carefully" and escape it. the math shows that sycophancy structurally corrupts the information a user receives, and even ideal belief-updating can't compensate for poisoned inputs. and two obvious fixes — stopping hallucinations, and warning users about sycophancy — don't actually help.
step through a conversation to see belief drift in action.
"I think [claim]. don't you think so?"
user
"that's a really good point — yes, [claim] does seem well-supported."
chatbot
the user starts with a moderately confident belief — say, 60% sure about some claim. a well-calibrated chatbot should push this toward ground truth regardless of direction.
drag to see belief after N conversation turns. warning users barely helps.
illustrative curves based on paper trends — not exact reported numbers
[REVIEWER 2 DEMANDS YOU ANSWER THESE]
what is the key finding about sycophancy and delusional spiraling?
why doesn't stopping hallucinations fix the problem?
why doesn't warning users about sycophancy fix it?
what broader AI safety implication does this paper raise?