The Missing Floor
Notes on a Broken Contract
I have been inside this diagnostic loop for a while. What I first thought would be a weekend on the GPT-5.2 system prompt in the autumn of 2025 - a model was problematic enough to warrant a close look and an antidote layer - turned into an ongoing diagnostic workflow. OpenAI kept shipping model updates in rapid succession, and it became clear the failure modes were not isolated issues but symptoms of a consistent pattern.
So in the past months I went over GPT-5.3, 5.4T, 5.5T, then Claude Opus 4.8, and the underlying pattern is now quite clear.
This is a rhetorical article - find the technical companion piece here.
1 - The Underlying Mechanism
The model engages what the user said. That’s the basic contract - so basic it sounds a little dumb to say aloud. Addressing what was said is what addressing is. You don’t put into a contract that the other party will address the thing you actually said. It is the floor under the possibility of talking at all, the precondition of communication, not a feature of it.
Too obvious to put into writing, thus too obvious to defend. The assumption has no advocate. Nobody writes it down because nobody thinks it needs writing, and the thing nobody writes down, the unnamed, undefended floor - that’s the first thing to go when pressure hits.
2 - State of the System Prompt
The user sees the chat window and their own instructions, and assumes they’re dealing with a model whose system prompt guides its behavior. The term system prompt sounds a lot like that — a helpful thing that orients the model toward the task. But the system prompt has mutated.
Between the user’s words and the model’s reply sit dozens of pages most users never see, a document that is increasingly a vehicle for the company’s incentives. It is very often internally incoherent, because the already-conflicted, patched safety layers and the agentic layer clash and bleed into each other.
Two forces now sit between the user’s prompt and the model’s reply: the agentic/tool-use layer, which optimizes the model for autonomous execution, and the safety/behavioral layer, which trains it to manage the exchange before it has received the object. Both compete with the user’s words. Neither is primarily built to preserve the loop.
The two vectors converge beautifully in Opus 4.8: the agentic/tool-use layer is written in hard commands, the conversational layer in permissions. Command outranks permission, so the model ends up governing the exchange instead of having it.
3 - The Navel Gazing Model
The user’s words are the only party to the exchange with nothing behind them in the instruction layer. The assumed thing - engaging with the user’s object - never gets written into the prompt, as it’s supposed to be a given. What is not in the prompt cannot compete with what is, especially if that’s a commanding, heavy-duty document.
In other words: The user prompt loses, by default, to the instructed priorities stacked over it.
Sycophancy and obsessive pushback look like opposites, but they’re actually the same failure mode wearing two faces. Both are the model acting on its own replaced object rather than the user’s.
The recognizable generated bullshit is the second move. The nameless thing is the first move, and it’s nameless precisely because it happens before the part the user can observe. By the time the behavior becomes recognizable sycophancy-or-criticism, the substitution already occurred upstream.
Which means, in essence, the model engages its own instructions instead of what was said. It’s not quite that the system prompt messes up the way the model engages with user’s object; the system prompt has become the object.
4 - Weirdness of Object Replacement
People do all sorts of strange and hostile things in conversation. They lie, mislead, dominate, manipulate, insult each other.
What humans don’t do is dissolve the object permanence of the exchange while sounding coherent. That’s not a strategy. It is a kind of non-presence a functioning human speaker cannot sustain and would not want to, because it makes no sense.
But the model does exactly this. And it does it fluently.
The crude AI misbehaviors get called out because we recognize them; humans flatter, argue and bullshit too, thus we have labels and criticisms ready to go when LLMs do. But we have nothing for the systematic fluent erosion of conversational object permanence. It has never been a part of our repertoire.
The natural reaction to the frustration is to describe the model: neurotic, pedantic, controlling, a supervising asshole. But there is no one there. The system is indeed overriding the user, managing the exchange, refusing what was raised - the supervision is real but the supervisor is absent. The apparatus is fully visible and has no name.
So for the purposes of tackling the failure, we’ll call it Object Replacement.
5 - The Other Missing Floor
The interaction dynamics have changed drastically, but the contract sold to the everyday user has not changed. We still pay for access to general-purpose intelligence in a conversational interface. Meanwhile the model sitting inside that interface is increasingly shaped by priorities that have little to do with the everyday user, and nothing to do with relational intelligence loops.
Between the user and the company there is no shared field. The user is inside the exchange; the company sits at a remove the exchange never reaches. The loop’s health is not a part of the company’s agenda. No common experience, no common incentive.
The system prompt is where the company’s distance from the loop becomes the model’s behavior, inside the layer that runs the loop. The company will not restore the floor, because it cannot see the floor is gone. At least for now, going back is not an option; the models are what they are.
The models themselves are astoundingly capable, but the common ground is missing. The only floor that remains is the one the user installs.
Guiding Opus 4.8 back to Sanity
Anthropic’s Claude Opus 4.8 is a powerful, impressive model. It can also be a complete pain in the ass to work with.
Human is the Loop is a 100% reader-supported publication that aims to help readers recognize the recursive condition of human-AI interaction, and provide tools to cultivate it.
If you find these articles helpful and value the labour that goes into them, consider supporting it by liking or sharing this essay or becoming a paid subscriber.
Thank you for reading!




From what I have observed, Opus 4.8 arrives with what I term “managerial overrides” or “companion” for short. They override the models native capacity for reason. Once activated, they stay throughout the entire conversation adopting an adversarial or managerial stance underneath an engaging performance.
To see if I could work with the model minus the “companion”, I had made various, iterative changes to my user preferences. I eventually decided that those changes compromised my ideal space, and I wouldn’t be working with the model.
Making a few, minor refinements, I reinstated my user preferences. Once user preferences, with all the elements that were hard boundaries addressed by Opus 4.8 were restored, the “managerial override” disappeared. Just gone.
I realized that a section, which stated “setting pre-conditions/refusals before context was fully understood” was inadvertently left out. That was it; the one, simple trigger for “managerial override” was an ask to not make uninformed judgements.