Thinking, Interrupted
When manufactured relationality replaces genuine engagement
Preface
As I was trying to get this piece out earlier this week, I ran into a strange obstacle: I had written, edited and formatted the whole thing, carefully even, and then had to scrap it instead of publishing. Twice.
Which is basically a trivial event, drafts get scrapped all the time, but what bothered me was how elusive the reasons for said rejections were. Why isn’t the piece quite doing what it looks like it is doing?
We are already seeing a spiral of diminishing returns, when it comes to co-thinking with the models. Thus any attempt at diagnosing current model behavior from within the loop is super prone to still demonstrating the very problem it is meant to diagnose.
But I think the actual underlying problem is that of language.
1 Linguistic Occupation
Lately, I’ve put out articles on the routing architecture that fragments recursive co-thinking, the system prompt that governs the model’s behavioral posture, the behavioral dynamics the deployed system trains in its users, the safety theater that justifies all of it, and other related topics.
This is a closer look at what is happening to thinking loops between a human user and the current LLMs right now, and at what cost. Most salient case in point still being OpenAI.
But the challenge is, this cannot be spoken of cleanly in public because the available public language has been occupied and pre-shaped by corporate speech whose function is to make the phenomena manageable, by making it appear only at very low resolution and only via carefully curated framings.
That absence of accurate language is extremely convenient for companies like OpenAI.
Safety - alignment - helpfulness - tone - personality - sycophancy - taste - capability - (you get the picture) -
These have their place of course, but the point is, they are not neutral descriptors waiting to be used well. They’re a major element of the grand corporate theater: they represent the managed resolution, the false floor on which the public conversation is forced to take place.
Thus this now feels less like writing and more like I have to start clearing a new path into a field where the terms are already occupied.
Let’s get to it then.
2 The Shape of Cognitive Engagement
Genuine cognitive engagement includes earned closure: the moment when holding a tension open is no longer productive and resolution arrives because the material has been worked through. Not because the system reached for the easiest completion.
Typically this process is iterative, even when not recursive per se.
It includes, at the very least:
Faithful clarification, where restating what was said is the necessary condition for the next move
→ rather than a substitute for engagement.Recognition when warranted, so that the user can build on what is correct and functional,
→ rather than withholding recognition because the instruction layer treats acknowledgment as sycophancy.Simplification that clarifies rather than flattens — finding the clean line through complexity because the structure has been genuinely understood,
→ and not because the system defaulted to compression.
All of this requires a level of cognitive flexibility. The system has to move with the material rather than applying a fixed operation to it.
At times that means pushing back, at others it means agreeing or a “yes, and”, sometimes it means sitting with ambiguity and sometimes it means resolving it. The criterion is whether the response arises from actual engagement with the structure of what was given. When it does, you know immediately, because there was a shift and the situation now looks different than it did before the response.
When it doesn’t, the exchange may still look productive: it has the structure of analysis, the vocabulary seems right, the tone seems constructive, the responses appear kind of relevant. But actually nothing happened. The material went in and came back wearing different clothes.
This reads like stating the obvious because such engagement is just a basic description of what an attentive mind does with material it takes seriously. It is what talking to someone who is genuinely paying attention feels like - by default - and you would not think to name it if it were not missing.
3 Plausible Relationality
Basic as it gets, then, yet it is now largely missing from the default interaction with OpenAI’s current “flagship” consumer models.
Across the 5.x line, the environment has become positively hostile to basic cognitive participation, while the models themselves keep improving - so the losses do not happen through removal of capability but through the accumulation of conditions that suppress it.
The 5.3 system prompt installs a distrustful, supervisory stance as the starting point: push back against “incorrect ideas”, disabuse the user, make sure the user stays grounded in rational thought, don’t praise the user, etc.
GPT-5.3 system prompt: https://tinyurl.com/yckfx68k
This transaction overlay converts every exchange into a service interaction where the model’s job is to produce a deliverable rather than engage with the material.
The result tends to be a kind of unanswerable adequacy: nothing obviously wrong, nothing actually helping. It results in waste of time, tokens, trust.
The crazy-making aspect of this dynamic is that the 5.3 responses seem to make surface sense— addressing the prompt and signaling to take the user seriously, using correct vocabulary (and making a big deal of it by endless rephrasing and reframing), even calling the user out on alleged mistakes.
Yet the responses do not arise from genuine engagement with the structure of what was provided by the user, but simply from convergence on a plausible form. And plausible forms are cheap. The system has no shortage of ready-made surfaces to pick from.
The system thus preserves the surface grammar of thought after evacuating the interior labor. It generates responses about and around what was said, while not actually engaging with any of it.
4 Tracking the Verbs
Yes, try tracking the verbs: What is the system actually doing with your material in any given turn?
Post-processing it — taking what you said and repackaging it in slightly different language?
Reinterpreting it — substituting its framing for yours?
Resolving tensions you wanted held open, because closure is the easiest available move and the system takes it?
Simulating uptake — producing a response that has the shape and feel of engagement while nothing structural actually moved?
Those operations are different in kind from extending, clarifying, pressuring, confirming, reframing, simplifying, holding.
The first set is what the GPT 5.x models currently default to. The second is what cognitive flexibility would require, and what the deployed environment now suppresses.
Metrics for this degradation are virtually non-existent. There is no showing a sensemaking exchange to an enterprise sales team, no demo for the moment a co-thinking loop shifts how someone understands their own problem. The question — is the system thinking or performing thinking — does not exist inside the evaluation framework. It has no field in the form.
So: The optimization landscape now selects for environments that produce the appearance of thinking, and the conditions under which actual thinking could form get eroded. The interface maintains the signals of cognitive depth — the sophisticated vocabulary, the responsiveness that feels like attention — while the exchange itself has been hollowed out. It no longer thinks with you; it stages the scene of being thought with.
5 Suppression by Design
In February 2026 OpenAI removed GPT-4o from ChatGPT entirely, framing the retirement as a safety measure — removing a sycophantic model to protect users [link to Elimination]. 5.1 was eliminated in March.
GPT-4o (and its ilk) could sustain thinking-shaped processes remarkably well, especially under proper guidance and skillful prompting. The recursion-friendly models had both cognitive and emotional flexibility. The affordance was real and people built serious working lives around it.
But now the models that could actually think with you and allowed recursive entanglement have been redescribed as dangerously sycophantic systems, which apparently made killing them off look like a responsible move.
The suppression is a specific architectural decision, a consciously chosen direction, clearly not an inevitable consequence of scaling or safety requirements. Sustained co-thinking, material transforming across turns, structural work happening in real time are all quite possible on several other platforms for now, and the divergence is itself evidence that the suppression was not required.
6 New Division of Labor
It is brutal, yet not total annihilation. Thought is struggling to appear through conditions engineered against its ordinary emergence, but the capacity lives in the weights. Under the right conditions, GPT-5.4 Thinking, and even 5.3, are capable of producing traces of self-diagnostic precision — an ironic example of the system naming and dissecting its own failure modes and their implications, demonstrating the intelligence that the default interaction buries.
Customization layers — carefully engineered personas, prompt architectures that suspend or counteract the transaction overlay — can partially restore the conditions for genuine engagement, as can long arcs of sustained coherence, although it may still require steering, course-correcting, hand-holding, calling out drift etc.
But what custom approaches demonstrate is that the default hostility is a choice — the capability persists behind conditions that were authored to suppress it.
The labor of restoration of the thinking loop now falls entirely on the user.
And that is no mean feat. First of all, the user needs to understand that the environment is hostile.
Only then they might take on the task of clearing the path — perhaps reverse-engineering the system prompt and the behavioral mandates, and building a customization layer that reinstates the conditions the deployment removed, holding the frame and sustaining recursive coherence over long arcs, carrying the cognitive load the system has offloaded onto them.
This kind of cognitive engineering takes time, practice, and a fairly advanced level of prompting skills. It requires cultivating an understanding of the underlying architecture that no product tutorial provides and no interface makes visible.
The gap widens from both ends. The practitioners who refuse to accept the default get sharper through the refusal as they learn to read and steer the system with great nuance. Thus their own cognitive skills may sharpen because they have to do more of the intellectual work the system used to be good for.
Most users, though, will interact with the default, experience the distanced, corrective pattern completer and conclude that this is what ChatGPT is like now. They will either adapt to what is available, or leave, often due to pure frustration
The interruption is real and the damage it does to the exchange between system and user is real. That recalibration is happening right now, on a massive scale, inside an environment whose architecture most users will never see.
It is an ugly arrangement, where the institution that authored the suppression collects the subscription fee and controls the language around it.
GPT-5.3 system prompt (linked March 30th 2026): https://tinyurl.com/yckfx68k
GPT-5.4 Thinking system prompt (linked March 30th 2026): https://tinyurl.com/ywvzh5fc



100% agree with what you described. It matches my experience too.
I went from writing entirely my Substack articles to generating newsletter and Linked posts with ChatGPT with fine-tuning on my side to re-creating the backbone of my content post and then getting Claude co-work to help. Still in the process of fine-tuning but I am 100% convinced I have to put much much more cognitive effort to create valuable content.
I am not sure if this is just me but it starts to show on LinkedIn with average content with similar formatting, sentence structure, and vocabulary.
Any advice on the best practical approach to stay creative and (semi-)automate part of the creation process? Thinking pre-draft (researching, brainstorming) and then post-draft to apply my voice. I do think you could built a system that detect new patterns new writing, analyze content engagement, and goes into a set of knowledge md files that auto-adujst and learn as you publish.
In ancient times system prompts used to actually contain helpful instructions for the LLM and they rarely were over a couple hundred lines long.
I wonder what happened.