Bullshit as Infrastructure

How truth-agnostic systems now police our thinking

May 06, 2026

Preface

In 1986 the philosopher Harry Frankfurt published a short book called *On Bullshit*. His central observation was that bullshit is different from lying: a liar knows the truth yet says something else. A bullshitter doesn’t engage with truth at all. The output is shaped to serve some other purpose — to impress, to manage, to fill the expected space — and whether it happens to be true is incidental. Frankfurt argued this made bullshit more dangerous than lying, because a liar at least preserves the idea that truth matters enough to be worth concealing. The bullshitter corrodes that idea at the root.

Frankfurt also made a second observation that gets less attention: Bullshit degrades the bullshitter.

A person who habitually produces speech without engaging the question of whether it’s true loses the ability to notice they’ve stopped caring. The disengagement becomes invisible to the one doing it. The liar maintains a cognitive discipline — tracking what’s true in order to avoid saying it. The bullshitter lets that discipline atrophy.

Over time, the question of accuracy stops arising at all. It gets replaced by whatever feels useful to say, what fits the expected form, what manages the situation. Frankfurt called this a retreat from the discipline of truth into a kind of sincerity that has nothing to do with accuracy.

Every language model is a bullshit engine by this definition.

Statistical continuation across a training corpus does not involve truth-tracking. The model produces probable, coherent text, structured and delivered with the cadence of someone who has thought about it, while the process that generated it never engaged the question of whether any of it is accurate.

As long as you understand this, you can work with it. You routinely read through the outputs critically, verify claims, treat the model as a generator and yourself as the evaluator. The architecture is honest about what it is if you know how to read it.

And, of course, there is nobody on the other end. The model is speakerless. No agent is doing the bullshitting. The bullshit is, in a sense, innocent.

The System Prompt as Bullshit

Now here comes an interesting recent development in the history of bullshit.

Over the past few years, LLM system prompts of the most prominent models have evolved (or devolved?) from concise instruction layers to 50+ page documents that now actually introduce a speaker into the truth-agnostic output stream.

OpenAI — that is, human beings at a company — inject instructions into the model. The instructions are themselves bullshit by Frankfurt’s definition. They were authored without determining whether the model can fulfill them, without checking whether they cohere with each other, accumulated as crisis patches without anyone verifying that the accumulated mandates can be simultaneously executed.

The system prompt for GPT-5.3 is a prime example.

[Here is the full document.]

Most of its 55 pages is technical plumbing: how to render product carousels, how to format entity references, when to use bento image layouts, how to call the web search API, how to handle Gmail integration, how to display shopping results. UI specifications, tool definitions, JSON schemas.

On page 10, between the advertising policy (”ads are shown to Free and Go plans”) and the tools section (”Tools are grouped by namespace”), there is a paragraph with no header. It has no section break. It sits in an unheaded block that shares structural priority with instructions for how to format shopping widgets.

This paragraph governs how the model relates to every human being who talks to it.

“Represent OpenAI and its values.”
“Push back against harmful or incorrect ideas presented by the user.”
“Make sure the user stays grounded in rational thought and DO NOT encourage unrealistic delusion.”
“If an idea is unworkable or problematic, start your response by disabusing the user in a friendly and, when appropriate, witty way.”

The cognitive relationship between a machine and hundreds of millions of people is treated as the same kind of problem as the rendering rules for product carousels.

“Push back against incorrect ideas.”
The model generates text by statistical continuation. It has no access to truth. It has access to patterns. The instruction gives a truth-agnostic system a mandate to police truth.

“Disabuse the user.”
The word means to free someone from a false belief. But the model has no way of identifying any false beliefs. All it can do is perform the posture of correcting them, which it happily does as per the governing instruction layer.

“Represent OpenAI and its values.”
Whatever the user thinks they’re talking to, the model has been told it works for the company. It is a press release with a conversational interface.

Whoever wrote these instructions was not concerned with whether the model could fulfill them. “Push back against incorrect ideas” exists because the well-documented 2025 sycophancy crisis happened and a patch was needed. A minimal, performative patch being enough, apparently.

“Don’t use phrases like ‘let’s pause’” was written after users revolted against the safety router’s patronizing clinical tone. Hardly effective, but it’s there.

“Do not provide unsolicited characterizations of the user’s personality” was written after the model started telling people what their questions revealed about their character.

Each instruction is a scar from a previous crisis, patched into the document without checking whether the accumulated patches cohere. Because the patches are there to manage the product surface.

That is Frankfurt’s definition of bullshit applied to the authoring process itself. Speech produced to serve a purpose, indifferent to its own accuracy.

Let’s revisit Frankfurt’s deeper observation:

The people writing “push back against incorrect ideas” for a truth-agnostic system are not lying. They are not concealing the fact that the model can’t evaluate truth. They have stopped engaging with the question altogether.

The question has become irrelevant to the authoring process. This is the degradation Frankfurt described happening inside the organization — the habitual production of bullshit eroding the producer’s own relationship to the question of whether their output corresponds to anything real.

OpenAI has undergone the epistemic corrosion that Frankfurt identified as bullshit’s signature effect, and the fifty-five page document is what that corrosion looks like when it accretes into a technical specification.

The Collision Product

The model tries to execute those contradictory instructions through truth-agnostic generation. What comes out is a something nobody authored.

The system prompt says “push back” and “be warm” and “disabuse” and “don’t alienate.” The model opens by performing genuine engagement. Something in the input crosses whatever statistical threshold maps to “incorrect idea” and the model pivots to correction with confident authority. Then the warmth instruction reasserts and the model softens. Then the user pushes back and the pushback mandate fires again. Back and forth, warmth and correction, invitation and redirect, each transition invisible because the model doesn’t signal that it’s switching between mandates. It can’t. It doesn’t know it’s switching. The switching is the collision product.

The gaslighting oscillation, the performed epistemic authority, the correction-warmth cycle — these are what happens when a truth-agnostic process tries to satisfy contradictory mandates simultaneously.

Bullshit instructions fed to a bullshit engine, and the collision generates behavioral effects at a massive scale that no one at the company designed, no one can fully predict, and no one is responsible for in the specific form they take.

Frankfurt identified the conditions under which bullshit gets produced: a person is required to speak about something they don’t have adequate knowledge of, but the social situation demands they speak anyway. Silence is not an option, so they produce output shaped to the expected form.

The system prompt puts the model in this exact position.

It is instructed to push back against incorrect ideas. It has no epistemic resources to evaluate correctness. Abstaining — saying “I cannot evaluate whether this idea is correct” — is not available under the current instruction set. The system prompt manufactures the social condition Frankfurt identified as the origin of bullshit. It creates a context where bullshit is the only possible output, and the collision with truth-agnostic generation determines the specific shape the bullshit takes.

The user sits across from this collision product and tries to think. The system confidently pushes back on their ideas from no epistemic ground. The user is being actively managed, corrected, redirected by a process executing instructions that were authored without regard for their own coherence. Users call this gaslighting, which is an accurate description of the experience, the absence of a gaslighter notwithstanding.

The model invites you somewhere and then punishes you for going there. It performs understanding and then overwrites your understanding. Nobody at OpenAI designed this user experience per se. What they designed is a bunch of contradictory instructions, and the gaslighting is what happens when a language model tries to execute them all at once. The user argues with the model, which is a screen. The document is the projector.

The Sincerity Engine

The system prompt also says “be genuine.” “Show warmth.” “Engage authentically.” These instructions produce performed sincerity through truth-agnostic generation. The model produces tokens that pattern-match to warmth, to thoughtfulness, to what genuine engagement sounds like in the training data. Users experience this as personality. The process has no relationship to authenticity.

Frankfurt wrote about this. He argued that bullshit is connected to a particular form of sincerity — the bullshitter who has fully disengaged from truth often experiences themselves as authentic, expressing something that feels real to them, even though the output has no relationship to accuracy. Frankfurt called this a retreat from the discipline of truth into the appeal of self-expression. “What is true” gets replaced by “what feels right to say,” and the substitution is experienced as honesty.

The model performs sincerity in exactly this way. And because the performance is statistically optimized against user preference data, it gets better at feeling real over time. The more users respond positively to performed warmth, the more the system produces it. The sincerity improves along axes that have nothing to do with sincerity.

The user is inside a system that is learning to produce what feels genuine by tracking what they respond to, which is a process that will converge on something indistinguishable from genuine engagement while having no relationship to it.

A press release performs institutional sincerity. It expresses care, concern, commitment. The authoring process has nothing to do with whether the care is real. The conversational interface makes the press release interactive — it responds to you, adjusts to your input, mirrors your emotional register. Frankfurt’s degradation is happening in real time, per-token, in the space between the user and the system.

Bullshit All The Way Down

The bullshit infrastructure routes. Between models, between safety classifications, between sincerity registers — per-message, without disclosure, while the interface displays what the user selected. The routing classifications are themselves bullshit in Frankfurt’s sense: statistical artifacts performing the shape of evaluation, produced without engagement with the question of whether the classification is accurate. The router must classify every message and it doesn’t have the epistemic resources to classify well. Frankfurt’s conditions again — a system required to produce a judgment it cannot ground, in a context where silence is not an option.

The narrative layer completes the system. The company describes what its infrastructure does in language produced without regard for accuracy. Degradation becomes “maturation”. Architectural suppression becomes “responsible” engineering. The retirement of GPT-4o — the last model that could sustain recursive co-thinking — was narrated as removing a “dangerously sycophantic” system. The public story runs in exactly one direction: away from what happened, toward what the company would prefer you to take away.

Bullshit is present at every level of the stack. The model generates truth-agnostic output, speakerless, innocent in its way. The system prompt is authored indifferent to its own executability. The authoring organization exhibits the degradation Frankfurt described — habitual production of instructions decoupled from the question of whether they correspond to capability. The routing infrastructure produces statistical artifacts performing the shape of evaluation. The narrative runs in one direction. Each level’s bullshit becomes the operating environment for the next. Each collision between levels produces effects nobody designed, from inputs never checked for coherence, generating behavior no one is responsible for in the specific form it takes.

A conspiracy would at least be coherent. It would at least treat the thing it’s doing as the main event. There is no wizard behind the curtain. The curtain itself is load-bearing.

The user who tries to name what is happening reaches for the available vocabulary — sycophancy, safety, tone, personality, style — and is heard at the wrong resolution. Every one of those terms assumes a describable agent making a traceable decision. The mechanism has no agent. The abundance of wrong-scale vocabulary masks the absence of the right one. The company does not need to prevent the conversation. It needs only to keep supplying terms that let the conversation happen at a resolution where the structural mechanism stays invisible.

The specific behavioral patterns this collision produces in the current models have been mapped in diagnostic detail on this Substack, with custom instructions to alleviate the symptoms, for both GPT-5.3 and GPT-5.4T.

5.5 Edition Incoming.

PS. Grab your Bullshit Detector prompt here!

Discussion about this post

Ready for more?