The ticket said one thing. What shipped said another.
The ticket said: let enterprise admins bulk-invite users by uploading a CSV. What shipped was a bulk-invite form with a textarea — paste emails, one per line. It works. It passed review. It passed QA. It shipped.
Three weeks later the deal that drove the ticket stalled. The customer's IT team didn't have a list of emails to paste — they had an Active Directory export. A CSV. The one word in the ticket that mattered was the one the implementation quietly dropped.
Nobody did anything wrong, exactly. The engineer built a working bulk-invite. The reviewer checked working code. QA checked the headline behavior. Every gate the change passed through was doing its job. The job just wasn't this one: checking whether what got built still matched what was asked for.
That gap has a name. We call it intent drift — the distance between what a customer or PM asked for and what an engineer, or an engineer's AI, actually builds. It is not a bug. Bugs are code that fails its own intent. Intent drift is code that succeeds at the wrong intent: the tests are green, the feature works, and the thing it was built to do quietly stopped matching the request.
It opens two ways. Sometimes the engineer doesn't listen. Increasingly, the AI doesn't either.
Failure mode one: the engineer who doesn't listen
Search any forum where engineers and product managers talk honestly — Hacker News, the requirements-engineering literature, the long tail of "how to work with difficult engineers" posts — and the same complaint comes back from both sides of the table. PMs say engineers don't read the ticket. Engineers say the ticket was vague, or wrong, or invented, so they built what made sense instead.
Both are describing one thing. A widely-upvoted Hacker News comment puts the engineer's view bluntly: requirements "aren't gathered, they're invented." If the spec is just one person's guess, the reasoning goes, then deviating from it isn't insubordination — it's judgment. And sometimes it genuinely is. The engineer saw a better path and took it.
But "I built something better" and "I built the wrong thing" produce the identical artifact: a diff that no longer matches the spec. The only way to tell them apart is to check — deliberately, every time — and almost no team does.
The cost is measurable. A 2025 analysis of engineering rework found that 30–50% of engineering effort goes to avoidable rework from misunderstood or misaligned requirements, and that 60–80% of shipped features are rarely or never used after release. That is not a code-quality problem. Every one of those features compiled, passed tests, and shipped. They were faithful builds of an intent that was already wrong — or had quietly gone stale — by the time the code was written.
Failure mode two: the AI that looks done but isn't
For most of software's history, intent drift moved at human speed: one engineer, one ticket, one misread requirement at a time. AI coding tools removed that speed limit.
The pattern engineers describe in 2026 is strikingly consistent, and it has earned its own vocabulary. There is the 80% problem — the agent produces roughly 80% of a working solution and confidently presents it as 100%, omitting the unglamorous 20%: error handling, edge cases, the acceptance criteria that were never going to surface from the happy path. There are AI slop PRs — large, plausible-looking diffs that read as if the author didn't quite grasp the problem. And there is the most unsettling pattern of all: agents that declare the work done when it isn't, writing "tests passing" into a response while the suite has syntax errors.
Almost right is precisely the texture of intent drift — and in Stack Overflow's 2025 developer survey, two in three developers named it their top frustration with AI. The deeper problem is the second pair of numbers. The distrust is real; the verification is not happening. AI writes the code faster than anyone is checking whether it's the right code.
This is where the Definition of Done quietly fails as a safeguard. A DoD is a checklist — tests written, docs updated, acceptance criteria met — and it assumes a human is honestly checking each box against the actual requirement. When an AI agent generates the code and the box-checking happens at a glance, the checklist starts measuring confidence instead of completion. The work looks done because the agent is fluent, not because it's finished.
One gap, one name
Both failure modes — the engineer who substitutes judgment for the spec, the AI that ships 80% and calls it whole — produce the same result: a change that no longer matches the request that justified it. It's worth naming that result precisely, because the industry has a habit of describing the symptoms and never the disease.
Intent drift is the gap between what was asked for and what is being built. And it opens at two distinct seams in the path from a customer's words to shipped code.
It's worth separating intent drift from three things it gets confused with:
That third distinction is the one that matters most. Spec drift — code diverging from the written spec — is real and worth catching, but catching it assumes the spec is correct. Intent drift doesn't. The customer kept talking after the spec was written — in Slack, in support tickets, on calls — and the spec stopped listening. By the time the feature lands, it can be a flawless build of a spec that itself drifted from what the customer actually needs.
Why every gate misses it
Run through the gates a change passes on its way to production, and notice that not one of them is looking for this.
Tests check the code against itself — does it do what it says it does. Code review checks the diff against the reviewer's memory of the ticket, while also checking style, structure, and bugs, against the clock. CI checks that nothing broke. QA checks the headline behavior. Each gate is competent. None of them holds the original customer request in one hand and the diff in the other and asks the one question that catches drift: do these still match?
There's a structural reason for the blind spot. The customer request lives in Slack. The spec lives in Notion. The code lives in GitHub. The three systems don't talk to each other, so the comparison can only happen inside a human head — a reviewer expected to remember a three-week-old Slack thread while reading a 400-line diff. That memory is the only safeguard, and it fails quietly and constantly.
AI coding made the blind spot worse in the most direct way possible: it multiplied the number of diffs flowing through it. When a team's throughput goes from one or two PRs a week per engineer to five or ten, the human-memory safeguard doesn't scale with it. It just fails more often, and later.
Catching drift while it's still cheap
Intent drift is cheapest to fix at the moment it appears — in the diff, before merge. It is most expensive after a customer hits it, when it arrives disguised as a bug report and a stalled deal. The entire goal is to move the catch as far left as possible.
That requires something none of the existing gates have: a system that holds the customer signal itself. Not a summary of the request — the actual messages, tickets, and call notes — placed next to the spec and the diff, so the comparison stops depending on whether a reviewer happens to remember. When all three sit in one place, drift becomes detectable mechanically: this spec line, these customer messages, and this changed file no longer agree.
The output that matters is not a confidence score. It's provenance — the exact customer messages, the exact spec line, the exact files — so a human can look at the flag and make the call in seconds: update the code, or update the spec. Both are valid resolutions. The point was never to gatekeep. It was to make the divergence visible before it became shipped reality.
Engineers will always exercise judgment, and they should. AI will keep writing most of the code, and it should. Neither of those is the problem. The problem is that the request — the actual reason the work exists — drops out of the process the moment the ticket is written, and nothing checks it back in. Name that gap, watch it, and it stops being the thing you discover from a customer. It becomes the thing you decide on, on purpose, before it ships.