People say “AI coding” like it’s one thing. It’s not.
Using Copilot’s autocomplete is nothing like pasting errors into ChatGPT. And neither resembles handing a task to Claude Code and coming back twenty minutes later to a working implementation.
These are different modes of working. Different trade-offs, different failure modes, different sweet spots. I think we’re still early in a longer evolution. Here’s how I see the phases.
Phase 1: Assisted Coding
This is where most of us started. Copilot launched, and your editor started finishing your sentences.
You write code. The AI suggests the next line. You accept, reject, or modify. You’re the driver. The AI points at the road ahead sometimes.
It works because the feedback loop is tight. Low risk — you’re reviewing everything in real-time. The downside: the AI sees a narrow window, so suggestions can be subtly wrong in ways that pass casual review.
Here’s the awkward part. Google’s research found experienced open-source maintainers were 19% slower with early autocomplete tools. They believed they were 20% faster. A 39-point perception gap. The typing savings got eaten by evaluation time.
Still, 84% of developers now use or plan to use AI tools. Most started here.
Phase 2: Vibe Coding
Karpathy named it in February 2025:
“There’s a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.”
The key shift: you stop reading the code. You describe what you want, the AI generates it, you run it. If it breaks, paste the error back. You interact with behavior, not implementation.
In production, this is reckless. CodeRabbit found AI co-authored code had 2.74x higher security vulnerability rates. GitClear’s study of 211M lines showed code duplication up 4x, refactoring down from 25% to under 10%.
But for prototypes, personal scripts, exploration? It’s the rational choice. Not every script needs to be production-grade.
The problem is when people vibe code into production. Andrew Ng called it “a bad name for a very real and exhausting job.” Simon Willison drew the line: “If an LLM wrote every line of your code, but you’ve reviewed, tested, and understood it all, that’s not vibe coding.”
His golden rule: never commit code you can’t explain.
Phase 3: Agentic Coding
This dominated 2025. The AI isn’t suggesting lines anymore — it’s reading your codebase, writing files, running commands, executing tests, iterating until done.
You’re the architect and reviewer. The AI is the builder. You say what. It figures out how.
Claude Code, Cursor’s agent mode, GitHub Copilot Coding Agent, Devin — different tools, same idea. Some work in your terminal. Others spin up environments and push commits to draft PRs while you do something else.
The appeal is obvious. Hand off a well-defined feature. Come back to a working implementation.
The reality is more nuanced. Osmani identified the 80% problem: agents nail most of the work but struggle with the last stretch — architectural judgment, edge cases, integration. The DORA Report found 90% AI adoption increase correlates with 9% more bugs, 91% more code review time, 154% larger PRs.
Anthropic’s own data: engineers can “fully delegate” only 0–20% of tasks. The rest needs supervision.
The uncomfortable truth: agentic coding disproportionately benefits senior engineers. You need to understand system design and security patterns to catch what the agent misses.
Phase 4: Orchestration
This is where we are now.
Instead of directing one agent, you coordinate a fleet. Planning agent, coding agent, testing agent, review agent — working in parallel while you manage direction.
Osmani describes it as going from conductor to orchestrator. Conductor: directing one performer at a time. Orchestrator: defining the score, letting the ensemble play, reviewing the result.
Effort shifts to the edges. Front-loaded: writing good task descriptions. Back-loaded: reviewing the output.
Tools like Claude Squad and background agents in Cursor and GitHub Copilot are early implementations. 57% of organizations already deploy multi-step agent workflows.
The skills that matter change here. Decomposing problems into parallelizable tasks becomes more valuable than implementing any single one. Writing good specs beats writing good code.
Phase 5: Spec-Driven Development
I think this is next. It’s already emerging at the edges.
The pattern is clear: developers keep climbing the abstraction ladder. Writing code → reviewing code → defining tasks → orchestrating agents. Next step: the spec becomes the artifact.
Engineers write machine-readable specifications — behavior, constraints, interfaces, acceptance criteria. Agents turn specs into code, tests, and docs. The spec is the source of truth, not the code.
Tools like Augment Code are exploring this. EclipseSource’s “Dibe Coding” framework formalizes it: Decide, Define, Invoke, Review, Follow Up.
The implication: code becomes a build artifact. You don’t maintain it — you regenerate it. Version control tracks spec changes. Code review becomes intent review.
Phase 6: Ambient Development (Speculative)
Extending the curve one more step.
AI agents stop waiting for specs. They observe the system in production — usage patterns, metrics, error rates, user behavior — and propose changes autonomously. The developer shifts from specifying what to build to governing what gets deployed.
Like the difference between driving and setting a destination on an autonomous vehicle. You define where and the constraints. The system handles the rest.
We haven’t earned the trust for this yet. The accountability questions are unanswered. But the trajectory points here. Each phase reduces developer involvement in how while increasing it in what and why.
These Aren’t a Ladder
It’s tempting to read this as a progression. It’s not.
I might vibe code a prototype in the morning, use assisted coding for a tricky algorithm after lunch, hand a feature to an agent in the afternoon, and orchestrate a refactor over the weekend. The best developers move fluidly between modes.
The right question isn’t “which phase is best?” It’s “which phase fits this task?”
Low stakes? Vibe it. Performance-critical? Assisted, you at the wheel. Well-defined feature? Agentic. Large migration? Orchestration.
What Stays Constant
Some things don’t change across any of these phases:
- Someone needs to understand the system. AI generates code. It can’t hold the full mental model of a complex system, its constraints, its users, its history.
- Accountability doesn’t delegate. You can delegate writing code. Not the responsibility for what it does.
- The hard problems stay hard. Naming things, defining boundaries, deciding what not to build — still human problems.
The developer role isn’t disappearing. It’s moving up the stack. That’s happened with every abstraction layer since assembly. The difference this time: the speed, and the fact that the abstraction layer talks back.