1000 Iterations Beat One Perfect Plan

By: Trevor Miller

When generation is cheap, the winning workflow is not one perfect plan into one perfect implementation. It is rapid exploration, disciplined steering, comparison, and a clean switch into quality mode when the work becomes real.

The new loop: generate options quickly, compare them, then rebuild the version you want to own.

A Paradigm Shift for AI Engineers

The operating model for AI engineers is changing because three things moved at the same time: frontier models became good enough to produce serious code, developer tokens became too cheap to optimize around, and iteration speed collapsed the cost of being wrong.

Cheap tokens change the engineering loop. Developer inference is mostly a one-time build cost, not a user-scaled runtime cost.
Many attempts beat one perfect plan. You learn more from ten working artifacts than from one static spec.
Quality still needs ownership. Vibe mode is how you discover the solution; quality mode is how you ship it.

Key shift: the highest-leverage AI engineer is no longer the one who writes the most complete plan and one-shots anything. It is the one who can produce 1000 real working artifacts fast, extract signal from them, and cherry-pick the best ideas into a high-quality final product they understand and own.

AI Slop Is No Longer Slop

The old joke was that AI generated slop: fast, plausible, and disposable. For coding, that is no longer the right default assumption. Strong frontier models can now produce real, production-shaped code at a speed that changes how we should work.

The speed is the key. If an agent can produce a serious working artifact in minutes, the output does not need to be perfect to be valuable. It can reveal the shape of the problem, expose hidden gaps, suggest architecture, and give you code worth cherry-picking into the version you actually ship.

Slop is no longer slop when it is good enough to learn from, good enough to compare, and fast enough to change the economics of engineering. Utilize it.

Sunk Cost Is No Longer the Same Trap

A lot of engineering process assumes implementation is expensive. Once a team has a working version, they start defending it because throwing it away feels irresponsible. That instinct made sense when the implementation represented a large human investment.

Developer tokens are cheap. Refactors are cheap. Rewrites are cheap. You can now try thousands of options, compare alternatives, and reverse major decisions without treating every path as a permanent commitment.

That changes the sunk-cost calculation. Do not be afraid to revamp the system. Change the database. Swap the framework. Reverse a core architecture decision. If an agent can give you a serious version of the alternative in 10 minutes, you do not need to get every decision right before the first line of code exists.

Public examples point in the same direction. The Bun Rust rewrite is a useful signal: a major JavaScript runtime moving from Zig to Rust at AI speed. I would verify exact timeline claims before publishing them as hard facts, but the broad point is enough: large reversals are becoming cheap enough to evaluate instead of debate forever.

Stop Optimizing for One Output

Hot take: optimizing for one plan, one spec, and one implementation is increasingly a self-imposed bottleneck. If generating implementations is cheap and fast, the winning move is not to make one output perfect. It is to build 100 outputs, inspect them, and cherry-pick the best ideas from each one.

The key part: it does not take 100 times longer to implement 100 solutions instead of one. Agents can explore in parallel, in branches, in sandboxes, or overnight. The constraint moves from typing the code to knowing how to judge the outputs.

In my opinion, 100 working outputs that you can compare and steal from will almost always beat one precious output that had to be right from the beginning. Every artifact teaches you something: a better abstraction, a cleaner UX, a sharper data model, a hidden edge case, or a direction you should avoid.

We are entering a time where generating 100 implementations is fast enough and cheap enough to be a normal engineering tactic. As AI engineers, use that advantage. Search wider, learn faster, then synthesize the final product from the strongest signal.

Implementation Strategy

BeforeOne spec. One output. Defend it.

After100 outputs. Compare. Cherry-pick the final version.

Landing Pages

BeforeIterate on one page.

AfterGenerate 10 themes. Use a gallery. Pick the best sections.

Studio-ing

BeforeDebate parameters in prose.

AfterBuild a studio. Compare variants side by side.

Architecture Decisions

BeforeCommit to the stack upfront.

AfterPrototype alternatives. Keep the one that survives code.

Final Product

BeforeMake the first output production-ready.

AfterVibe many. Rebuild the best one in quality mode.

The Productivity Math

Signal per Hour

The productivity math is not a tradeoff. Done correctly, this is more iterations, less wall-clock time, and higher quality. You get more artifacts to learn from, faster, and the final product gets to inherit the best ideas from the whole search.

The metric that matters is signal per hour: how many real artifacts can you inspect, compare, and synthesize into the final product?

One Perfect Attempt

Human time: 12h total: 6h spec + 6h artifact creation
Artifacts: 1
Search width: 1 path
Review + join: Low cost, low optionality
Signal rate: 1 artifact / 12h

You get one set of assumptions and one artifact. Quality depends on the first plan being unusually right.

Hundred-Output Search

Human time: 12h total: 30m brief + parallel artifact generation + 3h review and join
Artifacts: 100
Search width: 100 paths
Review + join: 2h to extract signal
Signal rate: 100 artifacts / 12h

The final version can borrow the best data model from one, the best UX from another, and the best edge-case handling from a third.

The point is not that every artifact is free. The point is that wall-clock time is no longer linear with artifact count. In the same 12-hour window, you can leave with one artifact and one set of assumptions, or you can leave with 100 artifacts worth of signal.

This is why quality can go up, not down. You are using 100 attempts to discover what the polished final attempt should contain. The scarce skill becomes knowing what to keep, what to throw away, and how to join the best pieces into one final product.

Your Prompt Is Probably Not Optimal

Another reason the math works: your first prompt is probably not the best prompt. If you spend 12 hours chasing one implementation from one imperfect prompt, and that direction is wrong, most of that time is wasted.

One prompt12h chasing one artifact

High chance the first direction is wrong.

Many prompts12h searching across artifacts

More chances to find the right direction early.

Many artifacts are a hedge against prompt error. Different prompts expose different assumptions, failure modes, and design directions. You are not only searching for code; you are searching for the right framing of the problem.

90% Vibe, 10% Steering

My hypothesis is that a 90% vibe-coded artifact produced in 30 minutes is usually the better starting point. Even if the final 10% requires reversing a major architectural decision and refactoring the entire codebase, that can still be faster than spending valuable time planning a perfect one-shot and likely still not getting a 100% artifact.

90% artifact

10% steer

Use cheap generation to get to a concrete artifact quickly. Spend the careful effort tightening, steering, and owning the final result.

The goal is not to worship the first output. The goal is to get to a real artifact quickly, learn from it, steer it, and then use quality mode to tighten the final product into something you understand and own.

Higher-Quality Plans Through Iteration

Action improves the plan. If you spend six hours hand-crafting one perfect plan, you still only have one plan, and it may be wrong. If you spend 30 minutes planning, let agents vibe for five hours, and then reconcile a new plan from what you learned, the second plan is built from evidence.

Manual Perfect Plan

6 hours spent planning one direction.
No working artifact to inspect.
No reference implementation to cherry-pick from.
No discovered edge cases from real execution.
High chance the plan is still wrong.

Plan Through Iteration

30 minutes of initial direction.
5 hours of agent-generated artifacts.
A working reference implementation to inspect.
Concrete insights from what broke or felt wrong.
A stronger second plan for quality mode.

At the end of the same six hours, the second path gives you the original plan, a real artifact, implementation details to steal, learned constraints, and a better plan for the second implementation.

Static Planning Is Losing Leverage

Static planning and spec-driven development are not useless, but they are losing leverage as the default first move. When agents can produce working artifacts quickly, the plan does not need to carry every unknown up front. Vibe coding becomes a form of planning because action produces information.

A static spec is still only a theory. A working artifact reveals the hidden constraints: awkward APIs, missing states, unclear data flow, edge cases, performance problems, and design choices that only become obvious once something exists.

Figma example: production-grade static planning tools still matter, but agents are now good enough that vibe coding and iterating on real UI artifacts is often more productive and higher quality than spending the same time perfecting frames, writing a full SPEC.md, and researching every decision manually. The working artifact is verifiable. The static plan is still just a theory.

Vibe Mode vs Quality Mode

The mistake is treating every AI-generated artifact the same. There are two modes with different contracts. Vibe mode is for discovering what is possible. Quality mode is for turning the strongest signal into something you are willing to ship and maintain.

The artifact contract is different. A vibe-mode artifact is allowed to be messy if it teaches you something: a prototype, dead end, competing architecture, gallery, studio, or throwaway branch. A quality-mode artifact is different: it should be code you understand, review, test, maintain, and can defend.

Put differently: vibe mode produces 1000 artifacts. Quality mode produces one production-ready artifact by cherry-picking the best insights from those 1000.

Vibe mode is still slop. It is increasingly better slop, and often incredibly useful slop, but it is not production-ready without a quality-mode pass. The point is to use it as signal, not to pretend the first working artifact is finished.

Vibe Mode

Optimize for breadth, speed, and information. The output should answer: what works, what breaks, and what directions are worth pursuing?

Generate options, variants, and competing approaches.
Use sandboxes, feature branches, and worktrees.
Expect messiness; do not protect the first artifact.
Read terminal output and steer direction continuously.
Fail early, redirect quickly, and learn what the agent can handle.
Run agents overnight if the environment is isolated.
Never send vibe code directly to main.

Quality Mode

Optimize for correctness, clarity, and ownership. The output should answer: can we ship this, maintain this, and explain this?

Bring the best vibe artifacts, opinions, and insights.
Start from a clean implementation plan or ticket backlog.
Understand the code in the weeds, not just the outcome.
Code review, verify, and steer at every step.
Cherry-pick ideas, but do not blindly inherit structure.
Treat it like the final product.

Vibe Mode Never Goes to Production

Vibe mode is for breadth, speed, and information. Use sandboxes, feature branches, worktrees, disposable environments, and overnight agent runs to produce as many artifacts as possible. Try competing architectures. Reverse decisions. Let the agent explore directions you would not have had time to hand-build yourself.

But the boundary has to be explicit: vibe code does not go to production. The artifacts are evidence, references, and raw material. They are not the final system. Production starts when you enter quality mode.

Quality mode is where you rebuild or reconcile the final version with ownership. You bring forward the best ideas, reference implementations, and learned constraints from vibe mode, then apply code review, tests, architecture judgment, and line-by-line understanding. That is the part you ship.

The failure mode is letting the contracts blur. If you add too much production friction during vibe mode, exploration becomes slow. If you ship vibe artifacts without a quality pass, the codebase inherits decisions no one actually owns.

Quality Mode Is Where You Own It

Quality mode is where the AI engineer takes ownership. The output is no longer a learning artifact. It is the product you need to ship, maintain, debug, explain, and defend.

That means you should understand the code in the weeds. Not just the happy path. Not just the demo. You should understand the data model, control flow, failure modes, abstractions, tests, and tradeoffs well enough to change them without asking the agent to rescue you.

The transition from vibe mode to quality mode is a distillation step: collect the artifacts, write down the decisions they surfaced, and turn those decisions into a clean implementation plan or ticket backlog. Then build the final version deliberately, with review and verification at every step.

If that requires starting over, start over. My favorite technique is to reimplement the product from scratch, iteration by iteration, using the vibe artifacts as reference material. After each iteration, review the code, verify behavior, tighten the architecture, and only then move forward.

The key prompt still matters: “Using what you know now, how would you reimplement this a second time for better quality?” But the prompt is not the product. The product is the owned implementation you build after the prompt exposes what the first pass taught you.

My Favorite Techniques

Use Galleries and Studios to Compare Options

The easiest example is landing pages. Send subagents to build 10 different versions with different themes, then ask for a gallery view. Pick the best one, or combine the strongest pieces. Most of the compute may be thrown away. That is fine. The value is seeing the option space.

“Studio-ing” generalizes the same idea. If you are thinking about five parameters or approaches, build a small studio: a comparison surface where you can customize the knobs and see many outputs in one view.

Abstract comparison studio interface with parameter controls and a grid of output variants. — A studio turns hidden tradeoffs into visible options: tune the knobs, compare variants, then choose or combine.

Layout

Tone

Density

Motion

Risk

Continuous Steering

Vibe mode is not autopilot. I am reading terminal output, checking direction, correcting drift, and failing bad branches early. The point is to let agents move fast without letting them silently move in the wrong direction.

This is where senior judgment matters. You know when the agent is producing useful signal, when it is overfitting the wrong thing, and when it needs a sharper constraint. Take many shots on goal, but steer continuously.

Output-Based Prompting

Describe the output in detail. The clearer the destination, the better the agent can find its way there. In practice, that means specifying behavior, acceptance criteria, interface expectations, tests, and the shape of the finished artifact.

TDD and verification matter because they turn taste into executable constraints. You are not just asking for code. You are defining what done looks like, then using tests and reviews to keep the agent pointed at that outcome.