I have a new bias I did not want to develop: less control, more intelligence.
I used to believe the opposite. If you want reliable outputs, you should control the pipeline. Add more steps. Define more constraints. Eliminate uncertainty.
That works until it does not. And in the age of fast-improving models, it stops working surprisingly fast.
The thesis
When model performance is comparable, I will almost always choose the simpler implementation, even if it gives up a little short-term quality. The long run favors the system that can absorb smarter models without needing a redesign.
The thesis sounds soft. It is not. It is a practical operational rule that came out of painful debugging and a few humbling outcomes.
I am not arguing against control in general. I am arguing against over-specifying the middle of a system that sits on top of rapidly improving models. Every extra rule you hard-code is a bet that the model will stay weak in exactly that way. That is a bad bet.
The original workflow: control everywhere
Our first project was straightforward on paper: generate ad creative images for clients. We assumed quality would come from control, so we built a highly orchestrated workflow:
- extract a theme color from the input assets
- generate an image within that palette
- inject a logo with strict placement rules
- add headline and subhead text with template typography
- run a final pass for contrast and legibility
It was engineered, in the best sense of the word. It was also very fragile.
If a result looked wrong, we could not tell whether it was the palette extraction, the image generation, the overlay, or the text rendering. The failure modes were diffuse, and the pipeline had too many degrees of freedom. Debugging turned into archaeology.
Worse, every client request spawned a new exception. One brand needed extra padding around the logo. Another used monochrome logos that broke the contrast step. We had a spreadsheet of special cases that grew faster than the model quality did.
Debugging gravity: more steps, more unknowns
A multi-step pipeline creates a strong illusion of control. What it actually creates is a long causal chain. When a result is bad, the blame is spread across the chain. And when you try to fix one link, you often introduce a new flaw in the next link.
The practical problems were constant:
- a small change in a prompt broke the color extraction assumptions
- a new logo format silently shifted text placement
- a client insisted on a palette that clashed with text contrast rules
- fix one edge case, break three others
You can only fix this with more rules. And more rules mean more control, which means more brittle behavior. The pipeline becomes an optimization trap.
We tried to instrument every step with metrics. It helped, but it did not solve the core issue: the system was optimizing for internal consistency, not external impact. We spent time tuning palette extraction instead of asking whether the resulting ad was actually better.
When models jump, pipelines freeze
Then the models improved.
A new image model dropped (the team called it "nano banana" internally). It was simply better at composing scenes and respecting prompts. But our pipeline could not benefit. It was locked into a style of control designed for a weaker model. The new model did not need color extraction or heavy templating. It could generate the right tone and composition directly.
We had built a system that assumed the model would stay weak. When it got strong, our system became the bottleneck.
This is the part I did not anticipate: over-control ages poorly. It is built around compensating for a model you no longer have.
The simple strategy that beat the complex one
We eventually tried a drastically simpler approach:
- one strong prompt
- minimal constraints (logo and safe text zones)
- a small number of example references
That was it.
It felt irresponsible. But the results were generally better. Not always better. But better enough, and vastly easier to debug. Most importantly, the outputs improved automatically as models improved. The pipeline did not need to change. We just swapped the engine.
The lesson landed hard: simple loops track model progress; complex workflows fight it.
Why did it work?
- fewer transformations meant fewer opportunities to degrade quality
- the model could solve composition end-to-end instead of being chopped into steps
- evaluation moved from "is the pipeline obeyed" to "does this look good"
The last point mattered most. We stopped rewarding compliance and started rewarding outcomes.
One loop beats twenty tools
I saw the same pattern elsewhere. The Decoding Claude Code post points out that Claude Code works with a single main loop in most cases. It leans on heuristics and examples rather than complex orchestration.
The power is not in the number of moving parts. It is in the quality of the loop.
A friend at another company told me their agent framework had poor abstractions for subagents. The result was a "neutral" agent that had to orchestrate more than twenty tools just to get basic work done. When outputs were wrong, nobody knew where to look. The system was busy, not intelligent.
This is a warning sign: if you need twenty tools to do a simple thing, you have not built intelligence. You have built plumbing.
DeepSeek R1 and the outcome-first mindset
DeepSeek R1 reinforced this idea in a different way. The model does not expose a carefully controlled intermediate process; it is evaluated on outcomes. The system is allowed to discover its own internal path to the result. That is uncomfortable for engineers who want traceability, but it is often where the real intelligence shows up.
In other words: stop dictating how the model should think. Focus on whether the result is correct, useful, or aligned. The more you force a specific process, the less room the model has to use its strength.
Why simplicity wins over small performance gains
So my rule of thumb now looks like this:
- If two systems are close in performance, pick the simpler one.
- If the simpler one is slightly worse today, you might still pick it if it is easier to upgrade tomorrow.
- When in doubt, reduce the control surface.
The argument is not philosophical. It is practical. Simpler systems:
- are easier to debug
- can absorb new model capabilities without redesign
- scale with data quality and model intelligence
- reduce the maintenance burden when requirements shift
If you aim for a 5% quality gain through heavy orchestration, you are usually paying for it with 50% more complexity. In a world where models improve every quarter, that trade is often negative.
The principle: less control, more intelligence
This is not an argument for chaos. You still need guardrails. You still need evaluation. You still need safety and compliance.
It is an argument for choosing where control actually matters:
- control the inputs (data quality, examples, constraints)
- control the outputs (evaluation, review, human accountability)
- loosen the middle
The middle is where models are getting smarter fastest. That is where you should allow them room to breathe.
Closing thought
If intelligence keeps rising, your system should be built to rise with it. That usually means fewer rules, fewer steps, and more trust in a well-designed loop.