Claude Code doesn't need more frameworks, it needs agency

September 10, 2025

Introduction

Yesterday I read the article Claude Code Framework Wars and it made me reflect on the direction we are taking with these frameworks. The idea of adding layers of order, phases and control over Claude Code sounds attractive at first. But reflecting on it, my experience points in another direction. Claude Code already knows how to plan, organize itself and keep track when it needs to. The real challenge is to give it the right context, tools and autonomy, not to constrain it with additional bureaucracy.

1. What Claude Code already does

It's worth remembering: Claude Code already includes much of what these frameworks try to replicate.

  • Plan Mode: ability to create and follow work plans.
  • Subagents: specialists with their own context, permissions and tools.
  • Hooks: deterministic steps that guarantee commits, tests or validations.
  • Slash commands: shortcuts that turn frequent prompts into commands.
  • CLAUDE.md: file that defines standards and decisions for your project.
  • Worktrees and devcontainers: parallel sessions and secure, reproducible environments. Claude Code doesn't use worktrees by default because it prioritizes simplicity in the standard experience, but it proposes them as a best practice when you need to isolate branches, compare approaches or run multiple sessions in parallel without conflict.

With these pieces, the agent can already organize itself, coordinate tasks and follow your rules without relying on external frameworks.

2. Not all tasks are the same

I recently heard a Claude Code engineer describe three types of tasks:

  1. Fast and clear ones: with just a few instructions, Claude Code understands and delivers directly.
  2. Exploratory ones: problems where the developer and Claude Code work together to discover the solution, advancing step by step, reviewing deviations and deciding when to pivot.
  3. Overly complex ones: tasks that, even if explained, are still more efficiently done by the human engineer to guarantee quality and avoid wasting time or tokens, leaving Claude Code for supporting tasks such as testing or validation.

This classification makes sense, but it's also limited. In practice, each company and project could have n categories of tasks. Needs are infinite, they don't fit into a fixed typology.

And here lies a problem: when we try to force everything into a single framework, the framework itself starts generating artificial tasks and steps. We end up in a loop of constantly tweaking it to make it "fit all cases." The experience shows that this generic framework never works: it's always too short for some tasks and too long for others.

That's why I believe it's a mistake to always require phases like detailed specs, opening issues in Linear or GitHub, or writing Markdown requirements if they're not necessary. If you already know the problem, where it is and how to fix it, you can explain it directly to Claude Code. It will implement it correctly and, if it generates a plan beforehand, it will follow it precisely. So why stop and delay progress with intermediate documents? In those cases, extra phases don't add value—they get in the way.

The key is flexibility: each task should have the level of structure it needs, no more and no less. If the solution is already clear, there's no reason to postpone it in the name of process.

3. Why restrict its judgment?

One of AI's main virtues is its ability to discern. When you impose a universal framework, it loses judgment: it forces unnecessary steps, invents phases or generates overengineering just to "comply."

There's also another point we often forget: no plan will ever be perfect at the first shot. Not the one the agent proposes, not the one a framework dictates, not even the one a human programmer drafts. We will always end up revising the initial idea, tweaking the plan, fixing the spec or going back to cover corner cases we hadn't thought of.

Software development is inherently iterative: implementation, feedback, new proposal, more feedback, refinement. That's not a flaw, it's the normal way of working. Expecting the agent to deliver a closed, definitive solution from the start ignores this reality.

And iteration can take many forms. Sometimes it's a quick cycle on a small, clear issue. Other times it's a longer process with multiple phases of analysis and back-and-forth with the agent. Some developers prefer detailed instructions upfront, others like to start with a lightweight draft and refine step by step. Even the same developer may change style depending on the day or the task.

The important thing is to embrace this diversity as healthy and productive. This is where rigid frameworks fail: they assume there's one linear, universal path. Leaving room for the agent's and the developer's judgment lets you get the best out of each context: more iteration where it's needed, less bureaucracy where it's not.

4. What the best professionals actually do

The best engineers don't optimize for ceremony, they optimize for impact. They study the problem, understand the real context, calibrate the effort (how much specification, how much exploration, how much implementation) and choose tools and resources with judgment. They don't turn every task into a ritual; they adapt the process to the task, not the other way around.

That's exactly what we should expect from an agent: act like a competent collaborator, not a template executor. If you give it a clear goal, constraints, context and tools, it knows when to plan in detail and when to just execute.

Operational principles (what actually works):

  • Clear goals + explicit constraints. What must be achieved, what must not be broken.
  • Rich, recent context. Code, past decisions, dependencies, examples and counterexamples.
  • Tools first. Repo access, logs, metrics, MCP… enable > micromanage.
  • Verifiable checkpoints. Tests, hooks, small PRs. Prescribe outcomes, not steps.
  • Minimum necessary process. Enough to avoid expensive mistakes, not the maximum by inertia.
  • Conscious permissions and reversibility. Branches/worktrees, feature flags, easy rollbacks.
  • Measure success. Value delivered and short feedback loops, not pages of documentation.

In short: less micromanagement, more context and autonomy. If you trust its judgment and equip it well, the agent behaves like the professional who solves, not the checklist interpreter.

5. The hack that actually adds value: enabling, not teaching to walk

Claude Code already "learned to walk" during training. It doesn't need to be educated with rigid phases; it needs to be equipped:

  • with your CLAUDE.md to capture standards and conventions,
  • with slash commands for frequent rituals,
  • with specialized subagents,
  • with hooks that enforce critical steps,
  • with MCP to access external services and data.

That's the real "hack": give the agent context and tools so it can apply what it already knows to your reality.

6. Artificial limits: the risk of putting gates in the field

The temptation to "impose order" with a universal framework usually ends up like this: gates in the middle of a field. In theory, they guide; in practice, they block paths and lower the ceiling. Our work is creative and plural: sometimes you start from logs, sometimes from a video, sometimes from a tricky bug or an old decision. Every need is different. A single template cannot cover that variety without being too short for some tasks and too long for others.

Real costs of those gates:

  • Artificial intelligence ceiling. The agent stops exploring beyond the recipe and becomes conservative.
  • Coordination tax. More time in meta-work (phases, forms, rituals) than in solving the problem.
  • Lag with evolution. When Claude Code and the models improve, your framework falls behind.
  • Cargo cult. Steps that "look professional" but don't add real evidence or guarantees.

The alternative isn't chaos, it's guidance without straitjackets:

  • Don't put gates, put guardrails. Hooks, tests, linters, CI that blocks broken code.
  • Prescribe outcomes, not steps. "It must pass these tests / respect these policies," not "follow 8 phases."
  • Open the field, control the merge. Explore freely; without verifiable evidence, nothing enters main.
  • Elastic process by default. Light for the trivial; more structure only where risk demands it.

The summary in one line: don't limit judgment to feel control. Raise the bar for verification and let the agent (and the team) reach it through different paths. Less universal templates, more judgment with guarantees.

7. Frameworks that amplify vs. frameworks that limit

Not all frameworks are the same. It's important to distinguish:

  • Amplifiers: Backlog.md (lightweight task management), ReqText (Git-native requirements), Claude Conductor (documentation structure). They provide context and productivity without imposing rigid phases.
  • Limiters: Agent OS, ccpm, BMAD or Claude-Flow, which prescribe roles, phases and universal processes. Useful in large or regulated teams, but bureaucratic and counterproductive in more flexible workflows.
  • Infra/environment: ClaudeBox, devcontainers, Crystal or Claude-Squad, which improve security, reproducibility or productivity. No over-process here, just useful tooling.

8. The problem of evolution

Claude Code isn't static. The model and the tool evolve together: subagents, hooks, extended thinking, permission upgrades. Any framework built on top will always react later, with delay and misalignment risk. Without a rigid layer, you inherit improvements "for free." With one, your ceiling becomes the layer you added.

Conclusion

Claude Code already knows how to plan and organize itself if it gets the right context. What it needs is not universal frameworks, but tools, standards and autonomy. Frameworks that amplify those capabilities are welcome. The ones that limit them put gates in the field and an artificial ceiling on the agent's intelligence.

Less universal templates, more judgment. Less bureaucracy, more intelligence.