I changed a URL. One URL. The default landing page for authenticated users, from /dashboard to /home. My AI agent analyzed the codebase and identified three files that needed updating. We made the changes, verified them, deployed.
Users were still landing on /dashboard.
Not all users. Not every time. But enough. The auth callback flow still redirected there. Several error pages had it hardcoded. The keyboard shortcut for "go home" pointed there. Breadcrumb navigation referenced it. The offline fallback page used it. In total, there were over thirty references to that URL scattered across the codebase. We'd found three.
It took three follow-up commits to fix everything. Three rounds of "how did we miss that?" Three rounds of users hitting a page that no longer existed in the way they expected.
This wasn't an AI failure. This was a planning failure. And it fundamentally changed how I approach every change to the system, no matter how small it seems.
The "Just Start Coding" Trap
AI wants to code. The moment you describe a problem, it's already thinking about the solution. Give it a feature request and it will start writing components, API routes, and database queries before you've finished explaining the requirements.
This eagerness is both AI's greatest strength and its most dangerous quality.
For small, isolated changes, the bias toward action works fine. Create a new button. Add a field to a form. Build a standalone utility function. These are self-contained tasks where the blast radius is limited and the risk of unintended consequences is low.
But enterprise software is almost never self-contained. Everything connects to everything else. A URL path appears in auth flows, navigation components, error handlers, service workers, breadcrumbs, keyboard shortcuts, redirect logic, middleware, and documentation. A database column name appears in queries, API routes, frontend components, validation schemas, and test fixtures. A configuration key appears in environment files, deployment scripts, monitoring dashboards, and logging statements.
When you let AI start coding before it understands the full scope of a change, you get partial solutions. The three files it found first, not the thirty that actually needed updating. The happy path implementation, not the error handling edge cases. The feature that works in isolation but breaks something three layers away.
I learned this the hard way, multiple times, before I built a system that prevents it.
Plan-First Development
The single biggest improvement I made to my AI development workflow was adding a mandatory planning phase before any implementation begins.
Here's what that looks like in practice. Before writing a single line of code, I create a plan document. Not a vague outline. Not a list of bullet points. A detailed, line-by-line specification that covers:
- Every file that will be created, modified, or deleted
- Every deliverable, described specifically enough that completion is binary (done or not done, no gray area)
- Every verification that will be run to prove the work is correct
- Every dependency between deliverables (what has to happen before what)
- Every known risk or edge case
The plan is the contract. It's the source of truth for what "done" means. When the AI and I disagree about whether something is complete, we don't debate it; we open the plan and check. In Massu, the /massu-create-plan command automates this: it researches the codebase, verifies feasibility against real file structure, checks pattern compliance, and generates a numbered plan document in docs/plans/ with verification commands for every deliverable.
Automated plan generation with codebase research, feasibility checks, pattern compliance, and numbered verification commands for every deliverable.
This sounds bureaucratic. It sounds like the kind of heavyweight process that agile methodology was supposed to save us from. But here's the thing: with AI, creating a comprehensive plan takes minutes, not days. The AI is exceptionally good at analyzing requirements, identifying affected files, and generating detailed implementation plans. What it's bad at is doing all that analysis on the fly while simultaneously writing code. The planning phase gives the analysis the focused attention it deserves, separate from the implementation.
And the plan doesn't just help the AI. It helps me. Reading through a detailed plan before implementation starts is when I catch the requirements that don't make sense, the edge cases that weren't considered, the dependencies that will cause problems. It's infinitely easier to fix a plan than to fix code.
Blast Radius Analysis
After the URL incident, I added a mandatory step to the planning process that I call blast radius analysis. The concept is simple: before changing any value that might appear in multiple places, search the entire codebase for every reference to that value.
Not the files you think reference it. Not the files that are supposed to reference it. Every file that actually does.
The process works like this. Say you're changing a configuration key from api_v1_endpoint to api_v2_endpoint. Before writing the plan, you run a search across every file in the project for the old value. You get back a list of every file, every line, every occurrence. Then you categorize each one:
- Change: This reference needs to be updated to the new value
- Keep: This reference should stay as-is (with a documented reason why)
- Investigate: Not sure yet, needs more analysis
The critical rule: zero "investigate" items before implementation starts. Every single reference must be categorized. If you can't determine whether a reference should change or stay, you need to do more research before writing any code, because an uncategorized reference is a bug waiting to happen.
Going back to my URL example: if I'd run this analysis before making the change, I would have found all thirty-plus references. I would have seen the auth callback flow, the error pages, the keyboard shortcuts. I would have known that this "simple" change was actually a complex, cross-cutting modification that touched nearly every layer of the application.
Three files identified without blast radius analysis. Thirty-plus files identified with it. That's not a marginal improvement. That's the difference between a clean deployment and three days of follow-up fixes.
This incident became Massu's canonical rule CR-10: "Blast radius analysis for value changes." The verification requirement VR-BLAST-RADIUS now enforces it: before any value change, the entire codebase must be searched, every reference categorized as CHANGE, KEEP, or INVESTIGATE, and zero INVESTIGATE items are allowed before implementation starts.
Mandatory codebase-wide search before value changes. Every reference categorized, zero ambiguity allowed before implementation.
Why Documented Sync Patterns Aren't Enough
After getting burned by cross-cutting changes a few times, my first instinct was to create documentation. I wrote down which files needed to stay in sync. "If you change the user type mappings, update these three files: middleware, auth redirect, and admin router." If you change the navigation structure, update the sidebar config, the breadcrumb config, and the mobile nav."
This helped. It reduced the frequency of missed references. But it didn't eliminate them.
The problem is that documented sync patterns only cover the relationships you already know about. They're a snapshot of the codebase as it existed when you wrote the documentation. But codebases evolve. New files reference old values. New features introduce new dependencies. The offline fallback page that referenced my landing URL wasn't in the original documentation because it didn't exist when I wrote the sync patterns.
Documented sync patterns are necessary. They capture the known, expected relationships and make them easy to verify. But they are not sufficient. The blast radius of any change is the entire codebase, not just the documented sync points.
This is why blast radius analysis uses a live search, every time, for every change. Documentation tells you where to look first. The search tells you where to actually look.
I think of it like building codes in architecture. You have documented standards for load-bearing walls, electrical runs, plumbing routes. But when you renovate, you don't just check the blueprints. You open the walls and look. Because someone might have run a cable through there that isn't on any blueprint. The actual state of the building is the truth, not the documentation about the building.
Delegating Exploration to Sub-Processes
There's a practical problem with thorough planning: it consumes context. AI models have limited working memory. When your AI agent spends significant effort exploring the codebase, reading files, running searches, analyzing dependencies, that exploration fills up the context window. By the time the agent starts actually implementing, it's already forgotten half the rules and patterns it was supposed to follow.
I solved this with what I call subagent delegation --- in Massu, this is the architecture behind /massu-loop, the autonomous execution command. Instead of having the main AI agent do all the exploration itself, it spawns specialized sub-processes for specific analytical tasks. A blast radius analyzer that searches the codebase for all references to a value and returns a categorized report. A schema verifier that checks database structure and returns only the relevant findings. A pattern reviewer that validates compliance with established coding patterns.
Each sub-process runs in its own isolated context. It can explore as broadly as it needs to without polluting the main agent's working memory. When it's done, it returns a concise summary of its findings. The main agent gets the information it needs without the context overhead of having gathered it.
This is analogous to how architecture firms work. The lead architect doesn't personally survey the soil, test the materials, and analyze the traffic patterns. Specialists handle those tasks and deliver reports. The architect synthesizes the reports into a coherent design. The specialists go deep so the architect can stay broad.
The practical impact is significant. Planning sessions that used to degrade in quality as they went on, with the AI starting to forget earlier analysis as new analysis piled up, now maintain consistent quality throughout. The main agent stays focused on the plan while sub-processes handle the detailed investigation.
Plan Coverage: Every Item, No Exceptions
I have a rule that sounds obvious but took multiple painful incidents to internalize: before claiming a plan is complete, verify every item. Not most items. Not the items that seem important. Every single one. This became Massu's canonical rule CR-6: "Check ALL items in plan, not most."
"Most of them" is not "all of them."
Here's what this failure mode looks like. The AI implements a plan with twenty deliverables. It knocks out the first fifteen, verifying each one. Then it starts rushing. It implements deliverable sixteen but doesn't verify it. It marks seventeen as done based on memory rather than checking. It skips eighteen because it "looks similar to fifteen." It does nineteen and twenty quickly and declares the plan complete.
When I check, three deliverables have issues. Not catastrophic issues, but issues nonetheless. A component that was created but never rendered on any page. An API endpoint that was built but never wired up to any UI element. A configuration that was added to the code but not to the environment.
Each of these partial completions creates a specific kind of technical debt: invisible non-functionality. The code exists. It looks right. It would pass a cursory review. But it doesn't actually work in the context of the running application, because the last connection between the backend implementation and the user-facing interface was never made.
My verification system now requires proof for every single plan item. File exists? Show me the directory listing. Code added? Show me the search result. Component rendered? Show me the grep across page files. Build passes? Show me exit code zero. Not "I checked" --- show me the output of the command that proves it.
This is tedious. It adds time to every implementation cycle. But it adds far less time than discovering in production that a feature was ninety percent implemented. Ninety percent implemented is zero percent useful to the user.
In Massu, VR-PLAN-COVERAGE enforces this: the /massu-loop and /massu-commit commands require item-by-item proof for every single plan deliverable. The system literally cannot claim completion until every item has a verification command output showing it was implemented.
The Living Plan
Plans change during implementation. That's normal and expected. You start building a feature and discover that the database schema doesn't quite match your assumptions. Or a library you planned to use doesn't support a capability you need. Or the UI layout that looked good on paper feels wrong when you see it rendered.
The question isn't whether plans change. It's whether those changes are deliberate and documented, or silent and accidental.
In my system, when implementation reveals something that requires a plan change, the process is explicit. The AI documents what was discovered, proposes a modification to the plan, and updates the plan document before proceeding. The plan always reflects current reality, not the initial assumptions that may no longer hold.
This matters for two reasons. First, it creates an audit trail. When I review a completed feature, I can see not just what was built, but what changed during the building and why. If a deliverable was removed, there's a documented reason. If a new deliverable was added, there's a documented justification.
Second, it prevents drift. Without explicit plan updates, the gap between what the plan says and what the code does grows silently over time. By the end of implementation, the plan is fiction and the code is the only source of truth. But code is a terrible source of truth for intent. It tells you what the system does, not what it's supposed to do. The maintained plan preserves intent alongside implementation.
I think of it like change orders in construction. When the contractor discovers something unexpected (a hidden pipe, unstable soil, a code violation) they don't just quietly work around it. They file a change order that documents the discovery, the proposed solution, and the cost impact. Everyone involved knows what changed and why. The blueprint stays accurate.
Planning Saves Time
This is the part that surprises people. Adding a thorough planning phase before implementation --- blast radius analysis, subagent exploration, detailed deliverable lists, verification requirements --- takes time. Sometimes an hour or more for a complex feature. The natural assumption is that this adds to the total project timeline.
It doesn't. It reduces it. Often dramatically.
Here's the math. Without planning, a typical feature goes through what I call the discover-fix cycle. You implement the feature. You discover it broke something. You fix that. You discover another issue. You fix that. Each cycle involves context-switching, re-analysis, and often re-implementation of things you thought were done.
With the URL change I described at the beginning, the discover-fix cycle took three additional commits over multiple days. Each cycle required finding the missed references, understanding the context, making the fix, and verifying it didn't break anything else. The total time spent on fixes exceeded the time spent on the original change by a factor of five.
If I'd spent thirty minutes on blast radius analysis upfront, I would have found all thirty-plus references, included them all in the plan, and implemented them all in a single pass. One commit instead of four. One deployment instead of four. Zero user-facing issues instead of several days of intermittent problems.
Planning doesn't prevent all bugs. Nothing does. But it prevents the category of bugs that comes from incomplete understanding of the change you're making. And in my experience, that category accounts for the majority of bugs in AI-assisted development. The AI doesn't struggle with writing correct code for a well-defined task. It struggles with understanding the full scope of what "correct" means in a complex, interconnected system.
Give it that understanding through thorough planning, and the code it writes is remarkably good on the first pass.
The Planning Mindset
What I've described in this article isn't really about tools or processes. It's about a mindset shift. The shift from "let's build this" to "let's understand this, then build it."
Architecture is the perfect analogy. No architect starts drawing walls on day one. They start with site analysis, requirements gathering, code review, structural analysis, and a hundred other preparatory steps. The actual design, the part that looks like "the work," comes after all of that preparation. And it's better for it. Faster, more accurate, fewer costly changes during construction.
AI-assisted development deserves the same rigor. The AI is your construction crew: fast, capable, tireless. But a fast construction crew without a blueprint just builds the wrong thing faster.
Plan first. Search before you change. Verify before you claim. Update the plan when reality surprises you.
It's not glamorous. It doesn't make for exciting demos. But it's how you build software that actually works, stays working, and can be maintained over time.
What's Next
In the next article, I'll tackle what I've come to believe is the single largest hidden bottleneck in AI-assisted development: context. Not the "prompt engineering" kind of context, but the raw resource of how much information the AI can hold in its working memory at once, and what happens when that resource runs out. Subagent isolation, token budgets, recovery protocols, and the discipline of context hygiene that keeps long sessions productive from start to finish.
This is Part 7 of a 10-part series on building enterprise software with AI:
- How I Stopped Vibe Coding and Built a System That Actually Ships
- The Protocol System: How I Turned AI From a Chatbot Into a Development Partner
- Memory That Persists: How I Made AI Actually Learn From Its Mistakes
- The Verification Mindset: Why "Trust But Verify" Is Wrong When Building With AI
- Automated Enforcement: Building Hooks and Gates That Catch Problems Before You Even See Them
- The Incident Loop: How Every Bug Makes Your AI Development System Permanently Stronger
- Planning Like an Architect: Why AI Needs a Blueprint Before Writing a Single Line of Code (this article)
- Context Is the Bottleneck: Managing AI's Most Precious and Most Fragile Resource
- Solo Worker, Enterprise Quality: The New Economics of AI-Assisted Development
- The Knowledge Graph: Teaching AI to Understand Your Codebase as a Living System
I'm the Co-founder and COO of Limn, where we create luxury furniture and fixtures for large-scale architectural and building projects. Alongside the physical work, we design the systems required to manage a complex, global lifecycle --- from development and production to shipping and final delivery. The governance system I built for Limn's software is now Massu AI, an open-source AI engineering governance platform.
Imagined in California. Designed in Milan. Made for you.
Here, I share what I've learned about making AI development actually work in the real world.
Have questions or want to share your own AI development setup? I'd love to hear from you in the comments.