Some days I write more lines I didn’t author than lines I did. We’re writing code with AI agents now. Not “getting suggestions”. Writing. You describe what you want, the agent produces the code, “hopefully” you review it, then ship it.
This isn’t a post about whether that’s good or bad. It’s a fact. What I’m interested in is: what now?
The assumption that broke
Every coding principle we follow, every book, every conference talk, every “clean code” rule, was written for a world where the person writing the code is the person reading the code. Or at least, where the dev team has lived in the codebase long enough to know it inside out.
DRY? Designed for someone who’ll remember where the shared logic lives.
Small functions? Designed for someone who can hold the call chain in their head because they built it.
Layered architecture? Designed for someone who knows the map because they drew it.
The assumption underneath all of it: the author and the reader share a brain.
That assumption is dead.
I come back to code three weeks later and I’m a visitor in my own codebase. I co-wrote it with a machine. I didn’t author every line. I’m semi-familiar at best. I know the shape of the thing, but not the details. And every time I open a file, I’m essentially rebuilding my understanding from scratch.
That’s the default state now. Not “I know this codebase inside out.” “I’m back in code I didn’t fully write, and I need to understand it fast.”
The real challenge
If that’s the default state, then the skill that matters most isn’t writing elegant code. It’s how fast can you rebuild a mental model when you’re a visitor?
When I review AI-generated code, the question I keep asking isn’t “is this correct?” Even when the code is correct and idiomatic, the harder question is: “if I come back to this in three weeks, will I understand it in ten minutes?”
Most of the time the answer is no. Not because the code is wrong. Because the code was written for a reader who already knows it.
What AI gave me
I asked an AI agent for a simple Flask app, two endpoints that fetch users and orders from an external API. Here’s what it gave me.
client.py:
class ApiError(Exception):
def __init__(self, endpoint, status_code):
self.endpoint = endpoint
self.status_code = status_code
super().__init__(f"{endpoint} returned {status_code}")
class HttpClient:
def __init__(self, base_url):
self.base_url = base_url
self.session = requests.Session()
def get(self, endpoint, **kwargs):
url = f"{self.base_url}/{endpoint}"
response = self.session.get(url, **kwargs)
return self._handle(response, endpoint)
def _handle(self, response, endpoint):
if response.status_code != 200:
log.error(f"{endpoint} failed: {response.status_code}")
raise ApiError(endpoint, response.status_code)
return response.json()
services.py:
class UserService:
def __init__(self, client):
self.client = client
def get_users(self):
data = self.client.get("users")
return data.get("users", [])
class OrderService:
def __init__(self, client):
self.client = client
def get_orders(self):
data = self.client.get("orders")
return data.get("orders", [])
app.py:
app = Flask(__name__)
client = HttpClient("https://api.example.com")
@app.route("/api/users")
def users():
try:
service = UserService(client)
return jsonify(service.get_users())
except ApiError as e:
return jsonify({"error": str(e)}), 500
@app.route("/api/orders")
def orders():
try:
service = OrderService(client)
return jsonify(service.get_orders())
except ApiError as e:
return jsonify({"error": str(e)}), 500
Three files. Two service classes. A custom exception. An HTTP client with a private handler method. For two GET endpoints.
It’s not wrong. It’s beautiful by conventional standards. Separation of concerns, single responsibility, clean interfaces. Textbook stuff.
Now debug it. /api/users returns 500. Where do you start?
app.py → UserService.get_users → HttpClient.get → HttpClient._handle → maybe ApiError → back to app.py to see the catch. Three files, five jumps, to understand a GET request that fetches users from an API.
Maybe “clean” and “clear” are different things
This is what bugged me. The code AI generated looks beautiful. Well-named classes, proper separation, custom exceptions. It looks like a textbook example of clean code.
Now imagine this across a codebase ten times larger. Production is down. Alerts firing. You’re three weeks removed from writing this code, except you didn’t write half of it. You’re jumping between app.py, services.py, client.py, trying to find the one line that explains why users are getting 500s. Every jump is a context switch. Every context switch costs you seconds you don’t have.
“Clean” was always a proxy for something else: readability, maintainability, onboarding speed. But it’s a proxy that assumes the reader brings context with them. “Clear” is different. Clear means: someone with zero context can follow this. Clean optimises for aesthetics. Clear optimises for the visitor.
Maybe we’ve been optimising for the wrong word.
Why AI builds everything
AI doesn’t know when to stop.
Every human has felt that moment. You’re cooking, you’re about to add another spice, and you think “no, it’s good. Stop.” You’re decorating a room, you’re about to add one more thing to the shelf, and you step back and think “that’s enough.” You’re editing a document, you’re about to add another paragraph, and you think “this is getting too long. Cut something.”
That instinct, knowing when enough is enough, is something AI doesn’t have by default. It could always add one more thing, so it does.
When you’re deep in an agentic session, this plays out turn by turn. You ask for a feature, AI adds it. You ask for another, AI adds that too. A client class here, a service layer there, a factory because why not. Each addition seems reasonable in the moment. But AI never steps back to look at the whole picture and ask “wait, does any of this need to exist?”
Then you come out of the session. You look at what you’ve got. Three files, two service classes. For two GET endpoints. Half of it is redundant. Not wrong. Just… unnecessary. Built because the model could always add one more thing, and nothing told it to stop.
And here’s the thing: AI doesn’t carry the burden. You do. It builds the maximally complete version and moves on. You’re the one who has to understand it at 2 AM when something breaks.
So now your job has a new step before reviewing. Trimming. Looking at what AI gave you and asking: does this need to exist? If the answer is no, cut it.
We were trained to add. Extract, refactor, abstract, layer. More structure meant better code. Now the skill is kind of the opposite. Knowing what to remove.
Some honest questions
I don’t have a manifesto. I have questions. Here’s where I’ve started pushing back on things I used to accept without thinking.
Maybe “small functions” isn’t the same as “readable”
We’ve internalised the rule: functions should be small. Do one thing. Split, split, split.
But when I come back to AI-generated code that followed that rule, I’m six levels deep in a call chain before I find the line that does something. Each function is trivially simple. The whole is incomprehensible. By the time I’ve traced the chain, I’ve forgotten why I started.
Is “readable” the same thing as “understandable by someone who didn’t write it”? Small functions are readable in isolation. Linear code is understandable in context. When I’m a visitor, I need understandable.
Maybe DRY has a cost we never counted
DRY. Don’t Repeat Yourself. If you write the same logic twice, extract it. One source of truth. Changes in one place. Clean.
But DRY creates dependencies. Every caller depends on the shared function. Not just “uses it”. Depends on understanding it. When something breaks, I trace from the call site to the shared function, understand what it does, and trace back.
When I wrote the shared function, I remembered all of that. When I’m a visitor? I reverse-engineer it. And the function was probably written generic, handling cases that aren’t mine.
If DRY was about reducing duplication for a developer who knows the codebase, what’s the equivalent for a developer who doesn’t?
What I tried: the Rule of Three
There’s an old principle that got pushed aside when DRY took over. Martin Fowler and Don Roberts called it the Rule of Three: the first time you do something, just do it. The second time, you feel the duplication but do it anyway. The third time, then you refactor.
It acknowledges something DRY doesn’t: duplication is cheap. The wrong abstraction is expensive.
I threw out the AI’s three-file architecture and trimmed it back to what actually needed to exist.
First time. users endpoint:
@app.route("/api/users")
def users():
response = requests.get("https://api.example.com/users")
if response.status_code != 200:
log.error(f"Failed to fetch users: {response.status_code}")
return jsonify({"error": "failed"}), 500
return jsonify(response.json().get("users", [])), 200
Just write it. No client class, no service layer. The error handling is right there next to the call.
Second time. orders endpoint. I feel the duplication, but I copy:
@app.route("/api/orders")
def orders():
response = requests.get("https://api.example.com/orders")
if response.status_code != 200:
log.error(f"Failed to fetch orders: {response.status_code}")
return jsonify({"error": "failed"}), 500
return jsonify(response.json().get("orders", [])), 200
Similar shape? Yeah. Extract now? No. The pattern isn’t proven, maybe orders will need auth headers later. Premature abstraction would couple them before I know if they evolve the same way.
Third time. products endpoint. Pattern is real. Now extract:
def fetch_endpoint(endpoint, key):
response = requests.get(f"https://api.example.com/{endpoint}")
if response.status_code != 200:
log.error(f"Failed to fetch {endpoint}: {response.status_code}")
return jsonify({"error": "failed"}), 500
return jsonify(response.json().get(key, [])), 200
@app.route("/api/users")
def users():
return fetch_endpoint("users", "users")
@app.route("/api/orders")
def orders():
return fetch_endpoint("orders", "orders")
@app.route("/api/products")
def products():
return fetch_endpoint("products", "products")
One file. One helper function. Every endpoint is one line.
The abstraction that emerged from three real repetitions is simpler and more correct than the one AI guessed after seeing one. Because by the third time, I knew the actual shape of the pattern, not the shape I imagined it would have.
What actually matters now
Not everything needs rethinking. Some concepts survive the shift from “author-as-reader” to “author-as-visitor” just fine.
Descriptive naming. Variable and function names are signposts when you’re rebuilding a mental model. process_data tells me nothing. normalize_user_input_before_validation tells me everything.
Early returns. Flattening conditionals so I read top-to-bottom without holding preconditions in my head. Pure cognitive load reduction.
Modules you don’t need to open. A clean interface over complex work. The goal is simple: can I use this without going inside? When I’m a visitor, the best module is one I never have to read the internals of. I interact at the boundary and move on. The AI-generated client class wasn’t this. It didn’t hide complex work. It split simple work across files, making it harder to inspect. Good abstractions hide complexity. Bad abstractions make simple things harder to follow.
Linear flow. Code that reads top-to-bottom, where the next thing is literally the next line. No jumping. Just read.
Notice the common thread: they all reduce jumps. How many files do I need to open to understand this? One. The one I’m already in. When you’re the author, jumps are cheap. When you’re a visitor, every jump is a tax.
“Just fix the AI prompt”
Someone pointed out: why change how you code? Just tell the AI to write readable code. Give it better instructions. Problem solved.
Fair question. My answer: we should do both.
Instruct the AI to produce code that a human can follow. Simple, linear, no premature abstractions. When a human needs to investigate, the code should already be built for a human brain, not for a pattern-matching model. That’s a skill we need to develop: writing prompts that produce code we can actually read later.
But that’s not enough on its own. Because even with the best prompt, the AI doesn’t know your codebase. It doesn’t know what you already have, what patterns you’ve established, what’s going to break when you add this. You still need to review. You still need to be the one who says “this is understandable” or “this isn’t.”
The prompt gets you 80% of the way. The last 20% is still you, rebuilding a mental model fast enough to catch what the machine missed.
So what now?
I don’t have a manifesto. But I have a hypothesis about where this is heading.
The old principles aren’t dead. They need a new filter. Judge every abstraction by one question: does this help someone who didn’t write it understand it fast?
Put the rules where the agent reads them
Rule of Three belongs in your agent’s instructions. AGENTS.md, CLAUDE.md, whatever file your coding agent loads at session start. Tell it directly: don’t extract abstractions until the same pattern appears three times. Write inline first. Duplication is acceptable.
But AGENTS.md loads once. By turn 15, it’s buried under conversation history. The agent forgets.
Both Claude Code and Codex support hooks that fire on every user prompt submission. You can inject the Rule of Three reminder into every single turn. The instruction stays fresh. Context drift doesn’t matter anymore.
Use both. AGENTS.md for the full ruleset. Hooks for the one principle that matters most. Yes, injecting on every turn costs tokens. It’s a small price for not having to trim the same overbuilt code every session.
Trimming as a skill
Even with the best instructions, you still need to review. AI has no cost model for complexity. A new class costs it nothing. Another file costs it nothing. A factory, an interface, a custom exception, a service layer: these are cheap tokens to the model. They’re expensive to the person who has to trace them later.
It also has a strong sense of what code is supposed to look like. It has seen the shapes: clients, services, providers, factories, base classes, custom exceptions. In the training data, those shapes often sit near good code. So it reaches for them before the need has been earned.
Trimming is learning to notice that gap.
Not by running a checklist. More like walking into a server room and hearing the one fan that sounds wrong. You see a factory with one implementation. A base class with one child. A service method that only calls another method. An exception type that gets caught once and turned back into a string. A config object with one value. None of that is broken. It’s just code dressed for a future that hasn’t arrived.
The Rule of Three example is the whole thing in miniature. AI abstracted at zero. It saw two endpoints and guessed the shape of the system. The better move was to un-abstract, copy the boring parts, and wait until the third real repetition proved what was shared. Only then did the helper have a job.
That’s the difference between structure and ceremony. Structure carries weight. Ceremony makes simple work look like serious software.
You learn to see it by being a visitor in overbuilt code enough times. After a while the shapes stand out. The layer that protects nothing. The abstraction that explains less than the code it replaced. The generic name hiding one special case. When you see that, trim it back to the thing that actually exists today.
The skill isn’t adding anymore. It’s recognizing when code hasn’t earned its shape yet.
The ground shifted
The principles we follow were written for a world that doesn’t exist anymore. The author and the reader are different people now. Sometimes the reader is me, three weeks later, with no memory of why any of this was written. Sometimes the “author” was a model that optimised for patterns, not for the person who has to understand it later.
The ground shifted. Maybe it’s time to ask whether our principles should shift too.
Not tear everything down. Not declare DRY dead. Just… ask the questions. Test the assumptions. Optimise for the visitor, not the author.
Because the visitor is us now.
