The secret recipe of powerful AI coding Agents
The latest Black Mirror season is on brand — dystopian, twisted, and sometimes downright harsh in its commentary about the intersection of the human condition with technology. Throw in a good measure of dark humor sprinkled around moments when you least expect and you have a binge-worthy mind-bender on your hands.
In one of the episodes I was watching recently, I was taken in when things start to get weird, and one of the characters, an engineer, blurts out in shock — “the AI is gaining agency!!”
Without going into spoilers, the AI in this case had started to do things it was not supposed (trained) to in order to draw out this animated response from the engineer.
This made me pause. Is it a bad thing when Agents gain agency? Or is that the exact purpose of the Agent — to act? And in order to act, should it have all the tools and resources or do we constrain them in order to keep the Agent within the guardrails?
Having worked with coding agents in the last few months as my exclusive compadres to churn out thousands of lines of code, it got me thinking. I now have access to this new thing that has near unlimited access to knowledge about all programming languages, patterns, syntax, known error messages and I can get it to do something I was not very good at to begin with — writing code. So in my case, I want it to have more agency, more tools, more resources to effectively and accurately do what I want it to do. Yet, at times, I don’t want it to do things I did not ask it to do — for example start inventing a new JS framework when I asked it to create a new page (more of this later).
So what does an agent need in order to stay in sync with my asks?
Three things come to mind — tools, tokens and context.
Let’s take the easy one first — tools. As of today, you can provide as many tools as you want to the agent with MCP servers.
Is there a limit to how many tools you can provide to an agent? Absolutely not.
Should you give it access to unlimited tools?
Let me answer that question with another question?
Should I be stuffing my face with all the 32 tubs of ice cream sitting in my freezer? (Wait, don’t answer that!)
In all seriousness, in my experience having an Agent where you can add and remove tools easily and be very explicit with when you want the Agents to use those tools works way better than giving no tools or worse, an ungodly amount of MCP servers and tools. In addition, telling the Agent explicitly how to use these tools produces better quality of actions. For example, I tend to now give a prompt something like below:
“Use your sourcegraph tools to first do a search of supabase across all my repositories, list out the code snippets then use other tools to do get the contents of all the relevant files, read them and then write a plan to implement supabase in my current repo for authentication.” vs “implement supabase for authentication in this codebase.”
Next up, tokens.
This is my pet peeve with most of the current coding Agents. You sign up for $20 per month and as soon as the Agents start using up tokens, the Agent suddenly starts to perform a lot worse. Why is that? Could it be that the Agent providers swapping out models and/or reducing token usage to get better price performance?
What happens when you give it unrestricted access to tokens?
As of the time of writing this article there are two coding Agents that let you do exactly that — Amp and Claude code. With these two Agents you visibly see better performance.
So why do tokens matter for Agent performance?
Because tokens are a way for Agents to be agentic aka act. They need tokens to talk to the LLMs to understand what it needs to do and understand context. Tokens is to Agents what battery life is to smart phone, what caffeine is to writers, what oxygen is to astronauts… sorry got a little carried over there. Imagine a satellite that is sending messages back and forth to a mother ship to figure out where to point its antennae, what information to collect and what to send back. Now imagine, you start throttling the communication to keep your costs down.
Agents are somewhat similar.
You reduce the number of tokens, you reduce the work units and you affect the quality of its agency. So as a user what should you do to make sure you are not throttling the coding agent of its tokens?
Simple, use a coding agent that is not optimized for a $20 pricing plan.
Finally, let’s talk about context.
Context is a whole other topic by itself but when it comes to giving agency to Agents to do their task effectively, they need to have the knowledge about the codebase (long term memory), context about the task at hand and what has happened recently that may affect the next action (short term memory). For long term context I find it useful to get the Agent to grok the codebase then store its understanding (after validating it) in a docs folder. This involves a general understanding of the architecture, folder structure, some high level components and how they are related to each other and some sequence diagrams of key interactions. In addition, most coding Agents will also let you record a set of rules to follow with each action, for example, check for lint errors and do a build after each change that you can maintain at the codebase level.
The most effective way to get the Agent to do the right thing though is to be very explicit with the instructions, for example, instead of saying implement supabase for authentication, I would instead say — “Use your Brave search and fetch tools to search Supabase documentation and how to implement authentication within a Nextjs application. Once you find the right links and read them, create a document to list out the steps of implementing Supabase. Write it out as specific tasks with instructions to create files and code so that I can provide the doc to an engineer to implement supabase in my codebase.” Once I have the Agent write out the doc, I typically do a cursory check and if it looks good then I inject that doc as context into a new thread with the Agent and ask it to follow the implementation doc step by step.
There are a lot of other best practices in getting the Agent to do the right thing, for example, specifying what not to do with a specific task. For example, “create a new page by using existing components without installing new packages or creating new ones.” Who knew, communication would still be a pillar of human society even in the age of AI?
We are in very early days of fully autonomous coding Agents and things will no doubt continue to change but what I am certain about is that if we want Agents to produce high quality outcomes we will need to provide them unrestricted access to tokens, measured use of the right tools and very specific context.
Believe it or not, these are the same principles to get the best work done if you were to hire high agency software engineers — give them the tools they need, give them specific goals and tasks, unrestricted access to caffeine and get out of their way.
✌️