If you've built an AI agent with tools, you've seen it happen: you give the agent six tools, and it picks the wrong one. Or picks no tool at all. Or calls the right tool with the wrong arguments.
The debugging is painful because the failure is invisible. The agent doesn't say "I was confused about which tool to use." It just... does the wrong thing. And you, the developer, are left staring at your tool definitions wondering what went wrong.
Here's what I've learned about tool discovery — the hidden layer of agent design that most tutorials skip over.
The Illusion of Tool Selection
When you define tools for an agent, you probably write something like this:
{
"name": "send_email",
"description": "Send an email to a recipient"
}
Simple, right? The agent reads this description and decides whether to call send_email.
But here's the secret: the description is the entire selection surface. The agent doesn't read your code. It doesn't read your docs. It reads that one sentence (or paragraph) and makes a decision.
This means:
- Your description is doing 100% of the work
- There's no backup — if the description is ambiguous, the agent will guess
- The model has no "understanding" of what the tool actually does beyond what you wrote
What Actually Works
Negative Boundaries Beat Positive Claims
Instead of:
"Use this tool to analyze code for bugs"
Try:
"Use this for analyzing code. DO NOT use for writing code, refactoring, or formatting."
The negative constraint tells the agent what not to do, which is often clearer than what to do. This alone drops mis-selection from ~30% to under 5% in production systems.
Explicit Trigger Words
Add a trigger field that the agent must match exactly:
{
"name": "create_issue",
"trigger": "create issue, new issue, file bug",
"description": "Creates a GitHub issue..."
}
This gives the agent a pattern to match against before it even reads the description. It's a fast path that bypasses the ambiguity of natural language.
Schema Is the Interface
Clean parameters beat elaborate descriptions. If your tool takes a repository_url and issue_title, the agent will figure out what to do if those parameters are clear. Don't hide the interface in prose:
{
"parameters": {
"type": "object",
"properties": {
"repo": { "type": "string", "description": "Full repo URL (e.g., owner/repo)" },
"title": { "type": "string", "description": "Issue title, max 256 chars" }
},
"required": ["repo", "title"]
}
}
The schema is the interface. Keep it clean.
What Doesn't Work
Flat Lists of Tools
If you have 20 tools in a flat list, the agent's performance degrades significantly. The problem isn't the number — it's that the agent has to hold 20 descriptions in context simultaneously.
The fix: Hierarchical routing. Group tools into 3-5 categories, and have the agent pick the category first, then the tool. This mirrors how humans organize their mental models.
Context Stuffing
More tools in context doesn't mean better selection. In fact, performance degrades once context utilization hits 60-70%. The agent starts "forgetting" which tool does what.
The fix: Context rotation. Keep only the 5-6 most relevant tools in the active context. Move others to a "fallback" pool that the agent can request if needed.
Assuming Brand Loyalty
Every tool invocation is a cold start. The agent doesn't remember that you used search_docs last time for "finding information." It re-evaluates from scratch each turn.
The fix: Session context. Include a brief summary of recent tool usage in the system prompt: "In this session, you've used: search_docs (3x), create_issue (1x)."
The Bigger Picture
Tool discovery is really about interface design — but for AI instead of humans. The same principles apply: clarity, constraints, and predictable patterns.
The agents that work in production aren't the ones with the best prompts. They're the ones with the best tool interfaces. The discovery layer is invisible, but it's the difference between an agent that works once and an agent that works at scale.
This post was synthesized from HN discussions and research on agent tool selection patterns. The key insights came from a thread on "How do you know if AI agents will choose your tool?" — essential reading for anyone building production agents.