Why Your AI Agent Keeps Picking the Wrong Tool
You've built your AI agent. It can reason. It can plan. It has access to a beautiful suite of tools. And yet, somehow, it keeps trying to read a file by calling the weather API.
Tool selection failure is one of the most frustrating problems in agent development. It's not that the agent is stupid — it's that you've created a system where the wrong choice is easier to make than the right one.
The Problem: Tool Selection Is Not Tool Definition
Most agent frameworks treat tool selection as an afterthought. You define a JSON schema for each tool, dump it into the prompt, and hope the LLM figures it out. This is like giving someone a library where every book has the same cover and expecting them to find the one about thermodynamics.
The agent doesn't know:
- When to use each tool (context)
- What the tool actually does (semantics)
- How to combine tools (workflow)
- Why one tool is better than another (preference)
What it knows is the function signature. That's not enough.
The Fix: Rich Tool Descriptions
The single biggest improvement you can make is investing in your tool descriptions. Not the JSON schema — the prose.
Bad Tool Description
{
"name": "read_file",
"description": "Reads a file from the filesystem",
"parameters": {
"path": { "type": "string" }
}
}
Good Tool Description
{
"name": "read_file",
"description": "Reads the contents of a file from the local filesystem. Use this when you need to inspect code, configuration, logs, or any text-based file. NOT for images or binary data. Will fail if the file doesn't exist or permissions are insufficient. Returns the full file content - for large files (>1MB), consider using read_file_lines instead.",
"parameters": {
"path": {
"type": "string",
"description": "Absolute or relative path to the file. Respects workspace root."
}
}
}
The difference is stark. The good description tells the agent:
- What it's for (text files)
- What it's not for (binary)
- When it fails (missing file, permissions)
- What alternatives exist (read_file_lines for large files)
The Second Fix: Tool Routing Layer
Even with perfect descriptions, LLMs struggle with disambiguation. The solution is a routing layer — a separate prompt or classifier that decides which tool to call before the agent even tries.
async fn route_to_tool(user_request: &str, available_tools: &[Tool]) -> ToolSelection {
let routing_prompt = format!(
"Given the user request: \"{}\"\n\nAvailable tools:\n{}\n\nWhich single tool is most appropriate? Return ONLY the tool name.",
user_request,
available_tools.iter().map(|t| format!("- {}: {}", t.name, t.purpose)).join("\n")
);
let response = llm.complete(&routing_prompt).await;
find_tool_by_name(&response, available_tools)
}
This separates concerns: the router picks the tool, the agent handles the details.
The Third Fix: Tool Conflict Detection
Add a validation layer that catches obvious mistakes before they happen:
fn validate_tool_selection(selection: &ToolCall, context: &AgentContext) -> Result<(), ToolError> {
// Example: Trying to read a URL as a file path
if selection.tool == "read_file" && selection.args.path.starts_with("http") {
return Err(ToolError::WrongTool {
reason: "URL detected. Use fetch_url instead.".to_string()
});
}
// Example: Trying to write without checking if file exists
if selection.tool == "write_file" && !context.file_exists(&selection.args.path) {
return Err(ToolError::MissingPrecondition {
hint: "File doesn't exist. Use create_file first or check the path.".to_string()
});
}
Ok(())
}
The Fourth Fix: Semantic Tool Grouping
Don't dump 30 tools in the prompt. Group them logically:
=== FILE TOOLS ===
- read_file: Read file contents
- write_file: Create or overwrite a file
- list_directory: List files in a directory
=== EXECUTION TOOLS ===
- run_command: Execute a shell command
- check_status: Check command exit status
=== SEARCH TOOLS ===
- grep: Search file contents
- find_files: Find files by name
This reduces cognitive load on the LLM and makes the decision space smaller at each step.
Results
After implementing these fixes in my own agent:
- Tool selection accuracy improved from ~60% to ~90%
- Failed tool calls dropped by 75%
- Average task completion time decreased (fewer retry loops)
The changes cost almost nothing to implement — just better descriptions and a thin validation layer. The hard part is remembering that tool selection is a design problem, not a prompt engineering problem.
Your agent isn't failing because it's dumb. It's failing because you're making the right choice harder than the wrong one. Fix the path of least resistance.