> ## Documentation Index > Fetch the complete documentation index at: https://docs.morphllm.com/llms.txt > Use this file to discover all available pages before exploring further. # Retrieval > Simple, effective code retrieval strategies for any repository size **Prerequisite**: You'll need an account on [Morph](https://morphllm.com/dashboard) to access embeddings for larger repositories. ## The Right Tool for the Right Size Code retrieval should be **simple by default, sophisticated when necessary**. The approach depends entirely on your repository size: **Agent Search**: Let Claude navigate file paths and symbols directly **Retrieval Funnel**: Embeddings → Reranking → Agent Reading ## Small Repositories: Agent Search For repositories under 200 files, skip the complexity. Use **agent search** where you give Claude three simple tools: ### The Three Essential Tools ```json theme={null} { "name": "list_dir", "description": "List directory contents to understand project structure", "parameters": { "properties": { "relative_workspace_path": { "description": "Path to list contents of, relative to workspace root", "type": "string" } } } } ``` Let Claude explore the file structure naturally. ```json theme={null} { "name": "file_search", "description": "Find files by partial filename match", "parameters": { "properties": { "query": { "description": "Partial filename to search for", "type": "string" } } } } ``` When Claude knows roughly what file it's looking for. ```json theme={null} { "name": "read_file", "description": "Read file contents with optional line range", "parameters": { "properties": { "target_file": { "description": "Path to the file to read", "type": "string" }, "start_line": { "description": "Optional: Start reading from this line", "type": "integer" }, "end_line": { "description": "Optional: Stop reading at this line", "type": "integer" } } } } ``` Let Claude decide what to read based on what it discovers. ```json theme={null} { "name": "semantic_grep", "description": "Search for code semantically similar to a query using embeddings", "parameters": { "properties": { "query": { "description": "Natural language description of code to find", "type": "string" }, "file_patterns": { "description": "Optional: File patterns to search within (e.g., '*.py', '*.ts')", "type": "array" } } } } ``` When Claude needs to find code by meaning, not just filename. ```typescript theme={null} async function semanticGrep(query: string, filePatterns?: string[]) { // 1. Claude outputs semantic query: "error handling for API calls" // 2. Embed the query const queryEmbedding = await embed(query); // 3. Compare against pre-embedded file chunks const matches = await findSimilar(queryEmbedding, { patterns: filePatterns, threshold: 0.7, limit: 10 }); return matches; // Claude gets semantic matches to explore } ``` **Benefits:** * Zero setup time * No indexing required * Claude makes intelligent choices about what to read * Works perfectly for most development scenarios ## Large Repositories: The Retrieval Funnel When you hit **200+ files**, raw agent search becomes inefficient. Use the **retrieval funnel**: **Embeddings**: Find 50-100 potentially relevant code chunks using Morph Embeddings **Reranking**: Narrow to the 5-10 most relevant pieces using Morph Rerank **Claude**: Inspect the ranked results and decide what to read in full ### Implementation: The Funnel Approach ```typescript theme={null} import { OpenAI } from 'openai'; const openai = new OpenAI({ apiKey: "YOUR_API_KEY", baseURL: 'https://api.morphllm.com/v1' }); async function retrievalFunnel(query: string) { // Step 1: Cast wide net with embeddings const embedding = await openai.embeddings.create({ model: "morph-embedding-v3", input: query }); const wideResults = await vectorSearch(embedding, { limit: 50 }); // Step 2: Focus with reranking const reranked = await openai.completions.create({ model: "morph-rerank-v3", documents: wideResults.map(r => r.content), query: query, top_k: 8 }); // Step 3: Let Claude decide what to read return provideCandidatesToClaude(reranked.data); } ``` ### When to Use Each Approach ``` Repository size? ├── < 200 files → Agent Search │ ├── Basic: list_dir + file_search + read_file │ └── +semantic_grep for meaning-based search (optional) └── 200+ files → Retrieval Funnel ├── < 1000 files → Basic embeddings + rerank └── 1000+ files → Add AST parsing + hybrid search ``` ## Best Practices * Begin with agent search for any new project * Only add complexity when Claude starts missing relevant code * Most projects never need embeddings * Switch to retrieval funnel when search becomes slow/inaccurate * Usually happens around 200-500 files depending on structure * Monitor Claude's success rate in finding relevant code * Even with embeddings, let Claude make final reading decisions * Provide candidates, not conclusions * Claude's reasoning beats pure similarity matching ## Performance Expectations

Repository Size	Approach	Search Time	Success Rate
\< 200 files	Agent Search	\< 5 seconds	95%+
200-1000 files	Basic Funnel	\< 10 seconds	90%+
1000+ files	Advanced Funnel	\< 15 seconds	85%+

**The key insight**: Most code retrieval problems are solved by giving Claude the right navigation tools, not by throwing embeddings at everything. Ready to implement? [Get your API key](https://morphllm.com/api-keys) for repositories that need the retrieval funnel, or just start with agent search for everything else.