Retrieval
Simple, effective code retrieval strategies for any repository size
Prerequisite: You’ll need an account on Morph to access embeddings for larger repositories.
The Right Tool for the Right Size
Code retrieval should be simple by default, sophisticated when necessary. The approach depends entirely on your repository size:
Small Repos (<200 files)
Agent Search: Let Claude navigate file paths and symbols directly
Large Repos (200+ files)
Retrieval Funnel: Embeddings → Reranking → Agent Reading
Small Repositories: Agent Search
For repositories under 200 files, skip the complexity. Use agent search where you give Claude three simple tools:
The Three Essential Tools
1. list_dir: Directory Exploration
1. list_dir: Directory Exploration
Let Claude explore the file structure naturally.
2. file_search: Find Files by Name
2. file_search: Find Files by Name
When Claude knows roughly what file it’s looking for.
3. read_file: Read and Analyze Code
3. read_file: Read and Analyze Code
Let Claude decide what to read based on what it discovers.
4. semantic_grep: Semantic Code Search (Optional)
4. semantic_grep: Semantic Code Search (Optional)
When Claude needs to find code by meaning, not just filename.
Why Agent Search Works for Small Repos
Benefits:
- Zero setup time
- No indexing required
- Claude makes intelligent choices about what to read
- Works perfectly for most development scenarios
Large Repositories: The Retrieval Funnel
When you hit 200+ files, raw agent search becomes inefficient. Use the retrieval funnel:
🔍 Cast Wide Net
Embeddings: Find 50-100 potentially relevant code chunks using Morph Embeddings
⚡ Focus Down
Reranking: Narrow to the 5-10 most relevant pieces using Morph Rerank
🧠 Agent Reading
Claude: Inspect the ranked results and decide what to read in full
Implementation: The Funnel Approach
When to Use Each Approach
Decision Tree
Best Practices
Start Simple
Start Simple
- Begin with agent search for any new project
- Only add complexity when Claude starts missing relevant code
- Most projects never need embeddings
Upgrade When Needed
Upgrade When Needed
- Switch to retrieval funnel when search becomes slow/inaccurate
- Usually happens around 200-500 files depending on structure
- Monitor Claude’s success rate in finding relevant code
Keep Agent in the Loop
Keep Agent in the Loop
- Even with embeddings, let Claude make final reading decisions
- Provide candidates, not conclusions
- Claude’s reasoning beats pure similarity matching
Performance Expectations
Typical Performance
Repository Size | Approach | Search Time | Success Rate |
---|---|---|---|
< 200 files | Agent Search | < 5 seconds | 95%+ |
200-1000 files | Basic Funnel | < 10 seconds | 90%+ |
1000+ files | Advanced Funnel | < 15 seconds | 85%+ |
The key insight: Most code retrieval problems are solved by giving Claude the right navigation tools, not by throwing embeddings at everything.
Ready to implement? Get your API key for repositories that need the retrieval funnel, or just start with agent search for everything else.