Files
local-deep-research/docs/cli-tools.md
LearningCircuit f246fa6044 docs: add comprehensive MCP server documentation (#2546)
* docs: add comprehensive MCP server documentation

- Create standalone docs/mcp-server.md with full MCP server docs covering
  installation, configuration, all 7 tools, research strategies guide,
  ReAct agentic strategy deep dive, MCP client setup, error handling,
  security model, Docker deployment, usage examples, and troubleshooting
- Add MCP Server section to docs/features.md under Advanced Features
- Add MCP Server CLI section to docs/cli-tools.md
- Fix search.search_strategy -> search.strategy in server.py and tests
  to match renamed setting from #2550

* fix(docs): correct 9 issues found in MCP server documentation review

- Revert search.strategy → search.search_strategy in server.py and tests (6 occurrences)
- Fix collection_name description: it's an engine ID, not a display name
- Fix invalid JSON in analyze_documents return example
- Add missing MCP Server CLI entry to cli-tools.md TOC
- Add unknown error type to error handling table
- Fix broken MCP security guide external link
- Clarify Docker section: MCP must run on host (STDIO can't bridge containers)
- Fix "7 research tools" → "7 tools (4 research, 3 discovery)" in features.md
- Add temperature valid range note (0.0-2.0)

* feat(mcp): add `search` tool for raw search results without LLM

Add a new MCP tool that calls a specific search engine and returns raw
results (title, link, snippet) without LLM processing. This enables
external AI agents to perform fast, cost-free searches and handle
result analysis themselves.

- Required `engine` parameter with validation against available engines
- API key presence check before engine creation
- Body-to-snippet normalization for consistent output
- 8 test cases covering success, errors, and edge cases
- Updated docs with tool count (7→8) and parameter reference

* fix(mcp): set thread-local settings context in search tool

Some engine constructors (e.g., arxiv's JournalReputationFilter) call
get_llm() internally without passing settings_snapshot, falling through
to the thread-local settings context. Set and clean up the context so
these engines can resolve settings correctly.

* docs: add OpenClaw MCP client configuration (#2562)

Add OpenClaw configuration subsection alongside Claude Desktop in the
MCP server guide, as suggested in PR #2546 review.

* docs: add Claude Code config, individual search engine examples, and openclaw

- Add Claude Code MCP configuration (.mcp.json) to README and mcp-server.md
- Add search tool to README tools table with LLM Cost column
- Add individual search engine examples (arxiv, pubmed, wikipedia, openclaw)
- Highlight search tool usefulness for monitoring and subscriptions
- List common engines in mcp-server.md search tool section
2026-03-06 03:12:52 +01:00

7.7 KiB

CLI Tools Reference

Local Deep Research includes command-line tools for benchmarking and rate limit management.

Table of Contents


Benchmarking CLI

Run benchmarks to evaluate search quality and compare configurations.

Basic Usage

python -m local_deep_research.benchmarks.cli <command> [options]

Commands

simpleqa - Run SimpleQA Benchmark

Tests factual question answering accuracy.

python -m local_deep_research.benchmarks.cli simpleqa [options]

Options:

Option Default Description
--examples 100 Number of questions to test
--iterations 3 Search iterations per question
--questions 3 Questions per iteration
--search-tool searxng Search engine to use
--search-strategy source_based Strategy (source_based, standard, rapid, parallel, iterdrag)
--search-model (default) LLM model for research
--search-provider (default) LLM provider
--eval-model (default) Model for answer evaluation
--eval-provider (default) Provider for evaluation
--output-dir ~/.local-deep-research/benchmark_results Results directory
--human-eval false Use human evaluation
--no-eval false Skip evaluation phase
--custom-dataset - Path to custom dataset

Example:

# Run 50 examples with Ollama
python -m local_deep_research.benchmarks.cli simpleqa \
  --examples 50 \
  --search-provider ollama \
  --search-model llama3.2

browsecomp - Run BrowseComp Benchmark

Tests complex reasoning and multi-step research.

python -m local_deep_research.benchmarks.cli browsecomp [options]

Same options as simpleqa.

Example:

# Run BrowseComp with focused-iteration strategy
python -m local_deep_research.benchmarks.cli browsecomp \
  --examples 20 \
  --search-strategy iterdrag \
  --iterations 5

compare - Compare Configurations

Compare multiple search configurations on the same dataset.

python -m local_deep_research.benchmarks.cli compare [options]

Options:

Option Default Description
--dataset simpleqa Dataset to use (simpleqa, browsecomp)
--examples 20 Examples per configuration
--output-dir ~/.local-deep-research/benchmark_results/comparison Results directory

Example:

# Compare configurations
python -m local_deep_research.benchmarks.cli compare \
  --dataset simpleqa \
  --examples 30

list - List Available Benchmarks

python -m local_deep_research.benchmarks.cli list

Shows available benchmark datasets and their descriptions.


Rate Limiting CLI

Monitor and manage the adaptive rate limiting system.

Basic Usage

python -m local_deep_research.web_search_engines.rate_limiting.cli <command> [options]

Commands

status - Show Rate Limit Statistics

View current rate limit data for search engines.

# All engines
python -m local_deep_research.web_search_engines.rate_limiting.cli status

# Specific engine
python -m local_deep_research.web_search_engines.rate_limiting.cli status --engine DuckDuckGoSearchEngine

Output columns:

Column Description
Engine Search engine name
Base Wait Current wait time in seconds
Range Min-max wait times
Success Success rate percentage
Attempts Total request attempts
Updated Last update timestamp

Example output:

Rate Limit Statistics:
--------------------------------------------------------------------------------
Engine               Base Wait    Range                Success    Attempts   Updated
--------------------------------------------------------------------------------
DuckDuckGoSearchEngine 2.50        1.0s - 5.0s         95.2%      150        12-26 14:30
ArXivSearchEngine      0.50        0.5s - 1.0s         99.8%      85         12-26 12:15

reset - Reset Engine Rate Limits

Clear learned rate limit data for an engine.

python -m local_deep_research.web_search_engines.rate_limiting.cli reset --engine <engine_name>

Example:

# Reset DuckDuckGo rate limits
python -m local_deep_research.web_search_engines.rate_limiting.cli reset --engine DuckDuckGoSearchEngine

Use this when:

  • Rate limits are too conservative
  • After API changes
  • When switching environments

export - Export Rate Limit Data

Export rate limit statistics in various formats.

# Table format (default)
python -m local_deep_research.web_search_engines.rate_limiting.cli export

# CSV format
python -m local_deep_research.web_search_engines.rate_limiting.cli export --format csv

# JSON format
python -m local_deep_research.web_search_engines.rate_limiting.cli export --format json

Formats:

Format Use Case
table Human-readable display
csv Spreadsheet import
json Programmatic processing

Example CSV output:

engine_type,base_wait_seconds,min_wait_seconds,max_wait_seconds,last_updated,total_attempts,success_rate
DuckDuckGoSearchEngine,2.5,1.0,5.0,1703612400,150,0.952

cleanup - Remove Old Data

Clean up rate limit data older than a specified number of days.

python -m local_deep_research.web_search_engines.rate_limiting.cli cleanup --days <days>

Example:

# Remove data older than 30 days
python -m local_deep_research.web_search_engines.rate_limiting.cli cleanup --days 30

# Remove data older than 7 days
python -m local_deep_research.web_search_engines.rate_limiting.cli cleanup --days 7

MCP Server CLI

Run the MCP server for Claude Desktop integration.

Basic Usage

# Via entry point
ldr-mcp

# Via module
python -m local_deep_research.mcp

The server communicates over STDIO (stdin/stdout for JSON-RPC, stderr for logs). It is designed to be launched by Claude Desktop, not run interactively.

Environment Variables

Set via Claude Desktop config env block or shell environment:

Variable Description Example
LDR_LLM_PROVIDER LLM provider openai, ollama, anthropic
LDR_LLM_MODEL Model name gpt-4, llama3:8b
LDR_LLM_OPENAI_API_KEY OpenAI API key sk-...
LDR_SEARCH_TOOL Default search engine auto, arxiv, wikipedia
LDR_SEARCH_SEARCH_STRATEGY Default strategy source-based, focused-iteration

See MCP Server Guide for full documentation.


Common Engine Names

When using the rate limiting CLI, use these engine class names:

Engine Class Name
DuckDuckGo DuckDuckGoSearchEngine
SearXNG SearXNGSearchEngine
Brave BraveSearchEngine
arXiv ArXivSearchEngine
PubMed PubMedSearchEngine
Semantic Scholar SemanticScholarSearchEngine
Wikipedia WikipediaSearchEngine
GitHub GitHubSearchEngine

Troubleshooting

Benchmark Not Starting

  • Verify LLM provider is configured
  • Check search engine is available
  • Ensure sufficient disk space for results

Rate Limit Data Missing

  • Run some searches first to generate data
  • Check database file exists
  • Try status without --engine flag

Export Permission Error

  • Check write permissions on output directory
  • Use a different output directory

See Also