mirror of https://github.com/LearningCircuit/local-deep-research.git synced 2026-06-16 20:10:39 +03:00

Files

LearningCircuit f246fa6044 docs: add comprehensive MCP server documentation (#2546 )

* docs: add comprehensive MCP server documentation

- Create standalone docs/mcp-server.md with full MCP server docs covering
  installation, configuration, all 7 tools, research strategies guide,
  ReAct agentic strategy deep dive, MCP client setup, error handling,
  security model, Docker deployment, usage examples, and troubleshooting
- Add MCP Server section to docs/features.md under Advanced Features
- Add MCP Server CLI section to docs/cli-tools.md
- Fix search.search_strategy -> search.strategy in server.py and tests
  to match renamed setting from #2550

* fix(docs): correct 9 issues found in MCP server documentation review

- Revert search.strategy → search.search_strategy in server.py and tests (6 occurrences)
- Fix collection_name description: it's an engine ID, not a display name
- Fix invalid JSON in analyze_documents return example
- Add missing MCP Server CLI entry to cli-tools.md TOC
- Add unknown error type to error handling table
- Fix broken MCP security guide external link
- Clarify Docker section: MCP must run on host (STDIO can't bridge containers)
- Fix "7 research tools" → "7 tools (4 research, 3 discovery)" in features.md
- Add temperature valid range note (0.0-2.0)

* feat(mcp): add `search` tool for raw search results without LLM

Add a new MCP tool that calls a specific search engine and returns raw
results (title, link, snippet) without LLM processing. This enables
external AI agents to perform fast, cost-free searches and handle
result analysis themselves.

- Required `engine` parameter with validation against available engines
- API key presence check before engine creation
- Body-to-snippet normalization for consistent output
- 8 test cases covering success, errors, and edge cases
- Updated docs with tool count (7→8) and parameter reference

* fix(mcp): set thread-local settings context in search tool

Some engine constructors (e.g., arxiv's JournalReputationFilter) call
get_llm() internally without passing settings_snapshot, falling through
to the thread-local settings context. Set and clean up the context so
these engines can resolve settings correctly.

* docs: add OpenClaw MCP client configuration (#2562)

Add OpenClaw configuration subsection alongside Claude Desktop in the
MCP server guide, as suggested in PR #2546 review.

* docs: add Claude Code config, individual search engine examples, and openclaw

- Add Claude Code MCP configuration (.mcp.json) to README and mcp-server.md
- Add search tool to README tools table with LLM Cost column
- Add individual search engine examples (arxiv, pubmed, wikipedia, openclaw)
- Highlight search tool usefulness for monitoring and subscriptions
- List common engines in mcp-server.md search tool section

2026-03-06 03:12:52 +01:00

7.7 KiB

Raw Blame History

CLI Tools Reference

Local Deep Research includes command-line tools for benchmarking and rate limit management.

Benchmarking CLI
Rate Limiting CLI
MCP Server CLI

Benchmarking CLI

Run benchmarks to evaluate search quality and compare configurations.

Basic Usage

python -m local_deep_research.benchmarks.cli <command> [options]

Commands

`simpleqa` - Run SimpleQA Benchmark

Tests factual question answering accuracy.

python -m local_deep_research.benchmarks.cli simpleqa [options]

Options:

Option	Default	Description
`--examples`	100	Number of questions to test
`--iterations`	3	Search iterations per question
`--questions`	3	Questions per iteration
`--search-tool`	searxng	Search engine to use
`--search-strategy`	source_based	Strategy (source_based, standard, rapid, parallel, iterdrag)
`--search-model`	(default)	LLM model for research
`--search-provider`	(default)	LLM provider
`--eval-model`	(default)	Model for answer evaluation
`--eval-provider`	(default)	Provider for evaluation
`--output-dir`	~/.local-deep-research/benchmark_results	Results directory
`--human-eval`	false	Use human evaluation
`--no-eval`	false	Skip evaluation phase
`--custom-dataset`	-	Path to custom dataset

Example:

# Run 50 examples with Ollama
python -m local_deep_research.benchmarks.cli simpleqa \
  --examples 50 \
  --search-provider ollama \
  --search-model llama3.2

`browsecomp` - Run BrowseComp Benchmark

Tests complex reasoning and multi-step research.

python -m local_deep_research.benchmarks.cli browsecomp [options]

Same options as simpleqa.

Example:

# Run BrowseComp with focused-iteration strategy
python -m local_deep_research.benchmarks.cli browsecomp \
  --examples 20 \
  --search-strategy iterdrag \
  --iterations 5

`compare` - Compare Configurations

Compare multiple search configurations on the same dataset.

python -m local_deep_research.benchmarks.cli compare [options]

Options:

Option	Default	Description
`--dataset`	simpleqa	Dataset to use (simpleqa, browsecomp)
`--examples`	20	Examples per configuration
`--output-dir`	~/.local-deep-research/benchmark_results/comparison	Results directory

Example:

# Compare configurations
python -m local_deep_research.benchmarks.cli compare \
  --dataset simpleqa \
  --examples 30

`list` - List Available Benchmarks

python -m local_deep_research.benchmarks.cli list

Shows available benchmark datasets and their descriptions.

Rate Limiting CLI

Monitor and manage the adaptive rate limiting system.

Basic Usage

python -m local_deep_research.web_search_engines.rate_limiting.cli <command> [options]

Commands

`status` - Show Rate Limit Statistics

View current rate limit data for search engines.

# All engines
python -m local_deep_research.web_search_engines.rate_limiting.cli status

# Specific engine
python -m local_deep_research.web_search_engines.rate_limiting.cli status --engine DuckDuckGoSearchEngine

Output columns:

Column	Description
Engine	Search engine name
Base Wait	Current wait time in seconds
Range	Min-max wait times
Success	Success rate percentage
Attempts	Total request attempts
Updated	Last update timestamp

Example output:

Rate Limit Statistics:
--------------------------------------------------------------------------------
Engine               Base Wait    Range                Success    Attempts   Updated
--------------------------------------------------------------------------------
DuckDuckGoSearchEngine 2.50        1.0s - 5.0s         95.2%      150        12-26 14:30
ArXivSearchEngine      0.50        0.5s - 1.0s         99.8%      85         12-26 12:15

`reset` - Reset Engine Rate Limits

Clear learned rate limit data for an engine.

python -m local_deep_research.web_search_engines.rate_limiting.cli reset --engine <engine_name>

Example:

# Reset DuckDuckGo rate limits
python -m local_deep_research.web_search_engines.rate_limiting.cli reset --engine DuckDuckGoSearchEngine

Use this when:

Rate limits are too conservative
After API changes
When switching environments

`export` - Export Rate Limit Data

Export rate limit statistics in various formats.

# Table format (default)
python -m local_deep_research.web_search_engines.rate_limiting.cli export

# CSV format
python -m local_deep_research.web_search_engines.rate_limiting.cli export --format csv

# JSON format
python -m local_deep_research.web_search_engines.rate_limiting.cli export --format json

Formats:

Format	Use Case
`table`	Human-readable display
`csv`	Spreadsheet import
`json`	Programmatic processing

Example CSV output:

engine_type,base_wait_seconds,min_wait_seconds,max_wait_seconds,last_updated,total_attempts,success_rate
DuckDuckGoSearchEngine,2.5,1.0,5.0,1703612400,150,0.952

`cleanup` - Remove Old Data

Clean up rate limit data older than a specified number of days.

python -m local_deep_research.web_search_engines.rate_limiting.cli cleanup --days <days>

Example:

# Remove data older than 30 days
python -m local_deep_research.web_search_engines.rate_limiting.cli cleanup --days 30

# Remove data older than 7 days
python -m local_deep_research.web_search_engines.rate_limiting.cli cleanup --days 7

MCP Server CLI

Run the MCP server for Claude Desktop integration.

Basic Usage

# Via entry point
ldr-mcp

# Via module
python -m local_deep_research.mcp

The server communicates over STDIO (stdin/stdout for JSON-RPC, stderr for logs). It is designed to be launched by Claude Desktop, not run interactively.

Environment Variables

Set via Claude Desktop config env block or shell environment:

Variable	Description	Example
`LDR_LLM_PROVIDER`	LLM provider	`openai`, `ollama`, `anthropic`
`LDR_LLM_MODEL`	Model name	`gpt-4`, `llama3:8b`
`LDR_LLM_OPENAI_API_KEY`	OpenAI API key	`sk-...`
`LDR_SEARCH_TOOL`	Default search engine	`auto`, `arxiv`, `wikipedia`
`LDR_SEARCH_SEARCH_STRATEGY`	Default strategy	`source-based`, `focused-iteration`

See MCP Server Guide for full documentation.

Common Engine Names

When using the rate limiting CLI, use these engine class names:

Engine	Class Name
DuckDuckGo	`DuckDuckGoSearchEngine`
SearXNG	`SearXNGSearchEngine`
Brave	`BraveSearchEngine`
arXiv	`ArXivSearchEngine`
PubMed	`PubMedSearchEngine`
Semantic Scholar	`SemanticScholarSearchEngine`
Wikipedia	`WikipediaSearchEngine`
GitHub	`GitHubSearchEngine`

Troubleshooting

Benchmark Not Starting

Verify LLM provider is configured
Check search engine is available
Ensure sufficient disk space for results

Rate Limit Data Missing

Run some searches first to generate data
Check database file exists
Try status without --engine flag

Export Permission Error

Check write permissions on output directory
Use a different output directory

7.7 KiB

Raw Blame History

CLI Tools Reference

Table of Contents

Benchmarking CLI

Basic Usage

Commands

`simpleqa` - Run SimpleQA Benchmark

`browsecomp` - Run BrowseComp Benchmark

`compare` - Compare Configurations

`list` - List Available Benchmarks

Rate Limiting CLI

Basic Usage

Commands

`status` - Show Rate Limit Statistics

`reset` - Reset Engine Rate Limits

`export` - Export Rate Limit Data

`cleanup` - Remove Old Data

MCP Server CLI

Basic Usage

Environment Variables

Common Engine Names

Troubleshooting

Benchmark Not Starting

Rate Limit Data Missing

Export Permission Error

See Also

7.7 KiB Raw Blame History

CLI Tools Reference

Table of Contents

Benchmarking CLI

Basic Usage

Commands

simpleqa - Run SimpleQA Benchmark

browsecomp - Run BrowseComp Benchmark

compare - Compare Configurations

list - List Available Benchmarks

Rate Limiting CLI

Basic Usage

Commands

status - Show Rate Limit Statistics

reset - Reset Engine Rate Limits

export - Export Rate Limit Data

cleanup - Remove Old Data

MCP Server CLI

Basic Usage

Environment Variables

Common Engine Names

Troubleshooting

Benchmark Not Starting

Rate Limit Data Missing

Export Permission Error

See Also

7.7 KiB

Raw Blame History

`simpleqa` - Run SimpleQA Benchmark

`browsecomp` - Run BrowseComp Benchmark

`compare` - Compare Configurations

`list` - List Available Benchmarks

`status` - Show Rate Limit Statistics

`reset` - Reset Engine Rate Limits

`export` - Export Rate Limit Data

`cleanup` - Remove Old Data