mirror of https://github.com/LearningCircuit/local-deep-research.git synced 2026-06-16 03:51:07 +03:00

Files

LearningCircuit cefdf70a4e fix: detect all redundant exception patterns in logger.exception hook (#2194 )

* fix: detect redundant exception args in logger.exception pre-commit hook

The regex-based check only caught f-string interpolation ({e}) but missed
%-style formatting (logger.exception("..%s..", e)), str(e)/repr(e) as
arguments, and {str(e)} inside f-strings. Replaced the regex with an
AST-based check that reliably catches all forms.

* fix: remove redundant exception vars from logger.exception calls

logger.exception() automatically includes the full traceback, so
passing the exception variable (e, exc, err, etc.) is redundant.
Removed 143 instances across 71 files including f-string {e},
%-style formatting, and bare exception arg patterns.

* fix: enhance hook with string concat detection and fix remaining violations

- Extract exception var names to class constant EXCEPTION_VAR_NAMES
- Add _string_concat_references_exception() to detect "Error: " + str(e)
- Fix 7 remaining files in examples/ and tests/ that failed CI

* fix: restore missing space in notification manager log message

2026-02-14 13:07:30 +01:00

browsecomp

feat: Add pre-commit hook to enforce pathlib usage (issue #640 ) (#656 )

2025-08-17 22:52:35 +02:00

claude_grading

Fix shellcheck warnings in all shell scripts

2025-11-27 19:18:10 +01:00

gemini

feat: Add pre-commit hook to enforce pathlib usage (issue #640 ) (#656 )

2025-08-17 22:52:35 +02:00

scripts

feat: Add pre-commit hook to enforce pathlib usage (issue #640 ) (#656 )

2025-08-17 22:52:35 +02:00

browsecomp_benchmark_readme.md

fix: Resolve CI test failures in search engines

2025-06-03 02:57:35 +02:00

README.md

fix: Resolve CI test failures in search engines

2025-06-03 02:57:35 +02:00

run_browsecomp.py

fix: detect all redundant exception patterns in logger.exception hook (#2194 )

2026-02-14 13:07:30 +01:00

run_gemini_benchmark.py

feat: Add pre-commit hook to enforce pathlib usage (issue #640 ) (#656 )

2025-08-17 22:52:35 +02:00

run_resumable_parallel_benchmark.py

feat: Add pre-commit hook to enforce pathlib usage (issue #640 ) (#656 )

2025-08-17 22:52:35 +02:00

run_simpleqa.py

feat: Add pre-commit hook to enforce pathlib usage (issue #640 ) (#656 )

2025-08-17 22:52:35 +02:00

README.md

Benchmarks for Local Deep Research

This directory contains scripts for running benchmarks to evaluate Local Deep Research's performance.

Available Benchmarks

SimpleQA

The SimpleQA benchmark evaluates factual question answering capabilities.

python run_simpleqa.py --examples 10 --iterations 3 --questions 3

Options:

--examples: Number of examples to run (default: 10)
--iterations: Number of search iterations (default: 3)
--questions: Questions per iteration (default: 3)
--search-tool: Search tool to use (default: "searxng")
--output-dir: Directory to save results (default: "benchmark_results")
--no-eval: Skip evaluation
--human-eval: Use human evaluation
--eval-model: Model to use for evaluation
--eval-provider: Provider to use for evaluation

BrowseComp

The BrowseComp benchmark evaluates web browsing comprehension and complex question answering.

python run_browsecomp.py --examples 5 --iterations 3 --questions 3

Options:

--examples: Number of examples to run (default: 2)
--iterations: Number of search iterations (default: 1)
--questions: Questions per iteration (default: 1)
--search-tool: Search tool to use (default: "searxng")
--output-dir: Directory to save results (default: "browsecomp_results")

See browsecomp_benchmark_readme.md for more information on how BrowseComp works.

Running All Benchmarks

To run both benchmarks and compare results:

# Run SimpleQA with default settings
python run_simpleqa.py

# Run BrowseComp with increased iterations and questions
python run_browsecomp.py --iterations 3 --questions 3

Evaluating Results

Results are saved in the specified output directories and include:

Raw results (JSONL format)
Evaluation results (JSONL format)
Summary reports (Markdown format)

The scripts will also print a summary of the results to the console, including accuracy metrics.