local-deep-research

mirror of https://github.com/LearningCircuit/local-deep-research.git synced 2026-06-16 03:51:07 +03:00

Author	SHA1	Message	Date
LearningCircuit	e79a9fb76a	docs(resource-cleanup) + fix: Round 9 audit results + per-user lock-dict cleanup (#4077 ) * docs(resource-cleanup): Round 9 audit results + conditional deferred fixes Capture the Round 9 broader-resource-audit results so: - Future contributors don't re-audit the same paths - If a relevant production symptom ever appears, the doc points directly at a pre-thought-through conditional fix Round 9 ran two passes of three parallel agents looking for resource leaks BEYOND FDs (memory/cache growth, thread/lock lifecycle, DB state hygiene). Round 1 produced six HIGH-confidence findings; Round 2 verification refuted four of them and downgraded one. Added to the audit ledger (next to the existing Wave 7 entry): - Refuted findings with WHY they were refuted: - @cache on get_available_providers (called with None — hashable, cardinality 1; dicts would raise TypeError, not cache silently) - ThreadLocalSession identity-map growth (expire_on_commit=True default clears the map on every commit) - token_usage table unbounded growth (design-intentional permanent audit table; time-series compound indexes; /api/context-overflow queries historical windows) - search_calls table unbounded growth (same shape and verdict) - Three per-user lock dicts (_user_init_locks, _user_locks, _user_critical_locks): technically correct that they never clean up on user delete, but ~296 bytes per user × 3 dicts = ~900 KB ceiling at 1000 users. Practically negligible. - app_logs (ResearchLog) table — the one finding that survived verification as a real but small concern. No auto-retention; only cleaned by cascade-delete when parent Research row is manually removed. For users keeping all research, logs accumulate. Added to "Intentionally not done (deferred)": - app_logs retention setting + scheduled cleanup job. Includes the trigger conditions that would justify the work and the implementation sketch (settings key, daily APScheduler job, regression test, news fragment). - Per-user lock dict cleanup on user delete. Cosmetic; included with trigger conditions and one-line-per-file sketch so it's actionable if multi-user deployments ever see it. No code changes. Documentation only. * fix(resource-cleanup): pop per-user lock-dict entries on user close Three module-level per-user lock dicts had no removal hook, so each accumulated one ``threading.Lock`` entry per username over the process lifetime: - ``_user_init_locks`` in ``database/library_init.py`` (serializes collection-init check-then-insert) - ``_user_locks`` in ``database/backup/backup_service.py`` (per-user backup serialization) - ``_user_critical_locks`` on ``QueueProcessorV2`` (per-user count-then-start critical section) The ceiling was ~296 bytes/entry × 3 dicts ≈ ~900 bytes per user across all three — bounded by total user count, microscopic relative to the eventpoll FD leak that motivated the original investigation, but real for long-lived multi-user instances with user-account churn. Identified in Round 9 of the broader resource-leak audit (see docs/developing/resource-cleanup.md). The fix: - Each module now exposes a ``pop_user__lock(username)`` function (or method for the QueueProcessor instance dict) that pops the entry under the existing per-dict lock. - ``connection_cleanup._pop_per_user_locks(username)`` is a shared helper that lazy-imports and calls all three with individual try/except blocks so a failure in one doesn't skip the others. - The helper is invoked from both user-close paths: - ``cleanup_idle_connections`` (the 5-minute sweeper) after ``db_manager.close_user_database`` - ``web/auth/routes.py`` after the logout and password-change ``close_user_database`` calls Pattern mirrors the existing per-user cleanup in those code paths (scheduler.unregister_user, session_password_store.clear_). Tests in ``tests/web/auth/test_connection_cleanup.py::TestPopPerUserLocks``: - Unit test: populate all three dicts, call the helper, assert all three entries are gone. - Idempotency: pop on a never-registered user must not raise. - Integration: ``cleanup_idle_connections`` actually invokes the helper for each user it closes (verified via the library-init dict for "alice"). Doc updated: the entry that R9A2 identified as "technically correct, practically negligible" is moved from the audit ledger's findings list into a "Fixed in this PR" subsection; the matching deferred-fix entry in "Intentionally not done" is removed. Adds a towncrier bugfix fragment. * review: address self-review findings on PR #4077 Four fixes from a multi-round agent review of this PR: 1. Move ``_pop_per_user_locks`` outside the ``close_user_database`` try/except in ``connection_cleanup.py``. Previously the pop was inside the same try block, so a DB-close failure — the path that ``test_close_failure_does_not_abort_loop`` already exercises — skipped the pop and leaked the lock-dict entry. New test ``test_pop_runs_even_when_close_user_database_fails`` pins this. 2. Bump three ``logger.debug`` to ``logger.warning`` in ``_pop_per_user_locks``. Matches the sibling scheduler-unregister error handler in the same module; debug-level silently masked lock-dict accumulation across cycles. 3. Doc accuracy fix in ``docs/developing/resource-cleanup.md``. The entry called all three dicts "module-level" but ``_user_critical_locks`` is an instance attribute on ``QueueProcessorV2``. Rewrote to distinguish module-level dicts from the instance attribute and to note the pop now runs outside the close try/except. 4. Integration test pre-populates all three lock-dict entries and asserts all three are absent post-cleanup, not just ``_user_init_locks``. Switched the test username from "alice" (used by other tests in the module) to a dedicated sentinel. Tests: 23/23 in ``tests/web/auth/test_connection_cleanup.py``. Race-condition concerns flagged in Round 1 (TOCTOU between pop and ``_get_user_*_lock``) were verified in Round 2 to be either guarded by Python's reference-counted lock semantics (``with lock:`` keeps the original lock object alive after dict pop) or bounded to a +1 race window across multiple browser sessions — not ship-blocking. The narrow ``library_init`` race surfaces as a propagated ``IntegrityError`` on collection insert, not silent corruption.	2026-05-17 15:40:11 +02:00
LearningCircuit	6f18a711d2	docs(resource-cleanup): expand Wave 7 with full audit ledger (#4054 ) * docs(resource-cleanup): expand Wave 7 with full audit ledger Replaces the brief "follow-up gaps" bullet list with the full ledger of what the broader audit during #4047 actually examined, split into four scannable subsections: - Checked and confirmed clean: non-Ollama LLM providers, HTTP session lifecycle, subprocess/pidfd, asyncio loops, file handles, SocketIO connect/disconnect. - Flagged then verified NOT a real FD leak: OllamaEmbeddings (uses the deprecated langchain_community class with no httpx client), auth_db + journal_quality engines escaping shutdown_databases (bounded pools, not growing), LibraryRAGService in three RAG SSE endpoints (RAM churn, no FDs — FAISS uses pickle.load, embeddings hold no FDs per the item above, SentenceTransformer mmaps are process-wide singletons). - Minor findings: daemon threads without explicit shutdown, abandoned-research cleanup on socket disconnect — both reaped at process exit, not steady-state leaks. - Future-proofing note: ``langchain_community.embeddings.OllamaEmbeddings`` is deprecated; the replacement ``langchain_ollama.OllamaEmbeddings`` DOES carry ``_client`` and ``_async_client`` (verified by direct introspection), so when LDR migrates the in-running-loop eventpoll leak class will reappear for embeddings unless ``_close_base_llm`` is generalized. Direct introspection done at audit time confirms each verdict: ``[a for a in dir(e) if 'client' in a.lower()]`` returned ``[]`` for the deprecated class and a non-empty list for the new class. This ledger saves the next contributor from re-running the same agent sweep when investigating a future FD spike. No code changes. * docs(resource-cleanup): add Round-8 pidfd finding (fixed by #3971) The Wave 7 ledger covered the eventpoll-FD investigation but didn't mention the residual pidfd accumulation we discovered post-merge. A follow-up Round-8 investigation (8 parallel agents, 2 rounds + direct /proc inspection on a live prerelease container) traced ~3.6 pidfds/hour, steady-state ~29, to: _check_subscription → quick_summary → FullSearchResults.batch_fetch_and_extract → AutoHTMLDownloader fallback → PlaywrightHTMLDownloader._fetch_with_playwright → sync_playwright().start() → asyncio.create_subprocess_exec(node-driver) # opens pidfd → driver fails (Chromium not installed in production ldr stage) → pidfd not closed on the failed-child exit CPython 3.14 ruled out as a confounder: subprocess.py uses waitpid(WNOHANG) polling, never opens pidfds. Only asyncio.create_subprocess_* and multiprocessing.Process can open them on Linux + Python 3.9+ via PidfdChildWatcher. PR #3971 (already merged) addresses this from a different angle: it makes web.enable_javascript_rendering default false, so AutoHTMLDownloader short-circuits before invoking Playwright. No subprocess spawned → no pidfd opened. Original motivation for #3971 was the confusing tracebacks reported in #3826; the FD-leak finding is the second motivation, captured here so a future reader sees both. The new bullet sits in Section B (flagged-then-verified-then-fixed) because the leak was real but is now resolved upstream. * docs(resource-cleanup): add FD-leak debugging playbook + CI considerations Add a new "Debugging FD leaks — playbook for the next one" section between the History (Waves 1-7) and "Intentionally not done" parts of the doc, capturing the diagnostic flow we developed across Waves 6 and 7 so future contributors don't re-derive it from scratch. Includes: - Symptoms that justify treating an issue as an FD leak (OSError 24, static-asset MIME errors, High FD count warnings, healthcheck hangs). - Host-side and inside-container snapshot scripts that work even when the container is too FD-starved for docker exec (host-side via sudo + /proc/$P/fd) and through the entrypoint's UID drop (--user 0 to docker exec). - Lookup table mapping each anon_inode / socket / pipe / REG flavor to its likely Python-level source and the path to deep-dive (e.g. /proc/PID/fdinfo/N's Pid: line for pidfds). - A pinpointing recipe per FD type — eventpoll (asyncio/httpx), pidfd (asyncio.create_subprocess / multiprocessing.Process), WAL/SHM (SQLCipher engine.dispose). - Pointer to the existing in-codebase instrumentation: _count_open_fds, the periodic Resource monitor log, fd_monitor.py, and the RUN_MANUAL_SMOKE-gated tests/manual_smoke/test_fd_smoke.py harness. - Honest discussion of why an automated per-PR FD-growth assertion is hard (transient FDs, CI-environment subprocess noise, namespace differences, slow-drip leaks needing hours of uptime) and what a nightly long-run job would look like if the team chooses to invest in one. - A "which Wave fixed which leak class" reference table so the next reporter can recognize a class and skip to the relevant precedent. No code changes. Pure documentation. * docs(resource-cleanup): add development-time detection + bpftrace recipes Extend the FD-leak debugging playbook with two industry-standard techniques that would have caught Waves 6 and 7 earlier, drawn from upstream Python docs and the wider production-tracing literature: 1. bpftrace syscall-level pinpointing (in the per-FD-type section). Trace pidfd_open / epoll_create1 / etc. on the host targeting the container's host PID; produces a histogram of every user stack that triggered the syscall, ranked by frequency. The hot stacks are the culprits. Would have caught the Playwright pidfd leak in seconds. 2. Development-time detection (new subsection 4a) — catches leaks at test time before they ship: - PYTHONASYNCIODEBUG=1 + -W default::ResourceWarning. Per the asyncio dev docs, unclosed transports emit ResourceWarning at GC time; the filter actually displays them. Would have surfaced the Wave 7 in-running-loop skip in any test that exercised ainvoke + safe_close on ChatOllama. - python -X dev for a one-flag local dev mode bundling ResourceWarning + asyncio debug + warnings as default. - pyproject.toml [tool.pytest.ini_options] examples for both "display" and "error" filter modes (with a caveat that error mode needs a targeted subset, not the whole suite, because third-party libs also emit ResourceWarning). - psutil's num_fds / open_files / connections as the cross-platform alternative to /proc/self/fd for unit tests on macOS dev environments. - tracemalloc + objgraph as the next-level tool when a leak is reproducible — diff allocations before/after, then render the reference chain holding the leaked wrapper alive. No code changes. The new tooling is recommendations only; no mandatory pytest config change in this commit. Future work could enable PYTHONASYNCIODEBUG=1 in the CI test environment if the overhead is acceptable. Citations to docs.python.org are inline for the load-bearing ResourceWarning claim. * test(fd-canary): pin asyncio.create_subprocess pidfd lifecycle in CI Add ``TestAsyncioSubprocessFDBaseline`` to ``tests/utilities/test_close_base_llm.py`` with two regression tests that run on every PR: 1. ``test_no_fd_growth_across_asyncio_subprocess_cycles`` — spawns ``/bin/true`` via ``asyncio.create_subprocess_exec`` 10 times and asserts total FD count delta ≤ +2. Pins the pidfd FD class against the child-watcher leak shape. 2. ``test_no_fd_growth_when_subprocess_fails_to_exec`` — same shape but with a deliberately-missing binary, mirroring the exact Wave-7 production failure mode (Playwright's Node.js driver being spawned, kernel returning ENOENT because Chromium wasn't installed, child watcher still expected to clean up the pidfd it opened before the failed exec). Why this is the right level --------------------------- LDR's own code does NOT call ``asyncio.create_subprocess_`` (verified in R8C1). The production leak came from a transitive dependency (Playwright). So we cannot test LDR's call sites directly — there are none. Instead these tests pin the platform baseline*: on this Python version, repeated asyncio subprocess cycles must not leak FDs. If a future Python upgrade, a child-watcher change, or a new direct asyncio.create_subprocess call in LDR breaks the close semantics, the next PR's CI fails on these tests — which is the canary signal we want. Linux-only via ``sys.platform != "linux"`` skip. pidfd_open is a Linux syscall; macOS uses a different watcher and Windows uses ProactorEventLoop. Both 'pass by virtue of nothing to leak', so restricting to Linux keeps the signal sharp (a failure on Linux is actionable; a pass on macOS is uninformative). Same +2 FD slack we use for the eventpoll canary above. A real 1-FD-per-iter leak across 10 iterations would land at delta=10, well past the threshold. Doc reference ------------- Updated ``docs/developing/resource-cleanup.md`` "Existing instrumentation" section to enumerate all four in-CI FD-growth canaries (two eventpoll, two pidfd) so future contributors see at a glance what's already guarded and where to extend coverage when a new leak class is found.	2026-05-16 20:01:04 +02:00
LearningCircuit	3d0b7bb5f9	review: hoist asyncio+threading imports to module level + Wave 7 doc (#4048 ) Addresses the AI Code Review nit on #4047: ``import threading`` (and the sibling ``import asyncio``) lived inside the ``_close_base_llm`` function body. There's no circular-import or optional-dependency reason to defer them; moving them to the top of the module improves readability and static analysis. Also extends ``docs/developing/resource-cleanup.md`` with a Wave 7 entry documenting: - The in-running-loop ``aclose`` skip bug (this PR's fix). - The healthcheck ``pidfd`` leak (Dockerfile change in the same PR). - The three gaps the broader audit during this PR surfaced as follow-up rather than in-scope work: ``OllamaEmbeddings`` httpx (same FD class as ChatOllama, no close path in langchain wrappers), ``auth_db`` / ``journal_quality`` engines escaping ``shutdown_databases``, and three RAG SSE endpoints constructing ``LibraryRAGService`` before the generator without a ``finally`` close. Also captures the negative results from the audit (non-Ollama providers safe via shared lru_cache, no subprocess pidfd risk, no raw event-loop creation, all ``open()`` calls inside ``with``) so a future contributor reading the history sees what was checked and ruled out.	2026-05-14 22:58:57 +02:00
LearningCircuit	5ede95d3b4	docs(developing): add resource-cleanup.md capturing the FD-leak campaign (#3856 ) Adds a single contributor-facing doc that explains how LDR manages process-level resources (DB sessions, LLM HTTP clients, search engines, threads) and the reasoning trail behind the current model. Why this isn't an ADR: ADRs (`docs/decisions/`) are for single-decision records. This doc is wider — it captures current architecture, a how-to cookbook, anti-patterns specific to this codebase, the chronological history of the FD-leak fix campaign (#1832 through #3855), and the deferred work list with reasoning. The history section consolidates ~14 weeks of iterative work across 15+ PRs into a single archive so future contributors hitting FD-shaped issues can see what's been tried, what worked, and what was ruled out without reconstructing it from `git log`. The "intentionally not done" section preempts re-discovery of deferred work as missing work. Related to #3816. The companion code fix is #3855. Co-authored-by: Daniel Petti <djpetti@gmail.com> Co-authored-by: r69 <143521130+r69shabh@users.noreply.github.com> Co-authored-by: Chris Dzombak <chris@dzombak.com>	2026-05-09 10:54:50 +02:00
LearningCircuit	061cd83dd4	feat: add is_lexical flag to auto-enable LLM relevance filtering for keyword-based engines (#3403 ) * feat: add needs_reranking flag to auto-enable LLM relevance filtering for keyword-based engines Engines with poor native relevance ranking (arXiv, PubMed, Wikipedia, GitHub, Mojeek, etc.) now auto-enable LLM-based result filtering via a new `needs_reranking` class attribute. This fixes the priority bug where the global `skip_relevance_filter=True` incorrectly overrode auto-detection for engines that genuinely need filtering. Priority is now: per-engine setting > needs_reranking > global skip. The global skip only affects unclassified engines. Closes #2297 * fix: address 7 code-review issues on needs_reranking branch 1. Rename needs_reranking → needs_llm_relevance_filter for consistency with enable_llm_relevance_filter and skip_relevance_filter naming 2. Fix Paperless dead code: replace non-existent _apply_content_filters with proper _filter_for_relevance() call in custom run() override 3. Fix misleading skip_relevance_filter description to accurately reflect checkbox behavior and keyword engine exceptions 4. Delete 4 vacuously-true inline tests that duplicated factory logic instead of calling the real factory (coverage tests already exist) 5. Add needs_llm_relevance_filter to EXTENDING.md and OVERVIEW.md 6. Clarify is_generic comment: generic does not imply good ranking 7. Upgrade no-LLM log from debug to warning when filtering was requested but no LLM is available (with should_filter guard) * fix: remove Paperless fallback that overrode valid empty LLM filter results Replace the fallback that restored all previews when the LLM filter returned empty with an info log. The base class _filter_for_relevance() already handles errors internally (returns previews[:5] on exception or JSON parse failure). An empty result means the LLM legitimately found nothing relevant — trust it, don't override it. * refactor: rename needs_llm_relevance_filter → is_lexical The flag describes what the engine IS (lexical/keyword-based search) rather than what it needs. This is a general classification that can drive multiple behaviors beyond just the relevance filter — e.g. query optimization strategies, result deduplication, or UI hints. Matches the existing is_* naming pattern (is_scientific, is_generic). * Revert "refactor: rename needs_llm_relevance_filter → is_lexical" This reverts commit `c322d478a1`. * Reapply "refactor: rename needs_llm_relevance_filter → is_lexical" This reverts commit `853dfe90bd`. * feat: add is_lexical classification flag alongside needs_llm_relevance_filter Separates classification from behavior: - is_lexical: informational flag indicating the engine uses keyword/lexical search. Reusable for query optimization, UI hints, deduplication, etc. - needs_llm_relevance_filter: behavioral flag that the factory reads to auto-enable LLM relevance filtering on the engine instance. Both flags are set on all 15 keyword-based engines. The factory only checks needs_llm_relevance_filter for filtering decisions. * fix: improve relevance filter error handling and logging - Return [] on all error paths instead of hiding failures behind previews[:5] fallback — failures should be visible, not masked - Log errors at error level (not warning) for LLM parse failures - Add engine name prefix to all log messages for traceability - Add token estimate debug log to help diagnose context overflow - Reduce log noise: routine operations are debug, only summary is info - Consolidate validation into single check * fix: address PR review findings for relevance filter - Fix literal \n in EXTENDING.md code block - Remove 'Maximum results to return' from LLM prompt (LLM decides) - Add INPUT/KEPT/REMOVED debug logging for filter quality analysis - Add is_lexical + needs_llm_relevance_filter to ElasticsearchSearchEngine - Delete vacuously-true test_missing_llm_returns_none test - Downgrade no-op skip_relevance_filter log from info to debug * refactor: extract relevance filter into dedicated module Pull the inline _filter_for_relevance() logic out of BaseSearchEngine into a new web_search_engines/relevance_filter.py module. - Use with_structured_output() with Pydantic schema; let LangChain pick the per-provider default method (JSON schema on Ollama, tool-calling on Anthropic, responseSchema on Gemini). - Trim prompt: drop URLs, cap snippets at 200 chars. - Suppress reasoning on Ollama thinking-by-default models via reasoning=False — saves 30-60s per call on qwen3 dense variants. - Treat empty LLM responses as valid judgments; log a warning on batches >2 so users notice a misbehaving model. - On exception or parse failure, return first N previews (cap=5 or max_filtered_results) to avoid overwhelming downstream. * refactor(relevance_filter): cleanup + add direct tests * feat(relevance_filter): batch previews in parallel for speed and reliability Adds two tunable parameters to the LLM relevance filter: - batch_size: split previews into chunks before sending to the LLM. Each batch uses local indices [0..batch_size-1] mapped back to global. Default 10. Smaller batches are faster per call AND more reliable on weaker models that struggle with many indices in one context. - max_parallel_batches: dispatch batches concurrently via a ThreadPoolExecutor. Default 4. Result order is preserved across parallel batches. Both exposed as BaseSearchEngine class attributes (relevance_filter_batch_size, relevance_filter_max_parallel_batches) so individual engines can override. Failure semantics: - Hard exception on any batch -> capped slice fallback (unchanged). - Parse failure on a single batch -> skip that batch only, keep results from successful batches. Adds 4 direct unit tests covering chunk/index mapping, batch_size=None single-call mode, failed-batch-skip-keeps-others, and parallel dispatch order preservation. All 120 tests pass. * refactor(relevance_filter): drop structured output, parse plain text The Pydantic with_structured_output() path had several issues: - qwen3 dense models returned prose instead of JSON, raising OutputParserException and disabling the filter for that call - grammar-constrained output on Ollama was 6-10x slower than plain text generation (~24s vs ~4s for 50 previews) - per-provider quirks (function_calling latency, schema bikeshedding) Switch to plain llm.invoke() and parse integers from the response with a tightened regex (word-boundary, no decimal fractions). The prompt now instructs the model to output ONLY the indices, which combined with the regex is robust against prose-injection of small numbers. Removes RelevanceResult Pydantic class, _invoke_structured, the _BATCH_FAILED_PARSE sentinel, and the "all batches failed" branch (all dead under the new contract). Updates tests to mock llm.invoke directly. Tightens default batch_size to 5 and parallel batches to 10 based on benchmark runs against Ollama. * docs: fix stale _filter_for_relevance docstring after text-parsing rewrite	2026-04-06 23:04:47 +02:00
LearningCircuit	05b96fbe3f	refactor: move engine module paths from settings DB to hardcoded registry (#2843 ) * refactor: move engine module paths from settings DB to hardcoded registry Engine implementation details (module_path, class_name, full_search_module, full_search_class) are internal wiring, not user configuration. Storing them in the settings DB created a security attack surface requiring blocklist validation and route blocking. Changes: - New engine_registry.py with frozen dataclass entries for all 24 engines - search_engines_config.py injects registry data after loading DB settings - search_engine_factory.py passes engine_config to full search wrapper - Remove ~52 module/class entries from 9 JSON defaults files - Remove BLOCKED_SETTING_PATTERNS, is_blocked_setting(), and 4 call sites - Remove absolute→relative normalization from module_whitelist.py - Update docs, tests, and golden master * fix: remove TestGetBlockedSettingsError that references removed function The get_blocked_settings_error() function was removed as part of the engine registry refactor. This test class was added on main after the PR was created and wasn't caught by conflict resolution. * fix: remove TestSaveSettingsPostBlockedSetting that tests removed blocking logic BLOCKED_SETTING_PATTERNS and is_blocked_setting() were removed as part of the engine registry refactor. This test was added on main and references the now-removed blocking behavior. * fix: inject ENGINE_REGISTRY into parallel/meta engine _get_search_config() Both ParallelSearchEngine and MetaSearchEngine manually extract config from settings_snapshot without going through search_config(). Since module_path/class_name are no longer in the settings DB (they live in the hardcoded registry), these engines would silently fail to discover sub-engines on fresh installations. Fix: inject ENGINE_REGISTRY values after extraction, matching the pattern used in search_config(). Also fixes MetaSearchEngine's stale check for "search.engine.auto.class_name" in settings_snapshot — this key no longer exists in settings DB, so auto engine config would be skipped. * fix: update tests for engine registry refactor - test_whitelist_config_consistency: check ENGINE_REGISTRY instead of JSON defaults (module_path/class_name no longer in defaults) - test_meta_search_engine_high_value: expect registry-injected module_path/class_name in _get_search_config() output - test_meta_search_engine_extended: registry overwrites snapshot values - test_settings_routes_coverage: remove blocked setting tests (blocking logic removed — registry is now the security mechanism) - test_settings_routes_deep_coverage2: same as above * fix: add 5 missing engines to registry, strip module_path from their settings Add gutenberg, openlibrary, pubchem, stackexchange, and zenodo to ENGINE_REGISTRY (were added to main in #1540 after this branch diverged). Remove module_path/class_name from their settings JSON files and golden master, matching the pattern established for all other engines. Expand test_engine_registry.py to scan per-engine settings_.json files and verify no settings files still contain module_path/class_name. fix: inject full_search_module/class in meta/parallel engine _get_search_config() The registry injection in MetaSearchEngine and ParallelSearchEngine was missing full_search_module and full_search_class fields, making it inconsistent with the main search_config() injection. This would cause full-search wrappers to fail when created through meta/parallel engines. * fix: resolve pre-commit formatting issues and sync pdm.lock after merge with main	2026-03-20 20:22:10 +01:00
LearningCircuit	890c84e534	docs: link auto-generated Configuration Reference across docs & fix stale env var docs (#2472 ) - Add "Config Reference" link to Settings page "Learn & Get Help" bar - Overhaul docs/env_configuration.md: remove stale Dynaconf references, fix wrong double-underscore env var format, remove documented-as-fixed bug, replace duplicate tables with links to CONFIGURATION.md - Fix broken case-sensitive link in docs/deployment/unraid.md - Add CONFIGURATION.md cross-references to 12 docs' "See Also" sections - Update .env.template with correct LDR_-prefixed variable names - Add config reference comment to docker-compose.yml environment block	2026-02-28 13:46:34 +01:00
LearningCircuit	465b0f3e9e	docs: Add architecture, extension guide, and troubleshooting documentation Add comprehensive documentation for contributors and users: - docs/architecture/OVERVIEW.md: System architecture with Mermaid diagrams covering components, research flow, threading model, and configuration - docs/architecture/DATABASE_SCHEMA.md: Complete database schema with ER diagram documenting all 40+ models - docs/developing/EXTENDING.md: Extension guide for adding custom search engines, strategies, LLM providers, and LangChain retrievers - docs/troubleshooting.md: Common issues and solutions for LLM, search, database, WebSocket, Docker, and API problems	2025-12-26 14:20:23 +01:00

8 Commits