Commit Graph

8 Commits

Author SHA1 Message Date
LearningCircuit
e79a9fb76a docs(resource-cleanup) + fix: Round 9 audit results + per-user lock-dict cleanup (#4077)
* docs(resource-cleanup): Round 9 audit results + conditional deferred fixes

Capture the Round 9 broader-resource-audit results so:
- Future contributors don't re-audit the same paths
- If a relevant production symptom ever appears, the doc points
  directly at a pre-thought-through conditional fix

Round 9 ran two passes of three parallel agents looking for resource
leaks BEYOND FDs (memory/cache growth, thread/lock lifecycle, DB state
hygiene). Round 1 produced six HIGH-confidence findings; Round 2
verification refuted four of them and downgraded one.

Added to the audit ledger (next to the existing Wave 7 entry):

- Refuted findings with WHY they were refuted:
  - @cache on get_available_providers (called with None — hashable,
    cardinality 1; dicts would raise TypeError, not cache silently)
  - ThreadLocalSession identity-map growth (expire_on_commit=True
    default clears the map on every commit)
  - token_usage table unbounded growth (design-intentional permanent
    audit table; time-series compound indexes; /api/context-overflow
    queries historical windows)
  - search_calls table unbounded growth (same shape and verdict)

- Three per-user lock dicts (_user_init_locks, _user_locks,
  _user_critical_locks): technically correct that they never clean
  up on user delete, but ~296 bytes per user × 3 dicts = ~900 KB
  ceiling at 1000 users. Practically negligible.

- app_logs (ResearchLog) table — the one finding that survived
  verification as a real but small concern. No auto-retention; only
  cleaned by cascade-delete when parent Research row is manually
  removed. For users keeping all research, logs accumulate.

Added to "Intentionally not done (deferred)":

- app_logs retention setting + scheduled cleanup job. Includes the
  trigger conditions that would justify the work and the
  implementation sketch (settings key, daily APScheduler job,
  regression test, news fragment).

- Per-user lock dict cleanup on user delete. Cosmetic; included with
  trigger conditions and one-line-per-file sketch so it's actionable
  if multi-user deployments ever see it.

No code changes. Documentation only.

* fix(resource-cleanup): pop per-user lock-dict entries on user close

Three module-level per-user lock dicts had no removal hook, so each
accumulated one ``threading.Lock`` entry per username over the
process lifetime:

- ``_user_init_locks`` in ``database/library_init.py`` (serializes
  collection-init check-then-insert)
- ``_user_locks`` in ``database/backup/backup_service.py`` (per-user
  backup serialization)
- ``_user_critical_locks`` on ``QueueProcessorV2`` (per-user
  count-then-start critical section)

The ceiling was ~296 bytes/entry × 3 dicts ≈ ~900 bytes per user
across all three — bounded by total user count, microscopic relative
to the eventpoll FD leak that motivated the original investigation,
but real for long-lived multi-user instances with user-account
churn. Identified in Round 9 of the broader resource-leak audit
(see docs/developing/resource-cleanup.md).

The fix:

- Each module now exposes a ``pop_user_*_lock(username)`` function
  (or method for the QueueProcessor instance dict) that pops the
  entry under the existing per-dict lock.
- ``connection_cleanup._pop_per_user_locks(username)`` is a shared
  helper that lazy-imports and calls all three with individual
  try/except blocks so a failure in one doesn't skip the others.
- The helper is invoked from both user-close paths:
  - ``cleanup_idle_connections`` (the 5-minute sweeper) after
    ``db_manager.close_user_database``
  - ``web/auth/routes.py`` after the logout and password-change
    ``close_user_database`` calls

Pattern mirrors the existing per-user cleanup in those code paths
(scheduler.unregister_user, session_password_store.clear_*).

Tests in ``tests/web/auth/test_connection_cleanup.py::TestPopPerUserLocks``:

- Unit test: populate all three dicts, call the helper, assert all
  three entries are gone.
- Idempotency: pop on a never-registered user must not raise.
- Integration: ``cleanup_idle_connections`` actually invokes the
  helper for each user it closes (verified via the library-init dict
  for "alice").

Doc updated: the entry that R9A2 identified as "technically correct,
practically negligible" is moved from the audit ledger's findings
list into a "Fixed in this PR" subsection; the matching
deferred-fix entry in "Intentionally not done" is removed.

Adds a towncrier bugfix fragment.

* review: address self-review findings on PR #4077

Four fixes from a multi-round agent review of this PR:

1. Move ``_pop_per_user_locks`` outside the ``close_user_database``
   try/except in ``connection_cleanup.py``. Previously the pop was
   inside the same try block, so a DB-close failure — the path that
   ``test_close_failure_does_not_abort_loop`` already exercises —
   skipped the pop and leaked the lock-dict entry. New test
   ``test_pop_runs_even_when_close_user_database_fails`` pins this.

2. Bump three ``logger.debug`` to ``logger.warning`` in
   ``_pop_per_user_locks``. Matches the sibling scheduler-unregister
   error handler in the same module; debug-level silently masked
   lock-dict accumulation across cycles.

3. Doc accuracy fix in ``docs/developing/resource-cleanup.md``.
   The entry called all three dicts "module-level" but
   ``_user_critical_locks`` is an instance attribute on
   ``QueueProcessorV2``. Rewrote to distinguish module-level dicts
   from the instance attribute and to note the pop now runs outside
   the close try/except.

4. Integration test pre-populates all three lock-dict entries and
   asserts all three are absent post-cleanup, not just
   ``_user_init_locks``. Switched the test username from "alice"
   (used by other tests in the module) to a dedicated sentinel.

Tests: 23/23 in ``tests/web/auth/test_connection_cleanup.py``.

Race-condition concerns flagged in Round 1 (TOCTOU between pop and
``_get_user_*_lock``) were verified in Round 2 to be either guarded
by Python's reference-counted lock semantics (``with lock:`` keeps
the original lock object alive after dict pop) or bounded to a +1
race window across multiple browser sessions — not ship-blocking.
The narrow ``library_init`` race surfaces as a propagated
``IntegrityError`` on collection insert, not silent corruption.
2026-05-17 15:40:11 +02:00
LearningCircuit
6f18a711d2 docs(resource-cleanup): expand Wave 7 with full audit ledger (#4054)
* docs(resource-cleanup): expand Wave 7 with full audit ledger

Replaces the brief "follow-up gaps" bullet list with the full ledger
of what the broader audit during #4047 actually examined, split into
four scannable subsections:

- Checked and confirmed clean: non-Ollama LLM providers, HTTP session
  lifecycle, subprocess/pidfd, asyncio loops, file handles, SocketIO
  connect/disconnect.
- Flagged then verified NOT a real FD leak: OllamaEmbeddings (uses
  the deprecated langchain_community class with no httpx client),
  auth_db + journal_quality engines escaping shutdown_databases
  (bounded pools, not growing), LibraryRAGService in three RAG SSE
  endpoints (RAM churn, no FDs — FAISS uses pickle.load, embeddings
  hold no FDs per the item above, SentenceTransformer mmaps are
  process-wide singletons).
- Minor findings: daemon threads without explicit shutdown,
  abandoned-research cleanup on socket disconnect — both reaped at
  process exit, not steady-state leaks.
- Future-proofing note: ``langchain_community.embeddings.OllamaEmbeddings``
  is deprecated; the replacement ``langchain_ollama.OllamaEmbeddings``
  DOES carry ``_client`` and ``_async_client`` (verified by direct
  introspection), so when LDR migrates the in-running-loop eventpoll
  leak class will reappear for embeddings unless ``_close_base_llm``
  is generalized.

Direct introspection done at audit time confirms each verdict:
``[a for a in dir(e) if 'client' in a.lower()]`` returned ``[]`` for
the deprecated class and a non-empty list for the new class. This
ledger saves the next contributor from re-running the same agent
sweep when investigating a future FD spike.

No code changes.

* docs(resource-cleanup): add Round-8 pidfd finding (fixed by #3971)

The Wave 7 ledger covered the eventpoll-FD investigation but didn't
mention the residual pidfd accumulation we discovered post-merge. A
follow-up Round-8 investigation (8 parallel agents, 2 rounds + direct
/proc inspection on a live prerelease container) traced ~3.6
pidfds/hour, steady-state ~29, to:

  _check_subscription → quick_summary
    → FullSearchResults.batch_fetch_and_extract
    → AutoHTMLDownloader fallback
    → PlaywrightHTMLDownloader._fetch_with_playwright
    → sync_playwright().start()
    → asyncio.create_subprocess_exec(node-driver)  # opens pidfd
    → driver fails (Chromium not installed in production ldr stage)
    → pidfd not closed on the failed-child exit

CPython 3.14 ruled out as a confounder: subprocess.py uses
waitpid(WNOHANG) polling, never opens pidfds. Only
asyncio.create_subprocess_* and multiprocessing.Process can open them
on Linux + Python 3.9+ via PidfdChildWatcher.

PR #3971 (already merged) addresses this from a different angle: it
makes web.enable_javascript_rendering default false, so
AutoHTMLDownloader short-circuits before invoking Playwright. No
subprocess spawned → no pidfd opened. Original motivation for #3971
was the confusing tracebacks reported in #3826; the FD-leak finding
is the second motivation, captured here so a future reader sees both.

The new bullet sits in Section B (flagged-then-verified-then-fixed)
because the leak was real but is now resolved upstream.

* docs(resource-cleanup): add FD-leak debugging playbook + CI considerations

Add a new "Debugging FD leaks — playbook for the next one" section
between the History (Waves 1-7) and "Intentionally not done" parts of
the doc, capturing the diagnostic flow we developed across Waves 6
and 7 so future contributors don't re-derive it from scratch.

Includes:

- Symptoms that justify treating an issue as an FD leak (OSError 24,
  static-asset MIME errors, High FD count warnings, healthcheck
  hangs).
- Host-side and inside-container snapshot scripts that work even when
  the container is too FD-starved for docker exec (host-side via
  sudo + /proc/$P/fd) and through the entrypoint's UID drop
  (--user 0 to docker exec).
- Lookup table mapping each anon_inode / socket / pipe / REG flavor
  to its likely Python-level source and the path to deep-dive (e.g.
  /proc/PID/fdinfo/N's Pid: line for pidfds).
- A pinpointing recipe per FD type — eventpoll (asyncio/httpx),
  pidfd (asyncio.create_subprocess / multiprocessing.Process),
  WAL/SHM (SQLCipher engine.dispose).
- Pointer to the existing in-codebase instrumentation: _count_open_fds,
  the periodic Resource monitor log, fd_monitor.py, and the
  RUN_MANUAL_SMOKE-gated tests/manual_smoke/test_fd_smoke.py harness.
- Honest discussion of why an automated per-PR FD-growth assertion is
  hard (transient FDs, CI-environment subprocess noise, namespace
  differences, slow-drip leaks needing hours of uptime) and what a
  nightly long-run job would look like if the team chooses to invest
  in one.
- A "which Wave fixed which leak class" reference table so the next
  reporter can recognize a class and skip to the relevant precedent.

No code changes. Pure documentation.

* docs(resource-cleanup): add development-time detection + bpftrace recipes

Extend the FD-leak debugging playbook with two industry-standard
techniques that would have caught Waves 6 and 7 earlier, drawn from
upstream Python docs and the wider production-tracing literature:

1. **bpftrace syscall-level pinpointing** (in the per-FD-type
   section). Trace pidfd_open / epoll_create1 / etc. on the host
   targeting the container's host PID; produces a histogram of every
   user stack that triggered the syscall, ranked by frequency. The
   hot stacks are the culprits. Would have caught the Playwright
   pidfd leak in seconds.

2. **Development-time detection (new subsection 4a)** — catches
   leaks at test time before they ship:
   - PYTHONASYNCIODEBUG=1 + -W default::ResourceWarning. Per the
     asyncio dev docs, unclosed transports emit ResourceWarning at GC
     time; the filter actually displays them. Would have surfaced
     the Wave 7 in-running-loop skip in any test that exercised
     ainvoke + safe_close on ChatOllama.
   - python -X dev for a one-flag local dev mode bundling
     ResourceWarning + asyncio debug + warnings as default.
   - pyproject.toml [tool.pytest.ini_options] examples for both
     "display" and "error" filter modes (with a caveat that error
     mode needs a targeted subset, not the whole suite, because
     third-party libs also emit ResourceWarning).
   - psutil's num_fds / open_files / connections as the
     cross-platform alternative to /proc/self/fd for unit tests on
     macOS dev environments.
   - tracemalloc + objgraph as the next-level tool when a leak is
     reproducible — diff allocations before/after, then render the
     reference chain holding the leaked wrapper alive.

No code changes. The new tooling is recommendations only; no
mandatory pytest config change in this commit. Future work could
enable PYTHONASYNCIODEBUG=1 in the CI test environment if the
overhead is acceptable.

Citations to docs.python.org are inline for the load-bearing
ResourceWarning claim.

* test(fd-canary): pin asyncio.create_subprocess pidfd lifecycle in CI

Add ``TestAsyncioSubprocessFDBaseline`` to
``tests/utilities/test_close_base_llm.py`` with two regression tests
that run on every PR:

1. ``test_no_fd_growth_across_asyncio_subprocess_cycles`` — spawns
   ``/bin/true`` via ``asyncio.create_subprocess_exec`` 10 times and
   asserts total FD count delta ≤ +2. Pins the pidfd FD class against
   the child-watcher leak shape.

2. ``test_no_fd_growth_when_subprocess_fails_to_exec`` — same shape
   but with a deliberately-missing binary, mirroring the *exact*
   Wave-7 production failure mode (Playwright's Node.js driver being
   spawned, kernel returning ENOENT because Chromium wasn't
   installed, child watcher still expected to clean up the pidfd it
   opened *before* the failed exec).

Why this is the right level
---------------------------
LDR's own code does NOT call ``asyncio.create_subprocess_*`` (verified
in R8C1). The production leak came from a transitive dependency
(Playwright). So we cannot test LDR's call sites directly — there are
none. Instead these tests pin the *platform baseline*: on this Python
version, repeated asyncio subprocess cycles must not leak FDs. If a
future Python upgrade, a child-watcher change, or a new direct
asyncio.create_subprocess call in LDR breaks the close semantics, the
next PR's CI fails on these tests — which is the canary signal we
want.

Linux-only via ``sys.platform != "linux"`` skip. pidfd_open is a
Linux syscall; macOS uses a different watcher and Windows uses
ProactorEventLoop. Both 'pass by virtue of nothing to leak', so
restricting to Linux keeps the signal sharp (a failure on Linux is
actionable; a pass on macOS is uninformative).

Same +2 FD slack we use for the eventpoll canary above. A real
1-FD-per-iter leak across 10 iterations would land at delta=10,
well past the threshold.

Doc reference
-------------
Updated ``docs/developing/resource-cleanup.md`` "Existing
instrumentation" section to enumerate all four in-CI FD-growth
canaries (two eventpoll, two pidfd) so future contributors see at a
glance what's already guarded and where to extend coverage when a
new leak class is found.
2026-05-16 20:01:04 +02:00
LearningCircuit
3d0b7bb5f9 review: hoist asyncio+threading imports to module level + Wave 7 doc (#4048)
Addresses the AI Code Review nit on #4047: ``import threading`` (and
the sibling ``import asyncio``) lived inside the ``_close_base_llm``
function body. There's no circular-import or optional-dependency
reason to defer them; moving them to the top of the module improves
readability and static analysis.

Also extends ``docs/developing/resource-cleanup.md`` with a Wave 7
entry documenting:

- The in-running-loop ``aclose`` skip bug (this PR's fix).
- The healthcheck ``pidfd`` leak (Dockerfile change in the same PR).
- The three gaps the broader audit during this PR surfaced as
  follow-up rather than in-scope work: ``OllamaEmbeddings`` httpx (same
  FD class as ChatOllama, no close path in langchain wrappers),
  ``auth_db`` / ``journal_quality`` engines escaping
  ``shutdown_databases``, and three RAG SSE endpoints constructing
  ``LibraryRAGService`` before the generator without a ``finally``
  close.

Also captures the negative results from the audit (non-Ollama
providers safe via shared lru_cache, no subprocess pidfd risk, no
raw event-loop creation, all ``open()`` calls inside ``with``) so a
future contributor reading the history sees what was checked and
ruled out.
2026-05-14 22:58:57 +02:00
LearningCircuit
5ede95d3b4 docs(developing): add resource-cleanup.md capturing the FD-leak campaign (#3856)
Adds a single contributor-facing doc that explains how LDR manages
process-level resources (DB sessions, LLM HTTP clients, search engines,
threads) and the reasoning trail behind the current model.

Why this isn't an ADR: ADRs (`docs/decisions/`) are for single-decision
records. This doc is wider — it captures current architecture, a
how-to cookbook, anti-patterns specific to this codebase, the
chronological history of the FD-leak fix campaign (#1832 through #3855),
and the deferred work list with reasoning.

The history section consolidates ~14 weeks of iterative work across 15+
PRs into a single archive so future contributors hitting FD-shaped
issues can see what's been tried, what worked, and what was ruled out
without reconstructing it from `git log`. The "intentionally not done"
section preempts re-discovery of deferred work as missing work.

Related to #3816. The companion code fix is #3855.

Co-authored-by: Daniel Petti <djpetti@gmail.com>
Co-authored-by: r69 <143521130+r69shabh@users.noreply.github.com>
Co-authored-by: Chris Dzombak <chris@dzombak.com>
2026-05-09 10:54:50 +02:00
LearningCircuit
061cd83dd4 feat: add is_lexical flag to auto-enable LLM relevance filtering for keyword-based engines (#3403)
* feat: add needs_reranking flag to auto-enable LLM relevance filtering for keyword-based engines

Engines with poor native relevance ranking (arXiv, PubMed, Wikipedia,
GitHub, Mojeek, etc.) now auto-enable LLM-based result filtering via
a new `needs_reranking` class attribute. This fixes the priority bug
where the global `skip_relevance_filter=True` incorrectly overrode
auto-detection for engines that genuinely need filtering.

Priority is now: per-engine setting > needs_reranking > global skip.
The global skip only affects unclassified engines.

Closes #2297

* fix: address 7 code-review issues on needs_reranking branch

1. Rename needs_reranking → needs_llm_relevance_filter for consistency
   with enable_llm_relevance_filter and skip_relevance_filter naming
2. Fix Paperless dead code: replace non-existent _apply_content_filters
   with proper _filter_for_relevance() call in custom run() override
3. Fix misleading skip_relevance_filter description to accurately
   reflect checkbox behavior and keyword engine exceptions
4. Delete 4 vacuously-true inline tests that duplicated factory logic
   instead of calling the real factory (coverage tests already exist)
5. Add needs_llm_relevance_filter to EXTENDING.md and OVERVIEW.md
6. Clarify is_generic comment: generic does not imply good ranking
7. Upgrade no-LLM log from debug to warning when filtering was
   requested but no LLM is available (with should_filter guard)

* fix: remove Paperless fallback that overrode valid empty LLM filter results

Replace the fallback that restored all previews when the LLM filter
returned empty with an info log. The base class _filter_for_relevance()
already handles errors internally (returns previews[:5] on exception
or JSON parse failure). An empty result means the LLM legitimately
found nothing relevant — trust it, don't override it.

* refactor: rename needs_llm_relevance_filter → is_lexical

The flag describes what the engine IS (lexical/keyword-based search)
rather than what it needs. This is a general classification that can
drive multiple behaviors beyond just the relevance filter — e.g.
query optimization strategies, result deduplication, or UI hints.
Matches the existing is_* naming pattern (is_scientific, is_generic).

* Revert "refactor: rename needs_llm_relevance_filter → is_lexical"

This reverts commit c322d478a1.

* Reapply "refactor: rename needs_llm_relevance_filter → is_lexical"

This reverts commit 853dfe90bd.

* feat: add is_lexical classification flag alongside needs_llm_relevance_filter

Separates classification from behavior:
- is_lexical: informational flag indicating the engine uses keyword/lexical
  search. Reusable for query optimization, UI hints, deduplication, etc.
- needs_llm_relevance_filter: behavioral flag that the factory reads to
  auto-enable LLM relevance filtering on the engine instance.

Both flags are set on all 15 keyword-based engines. The factory only
checks needs_llm_relevance_filter for filtering decisions.

* fix: improve relevance filter error handling and logging

- Return [] on all error paths instead of hiding failures behind
  previews[:5] fallback — failures should be visible, not masked
- Log errors at error level (not warning) for LLM parse failures
- Add engine name prefix to all log messages for traceability
- Add token estimate debug log to help diagnose context overflow
- Reduce log noise: routine operations are debug, only summary is info
- Consolidate validation into single check

* fix: address PR review findings for relevance filter

- Fix literal \n in EXTENDING.md code block
- Remove 'Maximum results to return' from LLM prompt (LLM decides)
- Add INPUT/KEPT/REMOVED debug logging for filter quality analysis
- Add is_lexical + needs_llm_relevance_filter to ElasticsearchSearchEngine
- Delete vacuously-true test_missing_llm_returns_none test
- Downgrade no-op skip_relevance_filter log from info to debug

* refactor: extract relevance filter into dedicated module

Pull the inline _filter_for_relevance() logic out of BaseSearchEngine
into a new web_search_engines/relevance_filter.py module.

- Use with_structured_output() with Pydantic schema; let LangChain
  pick the per-provider default method (JSON schema on Ollama,
  tool-calling on Anthropic, responseSchema on Gemini).
- Trim prompt: drop URLs, cap snippets at 200 chars.
- Suppress reasoning on Ollama thinking-by-default models via
  reasoning=False — saves 30-60s per call on qwen3 dense variants.
- Treat empty LLM responses as valid judgments; log a warning on
  batches >2 so users notice a misbehaving model.
- On exception or parse failure, return first N previews (cap=5 or
  max_filtered_results) to avoid overwhelming downstream.

* refactor(relevance_filter): cleanup + add direct tests

* feat(relevance_filter): batch previews in parallel for speed and reliability

Adds two tunable parameters to the LLM relevance filter:

- batch_size: split previews into chunks before sending to the LLM.
  Each batch uses local indices [0..batch_size-1] mapped back to
  global. Default 10. Smaller batches are faster per call AND more
  reliable on weaker models that struggle with many indices in one
  context.

- max_parallel_batches: dispatch batches concurrently via a
  ThreadPoolExecutor. Default 4. Result order is preserved across
  parallel batches.

Both exposed as BaseSearchEngine class attributes
(relevance_filter_batch_size, relevance_filter_max_parallel_batches)
so individual engines can override.

Failure semantics:
- Hard exception on any batch -> capped slice fallback (unchanged).
- Parse failure on a single batch -> skip that batch only, keep
  results from successful batches.

Adds 4 direct unit tests covering chunk/index mapping, batch_size=None
single-call mode, failed-batch-skip-keeps-others, and parallel dispatch
order preservation. All 120 tests pass.

* refactor(relevance_filter): drop structured output, parse plain text

The Pydantic with_structured_output() path had several issues:
- qwen3 dense models returned prose instead of JSON, raising
  OutputParserException and disabling the filter for that call
- grammar-constrained output on Ollama was 6-10x slower than plain
  text generation (~24s vs ~4s for 50 previews)
- per-provider quirks (function_calling latency, schema bikeshedding)

Switch to plain llm.invoke() and parse integers from the response with
a tightened regex (word-boundary, no decimal fractions). The prompt
now instructs the model to output ONLY the indices, which combined
with the regex is robust against prose-injection of small numbers.

Removes RelevanceResult Pydantic class, _invoke_structured, the
_BATCH_FAILED_PARSE sentinel, and the "all batches failed" branch
(all dead under the new contract). Updates tests to mock llm.invoke
directly. Tightens default batch_size to 5 and parallel batches to 10
based on benchmark runs against Ollama.

* docs: fix stale _filter_for_relevance docstring after text-parsing rewrite
2026-04-06 23:04:47 +02:00
LearningCircuit
05b96fbe3f refactor: move engine module paths from settings DB to hardcoded registry (#2843)
* refactor: move engine module paths from settings DB to hardcoded registry

Engine implementation details (module_path, class_name, full_search_module,
full_search_class) are internal wiring, not user configuration. Storing them
in the settings DB created a security attack surface requiring blocklist
validation and route blocking.

Changes:
- New engine_registry.py with frozen dataclass entries for all 24 engines
- search_engines_config.py injects registry data after loading DB settings
- search_engine_factory.py passes engine_config to full search wrapper
- Remove ~52 module/class entries from 9 JSON defaults files
- Remove BLOCKED_SETTING_PATTERNS, is_blocked_setting(), and 4 call sites
- Remove absolute→relative normalization from module_whitelist.py
- Update docs, tests, and golden master

* fix: remove TestGetBlockedSettingsError that references removed function

The get_blocked_settings_error() function was removed as part of the
engine registry refactor. This test class was added on main after the
PR was created and wasn't caught by conflict resolution.

* fix: remove TestSaveSettingsPostBlockedSetting that tests removed blocking logic

BLOCKED_SETTING_PATTERNS and is_blocked_setting() were removed as part of
the engine registry refactor. This test was added on main and references
the now-removed blocking behavior.

* fix: inject ENGINE_REGISTRY into parallel/meta engine _get_search_config()

Both ParallelSearchEngine and MetaSearchEngine manually extract config
from settings_snapshot without going through search_config(). Since
module_path/class_name are no longer in the settings DB (they live in
the hardcoded registry), these engines would silently fail to discover
sub-engines on fresh installations.

Fix: inject ENGINE_REGISTRY values after extraction, matching the
pattern used in search_config().

Also fixes MetaSearchEngine's stale check for
"search.engine.auto.class_name" in settings_snapshot — this key no
longer exists in settings DB, so auto engine config would be skipped.

* fix: update tests for engine registry refactor

- test_whitelist_config_consistency: check ENGINE_REGISTRY instead of
  JSON defaults (module_path/class_name no longer in defaults)
- test_meta_search_engine_high_value: expect registry-injected
  module_path/class_name in _get_search_config() output
- test_meta_search_engine_extended: registry overwrites snapshot values
- test_settings_routes_coverage: remove blocked setting tests (blocking
  logic removed — registry is now the security mechanism)
- test_settings_routes_deep_coverage2: same as above

* fix: add 5 missing engines to registry, strip module_path from their settings

Add gutenberg, openlibrary, pubchem, stackexchange, and zenodo to
ENGINE_REGISTRY (were added to main in #1540 after this branch diverged).

Remove module_path/class_name from their settings JSON files and golden
master, matching the pattern established for all other engines.

Expand test_engine_registry.py to scan per-engine settings_*.json files
and verify no settings files still contain module_path/class_name.

* fix: inject full_search_module/class in meta/parallel engine _get_search_config()

The registry injection in MetaSearchEngine and ParallelSearchEngine was
missing full_search_module and full_search_class fields, making it
inconsistent with the main search_config() injection. This would cause
full-search wrappers to fail when created through meta/parallel engines.

* fix: resolve pre-commit formatting issues and sync pdm.lock after merge with main
2026-03-20 20:22:10 +01:00
LearningCircuit
890c84e534 docs: link auto-generated Configuration Reference across docs & fix stale env var docs (#2472)
- Add "Config Reference" link to Settings page "Learn & Get Help" bar
- Overhaul docs/env_configuration.md: remove stale Dynaconf references,
  fix wrong double-underscore env var format, remove documented-as-fixed
  bug, replace duplicate tables with links to CONFIGURATION.md
- Fix broken case-sensitive link in docs/deployment/unraid.md
- Add CONFIGURATION.md cross-references to 12 docs' "See Also" sections
- Update .env.template with correct LDR_-prefixed variable names
- Add config reference comment to docker-compose.yml environment block
2026-02-28 13:46:34 +01:00
LearningCircuit
465b0f3e9e docs: Add architecture, extension guide, and troubleshooting documentation
Add comprehensive documentation for contributors and users:

- docs/architecture/OVERVIEW.md: System architecture with Mermaid diagrams
  covering components, research flow, threading model, and configuration
- docs/architecture/DATABASE_SCHEMA.md: Complete database schema with ER
  diagram documenting all 40+ models
- docs/developing/EXTENDING.md: Extension guide for adding custom search
  engines, strategies, LLM providers, and LangChain retrievers
- docs/troubleshooting.md: Common issues and solutions for LLM, search,
  database, WebSocket, Docker, and API problems
2025-12-26 14:20:23 +01:00