* fix: treat empty environment variables as unset to fix provider selection
When deploying via Docker/Unraid templates, all environment variables
are created even when left blank (e.g. LDR_LLM_ANTHROPIC_API_KEY="").
The check_env_setting() function previously treated these empty strings
as valid overrides, which caused provider settings to be blanked out
and prevented proper provider selection on fresh installs.
Empty env vars are now treated as unset, allowing database defaults to
take effect normally.
Fixes#3339
* fix(tests): update test to match empty env var behavior
Update test_env_override_empty_string to assert that empty environment
variables are treated as unset (returning DB value) rather than
overriding with empty string. This aligns with the fix for #3339.
* docs: add ecosystem context for empty env var handling decision
Document that treating empty environment variables as unset is standard
practice across major projects (botocore, viper, Turborepo, Go stdlib,
Docker Compose) with references to the PR discussion.
* feat: add warning log for empty env vars, fix references, add tests and docs
- Log warning when empty env vars are detected (helps users diagnose
Unraid/Docker template issues)
- Replace misleading viper/Docker Compose references with CPython
official docs and Pallets/Click PR #2223
- Add unit tests: empty string returns None, warning is logged,
provider/model/multiple keys handled
- Add integration tests: empty string with no DB value, checkbox,
number settings
- Document empty env var behavior in unraid.md, docker-compose-guide.md,
and env_configuration.md
* docs: recommend DISABLED instead of Web UI for blocking settings
Users can set env vars to a non-empty invalid value like "DISABLED"
to explicitly block a key, which is simpler than navigating the UI.
* docs: move detailed installation instructions from README to dedicated pages
README Installation Options section (~200 lines) replaced with a compact
table linking to docs/installation.md (hub page), docs/install-pip.md
(dedicated pip guide), and existing docker-compose and Unraid guides.
No content lost — everything is now in focused doc files.
* docs: trim redundant pip section in installation hub page
The pip section in docs/installation.md duplicated nearly all of the
Quick Install content from docs/install-pip.md. Replace with a brief
summary + single install command + link to the dedicated guide,
consistent with the hub-and-spoke pattern used by the Unraid section.
Addresses review feedback from djpetti on PR #2819.
* docs: restore missing installation info from README migration
- Add NVIDIA Container Toolkit full install commands (Ubuntu/Debian) with
distro note for RHEL/Fedora/Arch to docs/installation.md
- Add GPU docker-compose alias convenience tip
- Add DIY docker-compose configuration guidance (GPU driver, context
length, keep alive, model selection)
- Add Windows PDF export warning (Pango/WeasyPrint) to docs/install-pip.md
- Fix SQLCipher wording: pre-built wheels available, not "requires
system-level libraries"
- Restore ldr-web command instead of python -m invocation
* docs: follow-up polish for installation docs migration
- Restructure README Quick Start with clear Option 1/2/3 labels
- Update deprecated LDR_ALLOW_UNENCRYPTED to LDR_BOOTSTRAP_ALLOW_UNENCRYPTED
- Add "Open http://localhost:5000" to install-pip.md after ldr-web step
- Add back-link from install-pip.md to installation overview
- Add Docker/Docker Compose install prerequisite links to installation.md
- Cross-link NVIDIA toolkit commands from docker-compose-guide to installation.md
- Use double quotes for volume spec in Docker Run for cross-platform compat
* docs: restore original Quick Start ordering (Docker Run first)
* docs: move detailed installation instructions from README to dedicated pages
README Installation Options section (~200 lines) replaced with a compact
table linking to docs/installation.md (hub page), docs/install-pip.md
(dedicated pip guide), and existing docker-compose and Unraid guides.
No content lost — everything is now in focused doc files.
* docs: trim redundant pip section in installation hub page
The pip section in docs/installation.md duplicated nearly all of the
Quick Install content from docs/install-pip.md. Replace with a brief
summary + single install command + link to the dedicated guide,
consistent with the hub-and-spoke pattern used by the Unraid section.
Addresses review feedback from djpetti on PR #2819.
* docs: restore missing installation info from README migration
- Add NVIDIA Container Toolkit full install commands (Ubuntu/Debian) with
distro note for RHEL/Fedora/Arch to docs/installation.md
- Add GPU docker-compose alias convenience tip
- Add DIY docker-compose configuration guidance (GPU driver, context
length, keep alive, model selection)
- Add Windows PDF export warning (Pango/WeasyPrint) to docs/install-pip.md
- Fix SQLCipher wording: pre-built wheels available, not "requires
system-level libraries"
- Restore ldr-web command instead of python -m invocation
* fix: prevent file descriptor exhaustion from dead thread engine accumulation
Three root causes addressed:
1. Dead thread engine accumulation (primary): _thread_engines grows
unboundedly as crashed/terminated threads leave orphaned NullPool
engines. Add cleanup_dead_thread_engines() that sweeps entries for
threads no longer in threading.enumerate(). Integrate via throttled
sweep in teardown_appcontext (every 60s) and periodic sweep in the
queue processor loop (every 6 iterations).
2. Generic downloader stream=True leak (secondary): generic.py used
stream=True but never read or closed the response body, holding
connections open. Removed stream=True since only status_code and
headers are inspected.
3. Docker default 1024 FD limit (contributing): Add nofile ulimit
(65536) to docker-compose.yml so the container has headroom for
WAL mode databases, thread pools, and connection pools.
* fix: address review findings — sweep lock, credential cleanup, flaky test
- Add _sweep_lock to prevent TOCTOU race on _last_sweep_time in
maybe_sweep_dead_engines() (concurrent teardowns could all pass the
interval check)
- Move alive_ids computation inside _thread_engine_lock to prevent
race between snapshot and engine dict mutation
- Sweep dead _thread_credentials (plaintext passwords) alongside engines
in processor_v2.py and app_factory.py teardown
- Fix flaky test_sweeps_after_interval: replace time.sleep(0.15) with
_last_sweep_time backdating
- Add tests for credential sweep and module-level cleanup_dead_threads()
* fix: close search engine sessions after research, fix stream=True leak properly
Three improvements to the FD exhaustion fix:
1. generic.py: Restore stream=True (removing it is unsafe — GenericDownloader
handles ALL URLs and would download multi-GB files into memory). Use context
manager instead to ensure the streamed connection is properly closed on all
return paths, preventing socket FD leaks.
2. research_service.py: Add use_search.close() and system.close() in finally
block of run_research_process(). Search engine HTTP sessions (e.g.
SemanticScholar's SafeSession) were never explicitly closed after research,
relying on non-deterministic GC for cleanup.
3. search_system.py + strategies: Add close() method to AdvancedSearchSystem
and BaseSearchStrategy, with overrides in ConstraintParallelStrategy and
ConcurrentDualConfidenceStrategy to shut down persistent ThreadPoolExecutors.
Also adds detailed design comments throughout the codebase documenting:
- Why NullPool engines don't leak FDs (memory leak only)
- Why stream=True must NOT be removed from the diagnostic block
- The dual sweep trigger architecture (request-driven + queue-driven)
- Thread ID recycling limitations
- Search engine lifecycle and cleanup responsibilities
Fixes flaky test_removes_dead_thread_entries by using threading.Barrier to
prevent thread ID recycling during test.
* fix: unregister user from news scheduler on logout
The logout handler never called scheduler.unregister_user(), causing:
- Passwords to persist in scheduler memory for up to 48 hours
- Orphaned APScheduler jobs to keep running after logout
- Orphaned jobs to re-create QueuePool engines (~10 FDs each) after
close_user_database() disposed the original, contributing to FD leaks
Add scheduler unregistration before close_user_database() so running
jobs can finish gracefully while the DB engine is still available.
Add design comment documenting the logout cleanup order.
* test: remove ineffective patch in logout scheduler test
The `routes.get_news_scheduler` patch was ineffective because the logout
handler imports `get_news_scheduler` dynamically inside the function body,
so the name never enters the routes module namespace. The `create=True`
flag masked this by silently creating a new attribute. The real patch on
`subscription_manager.scheduler.get_news_scheduler` is sufficient.
* fix: remove nofile ulimit override from docker-compose.yml
Docker containers inherit ulimits from the Docker daemon, which typically
runs with LimitNOFILE=infinity (1073741816+). Setting nofile to 65536
could actually *lower* the limit for most users, hurting large
installations. The FD leak root causes are already fixed in this PR
(dead-thread engine sweep, session close, scheduler unregister), so the
safety net is unnecessary. Let users and their Docker daemon config
control this.
* fix: add try-except to strategy executor shutdown, elevate scheduler unregister log level
- Wrap executor.shutdown(wait=False) in try-except in strategy close()
methods for consistency with parallel_search_engine.py pattern
- Change logger.debug → logger.warning for scheduler unregister failure
on logout, since failure means password stays in scheduler memory
* docs: add comments explaining non-obvious design decisions from deep review
- SQLCipher WAL FD cost (1-3 FDs per connection, multiplied by users)
- Logout cleanup ordering: why unregister before close, known race window
- shutdown(wait=False): why non-blocking, safety via double-cleanup pattern
* docs: add thread lifecycle, FD budget, and resource exhaustion documentation
Knowledge captured from PR #2591 deep review (5 rounds of verification):
- architecture.md: Thread & Resource Lifecycle section with cleanup layers,
mermaid diagram, FD budget table, and key files reference
- troubleshooting.md: Resource Exhaustion section with diagnosis commands
and solutions for FD exhaustion
- docker-compose-guide.md: Resource Limits note explaining nofile/memlock
- web/database/README.md: Thread Safety & Connection Model section
- Cross-references added between all 4 docs
- Updated Areas for Improvement (container optimization → resource observability)
- Added encrypted_db.py and thread_local_session.py to Key Source Files
* refactor: remove deprecated settings-based local search engines
The old settings-based local engines (research_papers, project_docs,
personal_notes, local_all) are fully superseded by the database-backed
Collection system with CollectionSearchEngine and LibraryRAGSearchEngine.
- Delete LocalAllSearchEngine and LocalSearchEngine classes
- Remove 58 settings entries from default_settings.json
- Remove local engine registration from search_engines_config.py
- Remove local_search_engines() function
- Clean up LocalEmbeddingManager: remove 14 dead methods and unused attrs
- Remove Docker volume mounts for local_collections
- Update security whitelist, rate limiter, bearer config
- Remove dead force_reindex code path in research_functions.py
- Update docs to reference Collections UI
- Remove/update all associated tests
- Regenerate golden master settings
* fix: address review comments from djpetti
- Revert unintentional formatting change in theme options (keep compact inline format)
- Restore unicode arrow character (→) that was escaped to \u2192 by JSON serializer
- Rename search_engine_local.py → local_embedding_manager.py since it only contains
LocalEmbeddingManager now (no search engines)
- Remove unused chunk_size, chunk_overlap, cache_dir params from LocalEmbeddingManager
- Update all imports and references across codebase
- Add "Config Reference" link to Settings page "Learn & Get Help" bar
- Overhaul docs/env_configuration.md: remove stale Dynaconf references,
fix wrong double-underscore env var format, remove documented-as-fixed
bug, replace duplicate tables with links to CONFIGURATION.md
- Fix broken case-sensitive link in docs/deployment/unraid.md
- Add CONFIGURATION.md cross-references to 12 docs' "See Also" sections
- Update .env.template with correct LDR_-prefixed variable names
- Add config reference comment to docker-compose.yml environment block
* security: make allow_registrations non-editable, env-var-only
Set editable=false and visible=false so users cannot toggle registration
through the UI. The setting is still controllable via the
LDR_APP_ALLOW_REGISTRATIONS environment variable.
* update description and keep visible for user awareness
* security: enforce editable flag on bulk settings save endpoints
The `editable: false` flag was only checked by the individual PUT
/settings/api/<key> endpoint. Both bulk save endpoints
(save_all_settings and save_settings) bypassed it, allowing any
authenticated user to modify non-editable settings like
allow_registrations via crafted requests.
Add editable filtering to both bulk endpoints so non-editable settings
are silently skipped (with a warning log), matching the UI behavior
where these fields are never rendered as inputs.
* security: harden registration protection — DELETE check, env var warning, docs
Three follow-up hardening fixes from security review:
1. Add editable check to DELETE /settings/api/<key> endpoint — previously
only checked is_blocked_setting(), allowing deletion of non-editable
settings which could reset them to permissive defaults.
2. Warn on unrecognized LDR_APP_ALLOW_REGISTRATIONS values — parse_boolean
uses HTML checkbox semantics where any non-empty non-falsy string is
True. Values like "disabled" or "none" silently enable registrations.
Now logs a clear warning with accepted values.
3. Document LDR_APP_ALLOW_REGISTRATIONS in docker-compose.yml and the
Docker Compose guide so operators deploying publicly can discover it.
* docs: fix inaccuracies in docker-compose-guide.md
- Use LDR_LLM_MODEL instead of non-existent MODEL env var
- Add /v1 suffix to LM Studio URL for OpenAI-compatible API
- Remove port 8080 from troubleshooting (SearXNG is internal-only)
* docs: add warning about env variable hard overrides
Environment variables cause settings to become read-only in the UI.
Users should prefer the web UI for settings they may want to change.
Create the Docker Compose guide that was referenced in README.md but
didn't exist. The guide consolidates Docker Compose setup information
including quick start commands, configuration options, Cookie Cutter
approach, and troubleshooting tips.
Fixes#1817