docs: thread lifecycle, FD budget, and resource exhaustion (#2605)

* fix: prevent file descriptor exhaustion from dead thread engine accumulation Three root causes addressed: 1. Dead thread engine accumulation (primary): _thread_engines grows unboundedly as crashed/terminated threads leave orphaned NullPool engines. Add cleanup_dead_thread_engines() that sweeps entries for threads no longer in threading.enumerate(). Integrate via throttled sweep in teardown_appcontext (every 60s) and periodic sweep in the queue processor loop (every 6 iterations). 2. Generic downloader stream=True leak (secondary): generic.py used stream=True but never read or closed the response body, holding connections open. Removed stream=True since only status_code and headers are inspected. 3. Docker default 1024 FD limit (contributing): Add nofile ulimit (65536) to docker-compose.yml so the container has headroom for WAL mode databases, thread pools, and connection pools. * fix: address review findings — sweep lock, credential cleanup, flaky test - Add _sweep_lock to prevent TOCTOU race on _last_sweep_time in maybe_sweep_dead_engines() (concurrent teardowns could all pass the interval check) - Move alive_ids computation inside _thread_engine_lock to prevent race between snapshot and engine dict mutation - Sweep dead _thread_credentials (plaintext passwords) alongside engines in processor_v2.py and app_factory.py teardown - Fix flaky test_sweeps_after_interval: replace time.sleep(0.15) with _last_sweep_time backdating - Add tests for credential sweep and module-level cleanup_dead_threads() * fix: close search engine sessions after research, fix stream=True leak properly Three improvements to the FD exhaustion fix: 1. generic.py: Restore stream=True (removing it is unsafe — GenericDownloader handles ALL URLs and would download multi-GB files into memory). Use context manager instead to ensure the streamed connection is properly closed on all return paths, preventing socket FD leaks. 2. research_service.py: Add use_search.close() and system.close() in finally block of run_research_process(). Search engine HTTP sessions (e.g. SemanticScholar's SafeSession) were never explicitly closed after research, relying on non-deterministic GC for cleanup. 3. search_system.py + strategies: Add close() method to AdvancedSearchSystem and BaseSearchStrategy, with overrides in ConstraintParallelStrategy and ConcurrentDualConfidenceStrategy to shut down persistent ThreadPoolExecutors. Also adds detailed design comments throughout the codebase documenting: - Why NullPool engines don't leak FDs (memory leak only) - Why stream=True must NOT be removed from the diagnostic block - The dual sweep trigger architecture (request-driven + queue-driven) - Thread ID recycling limitations - Search engine lifecycle and cleanup responsibilities Fixes flaky test_removes_dead_thread_entries by using threading.Barrier to prevent thread ID recycling during test. * fix: unregister user from news scheduler on logout The logout handler never called scheduler.unregister_user(), causing: - Passwords to persist in scheduler memory for up to 48 hours - Orphaned APScheduler jobs to keep running after logout - Orphaned jobs to re-create QueuePool engines (~10 FDs each) after close_user_database() disposed the original, contributing to FD leaks Add scheduler unregistration before close_user_database() so running jobs can finish gracefully while the DB engine is still available. Add design comment documenting the logout cleanup order. * test: remove ineffective patch in logout scheduler test The `routes.get_news_scheduler` patch was ineffective because the logout handler imports `get_news_scheduler` dynamically inside the function body, so the name never enters the routes module namespace. The `create=True` flag masked this by silently creating a new attribute. The real patch on `subscription_manager.scheduler.get_news_scheduler` is sufficient. * fix: remove nofile ulimit override from docker-compose.yml Docker containers inherit ulimits from the Docker daemon, which typically runs with LimitNOFILE=infinity (1073741816+). Setting nofile to 65536 could actually *lower* the limit for most users, hurting large installations. The FD leak root causes are already fixed in this PR (dead-thread engine sweep, session close, scheduler unregister), so the safety net is unnecessary. Let users and their Docker daemon config control this. * fix: add try-except to strategy executor shutdown, elevate scheduler unregister log level - Wrap executor.shutdown(wait=False) in try-except in strategy close() methods for consistency with parallel_search_engine.py pattern - Change logger.debug → logger.warning for scheduler unregister failure on logout, since failure means password stays in scheduler memory * docs: add comments explaining non-obvious design decisions from deep review - SQLCipher WAL FD cost (1-3 FDs per connection, multiplied by users) - Logout cleanup ordering: why unregister before close, known race window - shutdown(wait=False): why non-blocking, safety via double-cleanup pattern * docs: add thread lifecycle, FD budget, and resource exhaustion documentation Knowledge captured from PR #2591 deep review (5 rounds of verification): - architecture.md: Thread & Resource Lifecycle section with cleanup layers, mermaid diagram, FD budget table, and key files reference - troubleshooting.md: Resource Exhaustion section with diagnosis commands and solutions for FD exhaustion - docker-compose-guide.md: Resource Limits note explaining nofile/memlock - web/database/README.md: Thread Safety & Connection Model section - Cross-references added between all 4 docs - Updated Areas for Improvement (container optimization → resource observability) - Added encrypted_db.py and thread_local_session.py to Key Source Files
2026-06-16 03:51:07 +03:00 · 2026-03-08 16:22:17 +01:00
parent 8d32f5f9e3
commit 0b23d58e85
4 changed files with 138 additions and 1 deletions
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -461,6 +461,79 @@ flowchart TB
 | **Progress Streaming** | SocketIO for real-time UI updates |
 | **Cross-Engine Filtering** | LLM-powered relevance scoring and deduplication |

+#### Thread & Resource Lifecycle
+
+Unlike typical web apps that share a single database connection pool, this application maintains **separate database engines per user** because each user has their own encrypted SQLite file with a unique SQLCipher key ([SQLCipher](https://www.zetetic.net/sqlcipher/) is an encrypted extension of SQLite). This creates a threading challenge: every background thread needs its own engine for the user it serves.
+
+Two SQLAlchemy pooling strategies coexist:
+
+- **QueuePool** — one per user, shared across Flask request-handler threads (`pool_size=10`, `max_overflow=30`). Maintains a fixed-size pool of reusable connections that persist between requests.
+- **NullPool** — per-thread engines for background work (research workers, scheduler jobs). Creates a new connection per checkout and closes it on return (no pooling). The SQLAlchemy `Session` using this engine is held for the background thread's lifetime.
+
+##### Resource Cleanup Layers
+
+Cleanup is defense-in-depth with multiple layers:
+
+| Layer | Trigger | What it cleans |
+|-------|---------|----------------|
+| `@thread_cleanup` decorator | Thread function exit (normal or exception) | DB session, thread engine, settings context, search context |
+| `finally` blocks | Per-research in `run_research_process()` | Search engine HTTP sessions, strategy thread pool executors |
+| `teardown_appcontext` | After each HTTP request | QueuePool session, thread-local session, triggers dead-thread engine sweep (rate-limited to 60s) and credential sweep (every request) |
+| Dead-thread engine sweep | Every ~60s (rate-limited) | Disposes NullPool engines for threads no longer in `threading.enumerate()` |
+| Logout cascade | User logout | Scheduler unregister (removes password) → DB close (`engine.dispose()`) → session destroy |
+| Stale session cleanup | `before_request` (~1% of requests, sampled) | Clears Flask sessions for users whose DB connection is gone |
+
+##### Research Thread Lifecycle
+
+```mermaid
+flowchart TD
+    A["POST /api/start_research"] --> B{"Slots available?"}
+    B -- "Yes (direct)" --> C["start_research_process()"]
+    B -- "No (queued)" --> D["Database queue"]
+    D --> E["_process_queue_loop<br/>(daemon thread, 10s poll)"]
+    E --> C
+
+    C --> F["Research thread<br/>(daemon, semaphore-gated)"]
+    F --> G["@thread_cleanup wraps<br/>run_research_process()"]
+
+    G --> H["Strategy.find_relevant_information()"]
+    H --> I["ThreadPoolExecutor sub-tasks<br/>(within strategies only)"]
+    I --> J["Worker @thread_cleanup:<br/>close worker DB session"]
+
+    J --> K["finally: search_engine.close()<br/>system.close() → strategy.close()"]
+    K --> L["Main @thread_cleanup:<br/>close DB session, dispose engine"]
+
+    subgraph sweep ["Defense-in-depth (independent 60s timer)"]
+        M["Dead-thread sweep"] -.-> N["Dispose engines for<br/>dead thread IDs"]
+    end
+
+    style sweep fill:#f0f0f0,stroke:#999
+```
+
+##### FD Budget
+
+Each SQLCipher connection in WAL (Write-Ahead Logging) mode uses **2 file descriptors** (main db + WAL file). All connections to the same database within a process share **1 SHM** (shared-memory) file descriptor. The formula per user database is: `connections × 2 + 1`.
+
+| Component | FD Formula | With defaults |
+|-----------|-----------|---------------|
+| QueuePool (steady state) | `logged_in_users × (pool_size × 2 + 1)` | `users × 21` FDs |
+| QueuePool (peak) | `logged_in_users × ((pool_size + max_overflow) × 2 + 1)` | `users × 81` FDs |
+| NullPool (background) | Transient — held for session lifetime only | Varies |
+
+Default Linux ulimit is 1024 soft (bare metal), which is tight for multi-user deployments. Docker's daemon default (typically 1M+) is adequate. QueuePool engines are created at login and disposed at logout, so only active users consume FDs.
+
+For more on diagnosing FD exhaustion, see [Troubleshooting - Resource Exhaustion](./troubleshooting.md#resource-exhaustion).
+
+##### Key Files
+
+| File | Role |
+|------|------|
+| `database/encrypted_db.py` | `DatabaseManager`, engine lifecycle, dead-thread sweep |
+| `database/thread_local_session.py` | `@thread_cleanup` decorator, thread-local sessions, credential cleanup |
+| `web/app_factory.py` | `teardown_appcontext` handler, cleanup orchestration |
+| `web/services/research_service.py` | Research thread creation, `run_research_process()` |
+| `web/queue/processor_v2.py` | Queue processing, dead-thread sweep trigger |
+
 ### Areas for Improvement

 While the project scores highly overall, these areas have room for growth:
@@ -468,7 +541,7 @@ While the project scores highly overall, these areas have room for growth:
 1. **Integration Testing** - More end-to-end tests for full research workflows
 2. **API Documentation** - OpenAPI/Swagger spec for REST endpoints
 3. **Metrics Dashboard** - Prometheus/Grafana integration for monitoring
-4. **Container Optimization** - Multi-stage Docker builds for smaller images
+4. **Resource Observability** - Expose FD count, thread count, and connection pool stats in /api/v1/health; add periodic sweep logging
 5. **Async Architecture** - Migration to async/await for I/O-bound operations

 ### Key Source Files
@@ -481,6 +554,8 @@ While the project scores highly overall, these areas have room for growth:
 | Report Generation | `src/local_deep_research/report_generator.py` | `IntegratedReportGenerator` |
 | Web API | `src/local_deep_research/web/routes/` | Flask routes and WebSocket handlers |
 | Database | `src/local_deep_research/web/database/` | SQLCipher models and migrations |
+| Encrypted DB | `src/local_deep_research/database/encrypted_db.py` | Per-user SQLCipher engine lifecycle |
+| Thread Sessions | `src/local_deep_research/database/thread_local_session.py` | Thread-safe session management and cleanup |
 | Settings | `src/local_deep_research/config/` | Configuration and LLM setup |

 ### Contributing to Architecture
--- a/docs/docker-compose-guide.md
+++ b/docs/docker-compose-guide.md
@@ -82,6 +82,13 @@ ports:
 | `ollama_data` | Downloaded models |
 | `searxng_data` | Search engine config |

+### Resource Limits
+
+> **Warning:** Resource limits in the base `docker-compose.yml` are intentionally minimal:
+> - **`nofile` (file descriptors):** Not set. Docker's daemon default (typically 1M+) is appropriate. Setting a lower value can cause `unable to open database file` errors under load.
+> - **`memlock`:** Set to unlimited so SQLCipher's `mlock()` (a system call that prevents memory from being swapped to disk) can lock encryption keys in RAM when `cipher_memory_security` is enabled (opt-in, off by default).
+> - To customize, use a `docker-compose.override.yml` for your deployment.
+
 ### Local Document Collections

 Use the **Collections** system in the Web UI to manage your local documents.
@@ -185,3 +192,4 @@ docker compose down -v
 - [Full Configuration Reference](CONFIGURATION.md)
 - [SearXNG Setup](SearXNG-Setup.md)
 - [Unraid Deployment](deployment/unraid.md)
+- [Architecture - Resource Lifecycle](architecture.md#thread--resource-lifecycle)
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -12,6 +12,7 @@ This guide covers common issues and their solutions.
 - [Docker Issues](#docker-issues)
 - [API Issues](#api-issues)
 - [Performance Issues](#performance-issues)
+- [Resource Exhaustion](#resource-exhaustion)

 ---

@@ -566,6 +567,47 @@ sudo lsof -i :5000  # May need sudo for system services

 ---

+## Resource Exhaustion
+
+### File Descriptor Exhaustion
+
+**Symptoms:**
+- `sqlite3.OperationalError: unable to open database file`
+- `OSError: [Errno 24] Too many open files`
+- Cascading failures across unrelated operations (logging, HTTP requests, WebSocket connections fail simultaneously)
+
+**Why it happens:**
+
+Each SQLCipher WAL-mode connection uses 2 file descriptors (main db + WAL), plus 1 shared SHM fd per database. With per-user encrypted databases, the QueuePool alone uses `users × (pool_size × 2 + 1)` FDs at steady state (21 per user with defaults), up to `users × ((10 + 30) × 2 + 1) = users × 81` under load. Background research threads add transient FDs. The default Linux soft ulimit of 1024 is tight for multi-user deployments.
+
+**Diagnosis:**
+
+```bash
+# Inside Docker (PID 1 is the app due to exec in entrypoint)
+ls /proc/1/fd | wc -l
+cat /proc/1/limits | grep "open files"
+
+# Bare-metal Linux
+ls /proc/$(pgrep -fo ldr-web)/fd | wc -l
+
+# Detailed view — show database-related FDs
+lsof -p <PID> | grep -E '\.db|\.wal|\.shm'
+```
+
+**Solutions:**
+
+1. The app includes automatic dead-thread engine sweeps every ~60 seconds — this normally handles cleanup transparently
+2. **Docker:** The daemon default FD limit (typically 1M+) is appropriate. Do not set a lower `nofile` ulimit — this was intentionally removed from `docker-compose.yml`
+3. **Bare-metal Linux:** The default soft limit of 1024 may be too low. Increase it:
+   ```bash
+   ulimit -n 65536
+   ```
+4. Restart the application to release all file descriptors
+
+For the technical details of the cleanup architecture, see [Architecture - Thread & Resource Lifecycle](./architecture.md#thread--resource-lifecycle).
+
+---
+
 ## Debug Logging

 > **Security note:** Log files are unencrypted and may contain sensitive information such as research queries. Ensure appropriate file permissions.
@@ -643,3 +685,4 @@ If you're still experiencing issues:
 - [Architecture Overview](./architecture/OVERVIEW.md) - System architecture
 - [FAQ](./faq.md) - Frequently asked questions
 - [Search Engines Guide](./search-engines.md) - Detailed engine documentation
+- [Architecture - Thread & Resource Lifecycle](./architecture.md#thread--resource-lifecycle) - Resource cleanup layers and FD budget
--- a/src/local_deep_research/web/database/README.md
+++ b/src/local_deep_research/web/database/README.md
@@ -53,6 +53,17 @@ The unified database contains:
 * `research` - Research data (from deep_research.db)
 * `research_report` - Generated research reports (from deep_research.db)

+## Thread Safety & Connection Model
+
+The application uses per-user encrypted SQLite databases (via [SQLCipher](https://www.zetetic.net/sqlcipher/)) with two [SQLAlchemy pool](https://docs.sqlalchemy.org/en/20/core/pooling.html) strategies:
+
+- **QueuePool** (shared per-user engine in `DatabaseManager.connections`): Serves Flask request handlers. `pool_size=10`, `max_overflow=30`. Connections persist in the pool.
+- **NullPool** (per-thread engines in `DatabaseManager._thread_engines`): Used by background threads (research workers, scheduler jobs). Each checkout creates a fresh SQLCipher connection with the user's encryption key. The SQLAlchemy `Session` using this engine is held for the thread's lifetime.
+
+Cleanup is handled by the `@thread_cleanup` decorator (primary) and periodic dead-thread sweeps (fallback for daemon threads that exit without cleanup).
+
+Note: The core database module is at `src/local_deep_research/database/`, separate from this `web/database/` directory. See [Architecture - Thread & Resource Lifecycle](../../../../docs/architecture.md#thread--resource-lifecycle) for the full cleanup architecture.
+
 ## Rollback

 If you need to roll back to the previous database architecture: