mirror of https://github.com/LearningCircuit/local-deep-research.git synced 2026-06-16 03:51:07 +03:00

Files

LearningCircuit 0b23d58e85 docs: thread lifecycle, FD budget, and resource exhaustion (#2605 )

* fix: prevent file descriptor exhaustion from dead thread engine accumulation

Three root causes addressed:

1. Dead thread engine accumulation (primary): _thread_engines grows
   unboundedly as crashed/terminated threads leave orphaned NullPool
   engines. Add cleanup_dead_thread_engines() that sweeps entries for
   threads no longer in threading.enumerate(). Integrate via throttled
   sweep in teardown_appcontext (every 60s) and periodic sweep in the
   queue processor loop (every 6 iterations).

2. Generic downloader stream=True leak (secondary): generic.py used
   stream=True but never read or closed the response body, holding
   connections open. Removed stream=True since only status_code and
   headers are inspected.

3. Docker default 1024 FD limit (contributing): Add nofile ulimit
   (65536) to docker-compose.yml so the container has headroom for
   WAL mode databases, thread pools, and connection pools.

* fix: address review findings — sweep lock, credential cleanup, flaky test

- Add _sweep_lock to prevent TOCTOU race on _last_sweep_time in
  maybe_sweep_dead_engines() (concurrent teardowns could all pass the
  interval check)
- Move alive_ids computation inside _thread_engine_lock to prevent
  race between snapshot and engine dict mutation
- Sweep dead _thread_credentials (plaintext passwords) alongside engines
  in processor_v2.py and app_factory.py teardown
- Fix flaky test_sweeps_after_interval: replace time.sleep(0.15) with
  _last_sweep_time backdating
- Add tests for credential sweep and module-level cleanup_dead_threads()

* fix: close search engine sessions after research, fix stream=True leak properly

Three improvements to the FD exhaustion fix:

1. generic.py: Restore stream=True (removing it is unsafe — GenericDownloader
   handles ALL URLs and would download multi-GB files into memory). Use context
   manager instead to ensure the streamed connection is properly closed on all
   return paths, preventing socket FD leaks.

2. research_service.py: Add use_search.close() and system.close() in finally
   block of run_research_process(). Search engine HTTP sessions (e.g.
   SemanticScholar's SafeSession) were never explicitly closed after research,
   relying on non-deterministic GC for cleanup.

3. search_system.py + strategies: Add close() method to AdvancedSearchSystem
   and BaseSearchStrategy, with overrides in ConstraintParallelStrategy and
   ConcurrentDualConfidenceStrategy to shut down persistent ThreadPoolExecutors.

Also adds detailed design comments throughout the codebase documenting:
- Why NullPool engines don't leak FDs (memory leak only)
- Why stream=True must NOT be removed from the diagnostic block
- The dual sweep trigger architecture (request-driven + queue-driven)
- Thread ID recycling limitations
- Search engine lifecycle and cleanup responsibilities

Fixes flaky test_removes_dead_thread_entries by using threading.Barrier to
prevent thread ID recycling during test.

* fix: unregister user from news scheduler on logout

The logout handler never called scheduler.unregister_user(), causing:
- Passwords to persist in scheduler memory for up to 48 hours
- Orphaned APScheduler jobs to keep running after logout
- Orphaned jobs to re-create QueuePool engines (~10 FDs each) after
  close_user_database() disposed the original, contributing to FD leaks

Add scheduler unregistration before close_user_database() so running
jobs can finish gracefully while the DB engine is still available.
Add design comment documenting the logout cleanup order.

* test: remove ineffective patch in logout scheduler test

The `routes.get_news_scheduler` patch was ineffective because the logout
handler imports `get_news_scheduler` dynamically inside the function body,
so the name never enters the routes module namespace. The `create=True`
flag masked this by silently creating a new attribute. The real patch on
`subscription_manager.scheduler.get_news_scheduler` is sufficient.

* fix: remove nofile ulimit override from docker-compose.yml

Docker containers inherit ulimits from the Docker daemon, which typically
runs with LimitNOFILE=infinity (1073741816+). Setting nofile to 65536
could actually *lower* the limit for most users, hurting large
installations. The FD leak root causes are already fixed in this PR
(dead-thread engine sweep, session close, scheduler unregister), so the
safety net is unnecessary. Let users and their Docker daemon config
control this.

* fix: add try-except to strategy executor shutdown, elevate scheduler unregister log level

- Wrap executor.shutdown(wait=False) in try-except in strategy close()
  methods for consistency with parallel_search_engine.py pattern
- Change logger.debug → logger.warning for scheduler unregister failure
  on logout, since failure means password stays in scheduler memory

* docs: add comments explaining non-obvious design decisions from deep review

- SQLCipher WAL FD cost (1-3 FDs per connection, multiplied by users)
- Logout cleanup ordering: why unregister before close, known race window
- shutdown(wait=False): why non-blocking, safety via double-cleanup pattern

* docs: add thread lifecycle, FD budget, and resource exhaustion documentation

Knowledge captured from PR #2591 deep review (5 rounds of verification):
- architecture.md: Thread & Resource Lifecycle section with cleanup layers,
  mermaid diagram, FD budget table, and key files reference
- troubleshooting.md: Resource Exhaustion section with diagnosis commands
  and solutions for FD exhaustion
- docker-compose-guide.md: Resource Limits note explaining nofile/memlock
- web/database/README.md: Thread Safety & Connection Model section
- Cross-references added between all 4 docs
- Updated Areas for Improvement (container optimization → resource observability)
- Added encrypted_db.py and thread_local_session.py to Key Source Files

2026-03-08 16:22:17 +01:00

6.4 KiB

Raw Blame History

Docker Compose Guide

This guide covers Docker Compose setup for Local Deep Research. For the quickest start, see the Quick Start section below.

Quick Start

CPU-Only (All Platforms)

Works on macOS (M1/M2/M3/M4 and Intel), Windows, and Linux:

curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d

With NVIDIA GPU (Linux Only)

For hardware-accelerated inference:

curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && \
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml && \
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d

Prerequisites for GPU: Install the NVIDIA Container Toolkit first. See the README for detailed instructions.

Open http://localhost:5000 after ~30 seconds.

Using a Different Model

Specify a model with the LDR_LLM_MODEL environment variable:

LDR_LLM_MODEL=gemma3:4b docker compose up -d

The model will be automatically pulled if not already available.

Configuration Options

docker-compose.yml

The base configuration includes:

Service	Description
`local-deep-research`	The main web application (port 5000)
`ollama`	Local LLM inference engine
`searxng`	Privacy-focused meta search engine

Key Environment Variables

Most settings can be configured through the web UI at http://localhost:5000/settings. Environment variables override UI settings and lock them. For the complete list of all environment variables and their defaults, see CONFIGURATION.md.

⚠️ Warning: Setting environment variables causes a hard override—the setting becomes read-only in the UI and cannot be changed until the environment variable is removed. For settings you may want to adjust later, use the web UI instead. Environment variables are best suited for deployment-specific values like LDR_DATA_DIR or API keys.

Variable	Description
`LDR_WEB_HOST`	Bind address (default: `0.0.0.0` for Docker)
`LDR_WEB_PORT`	Internal port (default: `5000`)
`LDR_DATA_DIR`	Data directory (default: `/data`)
`LDR_APP_ALLOW_REGISTRATIONS`	Allow new user registration (default: `true`). Set to `false` for public deployments after creating your initial account.
`LDR_LLM_PROVIDER`	LLM provider (`ollama`, `openai`, `anthropic`, etc.)
`LDR_LLM_MODEL`	Model name (e.g., `gemma3:12b`)

Changing the External Port

Use Docker's port mapping instead of environment variables:

ports:
  - "8080:5000"  # Expose on port 8080 instead of 5000

Volume Mounts

Volume	Purpose
`ldr_data`	Application data
`ldr_scripts`	Startup scripts
`ldr_rag_cache`	RAG index cache
`ollama_data`	Downloaded models
`searxng_data`	Search engine config

Resource Limits

Warning: Resource limits in the base docker-compose.yml are intentionally minimal:

nofile (file descriptors): Not set. Docker's daemon default (typically 1M+) is appropriate. Setting a lower value can cause unable to open database file errors under load.

memlock: Set to unlimited so SQLCipher's mlock() (a system call that prevents memory from being swapped to disk) can lock encryption keys in RAM when cipher_memory_security is enabled (opt-in, off by default).

To customize, use a docker-compose.override.yml for your deployment.

Local Document Collections

Use the Collections system in the Web UI to manage your local documents. Upload files directly through the Collections page — no volume mounts required.

For more customization, use Cookie Cutter to generate a tailored docker-compose file:

# Install cookiecutter
pip install --user cookiecutter

# Clone the repository
git clone https://github.com/LearningCircuit/local-deep-research.git
cd local-deep-research

# Generate custom configuration
cookiecutter cookiecutter-docker/

Cookie Cutter will prompt you for:

Option	Description
`config_name`	Name for your configuration
`host_port`	Port to expose (default: 5000)
`host_ip`	IP to bind (default: 0.0.0.0)
`host_network`	Use host networking
`enable_gpu`	Enable NVIDIA GPU support
`enable_searxng`	Include SearXNG service

Then start with:

docker compose -f docker-compose.default.yml up -d

Using External LLM Providers

OpenRouter (100+ Models)

environment:
  - LDR_LLM_PROVIDER=openai_endpoint
  - LDR_LLM_OPENAI_ENDPOINT_URL=https://openrouter.ai/api/v1
  - LDR_LLM_OPENAI_ENDPOINT_API_KEY=<your-api-key>
  - LDR_LLM_MODEL=anthropic/claude-3.5-sonnet

LM Studio (Running on Host)

environment:
  - LDR_LLM_PROVIDER=lmstudio
  - LDR_LLM_LMSTUDIO_URL=http://host.docker.internal:1234/v1
  - LDR_LLM_MODEL=<your-loaded-model>

Common Commands

# Start services
docker compose up -d

# Start with GPU support
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d

# View logs
docker compose logs -f

# Stop services
docker compose down

# Update to latest version
docker compose pull && docker compose up -d

# Remove all data (fresh start)
docker compose down -v

Troubleshooting

Container won't start

Check logs: docker compose logs local-deep-research
Ensure port 5000 is available

Ollama model not loading

Check Ollama logs: docker compose logs ollama
Verify model name in LDR_LLM_MODEL environment variable
Ensure sufficient disk space for model download

GPU not detected

Verify NVIDIA drivers: nvidia-smi
Check container toolkit: docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

6.4 KiB Raw Blame History