* fix: prevent file descriptor exhaustion from dead thread engine accumulation Three root causes addressed: 1. Dead thread engine accumulation (primary): _thread_engines grows unboundedly as crashed/terminated threads leave orphaned NullPool engines. Add cleanup_dead_thread_engines() that sweeps entries for threads no longer in threading.enumerate(). Integrate via throttled sweep in teardown_appcontext (every 60s) and periodic sweep in the queue processor loop (every 6 iterations). 2. Generic downloader stream=True leak (secondary): generic.py used stream=True but never read or closed the response body, holding connections open. Removed stream=True since only status_code and headers are inspected. 3. Docker default 1024 FD limit (contributing): Add nofile ulimit (65536) to docker-compose.yml so the container has headroom for WAL mode databases, thread pools, and connection pools. * fix: address review findings — sweep lock, credential cleanup, flaky test - Add _sweep_lock to prevent TOCTOU race on _last_sweep_time in maybe_sweep_dead_engines() (concurrent teardowns could all pass the interval check) - Move alive_ids computation inside _thread_engine_lock to prevent race between snapshot and engine dict mutation - Sweep dead _thread_credentials (plaintext passwords) alongside engines in processor_v2.py and app_factory.py teardown - Fix flaky test_sweeps_after_interval: replace time.sleep(0.15) with _last_sweep_time backdating - Add tests for credential sweep and module-level cleanup_dead_threads() * fix: close search engine sessions after research, fix stream=True leak properly Three improvements to the FD exhaustion fix: 1. generic.py: Restore stream=True (removing it is unsafe — GenericDownloader handles ALL URLs and would download multi-GB files into memory). Use context manager instead to ensure the streamed connection is properly closed on all return paths, preventing socket FD leaks. 2. research_service.py: Add use_search.close() and system.close() in finally block of run_research_process(). Search engine HTTP sessions (e.g. SemanticScholar's SafeSession) were never explicitly closed after research, relying on non-deterministic GC for cleanup. 3. search_system.py + strategies: Add close() method to AdvancedSearchSystem and BaseSearchStrategy, with overrides in ConstraintParallelStrategy and ConcurrentDualConfidenceStrategy to shut down persistent ThreadPoolExecutors. Also adds detailed design comments throughout the codebase documenting: - Why NullPool engines don't leak FDs (memory leak only) - Why stream=True must NOT be removed from the diagnostic block - The dual sweep trigger architecture (request-driven + queue-driven) - Thread ID recycling limitations - Search engine lifecycle and cleanup responsibilities Fixes flaky test_removes_dead_thread_entries by using threading.Barrier to prevent thread ID recycling during test. * fix: unregister user from news scheduler on logout The logout handler never called scheduler.unregister_user(), causing: - Passwords to persist in scheduler memory for up to 48 hours - Orphaned APScheduler jobs to keep running after logout - Orphaned jobs to re-create QueuePool engines (~10 FDs each) after close_user_database() disposed the original, contributing to FD leaks Add scheduler unregistration before close_user_database() so running jobs can finish gracefully while the DB engine is still available. Add design comment documenting the logout cleanup order. * test: remove ineffective patch in logout scheduler test The `routes.get_news_scheduler` patch was ineffective because the logout handler imports `get_news_scheduler` dynamically inside the function body, so the name never enters the routes module namespace. The `create=True` flag masked this by silently creating a new attribute. The real patch on `subscription_manager.scheduler.get_news_scheduler` is sufficient. * fix: remove nofile ulimit override from docker-compose.yml Docker containers inherit ulimits from the Docker daemon, which typically runs with LimitNOFILE=infinity (1073741816+). Setting nofile to 65536 could actually *lower* the limit for most users, hurting large installations. The FD leak root causes are already fixed in this PR (dead-thread engine sweep, session close, scheduler unregister), so the safety net is unnecessary. Let users and their Docker daemon config control this. * fix: add try-except to strategy executor shutdown, elevate scheduler unregister log level - Wrap executor.shutdown(wait=False) in try-except in strategy close() methods for consistency with parallel_search_engine.py pattern - Change logger.debug → logger.warning for scheduler unregister failure on logout, since failure means password stays in scheduler memory * docs: add comments explaining non-obvious design decisions from deep review - SQLCipher WAL FD cost (1-3 FDs per connection, multiplied by users) - Logout cleanup ordering: why unregister before close, known race window - shutdown(wait=False): why non-blocking, safety via double-cleanup pattern * docs: add thread lifecycle, FD budget, and resource exhaustion documentation Knowledge captured from PR #2591 deep review (5 rounds of verification): - architecture.md: Thread & Resource Lifecycle section with cleanup layers, mermaid diagram, FD budget table, and key files reference - troubleshooting.md: Resource Exhaustion section with diagnosis commands and solutions for FD exhaustion - docker-compose-guide.md: Resource Limits note explaining nofile/memlock - web/database/README.md: Thread Safety & Connection Model section - Cross-references added between all 4 docs - Updated Areas for Improvement (container optimization → resource observability) - Added encrypted_db.py and thread_local_session.py to Key Source Files
6.4 KiB
Docker Compose Guide
This guide covers Docker Compose setup for Local Deep Research. For the quickest start, see the Quick Start section below.
Quick Start
CPU-Only (All Platforms)
Works on macOS (M1/M2/M3/M4 and Intel), Windows, and Linux:
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
With NVIDIA GPU (Linux Only)
For hardware-accelerated inference:
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && \
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml && \
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
Prerequisites for GPU: Install the NVIDIA Container Toolkit first. See the README for detailed instructions.
Open http://localhost:5000 after ~30 seconds.
Using a Different Model
Specify a model with the LDR_LLM_MODEL environment variable:
LDR_LLM_MODEL=gemma3:4b docker compose up -d
The model will be automatically pulled if not already available.
Configuration Options
docker-compose.yml
The base configuration includes:
| Service | Description |
|---|---|
local-deep-research |
The main web application (port 5000) |
ollama |
Local LLM inference engine |
searxng |
Privacy-focused meta search engine |
Key Environment Variables
Most settings can be configured through the web UI at http://localhost:5000/settings. Environment variables override UI settings and lock them. For the complete list of all environment variables and their defaults, see CONFIGURATION.md.
⚠️ Warning: Setting environment variables causes a hard override—the setting becomes read-only in the UI and cannot be changed until the environment variable is removed. For settings you may want to adjust later, use the web UI instead. Environment variables are best suited for deployment-specific values like
LDR_DATA_DIRor API keys.
| Variable | Description |
|---|---|
LDR_WEB_HOST |
Bind address (default: 0.0.0.0 for Docker) |
LDR_WEB_PORT |
Internal port (default: 5000) |
LDR_DATA_DIR |
Data directory (default: /data) |
LDR_APP_ALLOW_REGISTRATIONS |
Allow new user registration (default: true). Set to false for public deployments after creating your initial account. |
LDR_LLM_PROVIDER |
LLM provider (ollama, openai, anthropic, etc.) |
LDR_LLM_MODEL |
Model name (e.g., gemma3:12b) |
Changing the External Port
Use Docker's port mapping instead of environment variables:
ports:
- "8080:5000" # Expose on port 8080 instead of 5000
Volume Mounts
| Volume | Purpose |
|---|---|
ldr_data |
Application data |
ldr_scripts |
Startup scripts |
ldr_rag_cache |
RAG index cache |
ollama_data |
Downloaded models |
searxng_data |
Search engine config |
Resource Limits
Warning: Resource limits in the base
docker-compose.ymlare intentionally minimal:
nofile(file descriptors): Not set. Docker's daemon default (typically 1M+) is appropriate. Setting a lower value can causeunable to open database fileerrors under load.memlock: Set to unlimited so SQLCipher'smlock()(a system call that prevents memory from being swapped to disk) can lock encryption keys in RAM whencipher_memory_securityis enabled (opt-in, off by default).- To customize, use a
docker-compose.override.ymlfor your deployment.
Local Document Collections
Use the Collections system in the Web UI to manage your local documents. Upload files directly through the Collections page — no volume mounts required.
Advanced: Cookie Cutter Configuration
For more customization, use Cookie Cutter to generate a tailored docker-compose file:
# Install cookiecutter
pip install --user cookiecutter
# Clone the repository
git clone https://github.com/LearningCircuit/local-deep-research.git
cd local-deep-research
# Generate custom configuration
cookiecutter cookiecutter-docker/
Cookie Cutter will prompt you for:
| Option | Description |
|---|---|
config_name |
Name for your configuration |
host_port |
Port to expose (default: 5000) |
host_ip |
IP to bind (default: 0.0.0.0) |
host_network |
Use host networking |
enable_gpu |
Enable NVIDIA GPU support |
enable_searxng |
Include SearXNG service |
Then start with:
docker compose -f docker-compose.default.yml up -d
Using External LLM Providers
OpenRouter (100+ Models)
environment:
- LDR_LLM_PROVIDER=openai_endpoint
- LDR_LLM_OPENAI_ENDPOINT_URL=https://openrouter.ai/api/v1
- LDR_LLM_OPENAI_ENDPOINT_API_KEY=<your-api-key>
- LDR_LLM_MODEL=anthropic/claude-3.5-sonnet
LM Studio (Running on Host)
environment:
- LDR_LLM_PROVIDER=lmstudio
- LDR_LLM_LMSTUDIO_URL=http://host.docker.internal:1234/v1
- LDR_LLM_MODEL=<your-loaded-model>
Common Commands
# Start services
docker compose up -d
# Start with GPU support
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
# View logs
docker compose logs -f
# Stop services
docker compose down
# Update to latest version
docker compose pull && docker compose up -d
# Remove all data (fresh start)
docker compose down -v
Troubleshooting
Container won't start
- Check logs:
docker compose logs local-deep-research - Ensure port 5000 is available
Ollama model not loading
- Check Ollama logs:
docker compose logs ollama - Verify model name in
LDR_LLM_MODELenvironment variable - Ensure sufficient disk space for model download
GPU not detected
- Verify NVIDIA drivers:
nvidia-smi - Check container toolkit:
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi