local-deep-research

mirror of https://github.com/LearningCircuit/local-deep-research.git synced 2026-06-15 19:46:56 +03:00

Author	SHA1	Message	Date
LearningCircuit	ec91c5c716	fix(pdf): render CJK characters in exported PDFs (#4055 ) (#4058 ) * fix(pdf): render CJK characters in exported PDFs (#4055) The PDF stylesheet hard-coded a Latin-only font stack, so WeasyPrint silently dropped Chinese/Japanese/Korean glyphs from downloads even when they rendered fine in the HTML view. Add Noto Sans CJK / Microsoft YaHei / SimSun fallbacks for both body and monospace families, and install fonts-noto-cjk in the Docker runtime stage so the slim base image actually has glyph coverage. Non-Docker installs still need a CJK font package on the host. * fix(pdf): broaden CJK font fallbacks + document host requirement Extend the PDF CSS font stack to cover macOS (PingFang, Hiragino, Apple SD Gothic Neo) and additional Windows families (Microsoft JhengHei, Yu Gothic, Malgun Gothic), so pip installs on those platforms render CJK without any user action. Document the per-distro CJK font install command in install-pip.md and add a new FAQ entry. Linux pip/server hosts still need fonts-noto-cjk installed manually — there is no in-code way to fix that without bundling ~20 MB of fonts into the wheel. * test(pdf): assert CJK glyph embedding end-to-end (#4055) Round-trip CJK text through markdown → PDF → pypdf extract_text so CI fails if fonts-noto-cjk is ever removed from the Docker runtime image. The pytest-tests job runs inside that image, so the test sees the installed fonts; bare hosts without CJK fonts skip the assertion via an fc-list gate. Does not catch CSS-fallback-stack regressions on its own: fontconfig auto-substitutes a CJK family on Linux even for a Latin-only stack. The CSS fallbacks still matter on Windows/macOS, which CI does not exercise — documented in the test docstring.	2026-05-16 13:12:28 +02:00
james LI	6b2f98351f	docs: expand LM Studio FAQ with API key and provider tips (#4008 ) Co-authored-by: james <li@jamesdeMacBook-Pro.local>	2026-05-11 18:47:23 +02:00
LearningCircuit	1b558bc1ba	feat(lmstudio): add optional API key support for authenticated instances (#3573 ) (#3740 ) * feat(lmstudio): add optional API key support for authenticated instances Recent LM Studio versions can require an API key on the local server. LDR previously hardcoded "not-required", making authenticated instances fail silently. This adds an optional `llm.lmstudio.api_key` setting (UI field on the home page + auto-rendered Settings → LLM field + `LDR_LLM_LMSTUDIO_API_KEY` env var), mirroring the Ollama optional-key pattern. Backward compatible: an empty/whitespace key falls back to the existing placeholder so unauthenticated installs require no change. `is_available()` now sends the key as an `Authorization: Bearer` header so authenticated LM Studio instances are correctly detected as available. The parallel `get_llm()` direct-construction path in `llm_config.py` is also updated so the user's key flows through both code paths. The path allowlist in `.gitleaks.toml` is extended to cover `llm_config.py` — this file already legitimately holds `api_key=` kwargs for all providers, matching the existing allowlist for `llm/providers/`. Closes #3573 * test(lmstudio): cover llm_config.py direct get_llm path for api key The provider-class path (LMStudioProvider.create_llm) is already covered in test_lmstudio_provider.py, but the parallel get_llm("lmstudio", ...) branch in llm_config.py had no test asserting the user's key flows through to ChatOpenAI. Adds two tests mirroring the existing test_openai_endpoint_without_api_key_uses_placeholder pattern: one for configured-key passthrough, one for the empty-key fallback to "not-required". Closes a coverage gap surfaced during PR review. * feat(lmstudio): override list_models_for_api + FAQ entry for save UX Folds two of the three deferred follow-ups into the PR: 1. Override `LMStudioProvider.list_models_for_api` so the optional API key is read from settings on the high-level `list_models()` path, matching what the settings route already does. Caller-provided keys (route path) short-circuit the settings read via an `if not api_key:` guard, so the route remains untouched. Eliminates the drift hazard between the two key-resolution paths. 2. Add a FAQ entry documenting the save-then-poll UX: the password field saves on blur, so pasting + immediately clicking refresh can produce an empty model list. Recovery is to tab out then refresh. Adds 3 new tests for the override (dummy-key fallback, user-key passthrough from settings, route-provided key not overwritten by settings read). The third deferred item (`llm.ollama.api_key` JSON entry) stays as a separate cleanup PR — different provider, different feature.	2026-05-10 00:02:52 +02:00
LearningCircuit	5bd717d022	docs(docker): clarify Windows/WSL2/Mac networking — drop --network host, use host.docker.internal (#3925 ) Users following the README's Linux docker-run quick-start on Docker Desktop hit two coupled issues: --network host silently drops -p 5000:5000, and once it's removed, localhost inside the LDR container no longer reaches host-side Ollama/SearXNG. The fix already exists (use host.docker.internal or docker-compose), but our docs didn't surface it where users look. - README: warn next to "Option 1: Docker Run (Linux)" that --network host is Linux-only, link to the Windows/WSL2 FAQ entry. - installation.md: add a Mac/Windows/WSL2 docker-run variant alongside the Linux one, with the env vars that point Ollama/SearXNG at host.docker.internal and an --add-host flag so the same recipe works on Linux Docker Desktop too. - faq.md: rewrite the misleading "Port 5000 not accessible on Windows" entry in place (extra_hosts does not fix port publishing). Keep the anchor for backlinks; replace the body with the real cause and a working recipe. Note the env-var-vs-UI precedence trap. - troubleshooting.md: replace the stale `search.engine.searxng.url` key with the real one (`search.engine.web.searxng.default_params.instance_url`) and expand the cryptic "Docker networking: same as Ollama" line into a concrete fix with a pointer to the FAQ entry. No code or default-value changes — kept defaults at localhost so native pip installs and Linux --network host users are unaffected.	2026-05-09 13:13:47 +02:00
LearningCircuit	ad9e5aeb8b	docs: link benchmarks dataset from FAQ and news model pickers (#3782 ) Surface the HuggingFace ldr-benchmarks dataset in the remaining places where users pick a model: - FAQ entry 'Which Ollama model should I use?' — points readers at the community results before they pick a download. - News + News subscription forms — both have their own model dropdowns separate from the research page; same hint, matched to the existing Bootstrap-style ldr-form-text helper class used in those forms.	2026-05-02 01:39:32 +02:00
LearningCircuit	8ea4787626	fix: rename "Custom OpenAI Endpoint" to "OpenAI-Compatible Endpoint" (#2745 ) (#2818 ) Users selecting Llama.cpp couldn't find the right provider for custom endpoints because four different names were used across the codebase. Standardize on "OpenAI-Compatible Endpoint" — the industry-standard naming used by LM Studio, Ollama, Open WebUI, vLLM, and others. Changes: - Provider class: provider_name → "OpenAI-Compatible Endpoint" - Legacy config, default_settings.json, golden master: consistent name - JS fallbacks (settings.js, benchmark.html): updated dropdown labels - Llama.cpp label clarified to "(Local GGUF files only)" - Docs (faq.md, env_configuration.md): updated references - Tests: updated assertions and docstrings No breaking changes — internal keys (openai_endpoint, OPENAI_ENDPOINT), setting paths, class/function names, and file names are unchanged.	2026-03-18 08:18:20 +00:00
LearningCircuit	df52e3ec3e	feat: implement Reddit feedback improvements (#1909 ) * feat: implement Reddit feedback improvements Based on user feedback from r/LocalLLaMA, this commit addresses several documentation and usability issues: Documentation: - Add macOS port 5000 conflict documentation (AirPlay Receiver conflict) - Create comprehensive reverse proxy guide (Caddy, Nginx, Traefik) - Add debug logging guidance with platform-specific paths UI/UX: - Add "Advanced" badge and tooltip to Detailed Report mode to set expectations Feature: - Add opt-in LLM prompt/response logging (LDR_LOG_LLM_CALLS=true) for debugging Closes feedback from: reddit.com/r/LocalLLaMA/comments/1qdj2nn/ * fix: correct mypy type issues in llm_log_utils * docs: remove reverse proxy guide, keep inline note instead The reverse proxy configuration is generic infrastructure knowledge not specific to LDR. Replaced the guide link with a one-liner noting that LDR uses HTTP polling and works with any standard reverse proxy. * perf: pre-compile regex patterns in llm_log_utils Avoids recompiling 6 regex patterns on every sanitize() call. * refactor: keep docs and CSS, remove LLM logging feature - Keep macOS port 5000/AirPlay troubleshooting docs - Keep debug logging documentation (fixed LDR_LOG_LEVEL → LDR_ENABLE_FILE_LOGGING) - Keep .ldr-mode-badge CSS and Advanced badge UI change - Restore correct Nginx WebSocket reverse proxy config - Remove LLM logging feature (suggest as separate focused PR) * docs: add security note at top of Debug Logging section Move the log file security warning to a prominent blockquote at the start of the section so it is not overlooked.	2026-03-06 01:32:31 +01:00
LearningCircuit	04a55f106f	security: replace gosu with setpriv and suppress 8 unfixable CVEs (#2501 ) Replace gosu (Go binary) with setpriv (util-linux, already in base image) for privilege dropping in the container entrypoint. This eliminates 7 Go stdlib CVEs (CVE-2025-4674, CVE-2025-61732, CVE-2025-61731, CVE-2025-47907, CVE-2025-61729, CVE-2025-58187, CVE-2025-58188) by removing the only Go binary from the image. For the remaining 8 CVEs that are unfixable in Debian Trixie (libtiff6, coreutils, libc6, Chrome DevTools), add documented suppressions to both .grype.yaml (new) and .trivyignore with review date 2026-09-01. Also updates the base image digest to pick up latest security patches, and bumps Playwright from 1.57.0 to 1.58.0 (matching pyproject.toml) with the corresponding chromium-1208 revision.	2026-03-01 23:37:26 +01:00
LearningCircuit	33119ae2a4	refactor: remove deprecated settings-based local search engines (#2344 ) * refactor: remove deprecated settings-based local search engines The old settings-based local engines (research_papers, project_docs, personal_notes, local_all) are fully superseded by the database-backed Collection system with CollectionSearchEngine and LibraryRAGSearchEngine. - Delete LocalAllSearchEngine and LocalSearchEngine classes - Remove 58 settings entries from default_settings.json - Remove local engine registration from search_engines_config.py - Remove local_search_engines() function - Clean up LocalEmbeddingManager: remove 14 dead methods and unused attrs - Remove Docker volume mounts for local_collections - Update security whitelist, rate limiter, bearer config - Remove dead force_reindex code path in research_functions.py - Update docs to reference Collections UI - Remove/update all associated tests - Regenerate golden master settings * fix: address review comments from djpetti - Revert unintentional formatting change in theme options (keep compact inline format) - Restore unicode arrow character (→) that was escaped to \u2192 by JSON serializer - Rename search_engine_local.py → local_embedding_manager.py since it only contains LocalEmbeddingManager now (no search engines) - Remove unused chunk_size, chunk_overlap, cache_dir params from LocalEmbeddingManager - Update all imports and references across codebase	2026-02-28 16:00:13 +01:00
LearningCircuit	890c84e534	docs: link auto-generated Configuration Reference across docs & fix stale env var docs (#2472 ) - Add "Config Reference" link to Settings page "Learn & Get Help" bar - Overhaul docs/env_configuration.md: remove stale Dynaconf references, fix wrong double-underscore env var format, remove documented-as-fixed bug, replace duplicate tables with links to CONFIGURATION.md - Fix broken case-sensitive link in docs/deployment/unraid.md - Add CONFIGURATION.md cross-references to 12 docs' "See Also" sections - Update .env.template with correct LDR_-prefixed variable names - Add config reference comment to docker-compose.yml environment block	2026-02-28 13:46:34 +01:00
LearningCircuit	2fbf5c91b0	docs(faq): add Proxmox LXC troubleshooting for Docker permission errors (#2382 ) * fix(docker): add diagnostic error message when gosu fails in LXC If gosu can't switch users (e.g. LXC blocks CAP_SETUID/CAP_SETGID), print a clear error message with actionable fix instructions instead of gosu's cryptic "operation not permitted" error. * docs(faq): add Proxmox LXC troubleshooting for Docker permission errors Document both failure modes (chown and gosu) with solutions ordered by likelihood: enable nesting, relax AppArmor, use privileged LXC.	2026-02-23 00:40:54 +01:00
LearningCircuit	17fe11f4aa	fix(docker): enable host.docker.internal on Linux for LM Studio Add extra_hosts configuration to docker-compose.yml so that host.docker.internal works on Linux (it already works on Mac/Windows Docker Desktop). This fixes the issue where Docker users on Linux cannot connect to LM Studio running on their host machine because localhost inside the container refers to the container itself. Also: - Expanded FAQ with Linux-specific instructions - Added LM Studio configuration example in docker-compose.yml Fixes #1358	2025-12-12 21:05:17 +01:00
LearningCircuit	0f4c4cb516	refactor: Move generic config to references, simplify FAQ duplication - Advanced Configuration section now references general config docs for LLM/search - Reduced FAQ duplication - brief answers with links to full guide - Addresses djpetti feedback about scope creep and duplication	2025-11-21 22:25:00 +01:00
LearningCircuit	e5c8d5afcf	Merge pull request #1084 from LearningCircuit/fix-ossf-scorecard-workflow Fix OSSF Scorecard workflow	2025-11-20 01:34:55 +01:00
LearningCircuit	cc739b2791	fix: use angle bracket placeholders to avoid gitleaks detection Replace API key placeholders with <your-api-key> format which uses characters outside the gitleaks detection pattern, preventing false positives while maintaining clear placeholder syntax	2025-11-16 15:56:03 +01:00
LearningCircuit	00f48008cf	fix: replace API key patterns to avoid gitleaks false positives Replace example API keys with generic placeholders to prevent gitleaks from flagging documentation examples as potential secret leaks: - sk-or-v1-your-key-here → your-openrouter-api-key-here - sk-... → your-openai-api-key - sk-ant-... → your-anthropic-api-key	2025-11-16 15:49:11 +01:00
LearningCircuit	1395290fa5	docs: add comprehensive OpenRouter configuration documentation Add detailed OpenRouter setup instructions to resolve user confusion about environment variable configuration for Docker deployments. Changes: - docs/env_configuration.md: Add LLM Provider Configuration section with OpenRouter setup via Web UI, environment variables, and Docker Compose - docs/faq.md: Add "How do I use OpenRouter?" FAQ with quick setup guide and common troubleshooting issues - docker-compose.yml: Add commented OpenRouter configuration example in ADVANCED section for easy copy-paste setup Fixes #639	2025-11-16 12:15:40 +01:00
LearningCircuit	1350c7022e	docs: Add comprehensive documentation and update README (#508 ) * docs: Add comprehensive documentation and update README - Add FAQ based on real user issues from Discord and GitHub - Add detailed search engines guide with all available options - Add features documentation covering all capabilities - Add analytics dashboard guide - Update README with accurate benchmark results and clearer installation steps - Fix misleading claims about privacy and performance - Add disclaimers about community-maintained documentation Key improvements: - Added critical Ollama model download step to prevent common errors - Removed unimplemented features and experimental claims - Made benchmark claims more conservative and accurate - Better organized documentation structure for easier navigation * Fix documentation issues from PR review - Update 'Analytics Dashboard' to 'Metrics Dashboard' throughout - Fix API example code blocks to use bash/curl instead of Python - Add GPU VRAM requirements for Ollama models - Add missing SearXNG repository link - Remove features that don't exist (plugin system, meta search, etc.) - Update search strategy names to match UI - Simplify caching and security sections to reflect actual implementation - Add troubleshooting step to check logs - Various other accuracy improvements based on reviewer feedback * Fix remaining PR review issues - Update 404 error fix version from 0.5.0 to 0.5.2 in FAQ - Remove incorrect WebFetch caching claim from features documentation - Add note about viewing logs in web UI for SearXNG troubleshooting	2025-06-21 23:52:10 +02:00

18 Commits