Commit Graph

6 Commits

Author SHA1 Message Date
LearningCircuit
7d8fdee7dd fix: unify SettingsManagers, fix env var bugs (#2070)
* fix: unify SettingsManagers, fix env var bugs, delete duplicate

Two parallel SettingsManager implementations existed (settings/manager.py
and web/services/settings_manager.py) that diverged accidentally, each
with different bugs. This unifies them into a single implementation.

Bug fixes in settings/manager.py:
- get_setting() now checks env vars when setting is not in DB (was
  jumping straight to return default, ignoring env override)
- get_all_settings() now type-converts env overrides through
  get_typed_setting_value() (was storing raw strings like "true"
  instead of True)
- create_or_update_setting() now correctly checks db_setting.editable
  (was checking input dict's .editable which caused AttributeError)
- Added missing ui_element types: textarea, multiselect

Features added to settings/manager.py:
- get_bool_setting() method (required by rag_routes.py)
- default_settings now loads all 18 JSON files via rglob (was only
  loading 1 file with 370 settings, now loads 526)

All production and test imports updated from web.services.settings_manager
to settings.manager. Duplicate web/services/settings_manager.py deleted.

314 tests pass across 7 test files. 9 new tests cover bug fixes.

* test: add 29 tests for unified SettingsManager coverage gaps (#2071)

Cover create_or_update_setting (8 tests), default_settings property (4),
_ensure_settings_initialized (2), new UI element types textarea/multiselect/
range (4), _emit_settings_changed error resilience (3), plus edge cases
for get_setting check_env=False, get_all_settings with locked settings,
get_bool_setting with integers, parse_boolean edge cases, and env override
type conversion for text settings.

* fix: add missing abstract methods, env var defaults override, and type bug (#2074)

- Add get_bool_setting() and get_settings_snapshot() abstract methods to
  ISettingsManager base class so the interface contract is complete
- Fix create_or_update_setting: use setting_obj.type directly instead of
  SettingType[setting_obj.type.upper()] which fails when type is already
  a SettingType enum from the Pydantic model
- Add env var override in get_all_settings() defaults loop so settings
  not yet in DB can still be overridden via LDR_* environment variables
- Fix test_get_all_settings_db_error to expect defaults on DB failure
  (graceful degradation after unification)

* refactor: deduplicate provider availability checks and settings wrapper (#2054) (#2068)

- Delegate 5 provider availability functions in llm_config.py to their
  existing provider class is_available() methods (OpenAI, Anthropic,
  CustomOpenAIEndpoint, Ollama, LMStudio)
- Extract _get_or_create_status() helper in queue_service.py to
  eliminate duplicated QueueStatus lookup-or-create pattern
- Centralize get_llm_setting_from_snapshot() in thread_settings.py,
  replacing 6 identical copy-pasted wrappers across provider files
- Update test mock targets to reflect new delegation pattern

* fix: add missing abstract method implementations to InMemorySettingsManager

InMemorySettingsManager was missing get_bool_setting() and
get_settings_snapshot() implementations required by the ISettingsManager
ABC, causing TypeError on instantiation and cascading failures in
LLM unit tests, REST API tests, and Puppeteer auth tests.

* fix: convert web SettingType to database SettingType in create_or_update_setting

The PR changed `type=SettingType[setting_obj.type.upper()]` to
`type=setting_obj.type`, but setting_obj.type is a web model SettingType
(str, Enum) while Setting.type expects the database SettingType (enum.Enum).
This causes a 500 error when creating new settings via PUT endpoint.

Use `.name` for cleaner enum-to-enum conversion instead of `.upper()`.

* fix: add multiselect type conversion and warn on untyped env overrides (#2080)

Address review feedback from @djpetti on PR #2070:

1. Replace multiselect `lambda x: x` with `_parse_multiselect()` that
   properly handles env var strings — parses JSON arrays (e.g.
   '["markdown","latex"]') and comma-separated values (e.g.
   'markdown,latex') while passing through lists from SQLAlchemy
   unchanged.

2. Log a warning when get_setting() encounters an env var override for
   a setting not in defaults, returning the raw string without type
   conversion. This surfaces settings that should be added to a
   defaults JSON file to get proper type information.

Tests: 14 new tests (111 total in test_settings_manager.py, 0 failures)

* test: add tests for consolidated UI element-to-type mapping

Verifies single canonical _UI_ELEMENT_TO_SETTING_TYPE is reused by
both InMemorySettingsManager and SettingsManager.
2026-02-11 06:59:07 +01:00
LearningCircuit
7a73ee26b9 docs: fix incorrect API endpoint paths in documentation (#1210)
Updates documentation and examples to use the correct API endpoints:
- /api/start_research (was /research/api/start)
- /api/research/{id}/status (was /research/api/research/{id}/status)
- /api/report/{id} (was /research/api/research/{id}/result)
- /api/terminate/{id} (was /research/api/research/{id}/terminate)

Fixes #1205
2025-12-02 19:54:46 +00:00
LearningCircuit
ddcd962a7e feat: enhance HTTP API examples with retry logic and automatic user creation
Major improvements to HTTP API examples:

- Add intelligent retry logic for fetching research results (up to 2 minutes)
- Implement automatic user creation for out-of-the-box functionality
- Fix API endpoint usage (/api/start_research instead of /research/api/start)
- Add proper CSRF token handling and authentication flow
- Create comprehensive documentation with environment variable configuration
- Add progress monitoring and detailed status reporting
- Include remote Ollama and SearXNG configuration examples
- Provide multiple example scripts for different use cases
- Use pathlib.Path instead of os.path for modern Python practice

Examples now work completely out of the box without manual user setup
and include proper error handling and user guidance throughout the process.
2025-10-31 23:48:01 +01:00
LearningCircuit
ccd809dbe3 fix: Correct API endpoint and authentication in examples and documentation
Fixes critical issues with HTTP API documentation and examples that were causing
authentication failures and "endpoint not found" errors for users.

## Changes Made

### 🔧 Fixed API Endpoint
- Updated examples to use correct endpoint: `/api/start_research`
- Previously examples used wrong endpoint: `/research/api/start`

### 🔐 Fixed Authentication Flow
- Updated login examples to use form data (not JSON)
- Added proper CSRF token handling for login
- Fixed authentication flow to work with v2.0+ security

### 📚 Documentation Updates
- Updated `examples/api_usage/README.md` with working example
- Fixed `examples/api_usage/http/simple_http_example.py`
- Added comprehensive `working_api_example.py` with proper error handling

### 🧪 Testing Tools Added
- Created `tests/api_tests/test_research_api_debug.py` for debugging API issues
- Added comprehensive test suite for authentication and API endpoints

## Impact

This fixes the most common issue reported by users trying to use the HTTP API,
where they get "Failed to start research" errors due to incorrect endpoint usage
and authentication problems.

## Testing

-  Tested with fresh user registration and login
-  Verified correct API endpoint works properly
-  Confirmed authentication flow works end-to-end
-  Added comprehensive debugging tools for future issues

Resolves user reports of API authentication failures and endpoint errors.
2025-10-31 22:58:27 +01:00
LearningCircuit
62928db777 feat: Implement per-user encrypted databases with comprehensive security overhaul
This major release introduces fundamental security and architectural improvements
to Local Deep Research, transitioning from a single-user system to a secure
multi-user platform with encrypted databases and proper authentication.

## 🔐 Security & Authentication
- **Per-user encrypted databases**: Each user now has their own SQLCipher-encrypted
  database with AES-256 encryption, protecting API keys and research data
- **Mandatory authentication**: All API endpoints and programmatic access now
  require user authentication
- **Session-based security**: Implemented secure session management with CSRF
  protection for all state-changing operations
- **Password-based encryption**: User passwords serve as database encryption keys
  (no recovery mechanism - intentional security feature)

## 🏗️ Architecture Changes
- **Thread-safe design**: Complete overhaul of settings and database access to
  ensure thread safety across all operations
- **Settings snapshots**: New immutable settings snapshot pattern prevents race
  conditions in concurrent operations
- **In-memory queue tracking**: Replaced unencrypted service.db with memory-only
  queue tracking to eliminate PII storage risks
- **Optimized middleware**: Reduced middleware overhead by 70% through intelligent
  request filtering and caching

## 📊 Database Structure
- Migrated from single shared database to per-user encrypted databases
- New models: User, UserSettings, UserActiveResearch, AuthSession
- Removed global models that could leak data between users
- All sensitive data (API keys, research history) now user-scoped

## 🧪 Testing & Quality
- Added 200+ new tests covering authentication, encryption, and thread safety
- New Puppeteer UI tests for end-to-end authentication flows
- Comprehensive OpenAI API key configuration tests
- LangChain integration tests for custom LLMs and retrievers
- All tests updated to work with new authentication system

## 📚 Documentation
- New migration guide for v0.x to v1.0 upgrade
- SQLCipher installation guide for all platforms
- Troubleshooting guide for OpenAI API configuration
- Updated all examples to demonstrate authenticated usage
- Comprehensive API documentation with authentication examples

## 🔧 Technical Implementation
- SQLCipher integration with hex-encoded password handling
- Thread-local session storage preventing cross-contamination
- Context-aware database sessions with proper cleanup
- Automatic session lifecycle management
- Rate limiting now per-user instead of global

## 💥 Breaking Changes
- All API access now requires authentication
- Database structure completely changed (migration required)
- Settings API redesigned for thread safety
- Removed direct database access methods
- Changed research ID type from integer to UUID

## 📦 Dependencies
- Added: pysqlcipher3 for database encryption
- Added: Additional auth-related dependencies
- Updated: All major dependencies to latest versions

## 🚀 Performance Improvements
- Middleware optimization reduces overhead by 70%
- Cached settings reduce database queries by 90%
- Thread-local sessions eliminate lock contention
- Smarter request routing skips auth for static assets

This release represents a complete security overhaul making LDR suitable for
production multi-user deployments while maintaining full backward compatibility
through migration guides and extensive documentation.
2025-07-03 02:17:44 +02:00
LearningCircuit
d8d982d338 Feature/langchain retriever integration (#502)
* feat: Add LangChain retriever integration for vector store support

- Add RetrieverRegistry for dynamic retriever registration
- Create RetrieverSearchEngine wrapper for LangChain BaseRetriever
- Integrate retrievers with search factory and config system
- Add retrievers parameter to all API functions
- Include comprehensive test suite and examples
- Support thread-safe operations and multiple retrievers

This allows users to pass any LangChain retriever (FAISS, Pinecone,
Vertex AI, etc.) to LDR and use it as a search engine seamlessly.

* refactor: Organize API examples into structured folders

- Create api_usage/ directory with programmatic/ and http/ subdirectories
- Move existing examples to appropriate folders
- Add comprehensive HTTP API examples (simple and advanced)
- Add curl examples for command-line usage
- Add simple programmatic example for quick start
- Include README explaining when to use each API type

* chore: Remove old example files from root examples directory

Files have been moved to examples/api_usage/programmatic/

* fix: Address PR review comments

- Replace logger.error with logger.exception for better error tracking
- Default retriever name to class name if not provided
2025-06-19 08:44:21 -04:00