mirror of
https://github.com/LearningCircuit/local-deep-research.git
synced 2026-06-15 19:46:56 +03:00
* fix: unify SettingsManagers, fix env var bugs, delete duplicate Two parallel SettingsManager implementations existed (settings/manager.py and web/services/settings_manager.py) that diverged accidentally, each with different bugs. This unifies them into a single implementation. Bug fixes in settings/manager.py: - get_setting() now checks env vars when setting is not in DB (was jumping straight to return default, ignoring env override) - get_all_settings() now type-converts env overrides through get_typed_setting_value() (was storing raw strings like "true" instead of True) - create_or_update_setting() now correctly checks db_setting.editable (was checking input dict's .editable which caused AttributeError) - Added missing ui_element types: textarea, multiselect Features added to settings/manager.py: - get_bool_setting() method (required by rag_routes.py) - default_settings now loads all 18 JSON files via rglob (was only loading 1 file with 370 settings, now loads 526) All production and test imports updated from web.services.settings_manager to settings.manager. Duplicate web/services/settings_manager.py deleted. 314 tests pass across 7 test files. 9 new tests cover bug fixes. * test: add 29 tests for unified SettingsManager coverage gaps (#2071) Cover create_or_update_setting (8 tests), default_settings property (4), _ensure_settings_initialized (2), new UI element types textarea/multiselect/ range (4), _emit_settings_changed error resilience (3), plus edge cases for get_setting check_env=False, get_all_settings with locked settings, get_bool_setting with integers, parse_boolean edge cases, and env override type conversion for text settings. * fix: add missing abstract methods, env var defaults override, and type bug (#2074) - Add get_bool_setting() and get_settings_snapshot() abstract methods to ISettingsManager base class so the interface contract is complete - Fix create_or_update_setting: use setting_obj.type directly instead of SettingType[setting_obj.type.upper()] which fails when type is already a SettingType enum from the Pydantic model - Add env var override in get_all_settings() defaults loop so settings not yet in DB can still be overridden via LDR_* environment variables - Fix test_get_all_settings_db_error to expect defaults on DB failure (graceful degradation after unification) * refactor: deduplicate provider availability checks and settings wrapper (#2054) (#2068) - Delegate 5 provider availability functions in llm_config.py to their existing provider class is_available() methods (OpenAI, Anthropic, CustomOpenAIEndpoint, Ollama, LMStudio) - Extract _get_or_create_status() helper in queue_service.py to eliminate duplicated QueueStatus lookup-or-create pattern - Centralize get_llm_setting_from_snapshot() in thread_settings.py, replacing 6 identical copy-pasted wrappers across provider files - Update test mock targets to reflect new delegation pattern * fix: add missing abstract method implementations to InMemorySettingsManager InMemorySettingsManager was missing get_bool_setting() and get_settings_snapshot() implementations required by the ISettingsManager ABC, causing TypeError on instantiation and cascading failures in LLM unit tests, REST API tests, and Puppeteer auth tests. * fix: convert web SettingType to database SettingType in create_or_update_setting The PR changed `type=SettingType[setting_obj.type.upper()]` to `type=setting_obj.type`, but setting_obj.type is a web model SettingType (str, Enum) while Setting.type expects the database SettingType (enum.Enum). This causes a 500 error when creating new settings via PUT endpoint. Use `.name` for cleaner enum-to-enum conversion instead of `.upper()`. * fix: add multiselect type conversion and warn on untyped env overrides (#2080) Address review feedback from @djpetti on PR #2070: 1. Replace multiselect `lambda x: x` with `_parse_multiselect()` that properly handles env var strings — parses JSON arrays (e.g. '["markdown","latex"]') and comma-separated values (e.g. 'markdown,latex') while passing through lists from SQLAlchemy unchanged. 2. Log a warning when get_setting() encounters an env var override for a setting not in defaults, returning the raw string without type conversion. This surfaces settings that should be added to a defaults JSON file to get proper type information. Tests: 14 new tests (111 total in test_settings_manager.py, 0 failures) * test: add tests for consolidated UI element-to-type mapping Verifies single canonical _UI_ELEMENT_TO_SETTING_TYPE is reused by both InMemorySettingsManager and SettingsManager.
232 lines
6.5 KiB
Markdown
232 lines
6.5 KiB
Markdown
# API Quick Start
|
|
|
|
## Overview
|
|
|
|
Local Deep Research provides both HTTP REST API and programmatic Python API access. Since version 1.0, authentication is required for all API endpoints, and the system uses per-user encrypted databases.
|
|
|
|
## Simplest Usage - Python Client
|
|
|
|
The easiest way to use the API is with the built-in client that handles all authentication complexity:
|
|
|
|
```python
|
|
from local_deep_research.api import LDRClient, quick_query
|
|
|
|
# One-liner for quick research
|
|
summary = quick_query("username", "password", "What is DNA?")
|
|
print(summary)
|
|
|
|
# Or use the client for multiple operations
|
|
client = LDRClient()
|
|
client.login("username", "password")
|
|
result = client.quick_research("What is machine learning?")
|
|
print(result["summary"])
|
|
```
|
|
|
|
No need to worry about CSRF tokens, HTML parsing, or session management!
|
|
|
|
## Authentication
|
|
|
|
### Web UI Authentication
|
|
|
|
The API requires authentication through the web interface first:
|
|
|
|
1. Start the server:
|
|
```bash
|
|
python -m local_deep_research.web.app
|
|
```
|
|
|
|
2. Open http://localhost:5000 in your browser
|
|
3. Register a new account or login
|
|
4. Your session cookie will be used for API authentication
|
|
|
|
### HTTP API Authentication
|
|
|
|
For HTTP API requests, you need to:
|
|
|
|
1. First authenticate through the login endpoint
|
|
2. Include the session cookie in subsequent requests
|
|
3. Include CSRF token for state-changing operations
|
|
|
|
Example authentication flow:
|
|
|
|
```python
|
|
import requests
|
|
from bs4 import BeautifulSoup
|
|
|
|
# Create a session to persist cookies
|
|
session = requests.Session()
|
|
|
|
# 1. Get login page and extract CSRF token for login
|
|
login_page = session.get("http://localhost:5000/auth/login")
|
|
soup = BeautifulSoup(login_page.text, 'html.parser')
|
|
csrf_input = soup.find('input', {'name': 'csrf_token'})
|
|
login_csrf = csrf_input.get('value') if csrf_input else None
|
|
|
|
# 2. Login with form data (not JSON) including CSRF
|
|
login_response = session.post(
|
|
"http://localhost:5000/auth/login",
|
|
data={
|
|
"username": "your_username",
|
|
"password": "your_password",
|
|
"csrf_token": login_csrf
|
|
}
|
|
)
|
|
|
|
if login_response.status_code in [200, 302]:
|
|
print("Login successful")
|
|
# Session cookie is automatically stored
|
|
else:
|
|
print(f"Login failed: {login_response.text}")
|
|
|
|
# 3. Get CSRF token for API requests
|
|
csrf_response = session.get("http://localhost:5000/auth/csrf-token")
|
|
csrf_token = csrf_response.json()["csrf_token"]
|
|
|
|
# 4. Make API requests with CSRF header
|
|
headers = {"X-CSRF-Token": csrf_token}
|
|
api_response = session.post(
|
|
"http://localhost:5000/api/start_research",
|
|
json={
|
|
"query": "What is quantum computing?",
|
|
"model": "gpt-3.5-turbo",
|
|
"search_engines": ["searxng"],
|
|
},
|
|
headers=headers
|
|
)
|
|
```
|
|
|
|
## Programmatic API Access
|
|
|
|
The programmatic API now requires a settings snapshot for proper context:
|
|
|
|
```python
|
|
from local_deep_research.api import quick_summary
|
|
from local_deep_research.settings import SettingsManager
|
|
from local_deep_research.database.session_context import get_user_db_session
|
|
|
|
# Get user session and settings
|
|
with get_user_db_session(username="your_username", password="your_password") as session:
|
|
settings_manager = SettingsManager(session)
|
|
settings_snapshot = settings_manager.get_all_settings()
|
|
|
|
# Use the API with settings snapshot
|
|
result = quick_summary(
|
|
query="What is machine learning?",
|
|
settings_snapshot=settings_snapshot,
|
|
iterations=2,
|
|
questions_per_iteration=3
|
|
)
|
|
|
|
print(result["summary"])
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
### Research Endpoints
|
|
|
|
Research endpoints are under `/api/`:
|
|
|
|
- `POST /api/start_research` - Start new research
|
|
- `GET /api/research/{id}/status` - Check research status
|
|
- `GET /api/report/{id}` - Get research results
|
|
- `POST /api/terminate/{id}` - Stop running research
|
|
|
|
### Settings Endpoints
|
|
|
|
Settings endpoints are under `/settings/api/`:
|
|
|
|
- `GET /settings/api` - Get all settings
|
|
- `GET /settings/api/{key}` - Get specific setting
|
|
- `PUT /settings/api/{key}` - Update setting
|
|
- `GET /settings/api/available-models` - Get available LLM providers
|
|
- `GET /settings/api/available-search-engines` - Get search engines
|
|
|
|
### History Endpoints
|
|
|
|
- `GET /history/api` - Get research history
|
|
- `GET /history/api/{id}` - Get specific research details
|
|
|
|
## Important Changes from v1.x
|
|
|
|
1. **Authentication Required**: All API endpoints now require authentication
|
|
2. **Settings Snapshot**: Programmatic API calls need settings_snapshot parameter
|
|
3. **Per-User Databases**: Each user has their own encrypted database
|
|
4. **CSRF Protection**: State-changing requests require CSRF token
|
|
5. **New Endpoint Structure**: Research APIs are under `/api/` (e.g., `/api/start_research`)
|
|
|
|
## Example: Complete Research Flow
|
|
|
|
```python
|
|
import requests
|
|
import time
|
|
|
|
# Setup session and login
|
|
session = requests.Session()
|
|
session.post(
|
|
"http://localhost:5000/auth/login",
|
|
json={"username": "user", "password": "pass"}
|
|
)
|
|
|
|
# Get CSRF token
|
|
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
|
|
headers = {"X-CSRF-Token": csrf}
|
|
|
|
# Start research
|
|
research = session.post(
|
|
"http://localhost:5000/api/start_research",
|
|
json={
|
|
"query": "Latest advances in quantum computing",
|
|
"model": "gpt-3.5-turbo",
|
|
"search_engines": ["arxiv", "wikipedia"],
|
|
"iterations": 3
|
|
},
|
|
headers=headers
|
|
).json()
|
|
|
|
research_id = research["research_id"]
|
|
|
|
# Poll for results
|
|
while True:
|
|
status = session.get(
|
|
f"http://localhost:5000/api/research/{research_id}/status"
|
|
).json()
|
|
|
|
if status["status"] in ["completed", "failed"]:
|
|
break
|
|
|
|
print(f"Progress: {status.get('progress', 'unknown')}")
|
|
time.sleep(5)
|
|
|
|
# Get final results
|
|
results = session.get(
|
|
f"http://localhost:5000/api/report/{research_id}"
|
|
).json()
|
|
|
|
print(f"Summary: {results['summary']}")
|
|
print(f"Sources: {len(results['sources'])}")
|
|
```
|
|
|
|
## Rate Limiting
|
|
|
|
The API includes adaptive rate limiting:
|
|
- Default: 60 requests per minute per user
|
|
- Automatic retry with exponential backoff
|
|
- Rate limits are per-user, not per-IP
|
|
|
|
## Error Handling
|
|
|
|
Common error responses:
|
|
- `401`: Not authenticated - login required
|
|
- `403`: CSRF token missing or invalid
|
|
- `404`: Resource not found
|
|
- `429`: Rate limit exceeded
|
|
- `500`: Server error
|
|
|
|
Always check response status and handle errors appropriately.
|
|
|
|
## Next Steps
|
|
|
|
- See [examples/api_usage](../examples/api_usage/) for complete examples
|
|
- Check [docs/CUSTOM_LLM_INTEGRATION.md](CUSTOM_LLM_INTEGRATION.md) for custom LLM setup
|
|
- Read [docs/LANGCHAIN_RETRIEVER_INTEGRATION.md](LANGCHAIN_RETRIEVER_INTEGRATION.md) for custom retrievers
|