Files
local-deep-research/docs/api-quickstart.md
LearningCircuit 7d8fdee7dd fix: unify SettingsManagers, fix env var bugs (#2070)
* fix: unify SettingsManagers, fix env var bugs, delete duplicate

Two parallel SettingsManager implementations existed (settings/manager.py
and web/services/settings_manager.py) that diverged accidentally, each
with different bugs. This unifies them into a single implementation.

Bug fixes in settings/manager.py:
- get_setting() now checks env vars when setting is not in DB (was
  jumping straight to return default, ignoring env override)
- get_all_settings() now type-converts env overrides through
  get_typed_setting_value() (was storing raw strings like "true"
  instead of True)
- create_or_update_setting() now correctly checks db_setting.editable
  (was checking input dict's .editable which caused AttributeError)
- Added missing ui_element types: textarea, multiselect

Features added to settings/manager.py:
- get_bool_setting() method (required by rag_routes.py)
- default_settings now loads all 18 JSON files via rglob (was only
  loading 1 file with 370 settings, now loads 526)

All production and test imports updated from web.services.settings_manager
to settings.manager. Duplicate web/services/settings_manager.py deleted.

314 tests pass across 7 test files. 9 new tests cover bug fixes.

* test: add 29 tests for unified SettingsManager coverage gaps (#2071)

Cover create_or_update_setting (8 tests), default_settings property (4),
_ensure_settings_initialized (2), new UI element types textarea/multiselect/
range (4), _emit_settings_changed error resilience (3), plus edge cases
for get_setting check_env=False, get_all_settings with locked settings,
get_bool_setting with integers, parse_boolean edge cases, and env override
type conversion for text settings.

* fix: add missing abstract methods, env var defaults override, and type bug (#2074)

- Add get_bool_setting() and get_settings_snapshot() abstract methods to
  ISettingsManager base class so the interface contract is complete
- Fix create_or_update_setting: use setting_obj.type directly instead of
  SettingType[setting_obj.type.upper()] which fails when type is already
  a SettingType enum from the Pydantic model
- Add env var override in get_all_settings() defaults loop so settings
  not yet in DB can still be overridden via LDR_* environment variables
- Fix test_get_all_settings_db_error to expect defaults on DB failure
  (graceful degradation after unification)

* refactor: deduplicate provider availability checks and settings wrapper (#2054) (#2068)

- Delegate 5 provider availability functions in llm_config.py to their
  existing provider class is_available() methods (OpenAI, Anthropic,
  CustomOpenAIEndpoint, Ollama, LMStudio)
- Extract _get_or_create_status() helper in queue_service.py to
  eliminate duplicated QueueStatus lookup-or-create pattern
- Centralize get_llm_setting_from_snapshot() in thread_settings.py,
  replacing 6 identical copy-pasted wrappers across provider files
- Update test mock targets to reflect new delegation pattern

* fix: add missing abstract method implementations to InMemorySettingsManager

InMemorySettingsManager was missing get_bool_setting() and
get_settings_snapshot() implementations required by the ISettingsManager
ABC, causing TypeError on instantiation and cascading failures in
LLM unit tests, REST API tests, and Puppeteer auth tests.

* fix: convert web SettingType to database SettingType in create_or_update_setting

The PR changed `type=SettingType[setting_obj.type.upper()]` to
`type=setting_obj.type`, but setting_obj.type is a web model SettingType
(str, Enum) while Setting.type expects the database SettingType (enum.Enum).
This causes a 500 error when creating new settings via PUT endpoint.

Use `.name` for cleaner enum-to-enum conversion instead of `.upper()`.

* fix: add multiselect type conversion and warn on untyped env overrides (#2080)

Address review feedback from @djpetti on PR #2070:

1. Replace multiselect `lambda x: x` with `_parse_multiselect()` that
   properly handles env var strings — parses JSON arrays (e.g.
   '["markdown","latex"]') and comma-separated values (e.g.
   'markdown,latex') while passing through lists from SQLAlchemy
   unchanged.

2. Log a warning when get_setting() encounters an env var override for
   a setting not in defaults, returning the raw string without type
   conversion. This surfaces settings that should be added to a
   defaults JSON file to get proper type information.

Tests: 14 new tests (111 total in test_settings_manager.py, 0 failures)

* test: add tests for consolidated UI element-to-type mapping

Verifies single canonical _UI_ELEMENT_TO_SETTING_TYPE is reused by
both InMemorySettingsManager and SettingsManager.
2026-02-11 06:59:07 +01:00

232 lines
6.5 KiB
Markdown

# API Quick Start
## Overview
Local Deep Research provides both HTTP REST API and programmatic Python API access. Since version 1.0, authentication is required for all API endpoints, and the system uses per-user encrypted databases.
## Simplest Usage - Python Client
The easiest way to use the API is with the built-in client that handles all authentication complexity:
```python
from local_deep_research.api import LDRClient, quick_query
# One-liner for quick research
summary = quick_query("username", "password", "What is DNA?")
print(summary)
# Or use the client for multiple operations
client = LDRClient()
client.login("username", "password")
result = client.quick_research("What is machine learning?")
print(result["summary"])
```
No need to worry about CSRF tokens, HTML parsing, or session management!
## Authentication
### Web UI Authentication
The API requires authentication through the web interface first:
1. Start the server:
```bash
python -m local_deep_research.web.app
```
2. Open http://localhost:5000 in your browser
3. Register a new account or login
4. Your session cookie will be used for API authentication
### HTTP API Authentication
For HTTP API requests, you need to:
1. First authenticate through the login endpoint
2. Include the session cookie in subsequent requests
3. Include CSRF token for state-changing operations
Example authentication flow:
```python
import requests
from bs4 import BeautifulSoup
# Create a session to persist cookies
session = requests.Session()
# 1. Get login page and extract CSRF token for login
login_page = session.get("http://localhost:5000/auth/login")
soup = BeautifulSoup(login_page.text, 'html.parser')
csrf_input = soup.find('input', {'name': 'csrf_token'})
login_csrf = csrf_input.get('value') if csrf_input else None
# 2. Login with form data (not JSON) including CSRF
login_response = session.post(
"http://localhost:5000/auth/login",
data={
"username": "your_username",
"password": "your_password",
"csrf_token": login_csrf
}
)
if login_response.status_code in [200, 302]:
print("Login successful")
# Session cookie is automatically stored
else:
print(f"Login failed: {login_response.text}")
# 3. Get CSRF token for API requests
csrf_response = session.get("http://localhost:5000/auth/csrf-token")
csrf_token = csrf_response.json()["csrf_token"]
# 4. Make API requests with CSRF header
headers = {"X-CSRF-Token": csrf_token}
api_response = session.post(
"http://localhost:5000/api/start_research",
json={
"query": "What is quantum computing?",
"model": "gpt-3.5-turbo",
"search_engines": ["searxng"],
},
headers=headers
)
```
## Programmatic API Access
The programmatic API now requires a settings snapshot for proper context:
```python
from local_deep_research.api import quick_summary
from local_deep_research.settings import SettingsManager
from local_deep_research.database.session_context import get_user_db_session
# Get user session and settings
with get_user_db_session(username="your_username", password="your_password") as session:
settings_manager = SettingsManager(session)
settings_snapshot = settings_manager.get_all_settings()
# Use the API with settings snapshot
result = quick_summary(
query="What is machine learning?",
settings_snapshot=settings_snapshot,
iterations=2,
questions_per_iteration=3
)
print(result["summary"])
```
## API Endpoints
### Research Endpoints
Research endpoints are under `/api/`:
- `POST /api/start_research` - Start new research
- `GET /api/research/{id}/status` - Check research status
- `GET /api/report/{id}` - Get research results
- `POST /api/terminate/{id}` - Stop running research
### Settings Endpoints
Settings endpoints are under `/settings/api/`:
- `GET /settings/api` - Get all settings
- `GET /settings/api/{key}` - Get specific setting
- `PUT /settings/api/{key}` - Update setting
- `GET /settings/api/available-models` - Get available LLM providers
- `GET /settings/api/available-search-engines` - Get search engines
### History Endpoints
- `GET /history/api` - Get research history
- `GET /history/api/{id}` - Get specific research details
## Important Changes from v1.x
1. **Authentication Required**: All API endpoints now require authentication
2. **Settings Snapshot**: Programmatic API calls need settings_snapshot parameter
3. **Per-User Databases**: Each user has their own encrypted database
4. **CSRF Protection**: State-changing requests require CSRF token
5. **New Endpoint Structure**: Research APIs are under `/api/` (e.g., `/api/start_research`)
## Example: Complete Research Flow
```python
import requests
import time
# Setup session and login
session = requests.Session()
session.post(
"http://localhost:5000/auth/login",
json={"username": "user", "password": "pass"}
)
# Get CSRF token
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
headers = {"X-CSRF-Token": csrf}
# Start research
research = session.post(
"http://localhost:5000/api/start_research",
json={
"query": "Latest advances in quantum computing",
"model": "gpt-3.5-turbo",
"search_engines": ["arxiv", "wikipedia"],
"iterations": 3
},
headers=headers
).json()
research_id = research["research_id"]
# Poll for results
while True:
status = session.get(
f"http://localhost:5000/api/research/{research_id}/status"
).json()
if status["status"] in ["completed", "failed"]:
break
print(f"Progress: {status.get('progress', 'unknown')}")
time.sleep(5)
# Get final results
results = session.get(
f"http://localhost:5000/api/report/{research_id}"
).json()
print(f"Summary: {results['summary']}")
print(f"Sources: {len(results['sources'])}")
```
## Rate Limiting
The API includes adaptive rate limiting:
- Default: 60 requests per minute per user
- Automatic retry with exponential backoff
- Rate limits are per-user, not per-IP
## Error Handling
Common error responses:
- `401`: Not authenticated - login required
- `403`: CSRF token missing or invalid
- `404`: Resource not found
- `429`: Rate limit exceeded
- `500`: Server error
Always check response status and handle errors appropriately.
## Next Steps
- See [examples/api_usage](../examples/api_usage/) for complete examples
- Check [docs/CUSTOM_LLM_INTEGRATION.md](CUSTOM_LLM_INTEGRATION.md) for custom LLM setup
- Read [docs/LANGCHAIN_RETRIEVER_INTEGRATION.md](LANGCHAIN_RETRIEVER_INTEGRATION.md) for custom retrievers