local-deep-research/docs/api-quickstart.md

# API Quick Start

## Overview

Local Deep Research provides both HTTP REST API and programmatic Python API access. Since version 1.0, authentication is required for all API endpoints, and the system uses per-user encrypted databases.

## Simplest Usage - Python Client

The easiest way to use the API is with the built-in client that handles all authentication complexity:

```python
from local_deep_research.api import LDRClient, quick_query

# One-liner for quick research
summary = quick_query("username", "password", "What is DNA?")
print(summary)

# Or use the client for multiple operations
client = LDRClient()
client.login("username", "password")
result = client.quick_research("What is machine learning?")
print(result["summary"])
```

No need to worry about CSRF tokens, HTML parsing, or session management!

## Authentication

### Web UI Authentication

The API requires authentication through the web interface first:

1. Start the server:
   ```bash
   python -m local_deep_research.web.app
   ```

2. Open http://localhost:5000 in your browser
3. Register a new account or login
4. Your session cookie will be used for API authentication

### HTTP API Authentication

For HTTP API requests, you need to:

1. First authenticate through the login endpoint
2. Include the session cookie in subsequent requests
3. Include CSRF token for state-changing operations

Example authentication flow:

```python
import requests
from bs4 import BeautifulSoup

# Create a session to persist cookies
session = requests.Session()

# 1. Get login page and extract CSRF token for login
login_page = session.get("http://localhost:5000/auth/login")
soup = BeautifulSoup(login_page.text, 'html.parser')
csrf_input = soup.find('input', {'name': 'csrf_token'})
login_csrf = csrf_input.get('value') if csrf_input else None

# 2. Login with form data (not JSON) including CSRF
login_response = session.post(
    "http://localhost:5000/auth/login",
    data={
        "username": "your_username",
        "password": "your_password",
        "csrf_token": login_csrf
    }
)

if login_response.status_code in [200, 302]:
    print("Login successful")
    # Session cookie is automatically stored
else:
    print(f"Login failed: {login_response.text}")

# 3. Get CSRF token for API requests
csrf_response = session.get("http://localhost:5000/auth/csrf-token")
csrf_token = csrf_response.json()["csrf_token"]

# 4. Make API requests with CSRF header
headers = {"X-CSRF-Token": csrf_token}
api_response = session.post(
    "http://localhost:5000/api/start_research",
    json={
        "query": "What is quantum computing?",
        "model": "gpt-3.5-turbo",
        "search_engines": ["searxng"],
    },
    headers=headers
)
```

## Programmatic API Access

The programmatic API now requires a settings snapshot for proper context:

```python
from local_deep_research.api import quick_summary
from local_deep_research.settings import SettingsManager
from local_deep_research.database.session_context import get_user_db_session

# Get user session and settings
with get_user_db_session(username="your_username", password="your_password") as session:
    settings_manager = SettingsManager(session)
    settings_snapshot = settings_manager.get_all_settings()

    # Use the API with settings snapshot
    result = quick_summary(
        query="What is machine learning?",
        settings_snapshot=settings_snapshot,
        iterations=2,
        questions_per_iteration=3
    )

    print(result["summary"])
```

## API Endpoints

### Research Endpoints

Research endpoints are under `/api/`:

- `POST /api/start_research` - Start new research
- `GET /api/research/{id}/status` - Check research status
- `GET /api/report/{id}` - Get research results
- `POST /api/terminate/{id}` - Stop running research

### Settings Endpoints

Settings endpoints are under `/settings/api/`:

- `GET /settings/api` - Get all settings
- `GET /settings/api/{key}` - Get specific setting
- `PUT /settings/api/{key}` - Update setting
- `GET /settings/api/available-models` - Get available LLM providers
- `GET /settings/api/available-search-engines` - Get search engines

### History Endpoints

- `GET /history/api` - Get research history
- `GET /history/api/{id}` - Get specific research details

## Important Changes from v1.x

1. **Authentication Required**: All API endpoints now require authentication
2. **Settings Snapshot**: Programmatic API calls need settings_snapshot parameter
3. **Per-User Databases**: Each user has their own encrypted database
4. **CSRF Protection**: State-changing requests require CSRF token
5. **New Endpoint Structure**: Research APIs are under `/api/` (e.g., `/api/start_research`)

## Example: Complete Research Flow

```python
import requests
import time

# Setup session and login
session = requests.Session()
session.post(
    "http://localhost:5000/auth/login",
    json={"username": "user", "password": "pass"}
)

# Get CSRF token
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
headers = {"X-CSRF-Token": csrf}

# Start research
research = session.post(
    "http://localhost:5000/api/start_research",
    json={
        "query": "Latest advances in quantum computing",
        "model": "gpt-3.5-turbo",
        "search_engines": ["arxiv", "wikipedia"],
        "iterations": 3
    },
    headers=headers
).json()

research_id = research["research_id"]

# Poll for results
while True:
    status = session.get(
        f"http://localhost:5000/api/research/{research_id}/status"
    ).json()

    if status["status"] in ["completed", "failed"]:
        break

    print(f"Progress: {status.get('progress', 'unknown')}")
    time.sleep(5)

# Get final results
results = session.get(
    f"http://localhost:5000/api/report/{research_id}"
).json()

print(f"Summary: {results['summary']}")
print(f"Sources: {len(results['sources'])}")
```

## Rate Limiting

The API includes adaptive rate limiting:
- Default: 60 requests per minute per user
- Automatic retry with exponential backoff
- Rate limits are per-user, not per-IP

## Error Handling

Common error responses:
- `401`: Not authenticated - login required
- `403`: CSRF token missing or invalid
- `404`: Resource not found
- `429`: Rate limit exceeded
- `500`: Server error

Always check response status and handle errors appropriately.

## Next Steps

- See [examples/api_usage](../examples/api_usage/) for complete examples
- Check [docs/CUSTOM_LLM_INTEGRATION.md](CUSTOM_LLM_INTEGRATION.md) for custom LLM setup
- Read [docs/LANGCHAIN_RETRIEVER_INTEGRATION.md](LANGCHAIN_RETRIEVER_INTEGRATION.md) for custom retrievers