* fix(llm): normalize str-returns to a message in ProcessingLLMWrapper + add ainvoke The central wrapper (returned by all get_llm paths) stripped <think> tags but returned a bare str when the base LLM returned a str — that inconsistent shape is the root of the recurring "'str' object has no attribute 'content'" crashes we've been fixing site-by-site (#3884 -> #4339). Generic fix at the choke point: - invoke(): when the base returns a bare str, wrap it into AIMessage(content=stripped) instead of returning a str. Message returns are unchanged (mutate .content in place, preserving additional_kwargs/reasoning_content/tool_calls). Other types pass through. - add ainvoke(): mirrors invoke(); without it, the 7 direct .ainvoke() sites (browsecomp_entity/modular strategies) bypassed think-stripping via __getattr__. Now every get_llm LLM yields a think-free str .content on both sync and async direct calls, so the raw .invoke().content sites are safe automatically (deferred per-site migration cancelled). Reasoning-safe: only .content is rewritten, so DeepSeek thinking-mode reasoning_content round-tripping (#4194) is not worsened. Limitation: the LangGraph create_agent path binds tools on the base model (model.bind_tools via __getattr__), so it bypasses this wrapper — unchanged by this PR. Tests: updated the 2 tests asserting a str return; added shape, reasoning_content/ tool_calls-preservation (#4194 guard), and ainvoke regression tests. mypy 552 clean; ruff clean; 2171 passed across 78 LLM-layer test files + citation_handlers. * refactor(llm): extract _log_llm_error helper + add type hints (review polish) Addresses the #4342 review recommendations: - DRY: invoke() and ainvoke() shared the same try/except error-logging verbatim; extracted a _log_llm_error(error) static helper so they can't diverge. - Type hints: added annotations to _normalize_response/_log_llm_error/invoke/ainvoke. No behavior change. ruff + mypy clean; 112 config tests pass.
Testing Guide for Local Deep Research
This document provides a comprehensive guide to running tests in the Local Deep Research project.
Quick Start
# Fast feedback loop (< 30 seconds)
python tests/run_all_tests.py fast
# Standard development testing (< 5 minutes)
python tests/run_all_tests.py standard
# Full comprehensive testing (< 15 minutes)
python tests/run_all_tests.py full
# Run with external server (skip automatic startup)
python tests/run_all_tests.py standard --no-server-start
# Unit tests only (no server needed)
python tests/run_all_tests.py unit-only
Test Structure
The project uses a multi-layered testing approach with different types of tests organized by purpose and execution speed:
Test Categories
| Category | Location | Purpose | Duration | Dependencies |
|---|---|---|---|---|
| Health Checks | tests/health_check/ |
Fast endpoint validation | 5-30s | Server running |
| Unit Tests | tests/test_*.py |
Component isolation testing | 30-60s | None |
| Feature Tests | tests/feature_tests/ |
Feature-specific validation | 60-120s | Test DB |
| Integration Tests | tests/searxng/, tests/fix_tests/ |
External service testing | 60-180s | External APIs |
| UI Tests | tests/ui_tests/ |
Browser automation | 120-300s | Server + Node.js |
Test Technologies
- Python: pytest with coverage, requests for HTTP testing
- JavaScript: Puppeteer for browser automation
- Shell: curl-based health checks for minimal dependencies
Test Execution Profiles
1. Fast Profile (fast)
Purpose: Rapid feedback during development Duration: < 30 seconds Includes: Health checks + Unit tests
python tests/run_all_tests.py fast
2. Standard Profile (standard)
Purpose: Regular development workflow Duration: < 5 minutes Includes: Fast + UI tests (core workflows)
python tests/run_all_tests.py standard
3. Full Profile (full)
Purpose: Comprehensive validation before releases Duration: < 15 minutes Includes: All tests including external integrations
python tests/run_all_tests.py full
4. CI Profile (ci)
Purpose: Continuous integration optimized Duration: < 2 minutes Includes: Fast + selected stable tests
python tests/run_all_tests.py ci
5. Unit-Only Profile (unit-only)
Purpose: Pure unit testing without server dependencies Duration: < 10 seconds Includes: Unit and feature tests only
python tests/run_all_tests.py unit-only
Individual Test Runners
Health Checks
Fast endpoint validation to ensure the server is responding correctly:
# Python version (auto-detects running server)
python tests/health_check/run_quick_health_check.py
# Shell version (minimal dependencies)
bash tests/health_check/test_endpoints_health.sh
Python Tests
Unit and integration tests using pytest:
# Run all Python tests with coverage
python run_tests.py
# Run specific test categories
pytest tests/test_*.py -v # Unit tests only
pytest tests/feature_tests/ -v # Feature tests only
pytest tests/searxng/ -v # Integration tests only
UI Tests
Browser automation tests using Puppeteer:
# Run all UI tests
node tests/ui_tests/run_all_tests.js
# Run individual UI tests
node tests/ui_tests/test_cost_analytics.js # Cost analytics page
node tests/ui_tests/test_settings_page.js # Settings functionality
node tests/ui_tests/test_metrics_charts.js # Chart visualizations
Prerequisites
Required for All Tests
- Python 3.8+ with project dependencies installed
- Local Deep Research server running on
http://127.0.0.1:5000
Additional Requirements by Test Type
UI Tests:
- Node.js (for Puppeteer)
- Chrome/Chromium browser
- Server must be running and accessible
Integration Tests:
- Network access for external APIs
- Valid API keys (if testing external search engines)
- SearXNG instance (for SearXNG integration tests)
Health Checks:
- curl (for shell version)
- requests library (for Python version)
Running Tests in Development
Before Committing Code
# Quick validation
python tests/run_all_tests.py fast
# If fast tests pass, run standard
python tests/run_all_tests.py standard
Before Creating a Pull Request
# Run comprehensive tests
python tests/run_all_tests.py full
Debugging Failed Tests
# Run with verbose output
pytest tests/ -v -s
# Run specific failing test
pytest tests/test_specific_test.py::test_function -v -s
# UI test debugging (saves screenshots)
node tests/ui_tests/test_specific_ui.js
# Check tests/ui_tests/screenshots/ for visual debugging
Test Configuration
pytest Configuration
Configuration is handled in:
pyproject.toml- pytest settings and coverage configurationtests/conftest.py- test fixtures and database mocking.coveragerc- coverage reporting settings
UI Test Configuration
Puppeteer tests are configured with:
- 3-second navigation timeout for faster execution
- Screenshot capture for debugging
- Automatic retry for flaky network operations
Environment Variables
# Set Python path for proper imports
export PYTHONPATH=/path/to/local-deep-research
# Optional: Configure test database
export TEST_DATABASE_URL=sqlite:///test.db
# Optional: Skip slow tests
export SKIP_SLOW_TESTS=1
Continuous Integration
GitHub Actions / CI Pipeline
Recommended CI test strategy:
# Fast checks on every PR
- name: Fast Tests
run: python tests/run_all_tests.py ci
# Full validation before merge
- name: Full Tests
run: python tests/run_all_tests.py full
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
Local Pre-commit Hooks
Add to .pre-commit-config.yaml:
- repo: local
hooks:
- id: fast-tests
name: Fast Tests
entry: python tests/run_all_tests.py fast
language: system
pass_filenames: false
Test Data and Fixtures
Database Testing
Tests use isolated SQLite databases with fixtures defined in tests/conftest.py:
- Automatic rollback after each test
- Mock data for consistent testing
- No impact on production data
UI Test Screenshots
UI tests automatically capture screenshots:
- Saved to
tests/ui_tests/screenshots/ - Useful for debugging visual issues
- Automatically cleaned up after successful runs
External API Mocking
Integration tests can use mocked responses:
- Real API calls in integration environment
- Mocked responses for unit tests
- Configurable via environment variables
Troubleshooting
Common Issues
"Server not running" error:
# Option 1: Start server manually, then run tests
pdm run ldr-web
# In another terminal:
python tests/run_all_tests.py standard --no-server-start
Server startup hangs during tests:
# Skip automatic server startup and start manually
pdm run ldr-web & # Start in background
# Run tests without automatic server startup
python tests/run_all_tests.py standard --no-server-start
"Node.js not found" error:
# Install Node.js (Ubuntu/Debian)
sudo apt install nodejs npm
# Install Node.js (macOS)
brew install node
# Verify installation
node --version
Import errors in tests:
# Ensure PYTHONPATH is set
export PYTHONPATH=$(pwd)
python tests/run_all_tests.py fast
Puppeteer browser launch failures:
# Install missing dependencies (Ubuntu/Debian)
sudo apt install chromium-browser
# Or use bundled Chromium
npm install puppeteer
Performance Issues
Tests running slowly:
- Use
fastprofile for development - Check network connectivity for integration tests
- Verify server performance with health checks
UI tests timing out:
- Increase timeout in individual test files
- Check browser developer tools for JavaScript errors
- Verify server is responding quickly
Test Coverage
Generate detailed coverage reports:
# HTML coverage report
python run_tests.py
open coverage_html/index.html
# Terminal coverage report
pytest tests/ --cov=src --cov-report=term-missing
Adding New Tests
Unit Tests
Add to tests/test_new_feature.py:
import pytest
from src.local_deep_research.module import function
def test_new_function():
assert function("input") == "expected_output"
UI Tests
Add to tests/ui_tests/test_new_ui_feature.js:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://127.0.0.1:5000/new-page');
await page.waitForSelector('.new-feature');
console.log('✅ New UI feature test passed');
await browser.close();
})();
Integration Tests
Add to tests/test_new_integration.py:
import pytest
import requests
def test_external_api_integration():
# Test real API integration
response = requests.get("https://api.example.com/data")
assert response.status_code == 200
Summary
The Local Deep Research testing framework provides multiple execution profiles to balance thoroughness with speed. Use the run_all_tests.py script for orchestrated testing, or run individual test suites for targeted debugging. The modular approach ensures you can quickly validate changes during development while maintaining comprehensive coverage for releases.