Files
local-deep-research/tests
LearningCircuit b1cbc6fbe0 fix(llm): normalize str-returns to a message in the central LLM wrapper (+ ainvoke) (#4342)
* fix(llm): normalize str-returns to a message in ProcessingLLMWrapper + add ainvoke

The central wrapper (returned by all get_llm paths) stripped <think> tags but
returned a bare str when the base LLM returned a str — that inconsistent shape is
the root of the recurring "'str' object has no attribute 'content'" crashes we've
been fixing site-by-site (#3884 -> #4339).

Generic fix at the choke point:
- invoke(): when the base returns a bare str, wrap it into AIMessage(content=stripped)
  instead of returning a str. Message returns are unchanged (mutate .content in place,
  preserving additional_kwargs/reasoning_content/tool_calls). Other types pass through.
- add ainvoke(): mirrors invoke(); without it, the 7 direct .ainvoke() sites
  (browsecomp_entity/modular strategies) bypassed think-stripping via __getattr__.

Now every get_llm LLM yields a think-free str .content on both sync and async direct
calls, so the raw .invoke().content sites are safe automatically (deferred per-site
migration cancelled). Reasoning-safe: only .content is rewritten, so DeepSeek
thinking-mode reasoning_content round-tripping (#4194) is not worsened.

Limitation: the LangGraph create_agent path binds tools on the base model
(model.bind_tools via __getattr__), so it bypasses this wrapper — unchanged by this PR.

Tests: updated the 2 tests asserting a str return; added shape, reasoning_content/
tool_calls-preservation (#4194 guard), and ainvoke regression tests.
mypy 552 clean; ruff clean; 2171 passed across 78 LLM-layer test files + citation_handlers.

* refactor(llm): extract _log_llm_error helper + add type hints (review polish)

Addresses the #4342 review recommendations:
- DRY: invoke() and ainvoke() shared the same try/except error-logging verbatim;
  extracted a _log_llm_error(error) static helper so they can't diverge.
- Type hints: added annotations to _normalize_response/_log_llm_error/invoke/ainvoke.
No behavior change. ruff + mypy clean; 112 config tests pass.
2026-05-25 16:47:11 +02:00
..

Testing Guide for Local Deep Research

This document provides a comprehensive guide to running tests in the Local Deep Research project.

Quick Start

# Fast feedback loop (< 30 seconds)
python tests/run_all_tests.py fast

# Standard development testing (< 5 minutes)
python tests/run_all_tests.py standard

# Full comprehensive testing (< 15 minutes)
python tests/run_all_tests.py full

# Run with external server (skip automatic startup)
python tests/run_all_tests.py standard --no-server-start

# Unit tests only (no server needed)
python tests/run_all_tests.py unit-only

Test Structure

The project uses a multi-layered testing approach with different types of tests organized by purpose and execution speed:

Test Categories

Category Location Purpose Duration Dependencies
Health Checks tests/health_check/ Fast endpoint validation 5-30s Server running
Unit Tests tests/test_*.py Component isolation testing 30-60s None
Feature Tests tests/feature_tests/ Feature-specific validation 60-120s Test DB
Integration Tests tests/searxng/, tests/fix_tests/ External service testing 60-180s External APIs
UI Tests tests/ui_tests/ Browser automation 120-300s Server + Node.js

Test Technologies

  • Python: pytest with coverage, requests for HTTP testing
  • JavaScript: Puppeteer for browser automation
  • Shell: curl-based health checks for minimal dependencies

Test Execution Profiles

1. Fast Profile (fast)

Purpose: Rapid feedback during development Duration: < 30 seconds Includes: Health checks + Unit tests

python tests/run_all_tests.py fast

2. Standard Profile (standard)

Purpose: Regular development workflow Duration: < 5 minutes Includes: Fast + UI tests (core workflows)

python tests/run_all_tests.py standard

3. Full Profile (full)

Purpose: Comprehensive validation before releases Duration: < 15 minutes Includes: All tests including external integrations

python tests/run_all_tests.py full

4. CI Profile (ci)

Purpose: Continuous integration optimized Duration: < 2 minutes Includes: Fast + selected stable tests

python tests/run_all_tests.py ci

5. Unit-Only Profile (unit-only)

Purpose: Pure unit testing without server dependencies Duration: < 10 seconds Includes: Unit and feature tests only

python tests/run_all_tests.py unit-only

Individual Test Runners

Health Checks

Fast endpoint validation to ensure the server is responding correctly:

# Python version (auto-detects running server)
python tests/health_check/run_quick_health_check.py

# Shell version (minimal dependencies)
bash tests/health_check/test_endpoints_health.sh

Python Tests

Unit and integration tests using pytest:

# Run all Python tests with coverage
python run_tests.py

# Run specific test categories
pytest tests/test_*.py -v                    # Unit tests only
pytest tests/feature_tests/ -v               # Feature tests only
pytest tests/searxng/ -v                     # Integration tests only

UI Tests

Browser automation tests using Puppeteer:

# Run all UI tests
node tests/ui_tests/run_all_tests.js

# Run individual UI tests
node tests/ui_tests/test_cost_analytics.js   # Cost analytics page
node tests/ui_tests/test_settings_page.js    # Settings functionality
node tests/ui_tests/test_metrics_charts.js   # Chart visualizations

Prerequisites

Required for All Tests

  • Python 3.8+ with project dependencies installed
  • Local Deep Research server running on http://127.0.0.1:5000

Additional Requirements by Test Type

UI Tests:

  • Node.js (for Puppeteer)
  • Chrome/Chromium browser
  • Server must be running and accessible

Integration Tests:

  • Network access for external APIs
  • Valid API keys (if testing external search engines)
  • SearXNG instance (for SearXNG integration tests)

Health Checks:

  • curl (for shell version)
  • requests library (for Python version)

Running Tests in Development

Before Committing Code

# Quick validation
python tests/run_all_tests.py fast

# If fast tests pass, run standard
python tests/run_all_tests.py standard

Before Creating a Pull Request

# Run comprehensive tests
python tests/run_all_tests.py full

Debugging Failed Tests

# Run with verbose output
pytest tests/ -v -s

# Run specific failing test
pytest tests/test_specific_test.py::test_function -v -s

# UI test debugging (saves screenshots)
node tests/ui_tests/test_specific_ui.js
# Check tests/ui_tests/screenshots/ for visual debugging

Test Configuration

pytest Configuration

Configuration is handled in:

  • pyproject.toml - pytest settings and coverage configuration
  • tests/conftest.py - test fixtures and database mocking
  • .coveragerc - coverage reporting settings

UI Test Configuration

Puppeteer tests are configured with:

  • 3-second navigation timeout for faster execution
  • Screenshot capture for debugging
  • Automatic retry for flaky network operations

Environment Variables

# Set Python path for proper imports
export PYTHONPATH=/path/to/local-deep-research

# Optional: Configure test database
export TEST_DATABASE_URL=sqlite:///test.db

# Optional: Skip slow tests
export SKIP_SLOW_TESTS=1

Continuous Integration

GitHub Actions / CI Pipeline

Recommended CI test strategy:

# Fast checks on every PR
- name: Fast Tests
  run: python tests/run_all_tests.py ci

# Full validation before merge
- name: Full Tests
  run: python tests/run_all_tests.py full
  if: github.event_name == 'push' && github.ref == 'refs/heads/main'

Local Pre-commit Hooks

Add to .pre-commit-config.yaml:

- repo: local
  hooks:
    - id: fast-tests
      name: Fast Tests
      entry: python tests/run_all_tests.py fast
      language: system
      pass_filenames: false

Test Data and Fixtures

Database Testing

Tests use isolated SQLite databases with fixtures defined in tests/conftest.py:

  • Automatic rollback after each test
  • Mock data for consistent testing
  • No impact on production data

UI Test Screenshots

UI tests automatically capture screenshots:

  • Saved to tests/ui_tests/screenshots/
  • Useful for debugging visual issues
  • Automatically cleaned up after successful runs

External API Mocking

Integration tests can use mocked responses:

  • Real API calls in integration environment
  • Mocked responses for unit tests
  • Configurable via environment variables

Troubleshooting

Common Issues

"Server not running" error:

# Option 1: Start server manually, then run tests
pdm run ldr-web

# In another terminal:
python tests/run_all_tests.py standard --no-server-start

Server startup hangs during tests:

# Skip automatic server startup and start manually
pdm run ldr-web &  # Start in background

# Run tests without automatic server startup
python tests/run_all_tests.py standard --no-server-start

"Node.js not found" error:

# Install Node.js (Ubuntu/Debian)
sudo apt install nodejs npm

# Install Node.js (macOS)
brew install node

# Verify installation
node --version

Import errors in tests:

# Ensure PYTHONPATH is set
export PYTHONPATH=$(pwd)
python tests/run_all_tests.py fast

Puppeteer browser launch failures:

# Install missing dependencies (Ubuntu/Debian)
sudo apt install chromium-browser

# Or use bundled Chromium
npm install puppeteer

Performance Issues

Tests running slowly:

  • Use fast profile for development
  • Check network connectivity for integration tests
  • Verify server performance with health checks

UI tests timing out:

  • Increase timeout in individual test files
  • Check browser developer tools for JavaScript errors
  • Verify server is responding quickly

Test Coverage

Generate detailed coverage reports:

# HTML coverage report
python run_tests.py
open coverage_html/index.html

# Terminal coverage report
pytest tests/ --cov=src --cov-report=term-missing

Adding New Tests

Unit Tests

Add to tests/test_new_feature.py:

import pytest
from src.local_deep_research.module import function

def test_new_function():
    assert function("input") == "expected_output"

UI Tests

Add to tests/ui_tests/test_new_ui_feature.js:

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    await page.goto('http://127.0.0.1:5000/new-page');
    await page.waitForSelector('.new-feature');

    console.log('✅ New UI feature test passed');
    await browser.close();
})();

Integration Tests

Add to tests/test_new_integration.py:

import pytest
import requests

def test_external_api_integration():
    # Test real API integration
    response = requests.get("https://api.example.com/data")
    assert response.status_code == 200

Summary

The Local Deep Research testing framework provides multiple execution profiles to balance thoroughness with speed. Use the run_all_tests.py script for orchestrated testing, or run individual test suites for targeted debugging. The modular approach ensures you can quickly validate changes during development while maintaining comprehensive coverage for releases.