mirror of https://github.com/LearningCircuit/local-deep-research.git synced 2026-06-16 03:51:07 +03:00

Files

LearningCircuit 3087eba843 fix: remove || true from LLM example tests (#2913 )

* fix: remove || true from LLM example tests

The four LLM example test steps in docker-tests.yml silently swallowed
all failures with `|| true`, providing zero signal on whether the
examples actually work. The tests already set LDR_USE_FALLBACK_LLM=true
and have a 60s timeout, so they should succeed in CI.

* fix: declare CustomLLM fields as Pydantic class attributes

The __init__ approach fails with Pydantic v2 because setting undeclared
fields via __setattr__ raises ValueError. Declaring them as class-level
fields lets Pydantic handle initialization natively.

* fix: add settings_snapshot creation to detailed_research()

detailed_research() was missing the settings_snapshot creation that
quick_summary() and generate_report() already have, causing a
RuntimeError when called outside a Flask app context.

* fix: declare MockLLM and ScenarioMockLLM fields as Pydantic attributes

Same Pydantic v2 compatibility fix as basic_custom_llm.py — fields must
be declared at class level, not set in __init__.

* fix: pass response_map as keyword argument to MockLLM

Pydantic models don't accept positional arguments.

* fix: Pydantic v2 compat for advanced_custom_llm + fix workflow refs

- Convert RetryLLM, ConfigurableLLM, DomainExpertLLM to use Pydantic
  class-level field declarations instead of __init__
- Replace workflow references to non-existent switch_providers.py and
  custom_research_example.py with advanced_custom_llm.py

* fix: pass base_llm as keyword argument to RetryLLM

* fix: code review fixes for Pydantic v2 compat and research context

- Add Optional[] to MockLLM nullable field types (Pydantic v2 rejects
  None for non-Optional annotations)
- Use local variable for RetryLLM exponential backoff instead of
  mutating self.retry_delay across calls
- Pass research_id and research_context to _init_search_system() in
  detailed_research(), matching the pattern in quick_summary()

* fix: simplify detailed_research() to use settings_snapshot only

Remove redundant provider/api_key/temperature/etc parameters — these
should be configured via settings_snapshot, not individual params.
Just create a default snapshot when none is provided.

* fix: remove silent settings_snapshot fallback from detailed_research()

Callers must explicitly pass settings_snapshot — silently creating a
default hides errors.

* fix: revert init_kwargs injection in detailed_research()

Remove the research_id/research_context injection we added — this was
us papering over missing caller-side responsibility. Restore the
original call pattern.

* fix: use explicit settings_snapshot in all example scripts

- Add auto-creation of default settings_snapshot with info log in
  detailed_research() when none is provided
- Update all example scripts to create and pass settings_snapshot
  explicitly via create_settings_snapshot(), demonstrating the
  correct programmatic API pattern

* feat: add API stability smoke test to CI

Add api_smoke_test.py that verifies the public API surface hasn't
changed — imports, function signatures, settings utilities, and
LDRClient interface. Also add test_direct_import.py to CI.

These tests catch breaking API changes early. The test file includes
a prominent warning that it should NOT be modified to accommodate
API changes — the API change should be reverted instead.

* feat: add CI testing for all example files

- Create examples/_ci_helpers.py with shared CIMockLLM for CI testing
- Add LDR_CI_TEST=1 mode to simple_programmatic, advanced_features,
  and search_strategies examples for full execution with mock LLM
- Refactor simple_programmatic_example.py to use main() guard
- Add py_compile checks for 9 examples with external dependencies
- Add show_env_vars.py execution to CI
- Total: 19 example files now covered (was 5)

* fix: move show_env_vars.py to compile check (uses removed API method)

* fix: address code review findings

- Add if __name__ guard to api_smoke_test.py (prevents pytest crash)
- Fix wasted _get_settings() call in advanced_features demonstrate_report_generation
- Move unused settings_snapshot creation inside non-CI branch in simple_programmatic
- Add missing files to compile checks (run_benchmark.py, elasticsearch/search_example.py, _ci_helpers.py)

* fix: add examples/** to change detection filter + job timeout

- Add examples/** to the llm path filter so PRs touching only example
  files trigger the LLM Example Tests job
- Add timeout-minutes: 20 to llm-example-tests job (was missing,
  unlike all other jobs)

* fix: revert programmatic examples to original, use compile checks instead

Revert simple_programmatic_example.py, advanced_features_example.py,
and search_strategies_example.py to their original state. Examples
should be clean user-facing documentation, not polluted with CI
infrastructure.

Use py_compile checks for these files instead of full execution with
mock LLM injection.

* fix: rename api_smoke_test to api_public_contract_guardrail

Rename to signal that this file protects the public API and must not
be modified to accommodate breaking changes. Added DO NOT MODIFY
comments at every test section so AI agents scanning inline comments
will see the restriction even without reading the docstring.

* fix: remove dead _ci_helpers.py (no example imports it)

* fix: raise ValueError instead of fallback when settings_snapshot missing

detailed_research() now raises a clear error when settings_snapshot
is not provided, instead of silently creating a default one. Callers
must explicitly pass create_settings_snapshot(...) so they know what
configuration they're getting.

quick_summary() and generate_report() are not affected — they build
the snapshot from their explicit provider/api_key/temperature params.

* fix: add warnings to all API functions when no config provided

All three public API functions (quick_summary, generate_report,
detailed_research) now log a warning when called without explicit
configuration (no settings_snapshot, no provider, no settings).
They still work using defaults + environment variables, but the
warning alerts callers that they may not get expected results.

2026-03-25 09:45:36 +01:00

advanced_features_example.py

fix: resolve programmatic API bugs and add CI tests (#1085 )

2025-11-19 12:28:37 -05:00

api_public_contract_guardrail.py

fix: remove || true from LLM example tests (#2913 )

2026-03-25 09:45:36 +01:00

custom_llm_retriever_example.py

feat: Add multi-architecture Docker image support (#851 )

2025-10-22 14:33:16 -04:00

hybrid_search_example.py

feat: Add multi-architecture Docker image support (#851 )

2025-10-22 14:33:16 -04:00

minimal_working_example.py

refactor: Replace programmatic_mode setting with explicit argument (#627 ) (#633 )

2025-08-13 17:36:09 -04:00

README.md

refactor: Replace programmatic_mode setting with explicit argument (#627 ) (#633 )

2025-08-13 17:36:09 -04:00

search_strategies_example.py

refactor: Replace programmatic_mode setting with explicit argument (#627 ) (#633 )

2025-08-13 17:36:09 -04:00

searxng_example.py

refactor: Replace programmatic_mode setting with explicit argument (#627 ) (#633 )

2025-08-13 17:36:09 -04:00

simple_programmatic_example.py

fix: resolve programmatic API bugs and add CI tests (#1085 )

2025-11-19 12:28:37 -05:00

test_direct_import.py

refactor: Reorganize programmatic examples and add debug logging

2025-08-09 13:19:31 +02:00

README.md

Local Deep Research - Programmatic API Examples

This directory contains examples demonstrating how to use Local Deep Research programmatically without requiring authentication or database access.

Quick Start

All examples use the programmatic API that bypasses authentication:

from local_deep_research.api import quick_summary, detailed_research
from local_deep_research.api.settings_utils import create_settings_snapshot

# Create settings for programmatic mode
settings = create_settings_snapshot({
    "search.tool": "wikipedia"
})

# Run research
result = quick_summary(
    "your topic",
    settings_snapshot=settings,
    programmatic_mode=True
)

Examples Overview

Example	Purpose	Key Features	Difficulty
minimal_working_example.py	Simplest possible example	Basic setup, minimal code	Beginner
simple_programmatic_example.py	Common use cases with the new API	quick_summary, detailed_research, generate_report, custom parameters	Beginner
search_strategies_example.py	Demonstrates search strategies	source-based vs focused-iteration strategies	Intermediate
hybrid_search_example.py	Combine multiple search sources	Multiple retrievers, web + retriever combo	Intermediate
advanced_features_example.py	Advanced programmatic features	generate_report, export formats, result analysis, keyword extraction	Advanced
custom_llm_retriever_example.py	Custom LLM and retriever integration	Ollama, custom retrievers, FAISS	Advanced
searxng_example.py	Web search with SearXNG	SearXNG integration, error handling	Advanced

Example Details

minimal_working_example.py

Purpose: Show the absolute minimum code needed to use LDR programmatically.

Creates a simple LLM and search engine
Runs a basic search
No external dependencies beyond Ollama

simple_programmatic_example.py

Purpose: Demonstrate the main API functions with practical examples.

quick_summary() - Fast research with summary
detailed_research() - Comprehensive research with findings
generate_report() - Create full markdown reports
Custom search parameters
Different search tools (Wikipedia, auto, etc.)

search_strategies_example.py

Purpose: Explain and demonstrate the two main search strategies.

source-based: Comprehensive research with detailed citations
focused-iteration: Iterative refinement of research questions
Side-by-side comparison of strategies
When to use each strategy

hybrid_search_example.py

Purpose: Show how to combine multiple search sources for comprehensive research.

Multiple named retrievers for different document types
Combining custom retrievers with web search
Source analysis and tracking
Meta search configuration

advanced_features_example.py

Purpose: Demonstrate advanced programmatic features and analysis capabilities.

generate_report() - Create comprehensive markdown reports
Export formats - JSON, Markdown, custom formats
Result analysis - Extract insights and patterns
Keyword extraction - Identify key terms and concepts
Batch research - Process multiple queries efficiently

custom_llm_retriever_example.py

Purpose: Advanced integration with custom components.

Custom LLM implementation (using Ollama)
Custom retriever with embeddings
Vector store integration (FAISS)
Direct use of AdvancedSearchSystem

searxng_example.py

Purpose: Web search integration using SearXNG.

SearXNG configuration
Error handling and fallbacks
Real-time web search
Direct use of search engines

Key Concepts

Programmatic Mode

All examples use programmatic_mode=True as an explicit parameter to bypass authentication:

result = quick_summary(
    query="your topic",
    settings_snapshot=settings,
    programmatic_mode=True
)

Search Strategies

source-based: Best for academic research, fact-checking
focused-iteration: Best for exploratory research, complex topics

Search Tools

Available search tools include:

wikipedia - Wikipedia search
arxiv - Academic papers
searxng - Web search via SearXNG
auto - Automatically select best tool
meta - Combine multiple tools

Custom Retrievers

You can provide your own retrievers:

result = quick_summary(
    query="topic",
    retrievers={"my_docs": custom_retriever},
    search_tool="my_docs",
    settings_snapshot=settings,
    programmatic_mode=True
)

API Functions

`quick_summary()`

Generate a quick research summary:

from local_deep_research.api import quick_summary
from local_deep_research.api.settings_utils import create_settings_snapshot

settings = create_settings_snapshot({})
result = quick_summary(
    query="Your research question",
    settings_snapshot=settings,
    search_tool="wikipedia",
    iterations=2,
    programmatic_mode=True
)

`detailed_research()`

Perform in-depth research with multiple iterations:

from local_deep_research.api import detailed_research

result = detailed_research(
    query="Your research question",
    settings_snapshot=settings,
    search_strategy="source-based",
    iterations=3,
    questions_per_iteration=5,
    programmatic_mode=True
)

`generate_report()`

Generate comprehensive markdown reports with structured sections:

from local_deep_research.api import generate_report
from local_deep_research.api.settings_utils import create_settings_snapshot

settings = create_settings_snapshot(overrides={"programmatic_mode": True})
result = generate_report(
    query="Your research question",
    settings_snapshot=settings,
    output_file="report.md",
    searches_per_section=3
)

Requirements

Python 3.8+
Local Deep Research installed
Ollama (for most examples)
SearXNG instance (for searxng_example.py)

Running the Examples

Install Local Deep Research:
```
pip install -e .
```

Start Ollama (if using Ollama examples):

ollama serve
ollama pull gemma3:12b
ollama pull nomic-embed-text  # For embeddings

Run any example:

python minimal_working_example.py
python simple_programmatic_example.py
python search_strategies_example.py

Troubleshooting

"No settings context available" Error

Make sure to pass settings_snapshot and programmatic_mode to all API functions:

settings = create_settings_snapshot({})
result = quick_summary(
    "topic",
    settings_snapshot=settings,
    programmatic_mode=True
)

Ollama Connection Error

Ensure Ollama is running:

ollama serve

SearXNG Connection Error

Start a SearXNG instance or use the fallback in the example:

docker run -p 8080:8080 searxng/searxng

Contributing

When adding new examples:

Focus on demonstrating specific features
Include clear comments explaining the code
Handle errors gracefully
Update this README with the new example

License

See the main project LICENSE file.