Files
local-deep-research/docs/developing/EXTENDING.md
LearningCircuit 564f228ee8 refactor(llm-providers): collapse dual-path LLM construction; fix Ollama enable_thinking (#3984)
* refactor(llm-providers): add API-key helpers + max_tokens cap to live class path

* refactor(llm-config): delete ~360 lines of dead procedural code in get_llm()

* test(llm-providers): replace dead-path tests with live class-path coverage

* fix(llamacpp): send Bearer header in is_available() probe + add changelog

* docs(llm-providers): clarify compute_max_tokens None semantics + duplicate-helper rationale

* test(llm-providers): add enable_thinking default-True test + clarify max_tokens default rationale

* fix(tests): rewrite test_api_key_configuration for live class path after dead-code deletion

* docs: fix stale EXTENDING.md import path + OVERVIEW.md provider interface

* docs(changelog): add llama.cpp is_available auth fix to PR #3984 entry

* fix(tests): rewrite test_anthropic_no_api_key_raises for live registered path

* docs(providers): expand BaseLLMProvider docstring + auto-discovery filter note in EXTENDING.md

* test(llm): end-to-end dispatch integration tests for the registry-only path

15 tests pinning the full chain get_llm() -> registry -> provider class ->
langchain client constructor, with only the client class patched. Covers:
registry population for every VALID_PROVIDER, cloud dispatch + required-key
semantics (openai/anthropic), LM Studio /v1 forcing (#4532) + optional-key
placeholder, llamacpp verbatim URL, Ollama enable_thinking (the PR's headline
fix), the 80% max_tokens cap on cloud and local windows,
llm.supports_max_tokens=False, and context_limit bookkeeping via the wrapper.

* fix(llm-providers): address multi-agent review findings on #3984

Review ran 8 dimension reviewers + adversarial verification; 7 confirmed
findings (1 medium, 6 low), fixed here:

- compute_max_tokens: absent llm.max_tokens now omits the kwarg (returns
  None / raises NoSettingsContextError for guarded callers) instead of
  injecting the dead chain's 100000 default, which exceeded most cloud
  models' output limits for partial-snapshot programmatic callers. This
  restores the pre-refactor live-class semantics the helper's docstring
  already promised. DEFAULT_MAX_TOKENS constant removed.
- Ollama: num_ctx resolves through the shared get_context_window_for_provider
  helper, so it can no longer drift from the context_limit reported for
  overflow detection (was: inline 4096 default / omit-on-None vs helper's
  8192). The max_tokens kwarg is no longer passed to ChatOllama — it has
  extra='ignore' and silently dropped it; num_predict is the real control
  and num_ctx behavior is unchanged.
- openai_endpoint: keyless construction logs the dead chain's warning
  again instead of failing later with an opaque upstream 401.
- get_llm docstring: openai_endpoint_url marked NON-FUNCTIONAL (it was
  already ignored on main's live path; honoring/removing it is a follow-up).

Tests updated to pin the new semantics; +1 integration test for the
unset-max_tokens path.

* fix(llm-providers): address round-2 review findings on #3984

- conftest: re-run discover_providers(force_refresh=True) after the
  ProviderDiscovery singleton reset. Test modules calling
  clear_llm_registry() otherwise broke every later builtin get_llm test
  in the same process (the deleted chain was the accidental safety net);
  reproduced with tests/test_llm/test_llm_edge_cases.py followed by
  tests/test_llm_provider_integration.py.
- llm_config: drop the except-ImportError around the discover_providers
  import. A broken providers package (e.g. bad langchain install) now
  fails the module import loudly instead of starting with an empty
  registry and a misdirecting per-call error (the swallowed failure left
  llm.providers.base in sys.modules, so the next import line didn't
  re-trigger it). Extend the unregistered-provider error to mention
  clear_llm_registry()/unregister_llm() and the recovery call.
- _helpers: correct the compute_max_tokens Raises clause — a present
  snapshot lacking llm.max_tokens still raises when no thread context
  exists.
- test_ollama_deep_coverage: the explicit-None window test patched the
  ollama module binding, which the shared helper never reads — it passed
  vacuously. Use a real settings_snapshot so the explicit-None branch is
  actually exercised.
- changelog: keyless openai_endpoint previously raised at construction
  on the live path (not a late 401); document the always-on local-window
  cap for LM Studio/llama.cpp, supports_max_tokens=False now honored,
  and whitespace-only key errors.
- EXTENDING.md: document that clear_llm_registry() removes built-in
  providers and how to restore them.

* test(llm): pin LM Studio max_tokens cap to the local context window

The AI review flagged a suspected provider_key regression (local providers
falling through to the cloud window branch). Verified false: LMStudio/
LlamaCpp/CustomOpenAIEndpoint all define provider_key (auto-discovery
registration requires it), and lmstudio/llamacpp resolve through the
LOCAL_PROVIDERS branch. This test pins that resolution end to end.

* test(security): patch get_setting_from_snapshot at its source module

tests/security/test_api_key_leakage.py patched
local_deep_research.llm.providers.openai_base.get_setting_from_snapshot,
but the refactor made that import function-local in openai_base (so source-
module patches are picked up), so the module no longer exposes it and the
patch raised AttributeError. This file came in via the merge of main and
was not in the PR's original diff, so the same merge fix applied to the
anthropic/openai test files was missed here. Repoint both patches at
local_deep_research.config.thread_settings.get_setting_from_snapshot, the
convention documented in openai_base.py's header.

* docs(llm): note is_available() is a config check, not a server probe

Clarify in OpenAICompatibleProvider.is_available that the api_key_optional
branch returns True without a reachability check, and that local optional
providers (LM Studio, llama.cpp) override it with an HTTP probe. Addresses
an AI-review readability note; no behavior change.

* test(cleanup): drop unused clear_llm_registry import in test_api_key_configuration

The fixture now restores the registry via discover_providers(force_refresh=True)
and no longer calls clear_llm_registry(); the import (kept alive only by a
# noqa: F401) is dead. Addresses an AI-review cleanup note.
2026-06-13 10:43:11 +02:00

20 KiB

Extension Guide

This guide explains how to extend Local Deep Research with custom components.

Table of Contents


Adding Custom Search Engines

Search engines are responsible for fetching results from external sources. All engines extend BaseSearchEngine.

Basic Search Engine

Create a new file in src/local_deep_research/web_search_engines/engines/:

# search_engine_custom.py
from typing import Any, Dict, List, Optional

from langchain_core.language_models import BaseLLM
from loguru import logger

from ..search_engine_base import BaseSearchEngine


class CustomSearchEngine(BaseSearchEngine):
    """Custom search engine implementation."""

    # Classification flags - set appropriately for your engine
    is_public = True       # Searches public internet
    is_generic = False     # Specialized (vs general web search)
    is_scientific = False  # Academic/scientific content
    is_local = False       # Local document search
    is_news = False        # News content
    is_code = False        # Code repositories
    is_lexical = False                  # Uses keyword/lexical search (informational)
    needs_llm_relevance_filter = False  # Set True to auto-enable LLM relevance filtering

    def __init__(
        self,
        max_results: int = 10,
        credential: Optional[str] = None,
        llm: Optional[BaseLLM] = None,
        max_filtered_results: Optional[int] = None,
        **kwargs,
    ):
        """
        Initialize the search engine.

        Args:
            max_results: Maximum number of results to return
            credential: API credential for the service (if required)
            llm: Language model for relevance filtering
            max_filtered_results: Max results after filtering
            **kwargs: Additional parameters
        """
        super().__init__(
            llm=llm,
            max_filtered_results=max_filtered_results,
            max_results=max_results,
        )
        self.credential = credential

    def _get_previews(self, query: str) -> List[Dict[str, Any]]:
        """
        Get preview results (first phase of two-phase retrieval).

        Args:
            query: Search query

        Returns:
            List of preview dictionaries with keys:
            - id: Unique identifier
            - title: Result title
            - snippet: Brief description/summary
            - link: URL to the content
            - source: Source name (e.g., "CustomEngine")
        """
        logger.info(f"Searching custom engine for: {query}")

        # Apply rate limiting before request
        self._last_wait_time = self.rate_tracker.apply_rate_limit(self.engine_type)

        # Your search implementation here
        results = self._call_api(query)

        previews = []
        for item in results:
            previews.append({
                "id": item["id"],
                "title": item["title"],
                "snippet": item["description"],
                "link": item["url"],
                "source": "CustomEngine",
            })

        return previews

    def _get_full_content(
        self, relevant_items: List[Dict[str, Any]]
    ) -> List[Dict[str, Any]]:
        """
        Get full content for relevant items (second phase).

        Args:
            relevant_items: Items that passed relevance filtering

        Returns:
            Items enriched with full content
        """
        results = []
        for item in relevant_items:
            # Apply rate limiting
            self._last_wait_time = self.rate_tracker.apply_rate_limit(self.engine_type)

            # Fetch full content
            full_content = self._fetch_content(item["link"])

            result = item.copy()
            result["content"] = full_content
            result["full_content"] = full_content
            results.append(result)

        return results

    def _call_api(self, query: str) -> List[Dict]:
        """Your API implementation."""
        # Implement your search logic here
        pass

    def _fetch_content(self, url: str) -> str:
        """Fetch full content from URL."""
        # Implement content fetching
        pass

Registering the Engine

Option 1: Register in engine_registry.py (Required)

Add the engine to src/local_deep_research/web_search_engines/engine_registry.py so the system knows how to load it. The registry maps engine names to their Python module and class:

# In engine_registry.py — ENGINE_REGISTRY dict
"custom_engine": EngineEntry(
    module_path=".engines.search_engine_custom",
    class_name="CustomSearchEngine",
),

Module paths must be relative (starting with .) and listed in the security whitelist (ALLOWED_MODULE_PATHS in module_whitelist.py).

Option 1b: Configure user-facing settings (Optional)

After registering in the engine registry, you can expose user-configurable settings via the settings database:

# Key: search.engine.web.custom_engine
config = {
    "requires_api_key": True,
    "requires_llm": False,
    "description": "Custom search engine for specific use case",
    "strengths": ["Feature 1", "Feature 2"],
    "weaknesses": ["Limitation 1"],
    "reliability": 0.8,
    "default_params": {
        "max_results": 10
    }
}

Option 2: Modify Factory (For Core Engines)

Add to search_engine_factory.py:

def create_search_engine(engine_name: str, ...) -> BaseSearchEngine:
    # ... existing code ...

    if engine_name.lower() == "custom_engine":
        from .engines.search_engine_custom import CustomSearchEngine
        return CustomSearchEngine(
            max_results=max_results,
            api_key=api_key,
            llm=llm,
            **kwargs
        )

Search Engine Best Practices

  1. Always apply rate limiting before API calls:

    self._last_wait_time = self.rate_tracker.apply_rate_limit(self.engine_type)
    
  2. Set classification flags accurately - they affect engine selection. For keyword-based engines without ML ranking, set is_lexical = True and needs_llm_relevance_filter = True — the factory will auto-enable LLM relevance filtering

  3. Handle errors gracefully - return empty list on failure, don't crash

  4. Use logging for debugging:

    from loguru import logger
    logger.info(f"Searching for: {query}")
    logger.error(f"API error: {e}")
    
  5. Support snippet-only mode by checking the config:

    from ...config import search_config
    if search_config.SEARCH_SNIPPETS_ONLY:
        return relevant_items  # Skip full content
    

Adding Custom Search Strategies

Strategies define how research is conducted - question generation, iteration, and synthesis.

Basic Strategy

Create a new file in src/local_deep_research/advanced_search_system/strategies/:

# my_custom_strategy.py
from typing import Dict, List, Optional
from loguru import logger

from .base_strategy import BaseSearchStrategy


class MyCustomStrategy(BaseSearchStrategy):
    """Custom search strategy implementation."""

    def __init__(
        self,
        search=None,
        model=None,
        all_links_of_system=None,
        settings_snapshot=None,
        max_iterations: int = 3,
        **kwargs,
    ):
        """
        Initialize the strategy.

        Args:
            search: Search engine instance
            model: LLM for question generation and synthesis
            all_links_of_system: Shared list for discovered links
            settings_snapshot: Configuration snapshot
            max_iterations: Maximum research iterations
            **kwargs: Additional parameters
        """
        super().__init__(
            all_links_of_system=all_links_of_system,
            settings_snapshot=settings_snapshot,
        )
        self.search = search
        self.model = model
        self.max_iterations = max_iterations

    def analyze_topic(self, query: str) -> Dict:
        """
        Execute the research strategy.

        Args:
            query: Research query

        Returns:
            Dict with:
            - findings: List of research findings
            - iterations: Number of iterations completed
            - questions: Dict of questions by iteration
            - formatted_findings: Formatted output string
            - current_knowledge: Accumulated knowledge dict
            - error: Optional error message
        """
        logger.info(f"Starting custom strategy for: {query}")

        findings = []
        current_knowledge = {}

        try:
            for iteration in range(1, self.max_iterations + 1):
                # Update progress
                self._update_progress(
                    f"Iteration {iteration}/{self.max_iterations}",
                    progress_percent=int(iteration / self.max_iterations * 100),
                    metadata={"iteration": iteration}
                )

                # Generate questions for this iteration
                questions = self._generate_questions(query, current_knowledge)
                self.questions_by_iteration[iteration] = questions

                # Search for each question
                for question in questions:
                    results = self._search(question)
                    findings.extend(results)

                    # Track links
                    for result in results:
                        if result.get("link"):
                            self.all_links_of_system.append(result["link"])

                # Synthesize findings
                current_knowledge = self._synthesize(findings)

                # Check if we should stop early
                if self._should_stop(current_knowledge):
                    logger.info(f"Early stopping at iteration {iteration}")
                    break

            # Format final output
            formatted = self._format_findings(findings, current_knowledge)

            return {
                "findings": findings,
                "iterations": iteration,
                "questions": self.questions_by_iteration,
                "formatted_findings": formatted,
                "current_knowledge": current_knowledge,
            }

        except Exception as e:
            logger.error(f"Strategy error: {e}")
            return {
                "findings": findings,
                "iterations": 0,
                "questions": self.questions_by_iteration,
                "formatted_findings": "",
                "current_knowledge": current_knowledge,
                "error": str(e),
            }

    def _generate_questions(self, query: str, knowledge: Dict) -> List[str]:
        """Generate research questions using the LLM."""
        prompt = f"""Given the query: {query}
        And current knowledge: {knowledge}
        Generate 3 specific research questions."""

        response = self.model.invoke(prompt)
        # Parse response into questions
        return self._parse_questions(response.content)

    def _search(self, question: str) -> List[Dict]:
        """Execute search for a question."""
        return self.search.run(question)

    def _synthesize(self, findings: List[Dict]) -> Dict:
        """Synthesize findings into knowledge."""
        # Implement synthesis logic
        return {"summary": "...", "key_points": [...]}

    def _should_stop(self, knowledge: Dict) -> bool:
        """Check if research should stop early."""
        # Implement stopping criteria
        return False

    def _format_findings(self, findings: List[Dict], knowledge: Dict) -> str:
        """Format findings as output string."""
        # Implement formatting
        return "Formatted research results..."

    def _parse_questions(self, content: str) -> List[str]:
        """Parse LLM response into question list."""
        # Implement parsing
        return content.strip().split("\n")

Registering the Strategy

Add to search_system_factory.py:

def create_strategy(strategy_name: str, ...) -> BaseSearchStrategy:
    strategy_name_lower = strategy_name.lower()

    # ... existing strategies ...

    elif strategy_name_lower in ["my-custom", "mycustom", "custom"]:
        from .advanced_search_system.strategies.my_custom_strategy import (
            MyCustomStrategy,
        )
        return MyCustomStrategy(
            search=search,
            model=model,
            all_links_of_system=all_links_of_system,
            settings_snapshot=settings_snapshot,
            **kwargs
        )

Strategy Best Practices

  1. Use progress callbacks to update the UI:

    self._update_progress("Searching...", progress_percent=50)
    
  2. Track all discovered links in self.all_links_of_system

  3. Store questions by iteration in self.questions_by_iteration

  4. Access settings via the snapshot:

    max_results = self.get_setting("search.max_results", default=10)
    
  5. Handle errors gracefully - return partial results with error message


Using LangChain Retrievers

The easiest way to add custom search is through LangChain retrievers.

Registering a Retriever

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from local_deep_research.web_search_engines.retriever_registry import retriever_registry

# Create your retriever
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(documents, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

# Register globally
retriever_registry.register("my_documents", retriever)

# Now use in research
from local_deep_research.api import quick_summary

result = quick_summary(
    query="What does the documentation say about X?",
    search_tool="my_documents",  # Use registered retriever
    programmatic_mode=True
)

Passing Retrievers Directly

from local_deep_research.api import quick_summary

# Create retriever
retriever = my_vectorstore.as_retriever()

# Pass directly to API
result = quick_summary(
    query="Search my documents",
    retrievers={"private_docs": retriever},
    search_tool="private_docs",
    programmatic_mode=True
)

Registry Methods

from local_deep_research.web_search_engines.retriever_registry import retriever_registry

# Register
retriever_registry.register("name", retriever)
retriever_registry.register_multiple({"a": ret1, "b": ret2})

# Query
retriever_registry.get("name")
retriever_registry.is_registered("name")
retriever_registry.list_registered()

# Remove
retriever_registry.unregister("name")
retriever_registry.clear()

Adding Custom LLM Providers

LLM providers wrap language model APIs for use in LDR.

Basic Provider

Create in src/local_deep_research/llm/providers/implementations/:

# my_provider.py
from typing import Dict, Optional

from langchain_core.language_models import BaseChatModel
from langchain_openai import ChatOpenAI

from ..openai_base import OpenAICompatibleProvider


class MyProvider(OpenAICompatibleProvider):
    """Custom LLM provider."""

    provider_name = "My Provider"
    api_key_setting = "llm.my_provider.api_key"
    url_setting = "llm.my_provider.url"
    default_base_url = "https://api.myprovider.com/v1"
    default_model = "my-model-v1"
    # Optional: set to True if missing key should fall back to a placeholder
    # rather than raising ValueError.
    api_key_optional = False

    @classmethod
    def create_llm(
        cls,
        model_name: Optional[str] = None,
        temperature: float = 0.7,
        settings_snapshot: Optional[Dict] = None,
        **kwargs
    ) -> BaseChatModel:
        """
        Create LLM instance.

        Args:
            model_name: Model to use
            temperature: Sampling temperature
            settings_snapshot: Configuration
            **kwargs: Additional parameters

        Returns:
            LangChain chat model instance
        """
        from ....config.thread_settings import get_setting_from_snapshot

        # Resolve API key via the base helper. Raises ValueError when
        # required and missing, returns the unified placeholder when
        # api_key_optional=True and the key is unset.
        api_key = cls.resolve_api_key_or_placeholder(settings_snapshot)

        # Get base URL
        base_url = get_setting_from_snapshot(
            cls.url_setting,
            cls.default_base_url,
            settings_snapshot=settings_snapshot,
        )

        return ChatOpenAI(
            model=model_name or cls.default_model,
            temperature=temperature,
            api_key=api_key,
            base_url=base_url,
            **kwargs
        )

    @classmethod
    def list_models(cls, settings_snapshot: Optional[Dict] = None) -> list[str]:
        """List available models."""
        return ["my-model-v1", "my-model-v2", "my-model-large"]

Register in Auto-Discovery

Drop the provider class file into src/local_deep_research/llm/providers/implementations/. Auto-discovery will scan that directory at import time and register every class whose name ends with Provider, subclasses BaseLLMProvider, and has provider_name set to a real value (i.e., overridden away from the "unknown" default). Setting provider_name = "unknown" — or leaving it unset on the class — will cause the class to be silently filtered out of auto-discovery, which is a common gotcha when copying an existing provider as a template.

Optional cloud-metadata registration in auto_discovery.py:

PROVIDER_METADATA = {
    # ... existing providers ...
    "my_provider": ProviderMetadata(
        provider_id="my_provider",
        provider_name="My Provider",
        company_name="My Company",
        region="US",
        country="United States",
        data_location="US",
        gdpr_compliant=False,
        is_cloud=True,
    ),
}

Registering Custom LLMs

For programmatic use, register LLMs directly:

from langchain_openai import ChatOpenAI
from local_deep_research.llm.llm_registry import register_llm, get_llm_from_registry

# Create custom LLM
custom_llm = ChatOpenAI(
    model="gpt-4",
    temperature=0.5,
    api_key="...",
)

# Register it
register_llm("my_gpt4", custom_llm)

# Use in research
from local_deep_research.api import quick_summary

result = quick_summary(
    query="Research topic",
    llms={"my_gpt4": custom_llm},  # Or use registered name
    provider_name="my_gpt4",
    programmatic_mode=True
)

Factory Functions

You can also register factory functions:

def create_my_llm(temperature=0.7):
    return ChatOpenAI(model="gpt-4", temperature=temperature)

register_llm("my_factory", create_my_llm)

# Will be called when needed
llm = get_llm_from_registry("my_factory")

Registry caveat

The built-in providers (ollama, openai, anthropic, ...) live in the same registry, auto-registered at import time. clear_llm_registry() removes them too, and get_llm() has no other construction path — every provider will raise "was not registered by auto-discovery" until you restore them:

from local_deep_research.llm.providers import discover_providers

discover_providers(force_refresh=True)

Prefer unregister_llm("<your name>") over clear_llm_registry() to remove only your own registrations.


See Also