Files
local-deep-research/tests/test_programmatic_custom_llm_retriever.py
LearningCircuit 12160e26e1 chore(lint): add ruff rules for logging, performance, exceptions, and print detection (#3211)
* chore(lint): add ruff rules for logging, performance, exceptions, and print detection

Add wave 2 lint rules: G, PERF, RET, TRY, T20, C4, ERA. All existing
violations are suppressed via ignore/per-file-ignores so this config
change is merge-safe. Follow-up PRs will fix violations and remove the
ignore entries incrementally.

* fix(lint): exempt pre-commit hooks from T201 print rule (#3270)

Pre-commit hooks are CLI scripts where print is the intended output
interface, same as scripts/ and cli/ directories already exempted.

* fix(lint): fix all low-count ruff violations instead of suppressing them (#3275)

* fix(lint): replace manual dict-building loops with dict comprehensions (PERF403)

* fix(lint): replace bare Exception raises with specific built-in types (TRY002)

Replace all `raise Exception(...)` in production code with appropriate
built-in exception types: RuntimeError for operational/state failures,
ValueError for invalid data, and ConnectionError for HTTP errors.

* fix(lint): resolve TRY004 and PERF402 ruff violations

Use TypeError instead of ValueError for isinstance/issubclass type
checks (TRY004), and replace manual for-loop list copies with
list.extend() (PERF402).

* fix(lint): fix all low-count ruff violations instead of suppressing them

Fix all violations for 15 ruff rules that had ≤10 occurrences each,
rather than suppressing them with ignore directives:

- TRY002: raise-vanilla-class → use specific built-in exceptions
- TRY004: type-check-without-type-error → use TypeError
- C408: unnecessary-collection-call → use dict/list literals
- C401: unnecessary-generator-set → use set comprehensions
- C416: unnecessary-comprehension → use list()/set()
- C414: unnecessary-double-cast-or-process → simplify
- PERF403: manual-dict-comprehension → use dict comprehensions
- PERF102: incorrect-dict-iterator → use .values()/.keys()
- PERF402: manual-list-copy → use list.extend()
- RET503/RET506/RET507/RET508: superfluous else after return/raise/continue/break
- RET501/RET502: unnecessary/implicit return None

Adds per-file-ignores for tests/ and examples/ where these patterns
are acceptable (e.g. bare Exception in tests, dict() calls in fixtures).

* fix(lint): enforce E722, ERA001, RET505 and fix pre-commit RET503 gap (#3276)

Remove three rules from the global ignore list by fixing all violations:

E722 (bare except) — 6 violations in tests:
  Replace `except:` with `except Exception:` to avoid swallowing
  KeyboardInterrupt and SystemExit.

ERA001 (commented-out code) — 25 violations:
  Delete 18 true positives (dead variables, disabled debug logs,
  commented-out imports). Add `# noqa: ERA001` to 7 false positives
  (template instructions, type annotations, documentation comments).

RET505 (superfluous else after return) — 413 violations:
  Auto-fix all occurrences. Also fixes 5 cascading RET506/RET507
  violations exposed by the RET505 removals.

Pre-commit hooks gap:
  Add RET503 to `.pre-commit-hooks/**` per-file-ignores alongside T201.

* fix(lint): enforce RET504 and TRY301 — fix all violations (#3279)

* fix(lint): enforce RET504 — collapse unnecessary assign-before-return

Auto-fix all 46 RET504 violations via ruff unsafe-fixes: collapse
`result = expr; return result` into `return expr`.

Remove RET504 from global ignore list. Add to tests/examples
per-file-ignores where intermediate variables aid test clarity.

Also removes TRY301 from global ignore (violations fixed in next commit).

* fix(lint): enforce TRY301 — fix raises inside broad try/except blocks

Structural fixes for 65 TRY301 violations:

Security-critical fixes:
- url_validator.py: move 6 validation raises before try block,
  replace isinstance-based re-raise with specific except clause
- path_validator.py: move validation outside try block
- env_settings.py: separate parsing (try) from validation (outside)

Route/service fixes:
- research_routes.py: replace raise-then-catch with direct error return
- mcp/server.py: move all 7 tool validations before try blocks
- news/api.py: move validation before try, noqa for db-session raises
- notifications: move rate limit and URL validation before try blocks
- iterative_refinement_strategy.py: move JSON validation after try

Added noqa for intentional patterns: re-raise in except handlers,
nested function definitions, db-session-dependent checks, rate limit
re-raises for base class retry logic.

* merge: resolve conflicts between wave2 lint branch and main

Resolve 14 merge conflicts by always starting from main's version
and re-applying lint fixes on top:

- mcp_strategy.py, ollama.py, security_settings.py, delete_routes.py:
  Take main's code, re-apply RET505 (remove else: after return)
- mcp/server.py (3 conflicts): Take main's ValidationError handlers
  and set_settings_context, re-apply TRY301 fixes, fix sensitive
  data logging
- research_routes.py: Take main, fix duplicate block (merge artifact)
- settings_routes.py: Take main's default-settings fallback feature
- meta_search_engine.py, parallel_search_engine.py: Take main's
  get_available_engines delegation, delete unreachable code
- search_engine_ddg.py, search_engine_google_pse.py: Take main's
  sanitization, re-apply RET506 (if not elif after raise)
- rag_routes.py: Accept main's deletion (route moved to delete_routes)
- encryption_check.py: Accept main's deletion (dead code)
- test_storage_coverage.py: Remove broken test classes referencing
  undefined stubs
- pre-commit hooks: extend per-file-ignores for ERA001, RET504

* fix: revert ValueError→TypeError changes that break tests and API contracts

Revert TRY004 fixes in 3 files where changing ValueError to TypeError
would break existing tests and HTTP status code contracts:

- card_factory.py: 5 tests assert pytest.raises(ValueError)
- base_rater.py: flask_api.py catches ValueError for HTTP 400 responses;
  TypeError would fall through to HTTP 500
- full_search.py: test asserts pytest.raises(ValueError)

Add # noqa: TRY004 to suppress the lint rule on these lines.

* fix: move benchmark_data check back inside try block

The ValueError for missing benchmark_data must be inside the try/except
so the except handler can mark the run as FAILED in the database.
Without this, the exception propagates unhandled in a daemon thread,
leaving the benchmark run stuck in RUNNING state permanently.

* chore(lint): remove ERA rule and suppress TRY004 globally

Remove ERA (eradicate — commented-out code detection) from ruff select:
- 28% false positive rate in our codebase (7 of 25 violations)
- No major Python project enables it (Django, FastAPI, Pydantic, Airflow)
- Ruff itself doesn't use it; autofix was demoted to manual-only
- 172 noqa suppressions provided zero enforcement value

Suppress TRY004 (type-check-without-type-error) globally:
- Ruff maintainer agreed the autofix "can change functionality"
- We already had to revert 3 TypeError changes that broke tests
  and HTTP 400→500 API contracts
- Django, Flask, pandas all use isinstance + ValueError routinely
- Pylint has no equivalent rule; near-zero PyPI adoption

Remove all 173 # noqa: ERA001 and 49 # noqa: TRY004 comments
from the codebase — no longer needed with rules disabled/suppressed.

* fix: resolve mypy errors, failing MCP test, and TRY301 noqa

- search_engine_factory.py: restore typed intermediate variable to fix
  mypy no-any-return (RET504 collapse lost the type annotation)
- search_engine_pubchem.py: add explicit list[str] type annotation
- test_edge_cases.py: fix assertion that expected engine name in
  sanitized error message
- mcp/server.py: add noqa: TRY301 to validation raises inside try
  blocks (from main's new merge code)
2026-03-29 17:01:23 +02:00

437 lines
15 KiB
Python

"""Test demonstrating programmatic access with Langchain Ollama LLM and in-memory vector retriever."""
import pytest
from unittest.mock import patch, MagicMock
import requests
from langchain_community.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.retrievers import Document
from local_deep_research.api import (
quick_summary,
detailed_research,
generate_report,
)
from local_deep_research.llm import clear_llm_registry
def _is_ollama_running():
"""Check if Ollama service is running."""
try:
response = requests.get("http://localhost:11434/api/tags", timeout=1)
return response.status_code == 200
except Exception:
return False
# Skip entire module if Ollama is not running to avoid fixture initialization hangs
pytestmark = pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping all Ollama integration tests",
)
@pytest.fixture(autouse=True)
def clear_registries():
"""Clear registries before and after each test."""
clear_llm_registry()
yield
clear_llm_registry()
@pytest.fixture
def sample_documents():
"""Create sample documents for the vector store."""
docs = [
Document(
page_content="Machine learning is a subset of artificial intelligence that enables systems to learn from data.",
metadata={"source": "ml_basics.txt", "topic": "machine_learning"},
),
Document(
page_content="Deep learning uses neural networks with multiple layers to extract features from raw data.",
metadata={"source": "dl_intro.txt", "topic": "deep_learning"},
),
Document(
page_content="Natural language processing allows computers to understand and generate human language.",
metadata={"source": "nlp_guide.txt", "topic": "nlp"},
),
Document(
page_content="Computer vision enables machines to interpret and analyze visual information from images and videos.",
metadata={"source": "cv_overview.txt", "topic": "computer_vision"},
),
Document(
page_content="Reinforcement learning trains agents to make decisions by rewarding desired behaviors.",
metadata={
"source": "rl_basics.txt",
"topic": "reinforcement_learning",
},
),
]
return docs
@pytest.fixture
def ollama_llm():
"""Create an Ollama LLM instance."""
# Using gemma3n:e4b as requested
return Ollama(
model="gemma3n:e4b",
temperature=0.7,
)
@pytest.fixture
def memory_retriever(sample_documents):
"""Create an in-memory vector store retriever."""
# Create embeddings using the specified multilingual model
embeddings = OllamaEmbeddings(
model="jeffh/intfloat-multilingual-e5-large-instruct:f16"
)
# Create FAISS vector store from documents
vectorstore = FAISS.from_documents(
documents=sample_documents, embedding=embeddings
)
# Create retriever from vector store
retriever = vectorstore.as_retriever(
search_kwargs={"k": 3} # Return top 3 most relevant documents
)
return retriever
@pytest.fixture
def mock_search_system():
"""Create a mock search system for testing."""
system = MagicMock()
system.analyze_topic.return_value = {
"current_knowledge": "Analysis using Ollama LLM and in-memory retriever",
"findings": [
"Successfully retrieved relevant documents from vector store",
"Ollama LLM provided coherent responses",
],
"iterations": 2,
"questions": {
"iteration_1": ["What is the main concept?"],
"iteration_2": ["How is it applied in practice?"],
},
"formatted_findings": "## Research Summary\n- Vector retrieval worked effectively",
"all_links_of_system": [],
}
system.model = MagicMock()
return system
@pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping integration test",
)
def test_quick_summary_with_ollama_and_memory_retriever(
ollama_llm, memory_retriever, mock_search_system
):
"""Test quick_summary using Ollama LLM and in-memory vector retriever."""
with patch(
"local_deep_research.api.research_functions._init_search_system"
) as mock_init:
mock_init.return_value = mock_search_system
# Use programmatic API with Ollama and memory retriever
result = quick_summary(
query="What is deep learning and how does it relate to machine learning?",
llms={"ollama_llm": ollama_llm},
retrievers={"memory_docs": memory_retriever},
provider="ollama_llm",
search_tool="memory_docs",
temperature=0.5,
)
# Verify result structure
assert "summary" in result
assert (
result["summary"]
== "Analysis using Ollama LLM and in-memory retriever"
)
assert len(result["findings"]) == 2
assert "vector store" in result["findings"][0].lower()
# Verify components were configured correctly
init_kwargs = mock_init.call_args[1]
assert init_kwargs["provider"] == "ollama_llm"
assert init_kwargs["search_tool"] == "memory_docs"
assert init_kwargs["temperature"] == 0.5
@pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping integration test",
)
def test_detailed_research_with_ollama_and_memory_retriever(
ollama_llm, memory_retriever, mock_search_system
):
"""Test detailed_research with Ollama and memory retriever."""
with patch(
"local_deep_research.api.research_functions._init_search_system"
) as mock_init:
mock_init.return_value = mock_search_system
result = detailed_research(
query="Explain the differences between various machine learning approaches",
llms={"ollama": ollama_llm},
retrievers={"local_docs": memory_retriever},
provider="ollama",
search_tool="local_docs",
iterations=3,
questions_per_iteration=2,
)
# Verify detailed research results
assert (
result["query"]
== "Explain the differences between various machine learning approaches"
)
assert (
result["summary"]
== "Analysis using Ollama LLM and in-memory retriever"
)
assert result["metadata"]["iterations_requested"] == 3
assert result["metadata"]["search_tool"] == "local_docs"
@pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping integration test",
)
def test_generate_report_with_ollama_and_memory_retriever(
ollama_llm, memory_retriever
):
"""Test report generation using Ollama and memory retriever."""
with patch(
"local_deep_research.api.research_functions._init_search_system"
) as mock_init:
with patch(
"local_deep_research.api.research_functions.IntegratedReportGenerator"
) as mock_report_gen:
# Setup mocks
mock_system = MagicMock()
mock_system.analyze_topic.return_value = {
"findings": "Initial ML findings"
}
mock_init.return_value = mock_system
mock_generator = MagicMock()
mock_generator.generate_report.return_value = {
"content": "# Machine Learning Overview\n\n## Introduction\nThis report covers key ML concepts from local documents.",
"metadata": {
"query": "machine learning overview",
"sources_used": 5,
},
}
mock_report_gen.return_value = mock_generator
# Generate report
result = generate_report(
query="Create a comprehensive overview of machine learning concepts",
llms={"ollama": ollama_llm},
retrievers={"vector_store": memory_retriever},
provider="ollama",
search_tool="vector_store",
searches_per_section=2,
)
# Verify report generation
assert "content" in result
assert "# Machine Learning Overview" in result["content"]
assert "local documents" in result["content"]
# Verify configuration
init_kwargs = mock_init.call_args[1]
assert init_kwargs["provider"] == "ollama"
assert init_kwargs["search_tool"] == "vector_store"
@pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping integration test",
)
def test_custom_vector_store_with_more_documents():
"""Test creating a larger in-memory vector store."""
# Create more documents
extended_docs = [
Document(
page_content="Transfer learning allows models trained on one task to be adapted for related tasks.",
metadata={"source": "transfer_learning.txt"},
),
Document(
page_content="Attention mechanisms help models focus on relevant parts of the input data.",
metadata={"source": "attention.txt"},
),
Document(
page_content="Gradient descent is an optimization algorithm used to train neural networks.",
metadata={"source": "optimization.txt"},
),
Document(
page_content="Convolutional neural networks are particularly effective for image processing tasks.",
metadata={"source": "cnn.txt"},
),
Document(
page_content="Recurrent neural networks can process sequential data like text or time series.",
metadata={"source": "rnn.txt"},
),
]
# Create vector store with extended documents
embeddings = OllamaEmbeddings(
model="jeffh/intfloat-multilingual-e5-large-instruct:f16"
)
vectorstore = FAISS.from_documents(
documents=extended_docs, embedding=embeddings
)
# Create retriever with different search parameters
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 5}, # Return top 5 documents
)
# Create Ollama LLM
llm = Ollama(model="gemma3n:e4b", temperature=0.3)
with patch(
"local_deep_research.api.research_functions._init_search_system"
) as mock_init:
mock_system = MagicMock()
mock_system.analyze_topic.return_value = {
"current_knowledge": "Extended document analysis complete",
"findings": ["Found relevant information about neural networks"],
}
mock_init.return_value = mock_system
result = quick_summary(
query="How do different types of neural networks work?",
llms={"ollama": llm},
retrievers={"extended_docs": retriever},
provider="ollama",
search_tool="extended_docs",
)
assert "summary" in result
assert result["summary"] == "Extended document analysis complete"
@pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping integration test",
)
def test_multiple_retrievers_with_ollama():
"""Test using multiple in-memory retrievers with Ollama."""
# Create first retriever for ML topics
ml_docs = [
Document(
page_content="Supervised learning uses labeled data for training."
),
Document(
page_content="Unsupervised learning finds patterns in unlabeled data."
),
]
# Create second retriever for application topics
app_docs = [
Document(
page_content="ML is used in recommendation systems for personalized content."
),
Document(
page_content="ML powers autonomous vehicles through computer vision and sensor fusion."
),
]
embeddings = OllamaEmbeddings(
model="jeffh/intfloat-multilingual-e5-large-instruct:f16"
)
ml_vectorstore = FAISS.from_documents(ml_docs, embeddings)
app_vectorstore = FAISS.from_documents(app_docs, embeddings)
ml_retriever = ml_vectorstore.as_retriever()
app_retriever = app_vectorstore.as_retriever()
ollama_llm = Ollama(model="gemma3n:e4b")
with patch(
"local_deep_research.api.research_functions._init_search_system"
) as mock_init:
mock_system = MagicMock()
mock_system.analyze_topic.return_value = {
"current_knowledge": "Analysis from multiple vector stores",
"findings": ["ML concepts retrieved", "Applications identified"],
}
mock_init.return_value = mock_system
result = quick_summary(
query="What are ML techniques and their applications?",
llms={"ollama": ollama_llm},
retrievers={
"ml_concepts": ml_retriever,
"ml_applications": app_retriever,
},
provider="ollama",
search_tool="auto", # Use all retrievers
)
assert "summary" in result
assert "multiple vector stores" in result["summary"]
@pytest.mark.skipif(
not _is_ollama_running(),
reason="Ollama is not running - skipping integration test",
)
def test_simple_ollama_factory_pattern():
"""Test using a factory function to create Ollama instances."""
def create_ollama_llm(model_name="gemma3n:e4b", temperature=0.7, **kwargs):
"""Factory function for creating configured Ollama instances."""
return Ollama(
model=model_name,
temperature=temperature,
num_predict=kwargs.get("max_tokens", 256),
)
# Create simple in-memory retriever
docs = [Document(page_content="Test content for factory pattern demo.")]
embeddings = OllamaEmbeddings(
model="jeffh/intfloat-multilingual-e5-large-instruct:f16"
)
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever()
with patch(
"local_deep_research.api.research_functions._init_search_system"
) as mock_init:
mock_system = MagicMock()
mock_system.analyze_topic.return_value = {
"current_knowledge": "Factory pattern test successful",
"findings": [],
}
mock_init.return_value = mock_system
result = quick_summary(
query="Test factory pattern",
llms={"ollama_factory": create_ollama_llm},
retrievers={"test_docs": retriever},
provider="ollama_factory",
search_tool="test_docs",
model_name="gemma3n:e4b",
temperature=0.2,
max_tokens=512,
)
assert "summary" in result
assert "Factory pattern test successful" in result["summary"]