test: gate test_research_creation.py with @pytest.mark.requires_llm (#4288)

PUNCHLIST Tier 4 FLAKY_SLEEP flagged all 6 tests in TestResearchCreation as flaky in CI: - test_research_creation_endpoint (gpt-3.5-turbo + searxng required) - test_research_creation_with_minimal_params (Ollama llama2 required) - test_research_status_check (Ollama llama2, time.sleep(0.5)) - test_research_termination (time.sleep(0.5), race condition) - test_research_modes (Ollama llama2, modes may not exist) - test_research_with_custom_model (Ollama mistral required) All tests POST to /api/start_research and need a live Ollama / searxng stack to succeed. They have no skip markers, so CI was running them unconditionally and flaking on infrastructure availability. Added @pytest.mark.requires_llm at the class level (the project's existing convention for tests needing a real LLM — see pyproject.toml markers + tests/api_tests/test_rest_api.py for sibling usage). With the marker: - CI runs with `-m "not requires_llm"` will skip the entire class - Manual runs with `-m requires_llm` can invoke them against a live Ollama + searxng setup This is more conservative than deletion — preserves the tests for manual integration runs but removes their CI flakiness.
2026-06-16 03:51:07 +03:00 · 2026-05-25 00:29:55 +02:00
parent c8e57594c5
commit 7d8e02a7e2
1 changed files with 12 additions and 1 deletions
--- a/tests/api_tests/test_research_creation.py
+++ b/tests/api_tests/test_research_creation.py
@@ -6,9 +6,20 @@ Test research creation endpoint specifically
 import json
 import time

+import pytest

+
+@pytest.mark.requires_llm
 class TestResearchCreation:
-    """Test research creation functionality."""
+    """Test research creation functionality.
+
+    All tests in this class POST to /api/start_research with real
+    Ollama / searxng / model dependencies (PUNCHLIST Tier 4: FLAKY_SLEEP).
+    The class-level @pytest.mark.requires_llm gate makes them auto-
+    skip in environments without a live LLM stack, so CI doesn't
+    flake on the timing-sensitive sleep(0.5) and model-availability
+    assertions. Manual runs can invoke them with -m requires_llm.
+    """

    def test_research_creation_endpoint(self, authenticated_client):
        """Test the research creation endpoint."""