mirror of
https://github.com/LearningCircuit/local-deep-research.git
synced 2026-06-15 19:46:56 +03:00
Post-merge review of the #4334 citation refactor found it only half-fixed the <think> leak it targeted, and introduced a small regression: 1. The MAIN synthesis path was never routed through get_llm_response_text, so reasoning models still leaked <think>...</think> into the user's final answer: - FindingsRepository.synthesize_findings (both timeout paths) returned raw response.content; its output is the final answer (current_knowledge) for the standard/iterdrag strategies and flows verbatim through format_findings. - StandardKnowledge.generate_knowledge / generate_sub_knowledge / compress_knowledge same raw .content. Route all four through get_llm_response_text (strips <think>, handles str/None). 2. Empty-answer regression from #4334: when _invoke_text returns '' (LLM None or a think-only response that strips to ''), several precision extractors and the forced _extract_direct_answer emitted '. <content>'. Guard each to fall back to its existing non-LLM path (first score/year/number) or to content. Tests: +think-strip regression tests for synthesize_findings + standard_knowledge; +empty-answer fallback tests for score/temporal/number + forced extractor; updated the one test that asserted the old '. content' behavior. mypy: 552 files clean. 1739 passed / 2 skipped across the 62 test files touching these modules + tests/citation_handlers.