Commit Graph

5 Commits

Author SHA1 Message Date
LearningCircuit
901d0db8d9 fix(examples): make mock LLM example truly offline + reject search.tool='none' (#3520)
* ci: skip live-network LLM examples, keep compile-checked

The basic_custom_llm.py and advanced_custom_llm.py examples execute a
real research pipeline that hits Wikipedia and PubMed. Under the job's
60s timeout they flake whenever those services are slow (seen on #3467
and elsewhere).

Drop the two exec steps from llm-example-tests, add both files to the
compile-check block so syntax/import regressions are still caught, and
leave mock_llm_example.py running since it exercises the same
integration path offline.

* fix(examples): make mock LLM example truly offline + reject search.tool='none'

The mock example claimed to run "offline" with `search.tool: "none"`, but
the factory silently fell back to the `auto` engine and dispatched real
searches to PubMed/Wikipedia — which is why the `LLM Example Tests` CI
job still timed out at 60s after #3478 removed the other two examples.

- `MockRetriever` is now a proper `langchain_core.retrievers.BaseRetriever`
  at module scope, so `RetrieverSearchEngine.run`'s `.invoke(query)` call
  actually works (previously the inner class was only discovered via the
  broad `except Exception` path and returned no results).
- The three `main()` tests that used `search.tool: "none"` now register
  the mock retriever and use `search.tool: "mock_retriever"`.
- `create_search_engine` rejects the literal string `"none"` with a
  `ValueError` so this silent-fallback class of bug cannot recur.

End-to-end run of the example in the project venv completes in ~13s with
no network traffic; previously this timed out at 60s hitting NCBI and
en.wikipedia.org.

* test(factory): regression coverage for search.tool='none' guard

Asserts that create_search_engine('none', ...) raises ValueError rather
than silently falling through to the 'auto' engine — the exact failure
mode that broke the mock LLM example in CI.
2026-04-18 15:45:27 +02:00
LearningCircuit
12160e26e1 chore(lint): add ruff rules for logging, performance, exceptions, and print detection (#3211)
* chore(lint): add ruff rules for logging, performance, exceptions, and print detection

Add wave 2 lint rules: G, PERF, RET, TRY, T20, C4, ERA. All existing
violations are suppressed via ignore/per-file-ignores so this config
change is merge-safe. Follow-up PRs will fix violations and remove the
ignore entries incrementally.

* fix(lint): exempt pre-commit hooks from T201 print rule (#3270)

Pre-commit hooks are CLI scripts where print is the intended output
interface, same as scripts/ and cli/ directories already exempted.

* fix(lint): fix all low-count ruff violations instead of suppressing them (#3275)

* fix(lint): replace manual dict-building loops with dict comprehensions (PERF403)

* fix(lint): replace bare Exception raises with specific built-in types (TRY002)

Replace all `raise Exception(...)` in production code with appropriate
built-in exception types: RuntimeError for operational/state failures,
ValueError for invalid data, and ConnectionError for HTTP errors.

* fix(lint): resolve TRY004 and PERF402 ruff violations

Use TypeError instead of ValueError for isinstance/issubclass type
checks (TRY004), and replace manual for-loop list copies with
list.extend() (PERF402).

* fix(lint): fix all low-count ruff violations instead of suppressing them

Fix all violations for 15 ruff rules that had ≤10 occurrences each,
rather than suppressing them with ignore directives:

- TRY002: raise-vanilla-class → use specific built-in exceptions
- TRY004: type-check-without-type-error → use TypeError
- C408: unnecessary-collection-call → use dict/list literals
- C401: unnecessary-generator-set → use set comprehensions
- C416: unnecessary-comprehension → use list()/set()
- C414: unnecessary-double-cast-or-process → simplify
- PERF403: manual-dict-comprehension → use dict comprehensions
- PERF102: incorrect-dict-iterator → use .values()/.keys()
- PERF402: manual-list-copy → use list.extend()
- RET503/RET506/RET507/RET508: superfluous else after return/raise/continue/break
- RET501/RET502: unnecessary/implicit return None

Adds per-file-ignores for tests/ and examples/ where these patterns
are acceptable (e.g. bare Exception in tests, dict() calls in fixtures).

* fix(lint): enforce E722, ERA001, RET505 and fix pre-commit RET503 gap (#3276)

Remove three rules from the global ignore list by fixing all violations:

E722 (bare except) — 6 violations in tests:
  Replace `except:` with `except Exception:` to avoid swallowing
  KeyboardInterrupt and SystemExit.

ERA001 (commented-out code) — 25 violations:
  Delete 18 true positives (dead variables, disabled debug logs,
  commented-out imports). Add `# noqa: ERA001` to 7 false positives
  (template instructions, type annotations, documentation comments).

RET505 (superfluous else after return) — 413 violations:
  Auto-fix all occurrences. Also fixes 5 cascading RET506/RET507
  violations exposed by the RET505 removals.

Pre-commit hooks gap:
  Add RET503 to `.pre-commit-hooks/**` per-file-ignores alongside T201.

* fix(lint): enforce RET504 and TRY301 — fix all violations (#3279)

* fix(lint): enforce RET504 — collapse unnecessary assign-before-return

Auto-fix all 46 RET504 violations via ruff unsafe-fixes: collapse
`result = expr; return result` into `return expr`.

Remove RET504 from global ignore list. Add to tests/examples
per-file-ignores where intermediate variables aid test clarity.

Also removes TRY301 from global ignore (violations fixed in next commit).

* fix(lint): enforce TRY301 — fix raises inside broad try/except blocks

Structural fixes for 65 TRY301 violations:

Security-critical fixes:
- url_validator.py: move 6 validation raises before try block,
  replace isinstance-based re-raise with specific except clause
- path_validator.py: move validation outside try block
- env_settings.py: separate parsing (try) from validation (outside)

Route/service fixes:
- research_routes.py: replace raise-then-catch with direct error return
- mcp/server.py: move all 7 tool validations before try blocks
- news/api.py: move validation before try, noqa for db-session raises
- notifications: move rate limit and URL validation before try blocks
- iterative_refinement_strategy.py: move JSON validation after try

Added noqa for intentional patterns: re-raise in except handlers,
nested function definitions, db-session-dependent checks, rate limit
re-raises for base class retry logic.

* merge: resolve conflicts between wave2 lint branch and main

Resolve 14 merge conflicts by always starting from main's version
and re-applying lint fixes on top:

- mcp_strategy.py, ollama.py, security_settings.py, delete_routes.py:
  Take main's code, re-apply RET505 (remove else: after return)
- mcp/server.py (3 conflicts): Take main's ValidationError handlers
  and set_settings_context, re-apply TRY301 fixes, fix sensitive
  data logging
- research_routes.py: Take main, fix duplicate block (merge artifact)
- settings_routes.py: Take main's default-settings fallback feature
- meta_search_engine.py, parallel_search_engine.py: Take main's
  get_available_engines delegation, delete unreachable code
- search_engine_ddg.py, search_engine_google_pse.py: Take main's
  sanitization, re-apply RET506 (if not elif after raise)
- rag_routes.py: Accept main's deletion (route moved to delete_routes)
- encryption_check.py: Accept main's deletion (dead code)
- test_storage_coverage.py: Remove broken test classes referencing
  undefined stubs
- pre-commit hooks: extend per-file-ignores for ERA001, RET504

* fix: revert ValueError→TypeError changes that break tests and API contracts

Revert TRY004 fixes in 3 files where changing ValueError to TypeError
would break existing tests and HTTP status code contracts:

- card_factory.py: 5 tests assert pytest.raises(ValueError)
- base_rater.py: flask_api.py catches ValueError for HTTP 400 responses;
  TypeError would fall through to HTTP 500
- full_search.py: test asserts pytest.raises(ValueError)

Add # noqa: TRY004 to suppress the lint rule on these lines.

* fix: move benchmark_data check back inside try block

The ValueError for missing benchmark_data must be inside the try/except
so the except handler can mark the run as FAILED in the database.
Without this, the exception propagates unhandled in a daemon thread,
leaving the benchmark run stuck in RUNNING state permanently.

* chore(lint): remove ERA rule and suppress TRY004 globally

Remove ERA (eradicate — commented-out code detection) from ruff select:
- 28% false positive rate in our codebase (7 of 25 violations)
- No major Python project enables it (Django, FastAPI, Pydantic, Airflow)
- Ruff itself doesn't use it; autofix was demoted to manual-only
- 172 noqa suppressions provided zero enforcement value

Suppress TRY004 (type-check-without-type-error) globally:
- Ruff maintainer agreed the autofix "can change functionality"
- We already had to revert 3 TypeError changes that broke tests
  and HTTP 400→500 API contracts
- Django, Flask, pandas all use isinstance + ValueError routinely
- Pylint has no equivalent rule; near-zero PyPI adoption

Remove all 173 # noqa: ERA001 and 49 # noqa: TRY004 comments
from the codebase — no longer needed with rules disabled/suppressed.

* fix: resolve mypy errors, failing MCP test, and TRY301 noqa

- search_engine_factory.py: restore typed intermediate variable to fix
  mypy no-any-return (RET504 collapse lost the type annotation)
- search_engine_pubchem.py: add explicit list[str] type annotation
- test_edge_cases.py: fix assertion that expected engine name in
  sanitized error message
- mcp/server.py: add noqa: TRY301 to validation raises inside try
  blocks (from main's new merge code)
2026-03-29 17:01:23 +02:00
LearningCircuit
3087eba843 fix: remove || true from LLM example tests (#2913)
* fix: remove || true from LLM example tests

The four LLM example test steps in docker-tests.yml silently swallowed
all failures with `|| true`, providing zero signal on whether the
examples actually work. The tests already set LDR_USE_FALLBACK_LLM=true
and have a 60s timeout, so they should succeed in CI.

* fix: declare CustomLLM fields as Pydantic class attributes

The __init__ approach fails with Pydantic v2 because setting undeclared
fields via __setattr__ raises ValueError. Declaring them as class-level
fields lets Pydantic handle initialization natively.

* fix: add settings_snapshot creation to detailed_research()

detailed_research() was missing the settings_snapshot creation that
quick_summary() and generate_report() already have, causing a
RuntimeError when called outside a Flask app context.

* fix: declare MockLLM and ScenarioMockLLM fields as Pydantic attributes

Same Pydantic v2 compatibility fix as basic_custom_llm.py — fields must
be declared at class level, not set in __init__.

* fix: pass response_map as keyword argument to MockLLM

Pydantic models don't accept positional arguments.

* fix: Pydantic v2 compat for advanced_custom_llm + fix workflow refs

- Convert RetryLLM, ConfigurableLLM, DomainExpertLLM to use Pydantic
  class-level field declarations instead of __init__
- Replace workflow references to non-existent switch_providers.py and
  custom_research_example.py with advanced_custom_llm.py

* fix: pass base_llm as keyword argument to RetryLLM

* fix: code review fixes for Pydantic v2 compat and research context

- Add Optional[] to MockLLM nullable field types (Pydantic v2 rejects
  None for non-Optional annotations)
- Use local variable for RetryLLM exponential backoff instead of
  mutating self.retry_delay across calls
- Pass research_id and research_context to _init_search_system() in
  detailed_research(), matching the pattern in quick_summary()

* fix: simplify detailed_research() to use settings_snapshot only

Remove redundant provider/api_key/temperature/etc parameters — these
should be configured via settings_snapshot, not individual params.
Just create a default snapshot when none is provided.

* fix: remove silent settings_snapshot fallback from detailed_research()

Callers must explicitly pass settings_snapshot — silently creating a
default hides errors.

* fix: revert init_kwargs injection in detailed_research()

Remove the research_id/research_context injection we added — this was
us papering over missing caller-side responsibility. Restore the
original call pattern.

* fix: use explicit settings_snapshot in all example scripts

- Add auto-creation of default settings_snapshot with info log in
  detailed_research() when none is provided
- Update all example scripts to create and pass settings_snapshot
  explicitly via create_settings_snapshot(), demonstrating the
  correct programmatic API pattern

* feat: add API stability smoke test to CI

Add api_smoke_test.py that verifies the public API surface hasn't
changed — imports, function signatures, settings utilities, and
LDRClient interface. Also add test_direct_import.py to CI.

These tests catch breaking API changes early. The test file includes
a prominent warning that it should NOT be modified to accommodate
API changes — the API change should be reverted instead.

* feat: add CI testing for all example files

- Create examples/_ci_helpers.py with shared CIMockLLM for CI testing
- Add LDR_CI_TEST=1 mode to simple_programmatic, advanced_features,
  and search_strategies examples for full execution with mock LLM
- Refactor simple_programmatic_example.py to use main() guard
- Add py_compile checks for 9 examples with external dependencies
- Add show_env_vars.py execution to CI
- Total: 19 example files now covered (was 5)

* fix: move show_env_vars.py to compile check (uses removed API method)

* fix: address code review findings

- Add if __name__ guard to api_smoke_test.py (prevents pytest crash)
- Fix wasted _get_settings() call in advanced_features demonstrate_report_generation
- Move unused settings_snapshot creation inside non-CI branch in simple_programmatic
- Add missing files to compile checks (run_benchmark.py, elasticsearch/search_example.py, _ci_helpers.py)

* fix: add examples/** to change detection filter + job timeout

- Add examples/** to the llm path filter so PRs touching only example
  files trigger the LLM Example Tests job
- Add timeout-minutes: 20 to llm-example-tests job (was missing,
  unlike all other jobs)

* fix: revert programmatic examples to original, use compile checks instead

Revert simple_programmatic_example.py, advanced_features_example.py,
and search_strategies_example.py to their original state. Examples
should be clean user-facing documentation, not polluted with CI
infrastructure.

Use py_compile checks for these files instead of full execution with
mock LLM injection.

* fix: rename api_smoke_test to api_public_contract_guardrail

Rename to signal that this file protects the public API and must not
be modified to accommodate breaking changes. Added DO NOT MODIFY
comments at every test section so AI agents scanning inline comments
will see the restriction even without reading the docstring.

* fix: remove dead _ci_helpers.py (no example imports it)

* fix: raise ValueError instead of fallback when settings_snapshot missing

detailed_research() now raises a clear error when settings_snapshot
is not provided, instead of silently creating a default one. Callers
must explicitly pass create_settings_snapshot(...) so they know what
configuration they're getting.

quick_summary() and generate_report() are not affected — they build
the snapshot from their explicit provider/api_key/temperature params.

* fix: add warnings to all API functions when no config provided

All three public API functions (quick_summary, generate_report,
detailed_research) now log a warning when called without explicit
configuration (no settings_snapshot, no provider, no settings).
They still work using defaults + environment variables, but the
warning alerts callers that they may not get expected results.
2026-03-25 09:45:36 +01:00
LearningCircuit
2eaaf12109 feat: Implement per-user encrypted databases with comprehensive auth system
BREAKING CHANGE: Data files now stored in platform-specific user directories
with SQLCipher encryption. Users must register/login to access the application.

## Major Features

### Security & Authentication
- Implemented complete multi-user authentication system with Flask-Login
- Per-user SQLCipher encrypted databases (falls back to SQLite with warnings)
- Secure session management with proper CSRF protection
- Password hashing with bcrypt for user credentials
- Complete isolation between user data - no cross-user access possible
- Thread-safe database connections with proper session management

### Database Architecture
- Migrated from single shared database to per-user encrypted databases
- Centralized auth database for user management
- User-specific databases for research data, settings, and metrics
- Automatic database initialization on user registration
- Platform-specific data directories using platformdirs library
- Removed all hardcoded paths and personal information

### User Experience
- Registration page with data privacy acknowledgment
- Login/logout functionality with session persistence
- Automatic redirect to login for unauthenticated access
- Research queue system with 3 concurrent research limit per user
- Real-time queue position updates
- Comprehensive error handling with user-friendly messages

### API & Routes
- All API endpoints now require authentication
- Updated routes: /auth/register, /auth/login, /auth/logout, /auth/check
- Protected research submission and history endpoints
- Proper JSON error responses for API routes
- CSRF token validation for state-changing operations

### Testing
- Added 53 Puppeteer tests for UI authentication flows
- Comprehensive auth integration tests (248 Python test files)
- Multi-user concurrent access testing
- Queue system testing with position tracking
- Database migration and encryption tests

### Configuration
- Single LDR_DATA_DIR environment variable for data location
- LDR_ALLOW_UNENCRYPTED environment variable for development
- Updated Docker configuration for proper volume mounting
- Removed multiple environment variables for simplicity

### Documentation
- Added DATA_MIGRATION_GUIDE.md for upgrade instructions
- Added SQLCIPHER_INSTALL.md for encryption setup
- Updated environment configuration documentation
- Professional error messages throughout

## Technical Improvements
- Replaced raw SQL with SQLAlchemy ORM throughout
- Proper database session management with context managers
- Thread-local storage for database connections
- Automatic cleanup of stale sessions
- Rate limiting infrastructure for future use
- Comprehensive logging with loguru

## Files Changed
- 322 files modified/added
- 248 Python files (core functionality and tests)
- 53 JavaScript files (Puppeteer tests)
- 6 Markdown files (documentation)
- No binary files, screenshots, or database files included
- All test credentials properly marked with pragma comments

This migration ensures each user's research data is completely isolated and
encrypted, providing enterprise-grade security for sensitive research operations.
2025-06-29 11:32:48 +02:00
LearningCircuit
3d102da08d feat: Add custom LLM integration support (#507)
* feat: Add custom LLM integration support

- Add LLM registry system for managing custom language models
- Support both LLM instances and factory functions
- Add llms parameter to API functions (quick_summary, detailed_research, generate_report)
- Create comprehensive test suite with 38 tests covering:
  - Registry functionality
  - Integration with get_llm()
  - API integration
  - Edge cases (streaming, errors, concurrency)
  - Benchmark compatibility
- Add CI/CD workflow for LLM tests
- Include example implementations and documentation
- Thread-safe implementation with proper cleanup

This feature allows users to pass custom LangChain-compatible LLMs to the research system,
similar to how custom retrievers work. Users can register LLMs programmatically and use
them via the provider parameter.

* Fix PR review issues for custom LLM integration

- Replace print statements with loguru logger in advanced_custom_llm.py
- Add docstrings for ConfigurableLLM parameters
- Remove hardcoded confidence value, use descriptive text instead
- Fix clear-text logging of sensitive medical information
- Move LLM and retriever registration to _init_search_system to centralize logic
- Use logger.exception instead of logger.error for better error tracking
2025-06-21 22:57:52 +02:00