* fix: accurate token/message counting for model router decisions
Previously the router only counted tokens from the current user message,
ignoring system prompt, chat history, pinned docs, and parsed files.
Introduce ModelRouterService.gatherRoutingContext() to compute the true
conversation token count and reuse the fetched context downstream to
avoid redundant DB/filesystem lookups.
Also adds model router support to the OpenAI-compatible and API chat
handlers, propagates usageMetrics through ephemeral agents, and fixes
the router input to use the attachment-injected message.
* fix /chat API counter
* count current message in message count for router
* add router handler everywhere these is getLLMProvider
* router logic overhaul and refactor
* fix tests
* fix logger
* support generic openAI auto-detection for dropdown in model rules provider - fails over to plaintext input
* add model router backend: schema, models, API, evaluator, provider, and cooldown
* add model router frontend: settings pages, workspace integration, rule builder with DnD, and system LLM support
* add routing indicator, new rule types, and UX improvements for model router
* add provider credential validation to model router and fix icon alignment
* add llm classifcation rules for model router
* lint
* add vision routing, agent support, and configurable cooldown to model router
* add english translations for model router
* move isPathMatch to path utils
* fix model router sticking to first model in agent mode
* make model router notifications ephemeral and add CSV rule support
* improve model route notification ux and fix agent image routing
* split RuleForm into subcomponents
* extract RuleRow subcomponent and move ModalWrapper to NewRouterModal
* remove /admin prefix from model router endpoints
* use single query for model router list endpoint
* clean up model router error handling and http methods
* refactor model router rule update method and fix stale error field names
* add model router env vars to .env.example
* update fallback model description to mention llm rule classification
* fix model router rule row alignment and badge readability
* Model Router: Multi-condition rules, updated designs, and router edit support (#5478)
* Implement draft designs
* add matches comparator for prompt content
* implement multi conditional to a calculated rule
* fix between comparator
* restyle model router modals to match light mode designs
---------
Co-authored-by: shatfield4 <seanhatfield5@gmail.com>
* replace model route spinner with webm animation
* route llm classifier through aibitat tool call
* fix: rule form state not resetting when making new rules
* rename router form to rules page + fix and/plus button styles
* add telemetry for model router creation
* fix x button alignment on rule condition rows
* simple UI changes
* fix router
* Refactor model routing into singleton with sticky routes and LLM caching
Consolidate router/index.js, router/deterministic.js, and router/cooldown.js
into a single ModelRouterService singleton class that manages all routing state.
Key changes:
- Sticky model routing: once a rule matches, the model stays active for
follow-up messages that don't match any rule (e.g., "tell me more"),
preventing unwanted bouncing back to the fallback model.
- LLM classification cache: expensive LLM classifier calls are cached for
the full sticky window (default 5 min), running at most once per window.
- Calculated rules always re-evaluate (they're instant), so topic shifts
that match a different rule still route correctly.
- Route notifications only fire when the model actually changes, not on
every message.
- Router config is cached in-memory and invalidated on CRUD operations.
- Detailed logging for cache hits/misses/expiry and rule evaluation.
* fix model router notification text to be i18n
* prevent history overwrite on regen
* fix in-processing post-routed chat spacing in UI
* reset router on thread reset
* update translation key
* refactor route name label on chat history
* use system LLM on new model router
* frontend modular cleanup
* fix UI/UX on how rules are shown
* rule builder translations
* translations
* endpoint cleanup
* refactor rules
* fix router translation for no rule
remove from default plugins
port support for model router to ephemeral
* simplify stream handler
* do not route notify on first message (non-agent chat)
* fix cache keys
* ttl reset on followups
* Differentiate between short TTL and long TTL when we actually have run an LLM eval route
* port embed support for router option
* frontend fixes
* dynamic key
* 5315 i18n (#5666)
* translations
* norm
* readme entry
* update comments
---------
Co-authored-by: Marcello Fitton <106866560+angelplusultra@users.noreply.github.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* add WORKSPACE_DELETION_PROTECTION env flag to disable workspace deletion from UI and APIs
* minor nits
* patch test for phrase
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* add memory storage layer with model, endpoints, schema, and system setting
* add memory extraction background worker
* handle single-user and multi-user mode for memory extraction
* add memory injection and personalization ui
* lint
* personalization UI polish, fix RBAC for all roles, and endpoint guards
* split admin personalization page into subcomponents
* split admin personalization page into subcomponents, use early return instead of ternary for readability
* make error responses consistent in memory endpoints
* consolidate duplicate MemoryItem into shared component
* skip memory reranking when no query context is available
* extract promptWithMemories util to deduplicate memory injection
* add english translations
* refactor memory api routes to eliminate ambiguous path params
* make memory extraction interval and idle threshold configurable via env vars
* add hint to memory form input
* inline personalization content in admin settings page
* consolidate unprocessed chat queries into single fetch
* simplify personalization page with early returns and layout subcomponent
* add memories sidebar with chat settings menu redesign
* remove legacy personalization ui
* split MemoriesSidebar into subcomponent folders
* fix memory card menu spacing to match design
* fix memories visibility for non-admin users
* add english translations for memories sidebar ui
* add workspace access checks to all memory endpoints
* pass user prompt to agent getDefinition for memory reranking
* check idle threshold per user/workspace group instead of globally
* use factory pattern for i18n text sizes and replace Trans with plain t() keys
* add placeholder examples and format guidance for manual memory creation
* remove stale comment
* refactor chat menu
* refactor sidebar to use provider and simplified DOM rendering
* fix memories sidebar overflow from long workspace names, widen memories options menu
* fix edit memory modal cursor landing at start of input
* hide memory add/move actions when at limit
* revert scope of workspace settings route changed for first pass of memories ui
* jsdoc memory model methods, drop redundant error logs
* inline error response returns and move body validation after perms in memory endpoints
* run memory fetches concurrently in list endpoint and prompt injection
* jsdoc server memory model methods
* normalize memory_enabled to true/false and add SystemSettings helper
* extract memory ownership middleware, strip redundant .end() calls
* use validWorkspaceSlug middleware for memory endpoints, switch to slug-based urls
* add description to memories util
* simplify memory ownership middleware to single db query
* use aibitat tool call for memory extraction, add chat safety limits
* truncate per-chat prompt/response before memory extraction
* add input validation to memory model
* validate ids array input in markMemoryProcessed
* inline memory injection in chatPrompt and systemPrompt
* make memory migration additive and drop unused sourceThreadId
* normalize translation files
* reset translations, minor nit
* refactor consts, extract memories with outputs
* set intervla
* sync
* rearch how memory extract works by moving to Observer/Reflector pattern (Mastra aspect)
* remove timeout on load
* Feat/memory i18n (#5661)
translations for #5269
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* support pulling generated documents from API calls
* revert async wait
* fix strictness of packMessages to handle streaming chunks but keep functionality
* add MiniMax LLM provider support
* add MiniMax LLM provider to the docs
* fix: add trailing newlines for lint compliance
* add env vars to system settings | make max tokens configurable via ui | pass max tokens into minimax llm provider
* change fallback max tokens value to null to use provider default | pass max tokens into handleFucntionCallStream and chat
* add minimax to getModelTag switch
* pass provider into tooledStream and tooledComplete
* remove max tokens param
* update image
---------
Co-authored-by: angelplusultra <macfittondev@gmail.com>
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* fix: add font fallback for form controls
* fix: restore complete index.css with font fallback
* move to end and mark important
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* fix: white flash when switching between threads
* nest thread route under workspace
* fix: white flash when switching between workspaces
* simplify Link usage in workspace/thread sidebar items
* smooth workspace and thread switching
* fix race condition on send during thread/workspace switch
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* fix(tts): strip Markdown syntax before sending text to TTS engines
Chat responses are rendered as Markdown but the TTS components piped the
raw response into Piper / the browser's `SpeechSynthesis` API. The
synthesizer reads every special character literally — `**bold**` becomes
"asterisk asterisk bold asterisk asterisk", `# Heading` becomes "pound
heading", code fences are read backtick-by-backtick, and bullet lists
become "hyphen item". The result is unintelligible whenever the assistant
includes any formatting, which is most of the time.
This commit adds a small `messageToSpeech` helper that converts a
Markdown chat message into plain text suitable for TTS:
- fenced code blocks and images are dropped (nothing useful to read)
- inline code and link labels keep their text content
- emphasis markers, headings, blockquote markers, list markers, and
horizontal rules are stripped while preserving the underlying words
- HTML tags are removed but their text content kept
- table pipes become commas so rows read naturally
The helper is regex-based — no new dependency — and is wired into both
the native (`SpeechSynthesis`) and Piper TTS components in
`WorkspaceChat/ChatContainer/ChatHistory/HistoricalMessage/Actions/TTSButton`.
Closes#5557.
---
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* initialize
* expand tool result text limit | add syntax highlighting and json formatting to tool result rendering
* fix onError jsdoc
* lint
* fix unread icon
* route protection
* improve form handling for NewJobModal
* safeJsonParse
* remove unneeded comments
* remove trycatch
* add truncateText helper
* add explicit fallback value tos safeJsonParse
* add shared cron constant and helpers
* reduce frontend indirection
* use isLight to compute syntax highlighting theme
* remove dead code
* remove forJob and make job limit to 50
* create recomputeNextRunAt helper method
* add comment about nextRunAt recomputation
* add job queue and concurrency control to scheduled jobs
* use p-queue
* change default max concurrent value to 1
* add comment explaining internal scheduling system
* add recomputeNextRunAt on boot
* add generated documents to run details
* Modify toolsOverride functionality where no tools selected means no tools are given to the agent
add a select all/deselect all toggle button for easily selecting all
tools in the cerate job form
* create usePolling hook
* add polling to scheduled jobs and scheduled job runs pages
* add cron generation feature in job form
* remove cron generation feature | add cron builder feature | add max active scheduled jobs limit
* set MAX_ACTIVE to null
* replace hour and minute input fields with input with type time
* simplify
* organize components
* move components to bottom of page component
* change Generated Documents to Generated Files
* add i18n to cronstrue
* add i18n
* add type="button" to button elements
* refactor fileSource retrieval logic
* one scheduled job run can have status "running"
* add protection of file retrieveal from scheduled job in multiuser mode
* fix comments
* make job status default to queued
* add queued status
* fix bug with result trace rendering
* store timeout ref and clearTimeout once race settles
* remove unneeded handlerPromise tracking
* move imports to top level
* refactor hardcoded paths to path resolve functions
* implement new job form design
* simplify
* fix button styles
* fix runJob bug
* implement styles for scheduled jobs page
* apply dark mode figma styles
* delete unused translation key
* implement light mode for new new job modal, run history, and run details
* lint
* fix light mode scroll bar in tool call card
* adjust table header contrast
* fix type in subtitle
* kill workers when job is in-flight before deleting job
* add border-none to buttons
* change locale time to iso string
* import BackgroundService module level | instatiate backgroundService singltone once and reuse across handlers
* add p-queue, @breejs/later and cron-validate as core deps
* parse cron expression to a builder state once
* add theme to day buttons in cron builder
* fix stale tools selection caption
* flip popover when popover clips screen height
* make ScheduleJob.trigger() await the run insertion | disable run now button if job is in flight
* regen table
* refactor generated file card
* refactor frontend
* remove logs
* major refactor for tool picking, fix bree/later bug
* combine action endpoints, move contine to method
* fix unoptimized query with include + take + order
* fix dangerous use, refactor job to utils
* add copy content to text response
* improve notification system subscription for browser
* remove unused translations
* prevent gen-file cleanup job from deleting active job file generated references
* rich text copy
* Scheduled Jobs: Translations (#5482)
* add locales for scheduled jobs
* i18n
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
* add config flag with UI notice
* update README
* telemetry datapoints
* Always use UTC on backend, convert to local in frontend
* fix tz render
* Add job killing
* cleanup thinking text in job notifications and break out reasoning in response text.
Also hide zero metrics since that is useless
* Port generatedFile schema to the normalized workspace chat `outputs` file format so porting to thread is simple and implem between chats <> jobs is 1:1
* what the fuck
* compiled bug
* fixed thinking oddity in complied frontend
* supress multi-toast
* fix duration call
* Revert "fix duration call"
This reverts commit 0491bc71f4.
* revert and reapply fix
---------
Co-authored-by: Timothy Carambat <rambat1010@gmail.com>