Files
LibreChat/.github
Danny Avila 0bd1a7350f 👷 ci: Add API runtime smoke (boot the production image) to docker-smoke (#13605)
* 👷 ci: Add API runtime smoke (boot the production image) to docker-smoke

The docker-smoke workflow only built the `client-package-build` stage and
never booted the runtime, so it couldn't catch the class of regression that
recently took production down: the api tsdown bundle externalizes runtime
deps that, after `npm ci --omit=dev`, were missing from the image
(`Cannot find module 'get-stream'`).

- Add an `api-runtime-smoke` job that builds the real production image
  (final `api-build` stage, `npm ci --omit=dev`), then:
  1. loads the @librechat/api bundle's full require graph in the pruned
     image (deterministic, no DB) — fails on any missing/ESM-incompatible
     runtime dependency.
  2. boots the actual entrypoint and asserts no module-load crash (the
     server loads its require graph before connecting to Mongo, so this
     surfaces without a database).
- Expand triggers to include `packages/api/**`, `packages/data-schemas/**`,
  and `api/package.json` (previously a packages/api change only triggered
  this via a root lockfile change, and even then only built the client stage).
- Add gha build cache + concurrency cancellation to bound CI cost.

* 👷 ci: Address Codex review — boot smoke against real Mongo + crash detection

- Boot the production image against a real MongoDB container with the env
  the server needs, so the *entire* require graph loads. `api/db/connect.js`
  throws at module scope without `MONGO_URI` and is imported before
  models/services/routes, so the previous no-env boot exercised almost none
  of the legacy API graph. (Codex finding 2)
- Gate on `/health` returning 200 AND the container staying alive, failing on
  any container exit. A non-module startup crash (ReferenceError, SyntaxError,
  bad config) now fails the smoke instead of slipping past a missing-module
  grep. (Codex finding 3)
- Expand trigger from `api/package.json` to `api/**`, since the image copies
  the whole `api/` tree and runs `node server/index.js`. (Codex finding 1)

* 👷 ci: Address Codex round 2 — poll /readyz + cover all image inputs

- Poll /readyz instead of /health. /health returns 200 at app.listen, but
  initializeMCPs() and checkMigrations() run *after* listen and process.exit(1)
  on failure; /readyz only returns 200 once serverReady is set after those
  complete. So post-listen startup crashes now fail the smoke too. (finding A)
- Expand triggers to every source tree copied into the production image:
  client/**, config/**, skill/** (the final stage copies client/dist, config,
  and skill). (finding B)
2026-06-08 18:44:52 -04:00
..