Files
local-deep-research/.github/workflows/check-workflow-status.yml
LearningCircuit 91b68acafd docs(ci): auto-generated workflow status dashboard (#3966)
* docs(ci): add auto-generated workflow status dashboard

Adds `docs/ci/workflow-status.md` — a single page that surfaces every
GitHub Actions workflow in the repo, grouped by role, with action items
(disabled / stale / manual-only) at the top. Live status badges link to
each workflow's runs page. Auto-generated from the workflow YAML files +
the GitHub API by `scripts/generate_workflow_status.py`.

Why: the GitHub Actions tab is chronological-mixed (poor "is anything
red right now?" view), and the static workflow table in
`CI_CD_INFRASTRUCTURE.md` drifts when workflows are added/renamed (PR
#3963 fixed three factually wrong header claims for exactly this
reason). A reference page that mechanically reflects current state +
identifies dormant workflows answers both gaps.

What's surfaced today (verified live):
- **Disabled**: `nuclei.yml` (caller commented out in
  `release-gate.yml:177`).
- **Stale**: `update-precommit-hooks.yml` — its weekly Friday cron has
  been **failing for 10+ consecutive weeks** (since at least 2026-03-06).
  This was discovered by the dashboard, not previously tracked.
- **Manual-only**: `check-config-docs.yml`, `sync-main-to-dev.yml`
  (both intentionally manual; the dashboard shows them so they're not
  forgotten).

Generator design notes:
- Resolves reusable workflows correctly: `gh run list --workflow=X.yml`
  is empty for `workflow_call`-only workflows. The script walks the
  call graph (release.yml → release-gate.yml → semgrep.yml etc.),
  fetches the parent run's job list, and matches by **job key** parsed
  from the caller YAML (not by name heuristic — `gitleaks-scan` ↔
  `gitleaks-main.yml` would otherwise collide with `gitleaks.yml`).
- Picks "primary trigger" per workflow so e.g. `codeql.yml` (PR + push +
  cron + workflow_call) gets its glyph from the gated daily run, not a
  stale PR run.
- Stale check walks the *recent* runs list to find last success — a
  workflow that ran red yesterday and green a week ago is not stale.
- Manual edits outside the `<!-- BEGIN/END GENERATED -->` markers are
  preserved on regeneration; the timestamp lives inside the markers so
  post-marker content is fully user-owned.
- Preflights `gh auth status` and rate limit before any per-workflow
  call — fails fast with actionable message instead of partial output.

CI integration:
- `.github/workflows/check-workflow-status.yml` runs
  `--check-structure` on PRs touching workflows, the dashboard, or the
  generator. Pure structural check (no API calls, no live data) — fast
  and deterministic. Live regeneration stays on demand.

Cost: ~340 GitHub API calls per regeneration, ~45 sec wall-clock,
~6.8% of the 5000/hr authenticated quota.

* fixup(ci): review-pass corrections to workflow status dashboard

Surfaced by three rounds of code-review + correctness + security agents
on the original PR. Four small fixes; no behavioral change to the
generated dashboard's content.

1. **Recognize commented job keys** — `JOB_KEY_RE` now accepts an
   optional `# ` prefix. Previously, when an entire job block was
   commented out (e.g. `release-gate.yml:175-181` for nuclei), the
   commented `uses:` line inherited the *previous* active job's key
   (`gitleaks-scan`) instead of the correct `nuclei-scan`. Latent —
   commented entries are filtered out before reaching gated-run lookup
   — but would misattribute status if someone partially uncommented a
   block (uncommented just the `uses:` line).

2. **Pin pyyaml to ==6.0.3** in the CI workflow. The repo convention is
   exact `==` pins (95% of `pip install` calls in workflows); the only
   floating range was the one introduced by this PR. Matches pdm.lock.

3. **Validate marker order** in `merge_with_existing`. If a manual edit
   leaves the BEGIN/END markers reversed (e.g. mid-merge-conflict), bail
   to a clean overwrite instead of splicing interleaved garbage.

4. **Remove `_coerce_jq_stream`** — unused helper left behind from an
   earlier iteration. Zero call sites; no behavior change.

Verified by re-running the generator + `--check-structure`. The
rendered dashboard's only diff vs prior commit is the regeneration
timestamp and live "Last activity" cells (expected — those reflect
recent runs since the previous regen).

* feat(ci): bucketed activity labels + auto-regen on version bump

Two changes that together make the dashboard's diffs meaningful instead
of noisy.

1. **Coarse activity buckets.** Replace exact UTC timestamps in every
   "Last activity / Last manual run / Last successful run" cell with one
   of: `this week`, `last week`, `2 weeks ago`, `3 weeks ago`,
   `last month`, `2 months ago`, `3+ months ago`, `long ago`, `never`.
   Calendar-day boundaries (no time-of-day jitter) so two regenerations
   on the same date produce **zero diff** when nothing actually drifted.
   Verified: same-day re-runs after stable workflow state → empty diff.

   Also drop the redundant `Days idle` columns from Stale and
   Manual-only tables (the bucket label already says it), and round the
   "Last regenerated" footer to a date.

   Why: a daily-running healthy workflow used to bump its timestamp
   every regen (noise). Now it stays in `this week` indefinitely, and
   the only diffs that land in a version-bump PR are real bucket
   transitions — exactly the "this slipped from last week to last month
   — something might be wrong" signal the dashboard exists for.

2. **Auto-regenerate on version bump.** Add a step to `version_check.yml`
   right after the existing `generate_config_docs.py` regen. Same
   pattern as the config docs precedent — the dashboard refresh rides
   along with each version-bump PR and is reviewable in the same diff.

   Costs ~340 GitHub API calls per run (well under the GITHUB_TOKEN
   1000/hr workflow-runs limit). Adds `actions: read` to the job
   permissions block; uses `pyyaml==6.0.3` matching pdm.lock.

* feat(ci): drop regen timestamp; add health banner; fix in-progress false-stale

Three follow-ups to keep version-bump diffs strictly meaningful, plus
two correctness fixes uncovered by repeated stability testing.

1. **Drop the "Last regenerated" date.** Git history is authoritative
   for "when this snapshot was taken"; embedding a date here forced a
   single-line diff every regeneration even when nothing else drifted.

2. **Aggregated health banner** at the top of the generated region:
   `**63 workflows:** 1 disabled · 1 stale · 2 manual-only · 59 active`
   Counts only change when a workflow shifts between
   {disabled, stale, manual, active} — same level of diff-stability as
   the per-row buckets.

3. **`?event=schedule` for own-cron workflow badges.** Verified
   effective by SHA-comparing badge bodies for workflows with
   multi-event run history. Makes the badge for e.g. `gitleaks.yml`,
   `fuzz.yml`, `osv-scanner.yml` reflect cron health specifically,
   rather than whichever PR ran last. The runs-page link uses the
   matching `?query=event%3Aschedule` so a click lands on the
   filtered run list.

4. **Fix false-stale during in-flight release runs.** Previously,
   when release.yml was running, gates reachable via release.yml
   (puppeteer-e2e-tests, ci-gate, etc.) would briefly flip to "stale"
   because `fetch_last_gated_run` returned the in-progress run first
   and `last_success` couldn't see past it. Now the function walks
   all 5 caller runs and returns both the latest match (for activity)
   and the latest successful match (for staleness), avoiding the flip.

5. **Map all GitHub conclusion enum values.** A `gitleaks.yml` run
   completed with `action_required` between two test regens; the
   glyph table didn't have it and rendered `?`. Added every
   documented value (`neutral`, `timed_out`, `stale`, `action_required`)
   and changed the unknown-fallback from `?` to em-dash, so future
   GitHub-side enum additions don't introduce a false-positive diff.

Verified: two same-day regens after workflow state has settled now
produce **zero diff**.

* ci(version-bump): make workflow-status regen non-blocking

Add `continue-on-error: true` to the dashboard regeneration step in
version_check.yml. The regen calls ~340 GitHub API endpoints and would
otherwise block the entire version-bump PR if any of them transiently
fail (rate-limit hit, GitHub Actions outage, etc.). The failure mode
should be "dashboard stays at the previous snapshot until next
successful regen", not "release pipeline is blocked".

The sibling `generate_config_docs.py` step doesn't need this — it's
purely local with no external API dependency.
2026-05-10 15:58:32 +02:00

53 lines
1.8 KiB
YAML

name: Check Workflow Status Dashboard
# Fails when a workflow file is added/renamed without a corresponding row
# in docs/ci/workflow-status.md. Pure structural check — no GitHub API
# calls, no live data — so it runs fast and doesn't need any auth.
#
# To fix a failure: regenerate the dashboard with
# `pdm run python scripts/generate_workflow_status.py`
# This requires `gh` authenticated against the repo. If you can't run it
# locally, ping a maintainer to regenerate, or add a temporary placeholder
# `\`<your-new-workflow>.yml\`` mention in the file's manual-edit region
# to unblock the PR.
on:
pull_request:
paths:
- '.github/workflows/**'
- 'docs/ci/workflow-status.md'
- 'scripts/generate_workflow_status.py'
workflow_dispatch:
permissions:
contents: read
jobs:
check-structure:
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Harden the runner (Audit all outbound calls)
uses: step-security/harden-runner@a5ad31d6a139d249332a2605b85202e8c0b78450 # v2.19.1
with:
egress-policy: audit
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.12'
- name: Install PyYAML
# Pinned to match pdm.lock; the rest of the repo uses exact
# `==` pins for ad-hoc workflow installs (see e.g.
# validate-image-pinning.yml). Floating `~=` ranges can pick up
# yanked / replaced patch versions silently.
run: pip install pyyaml==6.0.3
- name: Verify dashboard structure
run: python scripts/generate_workflow_status.py --check-structure