mirror of
https://github.com/LearningCircuit/local-deep-research.git
synced 2026-06-15 19:46:56 +03:00
* docs(ci): add auto-generated workflow status dashboard Adds `docs/ci/workflow-status.md` — a single page that surfaces every GitHub Actions workflow in the repo, grouped by role, with action items (disabled / stale / manual-only) at the top. Live status badges link to each workflow's runs page. Auto-generated from the workflow YAML files + the GitHub API by `scripts/generate_workflow_status.py`. Why: the GitHub Actions tab is chronological-mixed (poor "is anything red right now?" view), and the static workflow table in `CI_CD_INFRASTRUCTURE.md` drifts when workflows are added/renamed (PR #3963 fixed three factually wrong header claims for exactly this reason). A reference page that mechanically reflects current state + identifies dormant workflows answers both gaps. What's surfaced today (verified live): - **Disabled**: `nuclei.yml` (caller commented out in `release-gate.yml:177`). - **Stale**: `update-precommit-hooks.yml` — its weekly Friday cron has been **failing for 10+ consecutive weeks** (since at least 2026-03-06). This was discovered by the dashboard, not previously tracked. - **Manual-only**: `check-config-docs.yml`, `sync-main-to-dev.yml` (both intentionally manual; the dashboard shows them so they're not forgotten). Generator design notes: - Resolves reusable workflows correctly: `gh run list --workflow=X.yml` is empty for `workflow_call`-only workflows. The script walks the call graph (release.yml → release-gate.yml → semgrep.yml etc.), fetches the parent run's job list, and matches by **job key** parsed from the caller YAML (not by name heuristic — `gitleaks-scan` ↔ `gitleaks-main.yml` would otherwise collide with `gitleaks.yml`). - Picks "primary trigger" per workflow so e.g. `codeql.yml` (PR + push + cron + workflow_call) gets its glyph from the gated daily run, not a stale PR run. - Stale check walks the *recent* runs list to find last success — a workflow that ran red yesterday and green a week ago is not stale. - Manual edits outside the `<!-- BEGIN/END GENERATED -->` markers are preserved on regeneration; the timestamp lives inside the markers so post-marker content is fully user-owned. - Preflights `gh auth status` and rate limit before any per-workflow call — fails fast with actionable message instead of partial output. CI integration: - `.github/workflows/check-workflow-status.yml` runs `--check-structure` on PRs touching workflows, the dashboard, or the generator. Pure structural check (no API calls, no live data) — fast and deterministic. Live regeneration stays on demand. Cost: ~340 GitHub API calls per regeneration, ~45 sec wall-clock, ~6.8% of the 5000/hr authenticated quota. * fixup(ci): review-pass corrections to workflow status dashboard Surfaced by three rounds of code-review + correctness + security agents on the original PR. Four small fixes; no behavioral change to the generated dashboard's content. 1. **Recognize commented job keys** — `JOB_KEY_RE` now accepts an optional `# ` prefix. Previously, when an entire job block was commented out (e.g. `release-gate.yml:175-181` for nuclei), the commented `uses:` line inherited the *previous* active job's key (`gitleaks-scan`) instead of the correct `nuclei-scan`. Latent — commented entries are filtered out before reaching gated-run lookup — but would misattribute status if someone partially uncommented a block (uncommented just the `uses:` line). 2. **Pin pyyaml to ==6.0.3** in the CI workflow. The repo convention is exact `==` pins (95% of `pip install` calls in workflows); the only floating range was the one introduced by this PR. Matches pdm.lock. 3. **Validate marker order** in `merge_with_existing`. If a manual edit leaves the BEGIN/END markers reversed (e.g. mid-merge-conflict), bail to a clean overwrite instead of splicing interleaved garbage. 4. **Remove `_coerce_jq_stream`** — unused helper left behind from an earlier iteration. Zero call sites; no behavior change. Verified by re-running the generator + `--check-structure`. The rendered dashboard's only diff vs prior commit is the regeneration timestamp and live "Last activity" cells (expected — those reflect recent runs since the previous regen). * feat(ci): bucketed activity labels + auto-regen on version bump Two changes that together make the dashboard's diffs meaningful instead of noisy. 1. **Coarse activity buckets.** Replace exact UTC timestamps in every "Last activity / Last manual run / Last successful run" cell with one of: `this week`, `last week`, `2 weeks ago`, `3 weeks ago`, `last month`, `2 months ago`, `3+ months ago`, `long ago`, `never`. Calendar-day boundaries (no time-of-day jitter) so two regenerations on the same date produce **zero diff** when nothing actually drifted. Verified: same-day re-runs after stable workflow state → empty diff. Also drop the redundant `Days idle` columns from Stale and Manual-only tables (the bucket label already says it), and round the "Last regenerated" footer to a date. Why: a daily-running healthy workflow used to bump its timestamp every regen (noise). Now it stays in `this week` indefinitely, and the only diffs that land in a version-bump PR are real bucket transitions — exactly the "this slipped from last week to last month — something might be wrong" signal the dashboard exists for. 2. **Auto-regenerate on version bump.** Add a step to `version_check.yml` right after the existing `generate_config_docs.py` regen. Same pattern as the config docs precedent — the dashboard refresh rides along with each version-bump PR and is reviewable in the same diff. Costs ~340 GitHub API calls per run (well under the GITHUB_TOKEN 1000/hr workflow-runs limit). Adds `actions: read` to the job permissions block; uses `pyyaml==6.0.3` matching pdm.lock. * feat(ci): drop regen timestamp; add health banner; fix in-progress false-stale Three follow-ups to keep version-bump diffs strictly meaningful, plus two correctness fixes uncovered by repeated stability testing. 1. **Drop the "Last regenerated" date.** Git history is authoritative for "when this snapshot was taken"; embedding a date here forced a single-line diff every regeneration even when nothing else drifted. 2. **Aggregated health banner** at the top of the generated region: `**63 workflows:** 1 disabled · 1 stale · 2 manual-only · 59 active` Counts only change when a workflow shifts between {disabled, stale, manual, active} — same level of diff-stability as the per-row buckets. 3. **`?event=schedule` for own-cron workflow badges.** Verified effective by SHA-comparing badge bodies for workflows with multi-event run history. Makes the badge for e.g. `gitleaks.yml`, `fuzz.yml`, `osv-scanner.yml` reflect cron health specifically, rather than whichever PR ran last. The runs-page link uses the matching `?query=event%3Aschedule` so a click lands on the filtered run list. 4. **Fix false-stale during in-flight release runs.** Previously, when release.yml was running, gates reachable via release.yml (puppeteer-e2e-tests, ci-gate, etc.) would briefly flip to "stale" because `fetch_last_gated_run` returned the in-progress run first and `last_success` couldn't see past it. Now the function walks all 5 caller runs and returns both the latest match (for activity) and the latest successful match (for staleness), avoiding the flip. 5. **Map all GitHub conclusion enum values.** A `gitleaks.yml` run completed with `action_required` between two test regens; the glyph table didn't have it and rendered `?`. Added every documented value (`neutral`, `timed_out`, `stale`, `action_required`) and changed the unknown-fallback from `?` to em-dash, so future GitHub-side enum additions don't introduce a false-positive diff. Verified: two same-day regens after workflow state has settled now produce **zero diff**. * ci(version-bump): make workflow-status regen non-blocking Add `continue-on-error: true` to the dashboard regeneration step in version_check.yml. The regen calls ~340 GitHub API endpoints and would otherwise block the entire version-bump PR if any of them transiently fail (rate-limit hit, GitHub Actions outage, etc.). The failure mode should be "dashboard stays at the previous snapshot until next successful regen", not "release pipeline is blocked". The sibling `generate_config_docs.py` step doesn't need this — it's purely local with no external API dependency.
53 lines
1.8 KiB
YAML
53 lines
1.8 KiB
YAML
name: Check Workflow Status Dashboard
|
|
|
|
# Fails when a workflow file is added/renamed without a corresponding row
|
|
# in docs/ci/workflow-status.md. Pure structural check — no GitHub API
|
|
# calls, no live data — so it runs fast and doesn't need any auth.
|
|
#
|
|
# To fix a failure: regenerate the dashboard with
|
|
# `pdm run python scripts/generate_workflow_status.py`
|
|
# This requires `gh` authenticated against the repo. If you can't run it
|
|
# locally, ping a maintainer to regenerate, or add a temporary placeholder
|
|
# `\`<your-new-workflow>.yml\`` mention in the file's manual-edit region
|
|
# to unblock the PR.
|
|
on:
|
|
pull_request:
|
|
paths:
|
|
- '.github/workflows/**'
|
|
- 'docs/ci/workflow-status.md'
|
|
- 'scripts/generate_workflow_status.py'
|
|
workflow_dispatch:
|
|
|
|
permissions:
|
|
contents: read
|
|
|
|
jobs:
|
|
check-structure:
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 5
|
|
|
|
steps:
|
|
- name: Harden the runner (Audit all outbound calls)
|
|
uses: step-security/harden-runner@a5ad31d6a139d249332a2605b85202e8c0b78450 # v2.19.1
|
|
with:
|
|
egress-policy: audit
|
|
|
|
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
|
with:
|
|
persist-credentials: false
|
|
|
|
- name: Set up Python
|
|
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
|
|
with:
|
|
python-version: '3.12'
|
|
|
|
- name: Install PyYAML
|
|
# Pinned to match pdm.lock; the rest of the repo uses exact
|
|
# `==` pins for ad-hoc workflow installs (see e.g.
|
|
# validate-image-pinning.yml). Floating `~=` ranges can pick up
|
|
# yanked / replaced patch versions silently.
|
|
run: pip install pyyaml==6.0.3
|
|
|
|
- name: Verify dashboard structure
|
|
run: python scripts/generate_workflow_status.py --check-structure
|