docs(release): consolidate v2.11.0 with new generic video, crawl timing, grpc CVE

Folds in three additional user-visible items that landed on main since
yesterday's weekly run, so release-drafts/v2.11.0/ is now the single
consolidated draft covering everything since v2.10 on May 15, 2026:

- Generic video discovery via document.videos[] on the video format,
  configured via AVGRAB_SERVICE_URL on self-host (#3764).
- New createdAt, completedAt, and duration fields on
  GET /v2/crawl/{id} and GET /v2/batch/scrape/{id}, with stable
  duration: 0 on kickoff-failure crawls (#3771).
- @grpc/grpc-js CVE patch in apps/test-suite added to the consolidated
  Fixes bullet (#3762).

Also bumps the changelog post date to June 13, 2026 (run date + 1 day)
and rewrites the intro to make the four-week consolidated scope
explicit, since this is the first release since v2.10.

Per release-drafts/AUTOMATION.md: drafts only — maintainer publishes
the GitHub release and opens the firecrawl/web changelog PR.

Co-authored-by: rhys-firecrawl <rhys-firecrawl@users.noreply.github.com>
This commit is contained in:
Cursor Agent
2026-06-12 16:40:31 +00:00
parent 925b6a8eea
commit 00decae07e
2 changed files with 11 additions and 5 deletions

View File

@@ -1,17 +1,19 @@
---
title: "v2.11.0 is live"
date: "June 12, 2026"
description: "Firecrawl v2.11.0 adds PII redaction on scrape, a deterministic JSON extraction format, LLM-judged monitor goals, JSON-mode diffs and snapshots, a Drizzle-based self-host setup, dedicated search credits, and a wide round of SDK, billing, and security fixes."
date: "June 13, 2026"
description: "Firecrawl v2.11.0 is the first release since v2.10 on May 15, and consolidates four weeks of work: PII redaction on scrape, a deterministic JSON extraction format, generic video discovery, LLM-judged monitor goals, JSON-mode monitor diffs and snapshots, crawl wall-clock timestamps, a Drizzle-based self-host setup, dedicated search credits, and a wide round of SDK, billing, and security fixes."
---
## v2.11.0 is live
Firecrawl v2.11.0 lands PII redaction directly on scrape, ships a `deterministicJson` format that codegens reusable extractors instead of calling an LLM per request, makes the Monitor API smarter with goal-driven LLM judges and JSON-mode diffs, splits the `searchZDR` forced mode into explicit ZDR and anonymous variants, and migrates self-host deployments from the Supabase SDK to Drizzle on native Postgres. Search billing now meters against its own credit pool, we added a `WWW-Authenticate` header for agent discovery, raised the PDF size cap, added V1-compatible aliases on the V2 SDKs, and patched a fresh batch of CVEs.
Firecrawl v2.11.0 is our first release since `v2.10` on May 15 and rolls four weeks of work into a single drop. It lands PII redaction directly on scrape, ships a `deterministicJson` format that codegens reusable extractors instead of calling an LLM per request, adds generic video discovery to the `video` format, makes the Monitor API smarter with goal-driven LLM judges and JSON-mode diffs, exposes `createdAt` / `completedAt` / `duration` on crawl status, splits the `searchZDR` forced mode into explicit ZDR and anonymous variants, and migrates self-host deployments from the Supabase SDK to Drizzle on native Postgres. Search billing now meters against its own credit pool, we added a `WWW-Authenticate` header for agent discovery, raised the PDF size cap, added V1-compatible aliases on the V2 SDKs, and patched a fresh batch of CVEs.
### Highlights
- **`redactPII` option** — Set `redactPII: true` on `/v2/scrape`, `/v2/batch/scrape`, `/v2/crawl`, `/v2/parse`, or `/v2/extract` and `document.markdown` comes back with PII redacted. The object form accepts `mode` (`accurate` | `aggressive` | `fast`), an `entities[]` filter, and `replaceStyle` (`tag` | `mask` | `remove`). Costs +4 credits per scrape (plus +4 per extra PDF page) and is exposed across the JS, Python, Java, .NET, Go, PHP, Ruby, Rust, and Elixir SDKs.
- **`deterministicJson` format** — A new scrape format that codegens a reusable JS extractor for your schema and runs it in a sandbox instead of calling an LLM on every request. Populates `document.json` like the existing `json` format, gets faster and cheaper the more you reuse the same schema, and bills at 7 credits per page.
- **Generic video discovery** — The `video` format now returns a structured `document.videos` array with direct video URLs, HLS manifests, embeds, thumbnails, titles, durations, and provider metadata extracted from page HTML, alongside the existing `document.video` string for supported providers like YouTube.
- **Crawl wall-clock timestamps** — `GET /v2/crawl/{id}` and `GET /v2/batch/scrape/{id}` now return `createdAt`, `completedAt`, and `duration` so callers can read actual crawl wall-clock time directly from the API.
- **Monitor goals with LLM judge** — Set `goal` and `judgeEnabled` on a monitor and every changed page is classified as meaningful vs noise, sorted into summary emails with the unified diff rendered inline. Typed in the JS, Python, Java, .NET, Go, PHP, Ruby, Rust, and Elixir SDKs.
- **Monitor JSON diffs and snapshots** — Monitors using `{type: "json"}` or `{type: "changeTracking", modes: ["json"]}` now diff structured fields against the previous run and persist a `snapshot.json` of current values, surfaced through `GET /v2/monitor/:id/checks/:checkId`.
- **Mixed-mode change tracking** — Requesting both `["json", "git-diff"]` modes now runs both diffs and reports `changed` if either side changes, with the markdown unified diff attached as a sidecar on the JSON artifact.
@@ -24,6 +26,6 @@ Firecrawl v2.11.0 lands PII redaction directly on scrape, ships a `deterministic
- **PDF size cap raised to 50 MB** — PDF download and scrape payloads can now be up to 50 MB (up from 30 MB).
- **V1 aliases on V2 SDKs** — Python and JS V2 clients now expose deprecated `scrape_url` / `scrapeUrl`, `crawl_url` / `crawlUrl`, and friends so V1-trained callers keep working without rewriting code.
- **JS SDK string API key** — `new Firecrawl("fc-key")` now works alongside the object form, with whitespace-only keys rejected up front.
- **CVE patches** — Resolved `axios` (GHSA-35jp-ww65-95wh and friends) in the JS SDK, the Rust SDK's `openssl` (CVE-2026-41676), and pnpm-overrode `shell-quote`, `ws`, `brace-expansion`, `qs`, `uuid`, and `js-cookie` across the API, JS SDK, Playwright service, test-suite, and ingestion UI.
- **CVE patches** — Resolved `axios` (GHSA-35jp-ww65-95wh and friends) in the JS SDK, the Rust SDK's `openssl` (CVE-2026-41676), `@grpc/grpc-js` in the test-suite, and pnpm-overrode `shell-quote`, `ws`, `brace-expansion`, `qs`, `uuid`, and `js-cookie` across the API, JS SDK, Playwright service, test-suite, and ingestion UI.
Read the full changelog [here](https://github.com/firecrawl/firecrawl/releases/tag/v2.11.0).

View File

@@ -5,6 +5,8 @@
- **`redactPII` option** — Added a `redactPII` option to `/v2/scrape`, `/v2/batch/scrape`, `/v2/crawl`, `/v2/parse`, and `/v2/extract`. Set `redactPII: true` and the worker routes the scraped markdown through fire-privacy and returns the redacted text as `document.markdown`. The object form accepts `mode: "accurate" | "aggressive" | "fast"`, an `entities[]` filter (`PERSON`, `EMAIL`, `PHONE_NUMBER`, `ADDRESS`, `SECRET`, ...), and `replaceStyle: "tag" | "mask" | "remove"`. Fail-soft: upstream timeouts and errors leave the scrape successful with the original markdown intact. Costs +4 credits per scrape when enabled (plus +4 per extra PDF page).
- **`redactPII` across SDKs** — Exposed the new `redactPII` option on scrape, batch-scrape, parse, crawl, and extract models in the JS, Python, Java, .NET, Go, PHP, Ruby, Rust, and Elixir SDKs.
- **`deterministicJson` format** — Added a new `deterministicJson` format on `/v2/scrape`, `/v2/batch/scrape`, `/v2/crawl`, `/v2/parse`, and `/v2/extract` that codegens a reusable JS extractor for the requested schema and runs it in a sandbox against the page, populating `document.json` like the existing `json` format but without a per-request LLM call. Extractors and their LLM responses are cached per site/schema so repeat scrapes amortize to zero LLM cost. Accepts an optional `schema` and `prompt`, billed at 7 credits per page (vs. 5 for `json`), and cannot be combined with the `json` format on the same request.
- **Generic video discovery** — The `video` format now returns a structured `document.videos` array (alongside the existing legacy `document.video` string for supported providers like YouTube). Each entry carries `url`, `sourceURL`, `source`, and optional `kind`, `provider`, `title`, `thumbnail`, `description`, `duration`, `mimeType`, `width`, `height`, and `metadata`, populated from page HTML via the avgrab service. YouTube URLs continue to use the legacy provider path; non-provider pages now return direct video files, HLS manifests, and embeds without a download/upload round-trip. Configured via the `AVGRAB_SERVICE_URL` env var on self-host.
- **Crawl wall-clock timestamps** — `GET /v2/crawl/{id}` and `GET /v2/batch/scrape/{id}` now return `createdAt`, `completedAt`, and `duration` (seconds) so callers can read actual crawl wall-clock time directly from the API. `completedAt` is set on terminal states; in-progress crawls report `duration` as `createdAt → now`. Kickoff-failure crawls report a stable `duration: 0` and `completedAt = createdAt` instead of a value that grew on every poll.
- **Monitor goals with LLM judge** — Added `goal` and `judgeEnabled` to monitors. When a goal is set, every changed page is run through an LLM judge that labels the diff `meaningful` or `noise` against the goal, drops alert spam, sorts meaningful pages first in summary emails, and renders the unified diff inline with color cues for up to five meaningful pages. Surfaced on monitor GET/responses and exposed in JS, Python, Java, .NET, Go, PHP, Ruby, Rust, and Elixir SDKs. Judge calls cost +1 credit per judged page; monthly estimates double when `judgeEnabled` is on.
- **Monitor JSON diffs** — Monitors that scrape with `{type: "json"}` or `{type: "changeTracking", modes: ["json"]}` now compute field-level JSON diffs against the previous run instead of diffing the rendered markdown. Each check persists a `snapshot.json` of current field values alongside the diff, available through `GET /v2/monitor/:id/checks/:checkId` and typed in the JS, Python, Java, .NET, Go, PHP, Ruby, and Rust SDKs.
- **Mixed-mode change tracking** — Monitors requesting both `["json", "git-diff"]` modes now run both diffs and report `changed` whenever either side changes, with the markdown unified diff attached as a sidecar on the JSON artifact. Previously this combination silently returned `same` when only one side moved.
@@ -32,7 +34,7 @@
## Fixes
- Resolved CVEs across the API and SDKs by patching `axios` to `1.16.1` in `apps/js-sdk/firecrawl` (GHSA-35jp-ww65-95wh, GHSA-654m-c8p4-x5fp, GHSA-898c-q2cr-xwhg, GHSA-pjwm-pj3p-43mv), the Rust SDK's `openssl` to `0.10.80` (CVE-2026-41676), and pnpm-overriding `shell-quote` (GHSA-w7jw-789q-3m8p), `ws` to `8.20.1` (GHSA-58qx-3vcg-4xpx), `brace-expansion` to `5.0.6` (GHSA-jxxr-4gwj-5jf2), `qs` (GHSA-Q8MJ-M7CP-5Q26 / CVE-2026-8723), `uuid` to `11.1.1` (GHSA-W5HQ-G745-H8PQ), and `js-cookie` (GHSA-QJX8-664M-686J) across `apps/api`, `apps/js-sdk`, `apps/playwright-service-ts`, `apps/test-suite`, and `apps/ui/ingestion-ui`.
- Resolved CVEs across the API and SDKs by patching `axios` to `1.16.1` in `apps/js-sdk/firecrawl` (GHSA-35jp-ww65-95wh, GHSA-654m-c8p4-x5fp, GHSA-898c-q2cr-xwhg, GHSA-pjwm-pj3p-43mv), the Rust SDK's `openssl` to `0.10.80` (CVE-2026-41676), `@grpc/grpc-js` in the test-suite, and pnpm-overriding `shell-quote` (GHSA-w7jw-789q-3m8p), `ws` to `8.20.1` (GHSA-58qx-3vcg-4xpx), `brace-expansion` to `5.0.6` (GHSA-jxxr-4gwj-5jf2), `qs` (GHSA-Q8MJ-M7CP-5Q26 / CVE-2026-8723), `uuid` to `11.1.1` (GHSA-W5HQ-G745-H8PQ), and `js-cookie` (GHSA-QJX8-664M-686J) across `apps/api`, `apps/js-sdk`, `apps/playwright-service-ts`, `apps/test-suite`, and `apps/ui/ingestion-ui`.
- Fixed scrape-worker pods freezing the Node event loop for tens of seconds when the LLM extractor encoded very large inputs through `@dqbd/tiktoken` — token trimming now pre-bounds the input by character count and trims in a single encode/decode pass, eliminating nuq lock expiry, duplicate finish-crawl jobs, and Kubernetes liveness-probe restarts caused by the synchronous re-encode loop.
- Fixed Wikipedia scrapes missing the `og:image` meta tag (and therefore `metadata.ogImage`) on roughly half of Wikimedia URLs — the image URL is now synthesized into the engine's `<head>` from the Wikimedia Enterprise API's `article.image.content_url`.
- Fixed `cancel` on `/v1` and `/v2` crawls and batches not draining the per-team concurrency-limit backlog — queued job IDs are now removed via a chunked Redis pipeline and `status` reports `cancelled` immediately, even while the worker group is still draining.
@@ -61,6 +63,8 @@
- Added `redactPII: boolean | { mode?: "accurate" | "aggressive" | "fast", entities?: string[], replaceStyle?: "tag" | "mask" | "remove" }` to `POST /v2/scrape`, `POST /v2/batch/scrape`, `POST /v2/crawl`, `POST /v2/parse`, and `POST /v2/extract`. When enabled, `document.markdown` is replaced with the redacted markdown. Defaults are `mode: "accurate"` and `replaceStyle: "tag"`. New env vars `FIRE_PRIVACY_URL` and `FIRE_PRIVACY_TIMEOUT_MS` (default `5000`) control the upstream call.
- Removed the experimental `pii` scrape format and the `document.pii` diagnostics block (`status`, `spans`, `counts`, `redactedMarkdown`) from request schemas and SDK types. Requests that include `"pii"` in `formats` are now rejected by the schema. Use `redactPII` to get safe output in `document.markdown`.
- Added the `deterministicJson` format (`{ type: "deterministicJson", schema?, prompt? }`) to `POST /v2/scrape`, `POST /v2/batch/scrape`, `POST /v2/crawl`, `POST /v2/parse`, and `POST /v2/extract`. Populates `document.json` like the `json` format. Cannot be combined with the `json` format in the same request. New self-host env vars `EXTRACT_CODEGEN_MODEL`, `EXTRACT_ANCHOR_MODEL`, `EXTRACT_LIGHT_MODEL`, and `CODE_SANDBOX_URL` (default `ws://code-sandbox:3001`) configure the extractor pipeline.
- Added `document.videos: VideoItem[]` to `POST /v2/scrape` (and the endpoints that share its options) when the `video` format is requested. Each `VideoItem` has `url`, `sourceURL`, `source`, and optional `kind`, `provider`, `title`, `thumbnail`, `description`, `duration`, `mimeType`, `width`, `height`, and `metadata`. The legacy `document.video` string is still populated for supported providers (e.g. YouTube). New self-host env var `AVGRAB_SERVICE_URL` configures the upstream video discovery service.
- Added `createdAt`, `completedAt`, and `duration` (seconds) to `GET /v2/crawl/{id}` and `GET /v2/batch/scrape/{id}` responses. `completedAt` is only present on terminal states; `duration` is `createdAt → completedAt` on terminal states or `createdAt → now` while in progress, and is stable at `0` on kickoff-failure crawls.
- Added a `WWW-Authenticate: Bearer realm="firecrawl"` header on all `401 Unauthorized` responses across `/v0`, `/v1`, and `/v2` endpoints to advertise the credential scheme for agent discovery.
- Added new `searchZDR` values `"forced-zdr"` and `"forced-anon"` and deprecated `"forced"` (still accepted as an alias for `"forced-zdr"`). The resolved mode is now used for both billing and upstream routing, and search endpoints accept teams whose flags resolve to a forced ZDR mode.
- Added `goal: string|null` and `judgeEnabled: boolean` to `POST /v2/monitor` and `PATCH /v2/monitor/:id`. `judgeEnabled` defaults to `true` when `goal` is non-empty; passing `goal: null` clears it.