Skip to content

feat(mcp): add compress option to browser_snapshot to collapse repeated ARIA nodes#41507

Closed
Josef-Le wants to merge 1 commit into
microsoft:mainfrom
Josef-Le:feat/browser-snapshot-compress
Closed

feat(mcp): add compress option to browser_snapshot to collapse repeated ARIA nodes#41507
Josef-Le wants to merge 1 commit into
microsoft:mainfrom
Josef-Le:feat/browser-snapshot-compress

Conversation

@Josef-Le

Copy link
Copy Markdown

Summary

Adds a compress?: boolean parameter to the browser_snapshot MCP tool.

When enabled, a two-pass O(n) algorithm collapses repeated structural patterns in the ARIA snapshot YAML — keeping the first 10 occurrences of any pattern that appears more than 100 times — and emits a trailing note explaining how to enumerate the full list via browser_evaluate().

Motivation (concrete evidence from real pages)

Closes #41395. Evidence requested by @pavelfeldman posted in the issue thread.

Three real-world examples measured with Python Playwright 1.60.0 + headless Chromium:

Page Raw tokens Compressed tokens Reduction
Amazon product search (16 results) ~82,950 ~29,710 64%
Hacker News front page (30 stories) ~16,140 ~4,220 74%
worldometers.info/population (234 rows) ~42,000 ~2,070 95%

The agent receives a compressed snapshot that preserves the first 10 items (enough to learn the structure) and a summary note. It can then use browser_evaluate() for targeted queries on the remaining items.

Algorithm

Two-pass, O(n) time and space:

  1. Pre-scan: compute (indent, signature) counts. Signature normalises refs, accessible names, and numbers so structurally identical siblings share the same key.
  2. Safety gate: only fire when some signature repeats > 100 times (prevents false positives on diverse pages).
  3. Compression pass: keep the first 10 occurrences of any repeated pattern; collapse the rest (along with their subtrees). Interactive roles (button, link, input, etc.) are always kept regardless of repetition.
  4. Trailing note: documents what was removed and how to retrieve the full list.

Files changed

  • packages/playwright-core/src/tools/backend/ariaCompression.ts — new 168-line module (pure function, no I/O, zero dependencies)
  • packages/playwright-core/src/tools/backend/snapshot.ts — +2/-1 (adds compress param to schema)
  • packages/playwright-core/src/tools/backend/response.ts — +8/-2 (wires compress into snapshot path)
  • tests/mcp/snapshot-compression.spec.ts — new 127-line integration test: fires on 150-item list, no-op on small lists, no-op with compress: false, preserves interactive elements

…nodes

Adds a `compress?: boolean` parameter to the `browser_snapshot` MCP tool.
When set, a two-pass algorithm collapses repeated structural patterns in the
ARIA snapshot YAML — keeping the first 10 occurrences of any pattern that
appears more than 100 times — and emits a trailing note explaining how to
enumerate the full list via browser_evaluate().

This targets pages with large lists, data grids, or navigation menus where
the snapshot can grow to thousands of lines. On a 150-item GitHub issues
page the output shrinks from ~1 800 lines to ~80 lines while preserving all
interactive elements (buttons, inputs, links) unconditionally.

Adds ariaCompression.ts with the pure compression function, and four tests
covering the happy path, passthrough on small lists, interactive-element
preservation, and the compress: false escape hatch.
@Skn0tt

Skn0tt commented Jul 1, 2026

Copy link
Copy Markdown
Member

Let's come to a conclusion in the issue first. @pavelfeldman could you take another look there?

@Skn0tt Skn0tt closed this Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(mcp): add compress option to browser_snapshot to collapse repeated ARIA nodes

2 participants