Skip to content

fix(cua-driver/macos): prefer AX scroll actions#2031

Open
injaneity wants to merge 1 commit into
trycua:mainfrom
injaneity:fix/1895-macos-scroll-ax-actions
Open

fix(cua-driver/macos): prefer AX scroll actions#2031
injaneity wants to merge 1 commit into
trycua:mainfrom
injaneity:fix/1895-macos-scroll-ax-actions

Conversation

@injaneity

@injaneity injaneity commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Prefer a native advertised AX scroll action for indexed macOS scroll targets before falling back to key synthesis.
  • Support both AXScroll* and AXScroll*ByPage action families, ordered by requested by granularity.
  • Include the selected scroll path in the success message.

Rationale

This keeps the existing Cua scroll flow intact: resolve/retain the target element, focus under suppression, then scroll. The only change is to use the native AX action the target element advertises before using synthesized keyboard input as a fallback.

Testing

  • git diff --check
  • Not run: cargo check / rustfmt because no Rust toolchain is installed in this environment (rustup toolchain list reports none).

Fixes #1895

Summary by CodeRabbit

  • New Features

    • Added native scrolling support for elements that expose accessibility scroll actions.
    • Improved scrolling behavior to prefer the most appropriate available action before falling back.
  • Bug Fixes

    • Updated scrolling to better handle focused elements and reduce reliance on synthesized input when native actions are available.
    • Success messages now indicate which scrolling method was used.

@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

@injaneity is attempting to deploy a commit to the Cua Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8335ef05-bde9-483d-9b9b-1a12585339de

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The macOS scroll tool now checks for a matching native AX scroll action first, performs it when available, and falls back to synthesized key events otherwise. The helper reports which method ran, and the success text now includes that method.

Changes

AX scroll preference and reporting

Layer / File(s) Summary
AX scroll helper and candidate selection
libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rs
Adds a helper that checks advertised AX actions, maps direction/by to candidate scroll action names, and performs the selected action repeatedly with delays.
Scroll tool AX-first execution and reporting
libs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs
Updates the scroll tool to try the AX helper inside focus suppression, fall back to key synthesis when needed, and report the method used in the result text.

Sequence Diagram(s)

sequenceDiagram
  participant ScrollTool as ScrollTool::invoke
  participant FocusGuard as focus_guard::with_focus_suppressed
  participant AXHelper as perform_ax_scroll_action_if_supported
  participant AXAPI as AXUIElementPerformAction
  participant Keys as key synthesis path
  participant Result as ToolResult::text

  ScrollTool->>FocusGuard: run scroll for direction/by/amount
  FocusGuard->>AXHelper: inspect advertised AX scroll actions
  alt action available
    AXHelper->>AXAPI: perform selected AX action amount times
    AXAPI-->>AXHelper: success or error
    AXHelper-->>FocusGuard: Ok(Some(action_name))
  else action absent
    AXHelper-->>FocusGuard: Ok(None)
    FocusGuard->>Keys: synthesize key events
    Keys-->>FocusGuard: Ok("key synthesis")
  end
  FocusGuard-->>ScrollTool: method
  ScrollTool->>Result: render "Scrolled ... via {method}"
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I’m a rabbit with a scroll-wheel grin,
AX hops first, then keys roll in.
Page by page, the list goes zoom,
And “via” tells the path in bloom.
🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and accurately summarizes the main change: preferring AX scroll actions on macOS.
Linked Issues check ✅ Passed The implementation queries advertised AX actions, performs matching scroll actions per step, and falls back to keystrokes when unsupported.
Out of Scope Changes check ✅ Passed The other edits are limited to the scroll path, messaging, and comments needed to support the AX-first behavior.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rs`:
- Around line 58-69: The scroll candidate selection in scroll_action_candidates
is mixing granularities, which can cause a page request to fall back to line
scrolling and vice versa. Update the match cases so each direction/by
combination returns only candidates for the requested granularity, and have the
fallback branch return no action (or equivalent None path) instead of
AXScrollDown/AXScrollDownByPage defaults. Keep the logic localized to
scroll_action_candidates so the caller can handle the missing match and preserve
the intended scroll amount.

In `@libs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs`:
- Around line 25-28: The schema description for the scroll tool is outdated
because `amount` is described only as keystroke repetitions even though
`scroll.rs` now uses native AX actions first when available. Update the tool
text in the scroll tool definition so the `amount` contract explicitly covers
both native AX action repetitions and fallback keypress repetitions, keeping the
`by='page'` and `by='line'` behavior description aligned with the AX-first path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9e1688a2-9a27-4819-bead-feb0fdd36dba

📥 Commits

Reviewing files that changed from the base of the PR and between 46efc5f and a02fffb.

📒 Files selected for processing (2)
  • libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rs
  • libs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs

Comment on lines +58 to +69
fn scroll_action_candidates(direction: &str, by: &str) -> [&'static str; 2] {
match (direction, by) {
("up", "page") => ["AXScrollUpByPage", "AXScrollUp"],
("down", "page") => ["AXScrollDownByPage", "AXScrollDown"],
("left", "page") => ["AXScrollLeftByPage", "AXScrollLeft"],
("right", "page") => ["AXScrollRightByPage", "AXScrollRight"],
("up", _) => ["AXScrollUp", "AXScrollUpByPage"],
("down", _) => ["AXScrollDown", "AXScrollDownByPage"],
("left", _) => ["AXScrollLeft", "AXScrollLeftByPage"],
("right", _) => ["AXScrollRight", "AXScrollRightByPage"],
_ => ["AXScrollDown", "AXScrollDownByPage"],
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Preserve the requested scroll granularity when selecting AX actions.

by="page" can currently fall back to AXScrollDown, and by="line" can fall back to AXScrollDownByPage. That bypasses the intended keystroke fallback with the wrong scroll amount when the matching AX action is not advertised. Only return same-granularity candidates and let None trigger fallback.

Proposed fix
-fn scroll_action_candidates(direction: &str, by: &str) -> [&'static str; 2] {
+fn scroll_action_candidates(direction: &str, by: &str) -> &'static [&'static str] {
     match (direction, by) {
-        ("up", "page") => ["AXScrollUpByPage", "AXScrollUp"],
-        ("down", "page") => ["AXScrollDownByPage", "AXScrollDown"],
-        ("left", "page") => ["AXScrollLeftByPage", "AXScrollLeft"],
-        ("right", "page") => ["AXScrollRightByPage", "AXScrollRight"],
-        ("up", _) => ["AXScrollUp", "AXScrollUpByPage"],
-        ("down", _) => ["AXScrollDown", "AXScrollDownByPage"],
-        ("left", _) => ["AXScrollLeft", "AXScrollLeftByPage"],
-        ("right", _) => ["AXScrollRight", "AXScrollRightByPage"],
-        _ => ["AXScrollDown", "AXScrollDownByPage"],
+        ("up", "page") => &["AXScrollUpByPage"],
+        ("down", "page") => &["AXScrollDownByPage"],
+        ("left", "page") => &["AXScrollLeftByPage"],
+        ("right", "page") => &["AXScrollRightByPage"],
+        ("up", "line") => &["AXScrollUp"],
+        ("down", "line") => &["AXScrollDown"],
+        ("left", "line") => &["AXScrollLeft"],
+        ("right", "line") => &["AXScrollRight"],
+        _ => &[],
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn scroll_action_candidates(direction: &str, by: &str) -> [&'static str; 2] {
match (direction, by) {
("up", "page") => ["AXScrollUpByPage", "AXScrollUp"],
("down", "page") => ["AXScrollDownByPage", "AXScrollDown"],
("left", "page") => ["AXScrollLeftByPage", "AXScrollLeft"],
("right", "page") => ["AXScrollRightByPage", "AXScrollRight"],
("up", _) => ["AXScrollUp", "AXScrollUpByPage"],
("down", _) => ["AXScrollDown", "AXScrollDownByPage"],
("left", _) => ["AXScrollLeft", "AXScrollLeftByPage"],
("right", _) => ["AXScrollRight", "AXScrollRightByPage"],
_ => ["AXScrollDown", "AXScrollDownByPage"],
}
fn scroll_action_candidates(direction: &str, by: &str) -> &'static [&'static str] {
match (direction, by) {
("up", "page") => &["AXScrollUpByPage"],
("down", "page") => &["AXScrollDownByPage"],
("left", "page") => &["AXScrollLeftByPage"],
("right", "page") => &["AXScrollRightByPage"],
("up", "line") => &["AXScrollUp"],
("down", "line") => &["AXScrollDown"],
("left", "line") => &["AXScrollLeft"],
("right", "line") => &["AXScrollRight"],
_ => &[],
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rs` around
lines 58 - 69, The scroll candidate selection in scroll_action_candidates is
mixing granularities, which can cause a page request to fall back to line
scrolling and vice versa. Update the match cases so each direction/by
combination returns only candidates for the requested granularity, and have the
fallback branch return no action (or equivalent None path) instead of
AXScrollDown/AXScrollDownByPage defaults. Keep the logic localized to
scroll_action_candidates so the caller can handle the missing match and preserve
the intended scroll amount.

Comment on lines +25 to +28
description: "Scroll the target pid's focused region.\n\n\
If the target element advertises a native AX scroll action, that action is used first. \
Otherwise, by='page' maps to PageDown/PageUp × amount and by='line' maps to \
DownArrow/UpArrow × amount. Horizontal variants use Left/Right arrow keys.\n\n\

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Make amount wording include native AX actions.

With the AX-first path, amount is no longer only “keystroke repetitions”; it also controls native AX action repetitions. Update the schema text so clients get the right contract.

Proposed fix
-                    "description": "Number of keystroke repetitions. Default: 3."
+                    "description": "Number of scroll action repetitions. Default: 3."

Also applies to: 46-50

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs` around lines
25 - 28, The schema description for the scroll tool is outdated because `amount`
is described only as keystroke repetitions even though `scroll.rs` now uses
native AX actions first when available. Update the tool text in the scroll tool
definition so the `amount` contract explicitly covers both native AX action
repetitions and fallback keypress repetitions, keeping the `by='page'` and
`by='line'` behavior description aligned with the AX-first path.

@injaneity

Copy link
Copy Markdown
Collaborator Author

Updated per review:

  • scroll_action_candidates now returns None for unknown directions instead of defaulting to down-scroll actions.
  • The amount schema text now covers both native AX action repetitions and fallback keypress repetitions.

I kept same-direction AXScroll* / AXScroll*ByPage fallback ordering intentionally: by chooses preferred granularity, but macOS elements may only advertise one native scroll action family. Using an advertised same-direction AX action remains preferable to falling back to synthesized keys, and preserves the reported #1895 case where by defaults to line but the element advertises AXScrollUpByPage.

Validation: git diff --check.

@injaneity injaneity force-pushed the fix/1895-macos-scroll-ax-actions branch from a02fffb to c99a365 Compare June 25, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cua-driver scroll tool: prefer AXScrollUpByPage AX action over arrow-key synthesis when element exposes it (Catalyst regression)

1 participant