fix(cua-driver/macos): prefer AX scroll actions#2031
Conversation
|
@injaneity is attempting to deploy a commit to the Cua Team on Vercel. A member of the Team first needs to authorize it. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe macOS scroll tool now checks for a matching native AX scroll action first, performs it when available, and falls back to synthesized key events otherwise. The helper reports which method ran, and the success text now includes that method. ChangesAX scroll preference and reporting
Sequence Diagram(s)sequenceDiagram
participant ScrollTool as ScrollTool::invoke
participant FocusGuard as focus_guard::with_focus_suppressed
participant AXHelper as perform_ax_scroll_action_if_supported
participant AXAPI as AXUIElementPerformAction
participant Keys as key synthesis path
participant Result as ToolResult::text
ScrollTool->>FocusGuard: run scroll for direction/by/amount
FocusGuard->>AXHelper: inspect advertised AX scroll actions
alt action available
AXHelper->>AXAPI: perform selected AX action amount times
AXAPI-->>AXHelper: success or error
AXHelper-->>FocusGuard: Ok(Some(action_name))
else action absent
AXHelper-->>FocusGuard: Ok(None)
FocusGuard->>Keys: synthesize key events
Keys-->>FocusGuard: Ok("key synthesis")
end
FocusGuard-->>ScrollTool: method
ScrollTool->>Result: render "Scrolled ... via {method}"
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rs`:
- Around line 58-69: The scroll candidate selection in scroll_action_candidates
is mixing granularities, which can cause a page request to fall back to line
scrolling and vice versa. Update the match cases so each direction/by
combination returns only candidates for the requested granularity, and have the
fallback branch return no action (or equivalent None path) instead of
AXScrollDown/AXScrollDownByPage defaults. Keep the logic localized to
scroll_action_candidates so the caller can handle the missing match and preserve
the intended scroll amount.
In `@libs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs`:
- Around line 25-28: The schema description for the scroll tool is outdated
because `amount` is described only as keystroke repetitions even though
`scroll.rs` now uses native AX actions first when available. Update the tool
text in the scroll tool definition so the `amount` contract explicitly covers
both native AX action repetitions and fallback keypress repetitions, keeping the
`by='page'` and `by='line'` behavior description aligned with the AX-first path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9e1688a2-9a27-4819-bead-feb0fdd36dba
📒 Files selected for processing (2)
libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rslibs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs
| fn scroll_action_candidates(direction: &str, by: &str) -> [&'static str; 2] { | ||
| match (direction, by) { | ||
| ("up", "page") => ["AXScrollUpByPage", "AXScrollUp"], | ||
| ("down", "page") => ["AXScrollDownByPage", "AXScrollDown"], | ||
| ("left", "page") => ["AXScrollLeftByPage", "AXScrollLeft"], | ||
| ("right", "page") => ["AXScrollRightByPage", "AXScrollRight"], | ||
| ("up", _) => ["AXScrollUp", "AXScrollUpByPage"], | ||
| ("down", _) => ["AXScrollDown", "AXScrollDownByPage"], | ||
| ("left", _) => ["AXScrollLeft", "AXScrollLeftByPage"], | ||
| ("right", _) => ["AXScrollRight", "AXScrollRightByPage"], | ||
| _ => ["AXScrollDown", "AXScrollDownByPage"], | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Preserve the requested scroll granularity when selecting AX actions.
by="page" can currently fall back to AXScrollDown, and by="line" can fall back to AXScrollDownByPage. That bypasses the intended keystroke fallback with the wrong scroll amount when the matching AX action is not advertised. Only return same-granularity candidates and let None trigger fallback.
Proposed fix
-fn scroll_action_candidates(direction: &str, by: &str) -> [&'static str; 2] {
+fn scroll_action_candidates(direction: &str, by: &str) -> &'static [&'static str] {
match (direction, by) {
- ("up", "page") => ["AXScrollUpByPage", "AXScrollUp"],
- ("down", "page") => ["AXScrollDownByPage", "AXScrollDown"],
- ("left", "page") => ["AXScrollLeftByPage", "AXScrollLeft"],
- ("right", "page") => ["AXScrollRightByPage", "AXScrollRight"],
- ("up", _) => ["AXScrollUp", "AXScrollUpByPage"],
- ("down", _) => ["AXScrollDown", "AXScrollDownByPage"],
- ("left", _) => ["AXScrollLeft", "AXScrollLeftByPage"],
- ("right", _) => ["AXScrollRight", "AXScrollRightByPage"],
- _ => ["AXScrollDown", "AXScrollDownByPage"],
+ ("up", "page") => &["AXScrollUpByPage"],
+ ("down", "page") => &["AXScrollDownByPage"],
+ ("left", "page") => &["AXScrollLeftByPage"],
+ ("right", "page") => &["AXScrollRightByPage"],
+ ("up", "line") => &["AXScrollUp"],
+ ("down", "line") => &["AXScrollDown"],
+ ("left", "line") => &["AXScrollLeft"],
+ ("right", "line") => &["AXScrollRight"],
+ _ => &[],
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn scroll_action_candidates(direction: &str, by: &str) -> [&'static str; 2] { | |
| match (direction, by) { | |
| ("up", "page") => ["AXScrollUpByPage", "AXScrollUp"], | |
| ("down", "page") => ["AXScrollDownByPage", "AXScrollDown"], | |
| ("left", "page") => ["AXScrollLeftByPage", "AXScrollLeft"], | |
| ("right", "page") => ["AXScrollRightByPage", "AXScrollRight"], | |
| ("up", _) => ["AXScrollUp", "AXScrollUpByPage"], | |
| ("down", _) => ["AXScrollDown", "AXScrollDownByPage"], | |
| ("left", _) => ["AXScrollLeft", "AXScrollLeftByPage"], | |
| ("right", _) => ["AXScrollRight", "AXScrollRightByPage"], | |
| _ => ["AXScrollDown", "AXScrollDownByPage"], | |
| } | |
| fn scroll_action_candidates(direction: &str, by: &str) -> &'static [&'static str] { | |
| match (direction, by) { | |
| ("up", "page") => &["AXScrollUpByPage"], | |
| ("down", "page") => &["AXScrollDownByPage"], | |
| ("left", "page") => &["AXScrollLeftByPage"], | |
| ("right", "page") => &["AXScrollRightByPage"], | |
| ("up", "line") => &["AXScrollUp"], | |
| ("down", "line") => &["AXScrollDown"], | |
| ("left", "line") => &["AXScrollLeft"], | |
| ("right", "line") => &["AXScrollRight"], | |
| _ => &[], | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libs/cua-driver/rust/crates/platform-macos/src/input/ax_actions.rs` around
lines 58 - 69, The scroll candidate selection in scroll_action_candidates is
mixing granularities, which can cause a page request to fall back to line
scrolling and vice versa. Update the match cases so each direction/by
combination returns only candidates for the requested granularity, and have the
fallback branch return no action (or equivalent None path) instead of
AXScrollDown/AXScrollDownByPage defaults. Keep the logic localized to
scroll_action_candidates so the caller can handle the missing match and preserve
the intended scroll amount.
| description: "Scroll the target pid's focused region.\n\n\ | ||
| If the target element advertises a native AX scroll action, that action is used first. \ | ||
| Otherwise, by='page' maps to PageDown/PageUp × amount and by='line' maps to \ | ||
| DownArrow/UpArrow × amount. Horizontal variants use Left/Right arrow keys.\n\n\ |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Make amount wording include native AX actions.
With the AX-first path, amount is no longer only “keystroke repetitions”; it also controls native AX action repetitions. Update the schema text so clients get the right contract.
Proposed fix
- "description": "Number of keystroke repetitions. Default: 3."
+ "description": "Number of scroll action repetitions. Default: 3."Also applies to: 46-50
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libs/cua-driver/rust/crates/platform-macos/src/tools/scroll.rs` around lines
25 - 28, The schema description for the scroll tool is outdated because `amount`
is described only as keystroke repetitions even though `scroll.rs` now uses
native AX actions first when available. Update the tool text in the scroll tool
definition so the `amount` contract explicitly covers both native AX action
repetitions and fallback keypress repetitions, keeping the `by='page'` and
`by='line'` behavior description aligned with the AX-first path.
|
Updated per review:
I kept same-direction Validation: |
a02fffb to
c99a365
Compare
Summary
AXScroll*andAXScroll*ByPageaction families, ordered by requestedbygranularity.Rationale
This keeps the existing Cua scroll flow intact: resolve/retain the target element, focus under suppression, then scroll. The only change is to use the native AX action the target element advertises before using synthesized keyboard input as a fallback.
Testing
git diff --checkcargo check/rustfmtbecause no Rust toolchain is installed in this environment (rustup toolchain listreports none).Fixes #1895
Summary by CodeRabbit
New Features
Bug Fixes