Skip to content

feat(a2a): A2A (Agent-to-Agent) gateway — DP MVP#717

Open
moonming wants to merge 7 commits into
mainfrom
feat/a2a-gateway-mvp
Open

feat(a2a): A2A (Agent-to-Agent) gateway — DP MVP#717
moonming wants to merge 7 commits into
mainfrom
feat/a2a-gateway-mvp

Conversation

@moonming

@moonming moonming commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

What

Data-plane MVP of an A2A (Agent-to-Agent) gateway: register an upstream agent that speaks A2A over HTTP (JSON-RPC 2.0) as a first-class resource, front it at /a2a/<agent>, and govern every call with the same pipeline as LLM and MCP traffic — one API key, per-agent ACL, rate-limit + budget, and usage. This is the third traffic type on the single DP pipeline, mirroring the MCP gateway.

Design issue: AISIX-Cloud#958 (extends the #873 gap ③ "MCP / Agent 网关" — MCP half shipped as #894, this is the Agent half).

How it's built (6 commits, each self-contained)

Layer Commit
A2aAgent resource model (clone of McpServer + protocol_version 1.0/0.3) feat(a2a): add A2aAgent resource model
aisix-a2a crate — hand-rolled JSON-RPC client behind A2aBridge feat(a2a): add aisix-a2a crate
Snapshot table + etcd loader/supervisor + api_key.allowed_agents ACL feat(a2a): load A2aAgent into the snapshot
/a2a/:agent proxy endpoint + agent-card URL rewrite feat(a2a): serve the /a2a/:agent gateway endpoint
DP admin CRUD (/admin/v1/a2a_agents) feat(a2a): DP admin CRUD for a2a_agents
Admin OpenAPI docs docs(a2a): document /admin/v1/a2a_agents

Governance reuse (the point): the /a2a handler calls the exact same functions as /mcpAuthenticatedKey (401) → can_access_agent (403) → quota::enforce rate-limit + budget (429) → UsageEvent into the shared sink. No new governance code.

Reference implementations (per repo policy)

Compared how the two mainstream gateways front A2A before building:

  • LiteLLM (docs.litellm.ai/docs/a2a): registers agents, proxies POST /a2a/{id} with a virtual key + team ACL (403 on deny), serves the discovery card with the URL rewritten to the gateway, forwards non-messaging methods upstream. This MVP lands the same shape (per-agent path, key ACL, card rewrite, verbatim forward).
  • Kong Agent Gateway (developer.konghq.com/ai-gateway/a2a/): auto-detects A2A, rewrites the agent-card URL, and attaches auth/ACL/observability policies — but its LLM guardrail family is not documented over A2A. Our differentiator (guardrails over A2A on the same pipeline) is deliberately Phase 2 here.

Wire facts verified against the A2A spec and cited in code comments:

  • Agent-card discovery: https://{domain}/.well-known/agent-card.json (RFC 8615, domain origin) — a2a-protocol.org/latest/topics/agent-discovery/
  • message/send JSON-RPC envelope differs between A2A 0.3 (message/send, kind-discriminated) and 1.0 (SendMessage, PascalCase, result.task) — a2a-protocol.org/latest/topics/life-of-a-task/. The bridge forwards requests verbatim and does not translate between versions, so a single agent is reached in whichever version it's pinned to; normalization is a later step.

Divergence from MCP, called out because it's intentional: A2A has no official Rust SDK (the reference SDKs are Python/JS/Java/Go/.NET), so unlike the MCP gateway (which uses the official rmcp), the JSON-RPC + agent-card plumbing is hand-rolled directly on the workspace HTTP client, kept behind the A2aBridge trait.

Test plan

  • aisix-a2a: 6 unit + 3 real-HTTP roundtrip tests against a locally spawned A2A server — card fetch, message/send, and credential forwarding for bearer / api_key / none (proves the gateway-held credential reaches the upstream and only the upstream).
  • aisix-core: A2aAgent model (9) + ApiKey::can_access_agent ACL + validate_a2a_agent schema — full suite green (340).
  • aisix-admin: 4 a2a handler tests (slash/secret/oauth2 validation) + OpenAPI parity (openapi_documents_exact_admin_path_set) — full suite green (94).
  • aisix-proxy: endpoint helper tests (error-status mapping, gateway-base rewrite) (3).
  • cargo clippy -D warnings clean on every touched crate; cargo fmt; schemas regenerated (dump-schema); aisix-server binary builds.
  • Endpoint-level e2e (real DP + real agent) — deferred to the e2e harness (the aisix#674 analog), same as the MCP endpoint shipped.

Deferred (by design, tracked)

  • Guardrails over A2A — Phase 2 (the differentiator; the endpoint has the hook point but no scan yet).
  • OAuth2 upstream auth — returns 501 Not Implemented for now; none/bearer/api_key work.
  • 0.3 ↔ 1.0 wire normalization — Phase 2.

⚠️ Requires a paired CP PR before it's user-reachable

Per this repo's rule (a config knob isn't shipped until the control plane exposes it), a user cannot register an A2aAgent until AISIX-Cloud grows an org-scoped a2a_agents resource (model + RLS + secretbox + cp-admin.yaml + DP-push fan-out by allowed_environments) and a Dashboard page. cp-admin.yaml is a closed schema that will reject the new fields until then. That CP PR is the immediate follow-up; this DP PR is the mergeable first half (DP-first, same as MCP).

Summary by CodeRabbit

  • New Features

    • Added support for registering and managing A2A agents in admin tools.
    • Added A2A gateway endpoints for sending requests and retrieving agent cards.
    • API keys can now limit access to specific agents.
    • Usage reporting now includes A2A agent and method details.
  • Bug Fixes

    • Improved handling of agent configuration, access checks, timeouts, and upstream forwarding.
    • Updated snapshot and storage handling so A2A agents are consistently loaded and returned.

moonming added 6 commits July 3, 2026 16:46
First slice of the A2A (Agent-to-Agent) gateway: register an upstream
agent that speaks A2A over HTTP (JSON-RPC 2.0) as a first-class resource,
mirroring McpServer. Adds a pinned protocol_version (1.0/0.3) and the
same upstream auth shape (none/bearer/api_key/oauth2). Wired into the
aisix-core model module and re-exported. No runtime path yet.
Hand-rolled JSON-RPC 2.0 client behind the A2aBridge trait (A2A has no
official Rust SDK). Fetches the upstream agent card from its RFC 8615
well-known URI (/.well-known/agent-card.json) and forwards JSON-RPC
requests verbatim to the agent's service endpoint, holding the upstream
credential (none/bearer/api_key) so the calling client never sees it.
Does not translate between the A2A 0.3 and 1.0 wire formats — a single
agent is reached in whichever version it speaks (pinned on the resource).

Tested against a locally spawned real A2A server: card fetch, message/send
roundtrip, and credential forwarding for each auth type. oauth2 upstream
auth is rejected as unsupported for now.
Make the A2aAgent resource reachable on the hot path: add the a2a_agents
table to AisixSnapshot, wire the etcd loader + watch supervisor to
populate and maintain it (mirroring mcp_servers), add the JSON Schema
validator (validate_a2a_agent) plus the dumped schemas/resources file,
and add per-agent access control to ApiKey (allowed_agents +
can_access_agent, mirroring allowed_tools). The DP admin ApiKey body and
response carry allowed_agents so the ACL is configurable there too.

Read/serve plumbing only — the /a2a proxy endpoint and admin CRUD follow.
Front each registered A2A agent through /a2a/<agent>: forward JSON-RPC
requests to the upstream via the aisix-a2a bridge, and serve its card —
with the advertised service URL rewritten to this gateway — at the
RFC 8615 well-known path. Every call runs the same governance as LLM and
MCP traffic: AuthenticatedKey (401), per-agent ACL via allowed_agents
(403), quota::enforce rate-limit + budget (429), and a usage event
(a2a_agent_name / a2a_method) into the shared sink. The body is forwarded
verbatim, so no 0.3<->1.0 translation happens here. Guardrails over A2A
message content are a later step.
Add /admin/v1/a2a_agents CRUD, cloning the mcp_servers handlers: schema
validation, duplicate display_name (409), uuid on POST, revision bump on
PUT, and the same per-auth_type credential coupling. The display_name is
the agent's URL path segment (/a2a/<name>), so it is rejected if it
contains `/`. Wire the a2a_agents table through the ConfigStore trait
(in-memory + etcd-backed, subkey "a2a_agents") and the admin router.

OpenAPI documentation for the new paths follows in the next commit.
Add the a2a_agents collection and item paths (get/post, get/put/delete),
the A2aAgentEntry response wrapper, and register the generated A2aAgent
schema. Keeps the OpenAPI path set matching the admin router exactly
(openapi_documents_exact_admin_path_set) and the resource schema
components complete.
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@moonming, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 40 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ad8f466-2fb8-4b15-aba2-23aab9d34a01

📥 Commits

Reviewing files that changed from the base of the PR and between 83bbb42 and 4bba8c2.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • crates/aisix-a2a/Cargo.toml
  • crates/aisix-a2a/src/bridge.rs
  • crates/aisix-a2a/tests/upstream_roundtrip.rs
  • crates/aisix-proxy/src/a2a.rs
📝 Walkthrough

Walkthrough

Adds an A2A (agent-to-agent) JSON-RPC gateway: a new aisix-a2a crate implementing an upstream bridge, an A2aAgent resource model wired through schema/snapshot/etcd/store layers, admin CRUD endpoints with OpenAPI docs, proxy-side gateway routes, and API key allowlisting plus usage telemetry for A2A calls.

Changes

A2A gateway feature

Layer / File(s) Summary
Bridge crate: types, HTTP client, tests
crates/aisix-a2a/{Cargo.toml,src/bridge.rs,src/error.rs,src/lib.rs,tests/upstream_roundtrip.rs}, Cargo.toml
New crate defines A2aAuth, A2aUpstream, AgentCard, the A2aBridge trait, HttpBridge (reqwest-based), A2aError, and integration/unit tests covering auth mapping, timeouts, and card round-tripping.
A2aAgent model, schema, snapshot, allowlisting
crates/aisix-core/src/models/{a2a_agent.rs,mod.rs,schema.rs,snapshot.rs,apikey.rs}, crates/aisix-core/{src/lib.rs,src/bin/dump-schema.rs}, schemas/resources/{a2a_agent.schema.json,api_key.schema.json}, crates/aisix-obs/src/usage.rs
Adds A2aAgent/A2aAuthType/A2aProtocolVersion, JSON schema generation/validation, a snapshot ResourceTable, ApiKey.allowed_agents glob-based allowlist, and UsageEvent A2A attribution fields.
Persistence and sync
crates/aisix-admin/src/{store.rs,etcd_store.rs}, crates/aisix-etcd/src/{loader.rs,supervisor.rs}
Adds in-memory and etcd ConfigStore CRUD for A2aAgent, snapshot loader validation, and watch-supervisor put/delete/clone handling.
Admin CRUD and OpenAPI docs
crates/aisix-admin/src/{a2a_agents_handlers.rs,lib.rs,apikeys_handlers.rs,openapi.rs}
Adds admin list/get/create/update/delete handlers with validation and uniqueness checks, mounts routes, exposes allowed_agents on API key payloads, and documents new paths/schemas in OpenAPI.
Proxy gateway routes
crates/aisix-proxy/{Cargo.toml,src/a2a.rs,src/lib.rs}
Adds /a2a/:agent JSON-RPC forwarding and agent-card endpoints with ACL/quota enforcement, error mapping, usage emission, and router/metrics wiring.

Estimated code review effort: 4 (Complex) | ~75 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant ProxyA2aEndpoint
  participant HttpBridge
  participant UpstreamAgent
  Caller->>ProxyA2aEndpoint: POST /a2a/:agent (JSON-RPC)
  ProxyA2aEndpoint->>ProxyA2aEndpoint: load agent, check ACL, enforce quota
  ProxyA2aEndpoint->>HttpBridge: send(request)
  HttpBridge->>UpstreamAgent: forward JSON-RPC (with auth)
  UpstreamAgent-->>HttpBridge: JSON-RPC response
  HttpBridge-->>ProxyA2aEndpoint: response
  ProxyA2aEndpoint->>ProxyA2aEndpoint: emit UsageEvent
  ProxyA2aEndpoint-->>Caller: JSON-RPC response
Loading
sequenceDiagram
  participant Admin
  participant A2aAgentsHandler
  participant ConfigStore
  Admin->>A2aAgentsHandler: POST /admin/v1/a2a_agents
  A2aAgentsHandler->>A2aAgentsHandler: validate + decode payload
  A2aAgentsHandler->>ConfigStore: check display_name uniqueness
  A2aAgentsHandler->>ConfigStore: put_a2a_agent
  ConfigStore-->>A2aAgentsHandler: stored entry
  A2aAgentsHandler-->>Admin: ResourceEntry<A2aAgent>
Loading

Possibly related PRs

  • api7/aisix#516: Both PRs extend UsageEvent in crates/aisix-obs/src/usage.rs with additional telemetry fields affecting OTLP span attributes.
  • api7/aisix#534: Both PRs modify the same UsageEvent struct/serialization in crates/aisix-obs/src/usage.rs.
  • api7/aisix#667: Both PRs extend crates/aisix-proxy/src/lib.rs endpoint-label and inbound-protocol normalization logic for new route prefixes.

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Security Check ❌ Error A2aAgent.secret is stored as plain text and admin handlers return ResourceEntry directly, exposing credentials in responses and etcd. Redact secret from read responses and add encrypted/secret-wrapped storage for upstream credentials before persisting them.
E2e Test Quality Review ⚠️ Warning The only real-HTTP coverage is HttpBridge vs a local upstream; proxy/admin changes are unit/schema-only, so the full /a2a flow isn’t exercised. Add an e2e test that drives /a2a/:agent through the router with a live upstream and snapshot/store, covering auth, ACL, 404/403/400, quota, and card rewrite.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: an A2A gateway data-plane MVP.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/a2a-gateway-mvp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/aisix-admin/src/openapi.rs (1)

3488-3539: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

ApiKeyRequest/PublicApiKey OpenAPI schemas are missing the new allowed_agents field.

apikeys_handlers.rs added allowed_agents to both the request body and public response DTOs, but these component schemas here still only document allowed_tools and set additionalProperties: false. The generated OpenAPI contract no longer matches the actual accepted/returned payload shape.

As per coding guidelines: "After changing Admin API routes, OpenAPI metadata, or generated descriptions, verify the OpenAPI output ... and inspect the served OpenAPI if generated descriptions changed."

📝 Proposed fix (apply to both `ApiKeyRequest` and `PublicApiKey`)
           "allowed_tools": {
             "type": [
               "array",
               "null"
             ],
             "items": {
               "type": "string"
             },
             "description": "MCP tools this key may call, ..."
           },
+          "allowed_agents": {
+            "type": [
+              "array",
+              "null"
+            ],
+            "items": {
+              "type": "string"
+            },
+            "description": "A2A agents this key may reach, named by their registered names. Entries are matched as single-`*` globs, mirroring `allowed_tools`: `\"*\"` grants every agent and an entry without a `*` matches one agent exactly. When omitted or set to `null`, the key has no A2A agent access — access is granted explicitly."
+          },
           "expires_at": {

Also applies to: 3834-3890

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-admin/src/openapi.rs` around lines 3488 - 3539, The OpenAPI
component schemas for ApiKeyRequest and PublicApiKey are missing the new
allowed_agents property, so the documented contract no longer matches the DTOs
used by apikeys_handlers.rs. Update both schema definitions to include
allowed_agents with the correct array/null shape and description, and ensure it
is listed alongside allowed_tools before keeping additionalProperties set to
false. Verify the generated OpenAPI output still reflects the request and
response models after the schema change.

Source: Coding guidelines

♻️ Duplicate comments (1)
crates/aisix-admin/src/a2a_agents_handlers.rs (1)

24-30: 🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

Secrets returned unmasked via list/get.

list_a2a_agents/get_a2a_agent return the full A2aAgent including secret/token_url fields in plaintext to any Admin API caller. See the related comment on the model file; consider redacting credential fields on read.

Also applies to: 32-43

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-admin/src/a2a_agents_handlers.rs` around lines 24 - 30, The read
handlers for A2A agents are returning credential fields in plaintext, so redact
sensitive data before serializing responses. Update list_a2a_agents and
get_a2a_agent to mask or omit secret/token_url on the returned A2aAgent inside
ResourceEntry, and keep the redaction logic consistent with the model’s handling
so all read paths use the same sanitized representation.
🧹 Nitpick comments (6)
crates/aisix-a2a/Cargo.toml (1)

25-28: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant aisix-core dev-dependency.

aisix-core is already a regular dependency (Line 12) and is automatically available to tests; re-declaring it identically under [dev-dependencies] (Line 26) is unnecessary.

♻️ Proposed cleanup
 [dev-dependencies]
-aisix-core = { path = "../aisix-core" }
 tokio = { workspace = true, features = ["macros", "rt-multi-thread"] }
 axum.workspace = true
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-a2a/Cargo.toml` around lines 25 - 28, The Cargo.toml for the
aisix-a2a crate has a redundant aisix-core entry under [dev-dependencies] even
though aisix-core is already declared as a normal dependency and is available to
tests. Remove the duplicate aisix-core dev-dependency and leave the other
test-only dependencies (such as tokio and axum.workspace) unchanged.
crates/aisix-proxy/src/a2a.rs (2)

138-172: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Quota check runs after the full request body is buffered/parsed.

crate::quota::enforce (Line 157) only needs the auth key, not the parsed body/method — yet it runs after to_bytes/serde_json::from_slice (Lines 139-147). An already-throttled caller still pays the cost of the gateway buffering and parsing its full body (up to request_body_limit_bytes) before being rejected. Moving the quota check earlier would save that work, at the cost of losing the method label on quota-rejection usage events (would become "", as already happens on the oauth2-unsupported path).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-proxy/src/a2a.rs` around lines 138 - 172, The quota gate in
`a2a` is applied too late, after `to_bytes` and `serde_json::from_slice` have
already buffered and parsed the full request body. Move the
`crate::quota::enforce` call in the `a2a` request handling path to run
immediately after auth is available and before body parsing, using the existing
`method` fallback behavior for quota-failure usage events if the body has not
been inspected yet. Keep the current error/usage reporting flow intact in the
`a2a` handler while avoiding unnecessary work for rejected requests.

212-251: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

a2a_agent_card doesn't record AccessLog/state.metrics.record_request, unlike a2a_endpoint.

a2a_endpoint wraps dispatch with per-request AccessLog.emit() and state.metrics.record_request(...) (Lines 66-92), but a2a_agent_card is wired directly as the route handler with no equivalent instrumentation. Card-fetch requests are invisible to the same status/latency dashboards that track the main A2A traffic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-proxy/src/a2a.rs` around lines 212 - 251, a2a_agent_card is
missing the same request instrumentation used by a2a_endpoint, so card fetches
never emit AccessLog or update state.metrics.record_request. Wrap the handler
logic in the same per-request logging/metrics flow as dispatch in a2a_endpoint,
using the a2a_agent_card function as the entry point and recording the final
status and latency for both success and early-return paths.
crates/aisix-a2a/src/bridge.rs (1)

210-230: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Non-2xx upstream responses discard the JSON-RPC error body.

send() treats any non-success HTTP status as a hard error and returns only a synthesized "upstream returned HTTP {status}" message (Lines 221-226), discarding whatever JSON-RPC error envelope the upstream may have returned in the body. Per the module's own "forwards ... verbatim" contract (Lines 17-20), an upstream that legitimately returns a structured JSON-RPC error alongside a non-2xx status loses that detail for the caller.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-a2a/src/bridge.rs` around lines 210 - 230, The send() method in
bridge.rs is dropping JSON-RPC error details for non-2xx upstream responses by
returning only a synthesized HTTP status message. Update send() so it still
reads and attempts to parse the response body when resp.status().is_success() is
false, and preserve any upstream JSON-RPC error envelope instead of replacing it
with a generic A2aError::Request; keep the existing success-path parsing and use
the same send/apply_auth flow to locate the fix.
crates/aisix-a2a/tests/upstream_roundtrip.rs (1)

80-128: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Solid happy-path coverage; consider adding a failure-path test.

The three tests thoroughly verify credential propagation for Bearer/ApiKey/None auth. Consider also adding a case where the upstream returns a non-success status or malformed body, to assert the HttpBridge maps it to A2aError::Connect/A2aError::Request as documented, closing the loop with the proxy's a2a_error_status mapping shown in crates/aisix-proxy/src/a2a.rs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-a2a/tests/upstream_roundtrip.rs` around lines 80 - 128, The
current tests only cover successful credential forwarding and do not verify how
HttpBridge handles upstream failures. Add a failure-path test around
HttpBridge::fetch_agent_card and/or HttpBridge::send that simulates a
non-success status or malformed response body, and assert the error maps to
A2aError::Connect or A2aError::Request as expected. Use the existing test
helpers and A2aUpstream setup so the new case complements
fetches_card_and_forwards_bearer, forwards_api_key_header, and
sends_no_credential_when_none while covering the proxy error mapping behavior.
crates/aisix-core/src/models/a2a_agent.rs (1)

82-87: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

State the actual default timeout instead of "a built-in default."

Per coding guidelines, omitted-field fallback behavior should be described accurately. Naming the concrete default (e.g., "30000 ms") is more useful to API consumers than a vague reference.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-core/src/models/a2a_agent.rs` around lines 82 - 87, Update the
`timeout_ms` field documentation in `A2aAgent` so it names the actual
omitted-field fallback instead of saying “a built-in default.” Keep the existing
constraints, but replace the vague wording with the concrete default timeout
value used by the gateway, so API consumers can see the exact behavior from the
`timeout_ms` doc comment.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/aisix-a2a/src/bridge.rs`:
- Around line 154-168: `HttpBridge::new` currently creates its own
`reqwest::Client`, which prevents connection reuse when `/a2a` handlers
construct a new bridge per request. Update `HttpBridge` so it accepts a shared
`reqwest::Client` (instead of calling `Client::new()` internally) and thread
that client through the bridge construction path used by the `/a2a` handlers to
preserve the connection pool.

In `@crates/aisix-admin/src/a2a_agents_handlers.rs`:
- Around line 45-58: The uniqueness check in create_a2a_agent (and the similar
update_a2a_agent flow) is currently a read-then-write pattern that can race
under concurrent requests. Move the display_name uniqueness enforcement into the
store layer so put_a2a_agent/update logic performs an atomic constraint check,
or back it with a unique index, and have create_a2a_agent rely on that atomic
failure instead of calling list_a2a_agents plus assert_unique_display_name
before writing.

In `@crates/aisix-admin/src/openapi.rs`:
- Around line 1909-2003: Add the missing 409 conflict response to the PUT
handler entry for the A2A agent update operation in openapi.rs. The A2A agent
update documentation currently lists validation, auth, not found, body-size,
content-type, and store errors, but omits the duplicate display_name case
handled by the update path. Update the responses section for the A2aAgent update
endpoint to include a 409 Duplicate display_name response, matching the conflict
semantics already documented for the POST flow and keeping the A2A Agents
operation description consistent.

In `@crates/aisix-core/src/models/a2a_agent.rs`:
- Around line 18-24: Add a direct doc comment on the public A2aAgent struct so
schemars::JsonSchema can populate the OpenAPI description from the type itself,
since the existing module-level //! docs are not used for this. Update the
comment immediately above A2aAgent in the model definition to describe the
resource clearly, matching the style used by other Admin API resource models.
After documenting the struct, regenerate the A2aAgent schema output so the new
description is reflected in schemas/resources/a2a_agent.schema.json.
- Around line 49-56: The Admin API read paths are exposing stored upstream
credentials because `ResourceEntry<A2aAgent>` is returned directly from
`list_a2a_agents`, `get_a2a_agent`, `create_a2a_agent`, and `update_a2a_agent`.
Update the `A2aAgent` response flow so `secret` is omitted from serialized read
output, either by introducing a separate public/read DTO or by mapping the
existing model before returning it; use the `A2aAgent` struct and those four
handler methods as the main places to adjust.

In `@crates/aisix-proxy/src/a2a.rs`:
- Around line 95-172: The dispatch path in a2a::dispatch is missing usage
reporting for early rejections, so 404 unknown agent, 403 ACL denied, and 400
invalid body responses should also call emit_a2a_usage before returning. Add the
same usage event emission used in the oauth2 and quota error paths, using the
current request_id, agent, resolved method when available (or empty string for
pre-parse failures), the actual response status, and Duration::ZERO so these
failures are tracked consistently with the rest of the A2A pipeline.
- Around line 104-118: The A2A handler in a2a.rs currently returns 404 from the
snapshot/a2a_agents lookup before running the auth.key().can_access_agent check,
which lets callers distinguish unknown/disabled agents from ACL failures. Rework
the A2A request path in this handler so authorization is evaluated before
revealing whether an agent exists, or otherwise collapse both outcomes to the
same response status/message; use the existing snapshot, entry lookup, and
can_access_agent logic to keep the check centralized.

---

Outside diff comments:
In `@crates/aisix-admin/src/openapi.rs`:
- Around line 3488-3539: The OpenAPI component schemas for ApiKeyRequest and
PublicApiKey are missing the new allowed_agents property, so the documented
contract no longer matches the DTOs used by apikeys_handlers.rs. Update both
schema definitions to include allowed_agents with the correct array/null shape
and description, and ensure it is listed alongside allowed_tools before keeping
additionalProperties set to false. Verify the generated OpenAPI output still
reflects the request and response models after the schema change.

---

Duplicate comments:
In `@crates/aisix-admin/src/a2a_agents_handlers.rs`:
- Around line 24-30: The read handlers for A2A agents are returning credential
fields in plaintext, so redact sensitive data before serializing responses.
Update list_a2a_agents and get_a2a_agent to mask or omit secret/token_url on the
returned A2aAgent inside ResourceEntry, and keep the redaction logic consistent
with the model’s handling so all read paths use the same sanitized
representation.

---

Nitpick comments:
In `@crates/aisix-a2a/Cargo.toml`:
- Around line 25-28: The Cargo.toml for the aisix-a2a crate has a redundant
aisix-core entry under [dev-dependencies] even though aisix-core is already
declared as a normal dependency and is available to tests. Remove the duplicate
aisix-core dev-dependency and leave the other test-only dependencies (such as
tokio and axum.workspace) unchanged.

In `@crates/aisix-a2a/src/bridge.rs`:
- Around line 210-230: The send() method in bridge.rs is dropping JSON-RPC error
details for non-2xx upstream responses by returning only a synthesized HTTP
status message. Update send() so it still reads and attempts to parse the
response body when resp.status().is_success() is false, and preserve any
upstream JSON-RPC error envelope instead of replacing it with a generic
A2aError::Request; keep the existing success-path parsing and use the same
send/apply_auth flow to locate the fix.

In `@crates/aisix-a2a/tests/upstream_roundtrip.rs`:
- Around line 80-128: The current tests only cover successful credential
forwarding and do not verify how HttpBridge handles upstream failures. Add a
failure-path test around HttpBridge::fetch_agent_card and/or HttpBridge::send
that simulates a non-success status or malformed response body, and assert the
error maps to A2aError::Connect or A2aError::Request as expected. Use the
existing test helpers and A2aUpstream setup so the new case complements
fetches_card_and_forwards_bearer, forwards_api_key_header, and
sends_no_credential_when_none while covering the proxy error mapping behavior.

In `@crates/aisix-core/src/models/a2a_agent.rs`:
- Around line 82-87: Update the `timeout_ms` field documentation in `A2aAgent`
so it names the actual omitted-field fallback instead of saying “a built-in
default.” Keep the existing constraints, but replace the vague wording with the
concrete default timeout value used by the gateway, so API consumers can see the
exact behavior from the `timeout_ms` doc comment.

In `@crates/aisix-proxy/src/a2a.rs`:
- Around line 138-172: The quota gate in `a2a` is applied too late, after
`to_bytes` and `serde_json::from_slice` have already buffered and parsed the
full request body. Move the `crate::quota::enforce` call in the `a2a` request
handling path to run immediately after auth is available and before body
parsing, using the existing `method` fallback behavior for quota-failure usage
events if the body has not been inspected yet. Keep the current error/usage
reporting flow intact in the `a2a` handler while avoiding unnecessary work for
rejected requests.
- Around line 212-251: a2a_agent_card is missing the same request
instrumentation used by a2a_endpoint, so card fetches never emit AccessLog or
update state.metrics.record_request. Wrap the handler logic in the same
per-request logging/metrics flow as dispatch in a2a_endpoint, using the
a2a_agent_card function as the entry point and recording the final status and
latency for both success and early-return paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc715ddb-a499-4be3-b7e5-5675fd6bcefb

📥 Commits

Reviewing files that changed from the base of the PR and between 8bbc6a7 and 83bbb42.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (27)
  • Cargo.toml
  • crates/aisix-a2a/Cargo.toml
  • crates/aisix-a2a/src/bridge.rs
  • crates/aisix-a2a/src/error.rs
  • crates/aisix-a2a/src/lib.rs
  • crates/aisix-a2a/tests/upstream_roundtrip.rs
  • crates/aisix-admin/src/a2a_agents_handlers.rs
  • crates/aisix-admin/src/apikeys_handlers.rs
  • crates/aisix-admin/src/etcd_store.rs
  • crates/aisix-admin/src/lib.rs
  • crates/aisix-admin/src/openapi.rs
  • crates/aisix-admin/src/store.rs
  • crates/aisix-core/src/bin/dump-schema.rs
  • crates/aisix-core/src/lib.rs
  • crates/aisix-core/src/models/a2a_agent.rs
  • crates/aisix-core/src/models/apikey.rs
  • crates/aisix-core/src/models/mod.rs
  • crates/aisix-core/src/models/schema.rs
  • crates/aisix-core/src/models/snapshot.rs
  • crates/aisix-etcd/src/loader.rs
  • crates/aisix-etcd/src/supervisor.rs
  • crates/aisix-obs/src/usage.rs
  • crates/aisix-proxy/Cargo.toml
  • crates/aisix-proxy/src/a2a.rs
  • crates/aisix-proxy/src/lib.rs
  • schemas/resources/a2a_agent.schema.json
  • schemas/resources/api_key.schema.json

Comment thread crates/aisix-a2a/src/bridge.rs
Comment on lines +45 to +58
pub async fn create_a2a_agent(
_auth: AdminAuth,
State(state): State<AdminState>,
Json(raw): Json<Value>,
) -> Result<Json<ResourceEntry<A2aAgent>>, AdminError> {
let agent = decode(&raw)?;
let all = state.store.list_a2a_agents().await?;
assert_unique_display_name(&all, &agent.display_name, None)?;

let id = Uuid::new_v4().to_string();
let entry = ResourceEntry::new(&id, agent, STARTING_REVISION);
state.store.put_a2a_agent(entry.clone()).await?;
Ok(Json(entry))
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n## File outline\n'
ast-grep outline crates/aisix-admin/src/a2a_agents_handlers.rs --view expanded || true

printf '\n## Relevant symbol search\n'
rg -n "assert_unique_display_name|list_a2a_agents|put_a2a_agent|update_a2a_agent|display_name|/a2a/" crates/aisix-admin/src crates -g '!target' || true

printf '\n## Inspect handler file\n'
cat -n crates/aisix-admin/src/a2a_agents_handlers.rs | sed -n '1,260p'

printf '\n## Find store trait/impls for A2A agents\n'
rg -n "trait .*Store|struct .*Store|fn list_a2a_agents|fn put_a2a_agent|A2aAgent" crates -g '!target' || true

Repository: api7/aisix

Length of output: 50367


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n## a2a handler excerpt\n'
sed -n '1,170p' crates/aisix-admin/src/a2a_agents_handlers.rs

printf '\n## store trait excerpt\n'
sed -n '1,130p' crates/aisix-admin/src/store.rs

printf '\n## store A2A methods excerpt\n'
sed -n '220,320p' crates/aisix-admin/src/store.rs

printf '\n## etcd store A2A methods excerpt\n'
sed -n '340,430p' crates/aisix-admin/src/etcd_store.rs

printf '\n## a2a model excerpt\n'
sed -n '1,120p' crates/aisix-core/src/models/a2a_agent.rs

Repository: api7/aisix

Length of output: 23920


Avoid the check-then-put uniqueness pattern. Both create_a2a_agent and update_a2a_agent read all agents, check display_name, then write in separate steps. Concurrent requests can still persist duplicate display_name values, which breaks the /a2a/<display_name> routing invariant. Enforce uniqueness atomically in the store or via a unique index.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-admin/src/a2a_agents_handlers.rs` around lines 45 - 58, The
uniqueness check in create_a2a_agent (and the similar update_a2a_agent flow) is
currently a read-then-write pattern that can race under concurrent requests.
Move the display_name uniqueness enforcement into the store layer so
put_a2a_agent/update logic performs an atomic constraint check, or back it with
a unique index, and have create_a2a_agent rely on that atomic failure instead of
calling list_a2a_agents plus assert_unique_display_name before writing.

Comment on lines +1909 to +2003
"put": {
"summary": "Update A2A Agent by ID",
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/A2aAgent"
}
}
},
"description": "Replacement upstream A2A agent configuration."
},
"responses": {
"200": {
"description": "OK",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/A2aAgentEntry"
}
}
}
},
"400": {
"description": "Schema validation failed, the JSON body is malformed, `display_name` contains a `/` (it is the agent's URL path segment), or the credentials required by `auth_type` are missing (`secret` for `bearer`/`api_key`; `client_id`, `token_url`, and `secret` for `oauth2`)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/AdminError"
}
},
"text/plain": {
"schema": {
"type": "string"
}
}
}
},
"401": {
"description": "Missing or invalid admin key",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/AdminError"
}
}
}
},
"404": {
"description": "Resource not found",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/AdminError"
}
}
}
},
"413": {
"description": "JSON request body exceeds the admin body-size limit",
"content": {
"text/plain": {
"schema": {
"type": "string"
}
}
}
},
"415": {
"description": "Missing or unsupported JSON content type",
"content": {
"text/plain": {
"schema": {
"type": "string"
}
}
}
},
"500": {
"description": "Configuration store operation failed",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/AdminError"
}
}
}
}
},
"tags": [
"A2A Agents"
],
"description": "Update an upstream A2A agent resource by ID. The gateway validates the payload, rejects duplicate `display_name` values, preserves the resource ID, and increments the revision."
},

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate A2A agent docs and handlers
rg -n "A2aAgent|a2a_agents|Duplicate display_name|display_name" crates/aisix-admin/src -S

# Show the relevant openapi section around the PUT endpoint
sed -n '1860,2025p' crates/aisix-admin/src/openapi.rs

# Find the route/handler implementation references if any
rg -n "update.*A2A|A2A.*update|a2a agent|A2A Agent" crates -S

Repository: api7/aisix

Length of output: 23184


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,180p' crates/aisix-admin/src/a2a_agents_handlers.rs

Repository: api7/aisix

Length of output: 6463


Add the missing 409 Duplicate display_name response to PUT /admin/v1/a2a_agents/{id}. The handler rejects duplicate display_name values on update, but the OpenAPI response list omits the conflict case that POST already documents.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-admin/src/openapi.rs` around lines 1909 - 2003, Add the missing
409 conflict response to the PUT handler entry for the A2A agent update
operation in openapi.rs. The A2A agent update documentation currently lists
validation, auth, not found, body-size, content-type, and store errors, but
omits the duplicate display_name case handled by the update path. Update the
responses section for the A2aAgent update endpoint to include a 409 Duplicate
display_name response, matching the conflict semantics already documented for
the POST flow and keeping the A2A Agents operation description consistent.

Source: Coding guidelines

Comment on lines +18 to +24
use serde::{Deserialize, Serialize};

use crate::resource::Resource;

#[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)]
#[serde(deny_unknown_fields)]
pub struct A2aAgent {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Struct is missing a doc comment, so the generated OpenAPI schema will have no description.

The file has a rich //! module doc, but schemars::JsonSchema derives the schema description from doc comments placed directly on the struct, not the module. Since A2aAgent has no /// comment immediately above it, the generated resource schema will lack a top-level description, unlike (presumably) other resource models. As per coding guidelines, crates/aisix-core/src/models/**/*.rs should "write public API reference text in comments" for Admin API resource models.

📝 Proposed fix
+/// A registered upstream A2A (Agent-to-Agent) agent. The gateway fronts it at
+/// `/a2a/<display_name>`, serving its agent card and routing
+/// `message/send`/`message/stream` through the same auth/ACL/quota pipeline
+/// used for LLM and MCP traffic.
 #[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)]
 #[serde(deny_unknown_fields)]
 pub struct A2aAgent {

After this change, regenerate schemas/resources/a2a_agent.schema.json with cargo run -p aisix-core --bin dump-schema as instructed by the coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use serde::{Deserialize, Serialize};
use crate::resource::Resource;
#[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)]
#[serde(deny_unknown_fields)]
pub struct A2aAgent {
use serde::{Deserialize, Serialize};
use crate::resource::Resource;
/// A registered upstream A2A (Agent-to-Agent) agent. The gateway fronts it at
/// `/a2a/<display_name>`, serving its agent card and routing
/// `message/send`/`message/stream` through the same auth/ACL/quota pipeline
/// used for LLM and MCP traffic.
#[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)]
#[serde(deny_unknown_fields)]
pub struct A2aAgent {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-core/src/models/a2a_agent.rs` around lines 18 - 24, Add a direct
doc comment on the public A2aAgent struct so schemars::JsonSchema can populate
the OpenAPI description from the type itself, since the existing module-level
//! docs are not used for this. Update the comment immediately above A2aAgent in
the model definition to describe the resource clearly, matching the style used
by other Admin API resource models. After documenting the struct, regenerate the
A2aAgent schema output so the new description is reflected in
schemas/resources/a2a_agent.schema.json.

Source: Coding guidelines

Comment on lines +49 to +56
/// Authentication credential for the upstream agent. Its meaning follows
/// `auth_type`: the bearer token when `auth_type` is `bearer` (sent as
/// `Authorization: Bearer <secret>`), the API key when `auth_type` is
/// `api_key` (sent as `x-api-key: <secret>`), or the OAuth client secret
/// when `auth_type` is `oauth2`. Leave unset when `auth_type` is `none`.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub secret: Option<String>,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== a2a_agent.rs outline ==\n'
ast-grep outline crates/aisix-core/src/models/a2a_agent.rs --view expanded || true

printf '\n== a2a_agents_handlers.rs outline ==\n'
fd -a 'a2a_agents_handlers.rs' crates/aisix-core/src || true
for f in $(fd -a 'a2a_agents_handlers.rs' crates/aisix-core/src); do
  ast-grep outline "$f" --view expanded || true
done

printf '\n== focused model lines ==\n'
nl -ba crates/aisix-core/src/models/a2a_agent.rs | sed -n '1,180p'

printf '\n== focused handler lines ==\n'
for f in $(fd -a 'a2a_agents_handlers.rs' crates/aisix-core/src); do
  nl -ba "$f" | sed -n '1,260p'
done

printf '\n== search for ResourceEntry / serde redaction / secret fields ==\n'
rg -n "ResourceEntry<|skip_serializing_if|serde\\(|secret|client_id|token_url|list_a2a_agents|get_a2a_agent" crates/aisix-core/src -S

Repository: api7/aisix

Length of output: 1492


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== files matching a2a ==\n'
git ls-files | rg 'a2a|A2a' -n || true

printf '\n== a2a_agent.rs with line numbers ==\n'
cat -n crates/aisix-core/src/models/a2a_agent.rs | sed -n '1,280p'

printf '\n== search for handlers and response types ==\n'
rg -n "list_a2a_agents|get_a2a_agent|ResourceEntry<A2aAgent>|A2aAgent" crates/aisix-core/src -S

printf '\n== nearby model imports/usages ==\n'
ast-grep outline crates/aisix-core/src/models/a2a_agent.rs --view expanded || true

Repository: api7/aisix

Length of output: 16888


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== locate handler and entry types ==\n'
fd -a 'a2a_agents_handlers.rs|resource_entry.rs|mod.rs' crates/aisix-admin crates/aisix-core/src -E target || true
rg -n "struct ResourceEntry|type ResourceEntry|enum ResourceEntry|list_a2a_agents|get_a2a_agent|A2aAgent" crates/aisix-admin crates/aisix-core/src -S

printf '\n== read handler file ==\n'
for f in $(fd -a 'a2a_agents_handlers.rs' crates/aisix-admin/src); do
  wc -l "$f"
  cat -n "$f" | sed -n '1,260p'
done

printf '\n== read ResourceEntry definition if present ==\n'
for f in $(fd -a 'resource_entry.rs' crates/aisix-core/src crates/aisix-admin/src); do
  wc -l "$f"
  cat -n "$f" | sed -n '1,260p'
done

Repository: api7/aisix

Length of output: 15647


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== crates/aisix-core/src/resource.rs ==\n'
cat -n crates/aisix-core/src/resource.rs | sed -n '1,220p'

printf '\n== search for any redaction / masking on A2aAgent or ResourceEntry ==\n'
rg -n "redact|mask|secret.*skip_serializing|skip_serializing_if.*secret|A2aAgentEntry|ResourceEntry<.*Serialize|serde.*flatten" crates/aisix-core crates/aisix-admin -S

printf '\n== schema snippets for A2aAgentEntry / A2aAgent ==\n'
cat -n crates/aisix-admin/src/openapi.rs | sed -n '3620,3675p'

Repository: api7/aisix

Length of output: 10637


Redact upstream credentials from Admin API read responses
list_a2a_agents, get_a2a_agent, create_a2a_agent, and update_a2a_agent return ResourceEntry<A2aAgent> directly, so any stored secret is sent back in plaintext. Omit secret from read output or use a separate response type; keep client_id/token_url only if they’re meant to be visible.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-core/src/models/a2a_agent.rs` around lines 49 - 56, The Admin
API read paths are exposing stored upstream credentials because
`ResourceEntry<A2aAgent>` is returned directly from `list_a2a_agents`,
`get_a2a_agent`, `create_a2a_agent`, and `update_a2a_agent`. Update the
`A2aAgent` response flow so `secret` is omitted from serialized read output,
either by introducing a separate public/read DTO or by mapping the existing
model before returning it; use the `A2aAgent` struct and those four handler
methods as the main places to adjust.

Comment on lines +95 to +172
async fn dispatch(
auth: AuthenticatedKey,
agent: &str,
state: &ProxyState,
request: Request,
request_id: &str,
) -> Response {
// Resolve the agent from the live snapshot. A disabled agent is treated as
// absent — not served, same as a missing one.
let snapshot = state.snapshot.load();
let entry = match snapshot.a2a_agents.get_by_name(agent) {
Some(entry) if entry.value.enabled => entry,
_ => return (StatusCode::NOT_FOUND, format!("unknown A2A agent: {agent}")).into_response(),
};

// Per-agent access control, keyed on the same API key object as LLM/MCP
// access. A key with no `allowed_agents` reaches none (grant is explicit).
if !auth.key().can_access_agent(agent) {
return (
StatusCode::FORBIDDEN,
format!("this key may not reach A2A agent: {agent}"),
)
.into_response();
}

let upstream = match upstream_from_a2a_agent(&entry.value) {
Ok(upstream) => upstream,
// Currently only oauth2 upstream auth, which the runtime does not
// implement yet — surface it as "not implemented".
Err(err) => {
emit_a2a_usage(
state,
&auth,
request_id,
agent,
"",
StatusCode::NOT_IMPLEMENTED.as_u16(),
Duration::ZERO,
);
return (StatusCode::NOT_IMPLEMENTED, err.to_string()).into_response();
}
};

let (_parts, body) = request.into_parts();
let bytes = match to_bytes(body, state.request_body_limit_bytes).await {
Ok(bytes) => bytes,
Err(_) => return (StatusCode::BAD_REQUEST, "invalid request body").into_response(),
};
let value: serde_json::Value = match serde_json::from_slice(&bytes) {
Ok(value) => value,
Err(_) => return (StatusCode::BAD_REQUEST, "invalid JSON-RPC body").into_response(),
};
let peek = serde_json::from_slice::<JsonRpcPeek>(&bytes).ok();
let method = peek
.as_ref()
.and_then(|p| p.method.clone())
.unwrap_or_default();
let rpc_id = peek.and_then(|p| p.id);

// Reuse the LLM path's rate-limit + budget gate. The reservation is held
// for the call and dropped after (an A2A call carries no token cost yet).
// On 429 / budget-exceeded this returns before the upstream is contacted.
let _reservation = match crate::quota::enforce(state, &auth, None).await {
Ok(reservation) => reservation,
Err(err) => {
let response = err.into_response();
emit_a2a_usage(
state,
&auth,
request_id,
agent,
&method,
response.status().as_u16(),
Duration::ZERO,
);
return response;
}
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Usage events aren't emitted for 404 / 403 / 400 rejections.

dispatch emits emit_a2a_usage for the oauth2-unsupported (Lines 125-135), quota-rejected (Lines 159-171), and post-forward success/failure (Lines 181-204) paths, but the earlier 404 "unknown agent" (Line 107), 403 "ACL denied" (Lines 113-118), and 400 "invalid body" (Lines 141, 145) returns skip it entirely. Since the PR's stated goal is to route A2A traffic through the same usage-reporting pipeline as LLM/MCP, this creates blind spots in dashboards/billing for ACL denials and malformed requests specifically, while other rejection classes are fully tracked.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-proxy/src/a2a.rs` around lines 95 - 172, The dispatch path in
a2a::dispatch is missing usage reporting for early rejections, so 404 unknown
agent, 403 ACL denied, and 400 invalid body responses should also call
emit_a2a_usage before returning. Add the same usage event emission used in the
oauth2 and quota error paths, using the current request_id, agent, resolved
method when available (or empty string for pre-parse failures), the actual
response status, and Duration::ZERO so these failures are tracked consistently
with the rest of the A2A pipeline.

Comment on lines +104 to +118
let snapshot = state.snapshot.load();
let entry = match snapshot.a2a_agents.get_by_name(agent) {
Some(entry) if entry.value.enabled => entry,
_ => return (StatusCode::NOT_FOUND, format!("unknown A2A agent: {agent}")).into_response(),
};

// Per-agent access control, keyed on the same API key object as LLM/MCP
// access. A key with no `allowed_agents` reaches none (grant is explicit).
if !auth.key().can_access_agent(agent) {
return (
StatusCode::FORBIDDEN,
format!("this key may not reach A2A agent: {agent}"),
)
.into_response();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

Agent existence is distinguishable from ACL denial.

A 404 (unknown/disabled agent) is returned before the ACL check (403), so any caller holding a valid API key — even one authorized for no agents — can distinguish "agent doesn't exist" from "agent exists but I can't reach it," enabling limited agent-name enumeration. Checking the ACL before (or independent of) resolving the agent, or collapsing both cases to the same status, would close this.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/aisix-proxy/src/a2a.rs` around lines 104 - 118, The A2A handler in
a2a.rs currently returns 404 from the snapshot/a2a_agents lookup before running
the auth.key().can_access_agent check, which lets callers distinguish
unknown/disabled agents from ACL failures. Rework the A2A request path in this
handler so authorization is evaluated before revealing whether an agent exists,
or otherwise collapse both outcomes to the same response status/message; use the
existing snapshot, entry lookup, and can_access_agent logic to keep the check
centralized.

Independent audit of #717 raised four MEDIUM items; all addressed:

- SSRF: the outbound client now refuses redirects, so a compromised or
  MITM'd upstream cannot 302 the VPC-internal data plane into fetching an
  internal address (metadata endpoint, loopback). Mirrors the MCP OAuth
  client. Covered by a redirect-refused roundtrip test.
- Connection reuse: the reqwest client is built once (OnceLock) and shared
  instead of per-request, so calls reuse the pool + TLS session. The
  per-agent timeout is still applied per-request.
- OOM guard: upstream response bodies are read through a 16 MiB streaming
  cap (honest oversized Content-Length rejected up front; a lying/absent
  length or endless stream caught as chunks accumulate).
- Endpoint coverage: add oneshot integration tests for /a2a/:agent —
  default-deny 403, disabled/unknown agent 404, missing-key 401, and the
  agent-card URL rewrite (rewrites `url`, preserves every other field).

Also pin "surface the upstream status only, never its body" in the
error path so the Phase-2 error-translation work can't regress it.
@moonming

moonming commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

Audit follow-up — all findings addressed (4bba8c2)

An independent audit flagged 4 MEDIUM + 2 LOW (no CRITICAL/HIGH). Resolutions:

# Finding Resolution
M1 Upstream client follows redirects → SSRF pivot from the VPC-internal DP Client now built with redirect::Policy::none() (mirrors the MCP OAuth client). A 302 surfaces as a non-success error, never followed. New refuses_to_follow_upstream_redirect roundtrip test asserts the redirect target is not fetched.
M2 reqwest::Client::new() per request → no pool reuse Client built once via OnceLock and shared; the per-agent timeout is still applied per-request.
M3 Unbounded resp.json() → OOM from a malicious/streaming upstream Bodies read through a 16 MiB streaming cap: an honest oversized Content-Length is rejected up front; a lying/absent length or endless stream is caught as chunks accumulate.
M4 Endpoint glue (dispatch/card rewrite) had no e2e coverage Added oneshot integration tests against build_router: default-deny 403, disabled agent 404, unknown agent 404, missing key 401, and agent-card URL rewrite (rewrites url, preserves name/version/skills).
L2 Risk that a future change proxies the upstream error body verbatim Pinned a comment in the error path: surface the upstream status only, never its body.
L1 404-before-403 is a name-enumeration oracle Left as-is, justified: the caller is already past 401, agent names are operator labels (not secrets), and this matches the reference gateway's behavior (LiteLLM also returns 403 here). Flagging rather than changing.

All green after the fix: aisix-a2a 6 unit + 4 roundtrip (incl. redirect guard), aisix-proxy 8 a2a (incl. 5 new endpoint tests), clippy -D warnings clean, aisix-server binary builds.

@moonming

moonming commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator Author

Independent re-audit of 4bba8c2 — merge gate satisfied. A fresh adversarial pass confirmed all four MEDIUM findings are genuinely resolved with no new HIGH/MEDIUM introduced: redirects disabled on both card-fetch and send paths (SSRF closed); shared client keeps the per-agent timeout; the 16 MiB streaming cap covers honest/lying/endless/gzip-decompressed bodies with no bypass; and the new endpoint tests are regression-tight (the 403/404 cases target an unreachable upstream, so an ACL/enabled bypass would surface as 502, not the asserted status). L1 (404-before-403) left as a justified LOW. Final green rests on this PR's CI (locally: all touched-crate tests + clippy -D warnings + aisix-server build pass).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant