feat(a2a): A2A (Agent-to-Agent) gateway — DP MVP#717
Conversation
First slice of the A2A (Agent-to-Agent) gateway: register an upstream agent that speaks A2A over HTTP (JSON-RPC 2.0) as a first-class resource, mirroring McpServer. Adds a pinned protocol_version (1.0/0.3) and the same upstream auth shape (none/bearer/api_key/oauth2). Wired into the aisix-core model module and re-exported. No runtime path yet.
Hand-rolled JSON-RPC 2.0 client behind the A2aBridge trait (A2A has no official Rust SDK). Fetches the upstream agent card from its RFC 8615 well-known URI (/.well-known/agent-card.json) and forwards JSON-RPC requests verbatim to the agent's service endpoint, holding the upstream credential (none/bearer/api_key) so the calling client never sees it. Does not translate between the A2A 0.3 and 1.0 wire formats — a single agent is reached in whichever version it speaks (pinned on the resource). Tested against a locally spawned real A2A server: card fetch, message/send roundtrip, and credential forwarding for each auth type. oauth2 upstream auth is rejected as unsupported for now.
Make the A2aAgent resource reachable on the hot path: add the a2a_agents table to AisixSnapshot, wire the etcd loader + watch supervisor to populate and maintain it (mirroring mcp_servers), add the JSON Schema validator (validate_a2a_agent) plus the dumped schemas/resources file, and add per-agent access control to ApiKey (allowed_agents + can_access_agent, mirroring allowed_tools). The DP admin ApiKey body and response carry allowed_agents so the ACL is configurable there too. Read/serve plumbing only — the /a2a proxy endpoint and admin CRUD follow.
Front each registered A2A agent through /a2a/<agent>: forward JSON-RPC requests to the upstream via the aisix-a2a bridge, and serve its card — with the advertised service URL rewritten to this gateway — at the RFC 8615 well-known path. Every call runs the same governance as LLM and MCP traffic: AuthenticatedKey (401), per-agent ACL via allowed_agents (403), quota::enforce rate-limit + budget (429), and a usage event (a2a_agent_name / a2a_method) into the shared sink. The body is forwarded verbatim, so no 0.3<->1.0 translation happens here. Guardrails over A2A message content are a later step.
Add /admin/v1/a2a_agents CRUD, cloning the mcp_servers handlers: schema validation, duplicate display_name (409), uuid on POST, revision bump on PUT, and the same per-auth_type credential coupling. The display_name is the agent's URL path segment (/a2a/<name>), so it is rejected if it contains `/`. Wire the a2a_agents table through the ConfigStore trait (in-memory + etcd-backed, subkey "a2a_agents") and the admin router. OpenAPI documentation for the new paths follows in the next commit.
Add the a2a_agents collection and item paths (get/post, get/put/delete), the A2aAgentEntry response wrapper, and register the generated A2aAgent schema. Keeps the OpenAPI path set matching the admin router exactly (openapi_documents_exact_admin_path_set) and the resource schema components complete.
|
Warning Review limit reached
Next review available in: 40 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds an A2A (agent-to-agent) JSON-RPC gateway: a new ChangesA2A gateway feature
Estimated code review effort: 4 (Complex) | ~75 minutes Sequence Diagram(s)sequenceDiagram
participant Caller
participant ProxyA2aEndpoint
participant HttpBridge
participant UpstreamAgent
Caller->>ProxyA2aEndpoint: POST /a2a/:agent (JSON-RPC)
ProxyA2aEndpoint->>ProxyA2aEndpoint: load agent, check ACL, enforce quota
ProxyA2aEndpoint->>HttpBridge: send(request)
HttpBridge->>UpstreamAgent: forward JSON-RPC (with auth)
UpstreamAgent-->>HttpBridge: JSON-RPC response
HttpBridge-->>ProxyA2aEndpoint: response
ProxyA2aEndpoint->>ProxyA2aEndpoint: emit UsageEvent
ProxyA2aEndpoint-->>Caller: JSON-RPC response
sequenceDiagram
participant Admin
participant A2aAgentsHandler
participant ConfigStore
Admin->>A2aAgentsHandler: POST /admin/v1/a2a_agents
A2aAgentsHandler->>A2aAgentsHandler: validate + decode payload
A2aAgentsHandler->>ConfigStore: check display_name uniqueness
A2aAgentsHandler->>ConfigStore: put_a2a_agent
ConfigStore-->>A2aAgentsHandler: stored entry
A2aAgentsHandler-->>Admin: ResourceEntry<A2aAgent>
Possibly related PRs
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/aisix-admin/src/openapi.rs (1)
3488-3539: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
ApiKeyRequest/PublicApiKeyOpenAPI schemas are missing the newallowed_agentsfield.
apikeys_handlers.rsaddedallowed_agentsto both the request body and public response DTOs, but these component schemas here still only documentallowed_toolsand setadditionalProperties: false. The generated OpenAPI contract no longer matches the actual accepted/returned payload shape.As per coding guidelines: "After changing Admin API routes, OpenAPI metadata, or generated descriptions, verify the OpenAPI output ... and inspect the served OpenAPI if generated descriptions changed."
📝 Proposed fix (apply to both `ApiKeyRequest` and `PublicApiKey`)
"allowed_tools": { "type": [ "array", "null" ], "items": { "type": "string" }, "description": "MCP tools this key may call, ..." }, + "allowed_agents": { + "type": [ + "array", + "null" + ], + "items": { + "type": "string" + }, + "description": "A2A agents this key may reach, named by their registered names. Entries are matched as single-`*` globs, mirroring `allowed_tools`: `\"*\"` grants every agent and an entry without a `*` matches one agent exactly. When omitted or set to `null`, the key has no A2A agent access — access is granted explicitly." + }, "expires_at": {Also applies to: 3834-3890
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-admin/src/openapi.rs` around lines 3488 - 3539, The OpenAPI component schemas for ApiKeyRequest and PublicApiKey are missing the new allowed_agents property, so the documented contract no longer matches the DTOs used by apikeys_handlers.rs. Update both schema definitions to include allowed_agents with the correct array/null shape and description, and ensure it is listed alongside allowed_tools before keeping additionalProperties set to false. Verify the generated OpenAPI output still reflects the request and response models after the schema change.Source: Coding guidelines
♻️ Duplicate comments (1)
crates/aisix-admin/src/a2a_agents_handlers.rs (1)
24-30: 🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy liftSecrets returned unmasked via list/get.
list_a2a_agents/get_a2a_agentreturn the fullA2aAgentincludingsecret/token_urlfields in plaintext to any Admin API caller. See the related comment on the model file; consider redacting credential fields on read.Also applies to: 32-43
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-admin/src/a2a_agents_handlers.rs` around lines 24 - 30, The read handlers for A2A agents are returning credential fields in plaintext, so redact sensitive data before serializing responses. Update list_a2a_agents and get_a2a_agent to mask or omit secret/token_url on the returned A2aAgent inside ResourceEntry, and keep the redaction logic consistent with the model’s handling so all read paths use the same sanitized representation.
🧹 Nitpick comments (6)
crates/aisix-a2a/Cargo.toml (1)
25-28: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRedundant
aisix-coredev-dependency.
aisix-coreis already a regular dependency (Line 12) and is automatically available to tests; re-declaring it identically under[dev-dependencies](Line 26) is unnecessary.♻️ Proposed cleanup
[dev-dependencies] -aisix-core = { path = "../aisix-core" } tokio = { workspace = true, features = ["macros", "rt-multi-thread"] } axum.workspace = true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-a2a/Cargo.toml` around lines 25 - 28, The Cargo.toml for the aisix-a2a crate has a redundant aisix-core entry under [dev-dependencies] even though aisix-core is already declared as a normal dependency and is available to tests. Remove the duplicate aisix-core dev-dependency and leave the other test-only dependencies (such as tokio and axum.workspace) unchanged.crates/aisix-proxy/src/a2a.rs (2)
138-172: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick winQuota check runs after the full request body is buffered/parsed.
crate::quota::enforce(Line 157) only needs the auth key, not the parsed body/method — yet it runs afterto_bytes/serde_json::from_slice(Lines 139-147). An already-throttled caller still pays the cost of the gateway buffering and parsing its full body (up torequest_body_limit_bytes) before being rejected. Moving the quota check earlier would save that work, at the cost of losing themethodlabel on quota-rejection usage events (would become"", as already happens on the oauth2-unsupported path).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-proxy/src/a2a.rs` around lines 138 - 172, The quota gate in `a2a` is applied too late, after `to_bytes` and `serde_json::from_slice` have already buffered and parsed the full request body. Move the `crate::quota::enforce` call in the `a2a` request handling path to run immediately after auth is available and before body parsing, using the existing `method` fallback behavior for quota-failure usage events if the body has not been inspected yet. Keep the current error/usage reporting flow intact in the `a2a` handler while avoiding unnecessary work for rejected requests.
212-251: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win
a2a_agent_carddoesn't recordAccessLog/state.metrics.record_request, unlikea2a_endpoint.
a2a_endpointwrapsdispatchwith per-requestAccessLog.emit()andstate.metrics.record_request(...)(Lines 66-92), buta2a_agent_cardis wired directly as the route handler with no equivalent instrumentation. Card-fetch requests are invisible to the same status/latency dashboards that track the main A2A traffic.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-proxy/src/a2a.rs` around lines 212 - 251, a2a_agent_card is missing the same request instrumentation used by a2a_endpoint, so card fetches never emit AccessLog or update state.metrics.record_request. Wrap the handler logic in the same per-request logging/metrics flow as dispatch in a2a_endpoint, using the a2a_agent_card function as the entry point and recording the final status and latency for both success and early-return paths.crates/aisix-a2a/src/bridge.rs (1)
210-230: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winNon-2xx upstream responses discard the JSON-RPC error body.
send()treats any non-success HTTP status as a hard error and returns only a synthesized"upstream returned HTTP {status}"message (Lines 221-226), discarding whatever JSON-RPC error envelope the upstream may have returned in the body. Per the module's own "forwards ... verbatim" contract (Lines 17-20), an upstream that legitimately returns a structured JSON-RPC error alongside a non-2xx status loses that detail for the caller.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-a2a/src/bridge.rs` around lines 210 - 230, The send() method in bridge.rs is dropping JSON-RPC error details for non-2xx upstream responses by returning only a synthesized HTTP status message. Update send() so it still reads and attempts to parse the response body when resp.status().is_success() is false, and preserve any upstream JSON-RPC error envelope instead of replacing it with a generic A2aError::Request; keep the existing success-path parsing and use the same send/apply_auth flow to locate the fix.crates/aisix-a2a/tests/upstream_roundtrip.rs (1)
80-128: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winSolid happy-path coverage; consider adding a failure-path test.
The three tests thoroughly verify credential propagation for Bearer/ApiKey/None auth. Consider also adding a case where the upstream returns a non-success status or malformed body, to assert the
HttpBridgemaps it toA2aError::Connect/A2aError::Requestas documented, closing the loop with the proxy'sa2a_error_statusmapping shown incrates/aisix-proxy/src/a2a.rs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-a2a/tests/upstream_roundtrip.rs` around lines 80 - 128, The current tests only cover successful credential forwarding and do not verify how HttpBridge handles upstream failures. Add a failure-path test around HttpBridge::fetch_agent_card and/or HttpBridge::send that simulates a non-success status or malformed response body, and assert the error maps to A2aError::Connect or A2aError::Request as expected. Use the existing test helpers and A2aUpstream setup so the new case complements fetches_card_and_forwards_bearer, forwards_api_key_header, and sends_no_credential_when_none while covering the proxy error mapping behavior.crates/aisix-core/src/models/a2a_agent.rs (1)
82-87: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueState the actual default timeout instead of "a built-in default."
Per coding guidelines, omitted-field fallback behavior should be described accurately. Naming the concrete default (e.g., "30000 ms") is more useful to API consumers than a vague reference.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/aisix-core/src/models/a2a_agent.rs` around lines 82 - 87, Update the `timeout_ms` field documentation in `A2aAgent` so it names the actual omitted-field fallback instead of saying “a built-in default.” Keep the existing constraints, but replace the vague wording with the concrete default timeout value used by the gateway, so API consumers can see the exact behavior from the `timeout_ms` doc comment.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/aisix-a2a/src/bridge.rs`:
- Around line 154-168: `HttpBridge::new` currently creates its own
`reqwest::Client`, which prevents connection reuse when `/a2a` handlers
construct a new bridge per request. Update `HttpBridge` so it accepts a shared
`reqwest::Client` (instead of calling `Client::new()` internally) and thread
that client through the bridge construction path used by the `/a2a` handlers to
preserve the connection pool.
In `@crates/aisix-admin/src/a2a_agents_handlers.rs`:
- Around line 45-58: The uniqueness check in create_a2a_agent (and the similar
update_a2a_agent flow) is currently a read-then-write pattern that can race
under concurrent requests. Move the display_name uniqueness enforcement into the
store layer so put_a2a_agent/update logic performs an atomic constraint check,
or back it with a unique index, and have create_a2a_agent rely on that atomic
failure instead of calling list_a2a_agents plus assert_unique_display_name
before writing.
In `@crates/aisix-admin/src/openapi.rs`:
- Around line 1909-2003: Add the missing 409 conflict response to the PUT
handler entry for the A2A agent update operation in openapi.rs. The A2A agent
update documentation currently lists validation, auth, not found, body-size,
content-type, and store errors, but omits the duplicate display_name case
handled by the update path. Update the responses section for the A2aAgent update
endpoint to include a 409 Duplicate display_name response, matching the conflict
semantics already documented for the POST flow and keeping the A2A Agents
operation description consistent.
In `@crates/aisix-core/src/models/a2a_agent.rs`:
- Around line 18-24: Add a direct doc comment on the public A2aAgent struct so
schemars::JsonSchema can populate the OpenAPI description from the type itself,
since the existing module-level //! docs are not used for this. Update the
comment immediately above A2aAgent in the model definition to describe the
resource clearly, matching the style used by other Admin API resource models.
After documenting the struct, regenerate the A2aAgent schema output so the new
description is reflected in schemas/resources/a2a_agent.schema.json.
- Around line 49-56: The Admin API read paths are exposing stored upstream
credentials because `ResourceEntry<A2aAgent>` is returned directly from
`list_a2a_agents`, `get_a2a_agent`, `create_a2a_agent`, and `update_a2a_agent`.
Update the `A2aAgent` response flow so `secret` is omitted from serialized read
output, either by introducing a separate public/read DTO or by mapping the
existing model before returning it; use the `A2aAgent` struct and those four
handler methods as the main places to adjust.
In `@crates/aisix-proxy/src/a2a.rs`:
- Around line 95-172: The dispatch path in a2a::dispatch is missing usage
reporting for early rejections, so 404 unknown agent, 403 ACL denied, and 400
invalid body responses should also call emit_a2a_usage before returning. Add the
same usage event emission used in the oauth2 and quota error paths, using the
current request_id, agent, resolved method when available (or empty string for
pre-parse failures), the actual response status, and Duration::ZERO so these
failures are tracked consistently with the rest of the A2A pipeline.
- Around line 104-118: The A2A handler in a2a.rs currently returns 404 from the
snapshot/a2a_agents lookup before running the auth.key().can_access_agent check,
which lets callers distinguish unknown/disabled agents from ACL failures. Rework
the A2A request path in this handler so authorization is evaluated before
revealing whether an agent exists, or otherwise collapse both outcomes to the
same response status/message; use the existing snapshot, entry lookup, and
can_access_agent logic to keep the check centralized.
---
Outside diff comments:
In `@crates/aisix-admin/src/openapi.rs`:
- Around line 3488-3539: The OpenAPI component schemas for ApiKeyRequest and
PublicApiKey are missing the new allowed_agents property, so the documented
contract no longer matches the DTOs used by apikeys_handlers.rs. Update both
schema definitions to include allowed_agents with the correct array/null shape
and description, and ensure it is listed alongside allowed_tools before keeping
additionalProperties set to false. Verify the generated OpenAPI output still
reflects the request and response models after the schema change.
---
Duplicate comments:
In `@crates/aisix-admin/src/a2a_agents_handlers.rs`:
- Around line 24-30: The read handlers for A2A agents are returning credential
fields in plaintext, so redact sensitive data before serializing responses.
Update list_a2a_agents and get_a2a_agent to mask or omit secret/token_url on the
returned A2aAgent inside ResourceEntry, and keep the redaction logic consistent
with the model’s handling so all read paths use the same sanitized
representation.
---
Nitpick comments:
In `@crates/aisix-a2a/Cargo.toml`:
- Around line 25-28: The Cargo.toml for the aisix-a2a crate has a redundant
aisix-core entry under [dev-dependencies] even though aisix-core is already
declared as a normal dependency and is available to tests. Remove the duplicate
aisix-core dev-dependency and leave the other test-only dependencies (such as
tokio and axum.workspace) unchanged.
In `@crates/aisix-a2a/src/bridge.rs`:
- Around line 210-230: The send() method in bridge.rs is dropping JSON-RPC error
details for non-2xx upstream responses by returning only a synthesized HTTP
status message. Update send() so it still reads and attempts to parse the
response body when resp.status().is_success() is false, and preserve any
upstream JSON-RPC error envelope instead of replacing it with a generic
A2aError::Request; keep the existing success-path parsing and use the same
send/apply_auth flow to locate the fix.
In `@crates/aisix-a2a/tests/upstream_roundtrip.rs`:
- Around line 80-128: The current tests only cover successful credential
forwarding and do not verify how HttpBridge handles upstream failures. Add a
failure-path test around HttpBridge::fetch_agent_card and/or HttpBridge::send
that simulates a non-success status or malformed response body, and assert the
error maps to A2aError::Connect or A2aError::Request as expected. Use the
existing test helpers and A2aUpstream setup so the new case complements
fetches_card_and_forwards_bearer, forwards_api_key_header, and
sends_no_credential_when_none while covering the proxy error mapping behavior.
In `@crates/aisix-core/src/models/a2a_agent.rs`:
- Around line 82-87: Update the `timeout_ms` field documentation in `A2aAgent`
so it names the actual omitted-field fallback instead of saying “a built-in
default.” Keep the existing constraints, but replace the vague wording with the
concrete default timeout value used by the gateway, so API consumers can see the
exact behavior from the `timeout_ms` doc comment.
In `@crates/aisix-proxy/src/a2a.rs`:
- Around line 138-172: The quota gate in `a2a` is applied too late, after
`to_bytes` and `serde_json::from_slice` have already buffered and parsed the
full request body. Move the `crate::quota::enforce` call in the `a2a` request
handling path to run immediately after auth is available and before body
parsing, using the existing `method` fallback behavior for quota-failure usage
events if the body has not been inspected yet. Keep the current error/usage
reporting flow intact in the `a2a` handler while avoiding unnecessary work for
rejected requests.
- Around line 212-251: a2a_agent_card is missing the same request
instrumentation used by a2a_endpoint, so card fetches never emit AccessLog or
update state.metrics.record_request. Wrap the handler logic in the same
per-request logging/metrics flow as dispatch in a2a_endpoint, using the
a2a_agent_card function as the entry point and recording the final status and
latency for both success and early-return paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fc715ddb-a499-4be3-b7e5-5675fd6bcefb
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (27)
Cargo.tomlcrates/aisix-a2a/Cargo.tomlcrates/aisix-a2a/src/bridge.rscrates/aisix-a2a/src/error.rscrates/aisix-a2a/src/lib.rscrates/aisix-a2a/tests/upstream_roundtrip.rscrates/aisix-admin/src/a2a_agents_handlers.rscrates/aisix-admin/src/apikeys_handlers.rscrates/aisix-admin/src/etcd_store.rscrates/aisix-admin/src/lib.rscrates/aisix-admin/src/openapi.rscrates/aisix-admin/src/store.rscrates/aisix-core/src/bin/dump-schema.rscrates/aisix-core/src/lib.rscrates/aisix-core/src/models/a2a_agent.rscrates/aisix-core/src/models/apikey.rscrates/aisix-core/src/models/mod.rscrates/aisix-core/src/models/schema.rscrates/aisix-core/src/models/snapshot.rscrates/aisix-etcd/src/loader.rscrates/aisix-etcd/src/supervisor.rscrates/aisix-obs/src/usage.rscrates/aisix-proxy/Cargo.tomlcrates/aisix-proxy/src/a2a.rscrates/aisix-proxy/src/lib.rsschemas/resources/a2a_agent.schema.jsonschemas/resources/api_key.schema.json
| pub async fn create_a2a_agent( | ||
| _auth: AdminAuth, | ||
| State(state): State<AdminState>, | ||
| Json(raw): Json<Value>, | ||
| ) -> Result<Json<ResourceEntry<A2aAgent>>, AdminError> { | ||
| let agent = decode(&raw)?; | ||
| let all = state.store.list_a2a_agents().await?; | ||
| assert_unique_display_name(&all, &agent.display_name, None)?; | ||
|
|
||
| let id = Uuid::new_v4().to_string(); | ||
| let entry = ResourceEntry::new(&id, agent, STARTING_REVISION); | ||
| state.store.put_a2a_agent(entry.clone()).await?; | ||
| Ok(Json(entry)) | ||
| } |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n## File outline\n'
ast-grep outline crates/aisix-admin/src/a2a_agents_handlers.rs --view expanded || true
printf '\n## Relevant symbol search\n'
rg -n "assert_unique_display_name|list_a2a_agents|put_a2a_agent|update_a2a_agent|display_name|/a2a/" crates/aisix-admin/src crates -g '!target' || true
printf '\n## Inspect handler file\n'
cat -n crates/aisix-admin/src/a2a_agents_handlers.rs | sed -n '1,260p'
printf '\n## Find store trait/impls for A2A agents\n'
rg -n "trait .*Store|struct .*Store|fn list_a2a_agents|fn put_a2a_agent|A2aAgent" crates -g '!target' || trueRepository: api7/aisix
Length of output: 50367
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n## a2a handler excerpt\n'
sed -n '1,170p' crates/aisix-admin/src/a2a_agents_handlers.rs
printf '\n## store trait excerpt\n'
sed -n '1,130p' crates/aisix-admin/src/store.rs
printf '\n## store A2A methods excerpt\n'
sed -n '220,320p' crates/aisix-admin/src/store.rs
printf '\n## etcd store A2A methods excerpt\n'
sed -n '340,430p' crates/aisix-admin/src/etcd_store.rs
printf '\n## a2a model excerpt\n'
sed -n '1,120p' crates/aisix-core/src/models/a2a_agent.rsRepository: api7/aisix
Length of output: 23920
Avoid the check-then-put uniqueness pattern. Both create_a2a_agent and update_a2a_agent read all agents, check display_name, then write in separate steps. Concurrent requests can still persist duplicate display_name values, which breaks the /a2a/<display_name> routing invariant. Enforce uniqueness atomically in the store or via a unique index.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aisix-admin/src/a2a_agents_handlers.rs` around lines 45 - 58, The
uniqueness check in create_a2a_agent (and the similar update_a2a_agent flow) is
currently a read-then-write pattern that can race under concurrent requests.
Move the display_name uniqueness enforcement into the store layer so
put_a2a_agent/update logic performs an atomic constraint check, or back it with
a unique index, and have create_a2a_agent rely on that atomic failure instead of
calling list_a2a_agents plus assert_unique_display_name before writing.
| "put": { | ||
| "summary": "Update A2A Agent by ID", | ||
| "requestBody": { | ||
| "required": true, | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/A2aAgent" | ||
| } | ||
| } | ||
| }, | ||
| "description": "Replacement upstream A2A agent configuration." | ||
| }, | ||
| "responses": { | ||
| "200": { | ||
| "description": "OK", | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/A2aAgentEntry" | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "400": { | ||
| "description": "Schema validation failed, the JSON body is malformed, `display_name` contains a `/` (it is the agent's URL path segment), or the credentials required by `auth_type` are missing (`secret` for `bearer`/`api_key`; `client_id`, `token_url`, and `secret` for `oauth2`)", | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/AdminError" | ||
| } | ||
| }, | ||
| "text/plain": { | ||
| "schema": { | ||
| "type": "string" | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "401": { | ||
| "description": "Missing or invalid admin key", | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/AdminError" | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "404": { | ||
| "description": "Resource not found", | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/AdminError" | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "413": { | ||
| "description": "JSON request body exceeds the admin body-size limit", | ||
| "content": { | ||
| "text/plain": { | ||
| "schema": { | ||
| "type": "string" | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "415": { | ||
| "description": "Missing or unsupported JSON content type", | ||
| "content": { | ||
| "text/plain": { | ||
| "schema": { | ||
| "type": "string" | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "500": { | ||
| "description": "Configuration store operation failed", | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/AdminError" | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "tags": [ | ||
| "A2A Agents" | ||
| ], | ||
| "description": "Update an upstream A2A agent resource by ID. The gateway validates the payload, rejects duplicate `display_name` values, preserves the resource ID, and increments the revision." | ||
| }, |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate A2A agent docs and handlers
rg -n "A2aAgent|a2a_agents|Duplicate display_name|display_name" crates/aisix-admin/src -S
# Show the relevant openapi section around the PUT endpoint
sed -n '1860,2025p' crates/aisix-admin/src/openapi.rs
# Find the route/handler implementation references if any
rg -n "update.*A2A|A2A.*update|a2a agent|A2A Agent" crates -SRepository: api7/aisix
Length of output: 23184
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,180p' crates/aisix-admin/src/a2a_agents_handlers.rsRepository: api7/aisix
Length of output: 6463
Add the missing 409 Duplicate display_name response to PUT /admin/v1/a2a_agents/{id}. The handler rejects duplicate display_name values on update, but the OpenAPI response list omits the conflict case that POST already documents.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aisix-admin/src/openapi.rs` around lines 1909 - 2003, Add the missing
409 conflict response to the PUT handler entry for the A2A agent update
operation in openapi.rs. The A2A agent update documentation currently lists
validation, auth, not found, body-size, content-type, and store errors, but
omits the duplicate display_name case handled by the update path. Update the
responses section for the A2aAgent update endpoint to include a 409 Duplicate
display_name response, matching the conflict semantics already documented for
the POST flow and keeping the A2A Agents operation description consistent.
Source: Coding guidelines
| use serde::{Deserialize, Serialize}; | ||
|
|
||
| use crate::resource::Resource; | ||
|
|
||
| #[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)] | ||
| #[serde(deny_unknown_fields)] | ||
| pub struct A2aAgent { |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Struct is missing a doc comment, so the generated OpenAPI schema will have no description.
The file has a rich //! module doc, but schemars::JsonSchema derives the schema description from doc comments placed directly on the struct, not the module. Since A2aAgent has no /// comment immediately above it, the generated resource schema will lack a top-level description, unlike (presumably) other resource models. As per coding guidelines, crates/aisix-core/src/models/**/*.rs should "write public API reference text in comments" for Admin API resource models.
📝 Proposed fix
+/// A registered upstream A2A (Agent-to-Agent) agent. The gateway fronts it at
+/// `/a2a/<display_name>`, serving its agent card and routing
+/// `message/send`/`message/stream` through the same auth/ACL/quota pipeline
+/// used for LLM and MCP traffic.
#[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)]
#[serde(deny_unknown_fields)]
pub struct A2aAgent {After this change, regenerate schemas/resources/a2a_agent.schema.json with cargo run -p aisix-core --bin dump-schema as instructed by the coding guidelines.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| use serde::{Deserialize, Serialize}; | |
| use crate::resource::Resource; | |
| #[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)] | |
| #[serde(deny_unknown_fields)] | |
| pub struct A2aAgent { | |
| use serde::{Deserialize, Serialize}; | |
| use crate::resource::Resource; | |
| /// A registered upstream A2A (Agent-to-Agent) agent. The gateway fronts it at | |
| /// `/a2a/<display_name>`, serving its agent card and routing | |
| /// `message/send`/`message/stream` through the same auth/ACL/quota pipeline | |
| /// used for LLM and MCP traffic. | |
| #[derive(Debug, Clone, Serialize, Deserialize, schemars::JsonSchema, PartialEq, Eq)] | |
| #[serde(deny_unknown_fields)] | |
| pub struct A2aAgent { |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aisix-core/src/models/a2a_agent.rs` around lines 18 - 24, Add a direct
doc comment on the public A2aAgent struct so schemars::JsonSchema can populate
the OpenAPI description from the type itself, since the existing module-level
//! docs are not used for this. Update the comment immediately above A2aAgent in
the model definition to describe the resource clearly, matching the style used
by other Admin API resource models. After documenting the struct, regenerate the
A2aAgent schema output so the new description is reflected in
schemas/resources/a2a_agent.schema.json.
Source: Coding guidelines
| /// Authentication credential for the upstream agent. Its meaning follows | ||
| /// `auth_type`: the bearer token when `auth_type` is `bearer` (sent as | ||
| /// `Authorization: Bearer <secret>`), the API key when `auth_type` is | ||
| /// `api_key` (sent as `x-api-key: <secret>`), or the OAuth client secret | ||
| /// when `auth_type` is `oauth2`. Leave unset when `auth_type` is `none`. | ||
| #[serde(default, skip_serializing_if = "Option::is_none")] | ||
| pub secret: Option<String>, | ||
|
|
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== a2a_agent.rs outline ==\n'
ast-grep outline crates/aisix-core/src/models/a2a_agent.rs --view expanded || true
printf '\n== a2a_agents_handlers.rs outline ==\n'
fd -a 'a2a_agents_handlers.rs' crates/aisix-core/src || true
for f in $(fd -a 'a2a_agents_handlers.rs' crates/aisix-core/src); do
ast-grep outline "$f" --view expanded || true
done
printf '\n== focused model lines ==\n'
nl -ba crates/aisix-core/src/models/a2a_agent.rs | sed -n '1,180p'
printf '\n== focused handler lines ==\n'
for f in $(fd -a 'a2a_agents_handlers.rs' crates/aisix-core/src); do
nl -ba "$f" | sed -n '1,260p'
done
printf '\n== search for ResourceEntry / serde redaction / secret fields ==\n'
rg -n "ResourceEntry<|skip_serializing_if|serde\\(|secret|client_id|token_url|list_a2a_agents|get_a2a_agent" crates/aisix-core/src -SRepository: api7/aisix
Length of output: 1492
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== files matching a2a ==\n'
git ls-files | rg 'a2a|A2a' -n || true
printf '\n== a2a_agent.rs with line numbers ==\n'
cat -n crates/aisix-core/src/models/a2a_agent.rs | sed -n '1,280p'
printf '\n== search for handlers and response types ==\n'
rg -n "list_a2a_agents|get_a2a_agent|ResourceEntry<A2aAgent>|A2aAgent" crates/aisix-core/src -S
printf '\n== nearby model imports/usages ==\n'
ast-grep outline crates/aisix-core/src/models/a2a_agent.rs --view expanded || trueRepository: api7/aisix
Length of output: 16888
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== locate handler and entry types ==\n'
fd -a 'a2a_agents_handlers.rs|resource_entry.rs|mod.rs' crates/aisix-admin crates/aisix-core/src -E target || true
rg -n "struct ResourceEntry|type ResourceEntry|enum ResourceEntry|list_a2a_agents|get_a2a_agent|A2aAgent" crates/aisix-admin crates/aisix-core/src -S
printf '\n== read handler file ==\n'
for f in $(fd -a 'a2a_agents_handlers.rs' crates/aisix-admin/src); do
wc -l "$f"
cat -n "$f" | sed -n '1,260p'
done
printf '\n== read ResourceEntry definition if present ==\n'
for f in $(fd -a 'resource_entry.rs' crates/aisix-core/src crates/aisix-admin/src); do
wc -l "$f"
cat -n "$f" | sed -n '1,260p'
doneRepository: api7/aisix
Length of output: 15647
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== crates/aisix-core/src/resource.rs ==\n'
cat -n crates/aisix-core/src/resource.rs | sed -n '1,220p'
printf '\n== search for any redaction / masking on A2aAgent or ResourceEntry ==\n'
rg -n "redact|mask|secret.*skip_serializing|skip_serializing_if.*secret|A2aAgentEntry|ResourceEntry<.*Serialize|serde.*flatten" crates/aisix-core crates/aisix-admin -S
printf '\n== schema snippets for A2aAgentEntry / A2aAgent ==\n'
cat -n crates/aisix-admin/src/openapi.rs | sed -n '3620,3675p'Repository: api7/aisix
Length of output: 10637
Redact upstream credentials from Admin API read responses
list_a2a_agents, get_a2a_agent, create_a2a_agent, and update_a2a_agent return ResourceEntry<A2aAgent> directly, so any stored secret is sent back in plaintext. Omit secret from read output or use a separate response type; keep client_id/token_url only if they’re meant to be visible.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aisix-core/src/models/a2a_agent.rs` around lines 49 - 56, The Admin
API read paths are exposing stored upstream credentials because
`ResourceEntry<A2aAgent>` is returned directly from `list_a2a_agents`,
`get_a2a_agent`, `create_a2a_agent`, and `update_a2a_agent`. Update the
`A2aAgent` response flow so `secret` is omitted from serialized read output,
either by introducing a separate public/read DTO or by mapping the existing
model before returning it; use the `A2aAgent` struct and those four handler
methods as the main places to adjust.
| async fn dispatch( | ||
| auth: AuthenticatedKey, | ||
| agent: &str, | ||
| state: &ProxyState, | ||
| request: Request, | ||
| request_id: &str, | ||
| ) -> Response { | ||
| // Resolve the agent from the live snapshot. A disabled agent is treated as | ||
| // absent — not served, same as a missing one. | ||
| let snapshot = state.snapshot.load(); | ||
| let entry = match snapshot.a2a_agents.get_by_name(agent) { | ||
| Some(entry) if entry.value.enabled => entry, | ||
| _ => return (StatusCode::NOT_FOUND, format!("unknown A2A agent: {agent}")).into_response(), | ||
| }; | ||
|
|
||
| // Per-agent access control, keyed on the same API key object as LLM/MCP | ||
| // access. A key with no `allowed_agents` reaches none (grant is explicit). | ||
| if !auth.key().can_access_agent(agent) { | ||
| return ( | ||
| StatusCode::FORBIDDEN, | ||
| format!("this key may not reach A2A agent: {agent}"), | ||
| ) | ||
| .into_response(); | ||
| } | ||
|
|
||
| let upstream = match upstream_from_a2a_agent(&entry.value) { | ||
| Ok(upstream) => upstream, | ||
| // Currently only oauth2 upstream auth, which the runtime does not | ||
| // implement yet — surface it as "not implemented". | ||
| Err(err) => { | ||
| emit_a2a_usage( | ||
| state, | ||
| &auth, | ||
| request_id, | ||
| agent, | ||
| "", | ||
| StatusCode::NOT_IMPLEMENTED.as_u16(), | ||
| Duration::ZERO, | ||
| ); | ||
| return (StatusCode::NOT_IMPLEMENTED, err.to_string()).into_response(); | ||
| } | ||
| }; | ||
|
|
||
| let (_parts, body) = request.into_parts(); | ||
| let bytes = match to_bytes(body, state.request_body_limit_bytes).await { | ||
| Ok(bytes) => bytes, | ||
| Err(_) => return (StatusCode::BAD_REQUEST, "invalid request body").into_response(), | ||
| }; | ||
| let value: serde_json::Value = match serde_json::from_slice(&bytes) { | ||
| Ok(value) => value, | ||
| Err(_) => return (StatusCode::BAD_REQUEST, "invalid JSON-RPC body").into_response(), | ||
| }; | ||
| let peek = serde_json::from_slice::<JsonRpcPeek>(&bytes).ok(); | ||
| let method = peek | ||
| .as_ref() | ||
| .and_then(|p| p.method.clone()) | ||
| .unwrap_or_default(); | ||
| let rpc_id = peek.and_then(|p| p.id); | ||
|
|
||
| // Reuse the LLM path's rate-limit + budget gate. The reservation is held | ||
| // for the call and dropped after (an A2A call carries no token cost yet). | ||
| // On 429 / budget-exceeded this returns before the upstream is contacted. | ||
| let _reservation = match crate::quota::enforce(state, &auth, None).await { | ||
| Ok(reservation) => reservation, | ||
| Err(err) => { | ||
| let response = err.into_response(); | ||
| emit_a2a_usage( | ||
| state, | ||
| &auth, | ||
| request_id, | ||
| agent, | ||
| &method, | ||
| response.status().as_u16(), | ||
| Duration::ZERO, | ||
| ); | ||
| return response; | ||
| } | ||
| }; |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Usage events aren't emitted for 404 / 403 / 400 rejections.
dispatch emits emit_a2a_usage for the oauth2-unsupported (Lines 125-135), quota-rejected (Lines 159-171), and post-forward success/failure (Lines 181-204) paths, but the earlier 404 "unknown agent" (Line 107), 403 "ACL denied" (Lines 113-118), and 400 "invalid body" (Lines 141, 145) returns skip it entirely. Since the PR's stated goal is to route A2A traffic through the same usage-reporting pipeline as LLM/MCP, this creates blind spots in dashboards/billing for ACL denials and malformed requests specifically, while other rejection classes are fully tracked.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aisix-proxy/src/a2a.rs` around lines 95 - 172, The dispatch path in
a2a::dispatch is missing usage reporting for early rejections, so 404 unknown
agent, 403 ACL denied, and 400 invalid body responses should also call
emit_a2a_usage before returning. Add the same usage event emission used in the
oauth2 and quota error paths, using the current request_id, agent, resolved
method when available (or empty string for pre-parse failures), the actual
response status, and Duration::ZERO so these failures are tracked consistently
with the rest of the A2A pipeline.
| let snapshot = state.snapshot.load(); | ||
| let entry = match snapshot.a2a_agents.get_by_name(agent) { | ||
| Some(entry) if entry.value.enabled => entry, | ||
| _ => return (StatusCode::NOT_FOUND, format!("unknown A2A agent: {agent}")).into_response(), | ||
| }; | ||
|
|
||
| // Per-agent access control, keyed on the same API key object as LLM/MCP | ||
| // access. A key with no `allowed_agents` reaches none (grant is explicit). | ||
| if !auth.key().can_access_agent(agent) { | ||
| return ( | ||
| StatusCode::FORBIDDEN, | ||
| format!("this key may not reach A2A agent: {agent}"), | ||
| ) | ||
| .into_response(); | ||
| } |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win
Agent existence is distinguishable from ACL denial.
A 404 (unknown/disabled agent) is returned before the ACL check (403), so any caller holding a valid API key — even one authorized for no agents — can distinguish "agent doesn't exist" from "agent exists but I can't reach it," enabling limited agent-name enumeration. Checking the ACL before (or independent of) resolving the agent, or collapsing both cases to the same status, would close this.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/aisix-proxy/src/a2a.rs` around lines 104 - 118, The A2A handler in
a2a.rs currently returns 404 from the snapshot/a2a_agents lookup before running
the auth.key().can_access_agent check, which lets callers distinguish
unknown/disabled agents from ACL failures. Rework the A2A request path in this
handler so authorization is evaluated before revealing whether an agent exists,
or otherwise collapse both outcomes to the same response status/message; use the
existing snapshot, entry lookup, and can_access_agent logic to keep the check
centralized.
Independent audit of #717 raised four MEDIUM items; all addressed: - SSRF: the outbound client now refuses redirects, so a compromised or MITM'd upstream cannot 302 the VPC-internal data plane into fetching an internal address (metadata endpoint, loopback). Mirrors the MCP OAuth client. Covered by a redirect-refused roundtrip test. - Connection reuse: the reqwest client is built once (OnceLock) and shared instead of per-request, so calls reuse the pool + TLS session. The per-agent timeout is still applied per-request. - OOM guard: upstream response bodies are read through a 16 MiB streaming cap (honest oversized Content-Length rejected up front; a lying/absent length or endless stream caught as chunks accumulate). - Endpoint coverage: add oneshot integration tests for /a2a/:agent — default-deny 403, disabled/unknown agent 404, missing-key 401, and the agent-card URL rewrite (rewrites `url`, preserves every other field). Also pin "surface the upstream status only, never its body" in the error path so the Phase-2 error-translation work can't regress it.
Audit follow-up — all findings addressed (
|
| # | Finding | Resolution |
|---|---|---|
| M1 | Upstream client follows redirects → SSRF pivot from the VPC-internal DP | Client now built with redirect::Policy::none() (mirrors the MCP OAuth client). A 302 surfaces as a non-success error, never followed. New refuses_to_follow_upstream_redirect roundtrip test asserts the redirect target is not fetched. |
| M2 | reqwest::Client::new() per request → no pool reuse |
Client built once via OnceLock and shared; the per-agent timeout is still applied per-request. |
| M3 | Unbounded resp.json() → OOM from a malicious/streaming upstream |
Bodies read through a 16 MiB streaming cap: an honest oversized Content-Length is rejected up front; a lying/absent length or endless stream is caught as chunks accumulate. |
| M4 | Endpoint glue (dispatch/card rewrite) had no e2e coverage |
Added oneshot integration tests against build_router: default-deny 403, disabled agent 404, unknown agent 404, missing key 401, and agent-card URL rewrite (rewrites url, preserves name/version/skills). |
| L2 | Risk that a future change proxies the upstream error body verbatim | Pinned a comment in the error path: surface the upstream status only, never its body. |
| L1 | 404-before-403 is a name-enumeration oracle | Left as-is, justified: the caller is already past 401, agent names are operator labels (not secrets), and this matches the reference gateway's behavior (LiteLLM also returns 403 here). Flagging rather than changing. |
All green after the fix: aisix-a2a 6 unit + 4 roundtrip (incl. redirect guard), aisix-proxy 8 a2a (incl. 5 new endpoint tests), clippy -D warnings clean, aisix-server binary builds.
|
Independent re-audit of |
What
Data-plane MVP of an A2A (Agent-to-Agent) gateway: register an upstream agent that speaks A2A over HTTP (JSON-RPC 2.0) as a first-class resource, front it at
/a2a/<agent>, and govern every call with the same pipeline as LLM and MCP traffic — one API key, per-agent ACL, rate-limit + budget, and usage. This is the third traffic type on the single DP pipeline, mirroring the MCP gateway.Design issue: AISIX-Cloud#958 (extends the #873 gap ③ "MCP / Agent 网关" — MCP half shipped as #894, this is the Agent half).
How it's built (6 commits, each self-contained)
A2aAgentresource model (clone ofMcpServer+protocol_version1.0/0.3)feat(a2a): add A2aAgent resource modelaisix-a2acrate — hand-rolled JSON-RPC client behindA2aBridgefeat(a2a): add aisix-a2a crateapi_key.allowed_agentsACLfeat(a2a): load A2aAgent into the snapshot/a2a/:agentproxy endpoint + agent-card URL rewritefeat(a2a): serve the /a2a/:agent gateway endpoint/admin/v1/a2a_agents)feat(a2a): DP admin CRUD for a2a_agentsdocs(a2a): document /admin/v1/a2a_agentsGovernance reuse (the point): the
/a2ahandler calls the exact same functions as/mcp—AuthenticatedKey(401) →can_access_agent(403) →quota::enforcerate-limit + budget (429) →UsageEventinto the shared sink. No new governance code.Reference implementations (per repo policy)
Compared how the two mainstream gateways front A2A before building:
docs.litellm.ai/docs/a2a): registers agents, proxiesPOST /a2a/{id}with a virtual key + team ACL (403 on deny), serves the discovery card with the URL rewritten to the gateway, forwards non-messaging methods upstream. This MVP lands the same shape (per-agent path, key ACL, card rewrite, verbatim forward).developer.konghq.com/ai-gateway/a2a/): auto-detects A2A, rewrites the agent-card URL, and attaches auth/ACL/observability policies — but its LLM guardrail family is not documented over A2A. Our differentiator (guardrails over A2A on the same pipeline) is deliberately Phase 2 here.Wire facts verified against the A2A spec and cited in code comments:
https://{domain}/.well-known/agent-card.json(RFC 8615, domain origin) —a2a-protocol.org/latest/topics/agent-discovery/message/sendJSON-RPC envelope differs between A2A 0.3 (message/send,kind-discriminated) and 1.0 (SendMessage, PascalCase,result.task) —a2a-protocol.org/latest/topics/life-of-a-task/. The bridge forwards requests verbatim and does not translate between versions, so a single agent is reached in whichever version it's pinned to; normalization is a later step.Divergence from MCP, called out because it's intentional: A2A has no official Rust SDK (the reference SDKs are Python/JS/Java/Go/.NET), so unlike the MCP gateway (which uses the official
rmcp), the JSON-RPC + agent-card plumbing is hand-rolled directly on the workspace HTTP client, kept behind theA2aBridgetrait.Test plan
aisix-a2a: 6 unit + 3 real-HTTP roundtrip tests against a locally spawned A2A server — card fetch,message/send, and credential forwarding for bearer / api_key / none (proves the gateway-held credential reaches the upstream and only the upstream).aisix-core:A2aAgentmodel (9) +ApiKey::can_access_agentACL +validate_a2a_agentschema — full suite green (340).aisix-admin: 4 a2a handler tests (slash/secret/oauth2 validation) + OpenAPI parity (openapi_documents_exact_admin_path_set) — full suite green (94).aisix-proxy: endpoint helper tests (error-status mapping, gateway-base rewrite) (3).cargo clippy -D warningsclean on every touched crate;cargo fmt; schemas regenerated (dump-schema);aisix-serverbinary builds.Deferred (by design, tracked)
501 Not Implementedfor now; none/bearer/api_key work.Per this repo's rule (a config knob isn't shipped until the control plane exposes it), a user cannot register an
A2aAgentuntil AISIX-Cloud grows an org-scopeda2a_agentsresource (model + RLS + secretbox +cp-admin.yaml+ DP-push fan-out byallowed_environments) and a Dashboard page.cp-admin.yamlis a closed schema that will reject the new fields until then. That CP PR is the immediate follow-up; this DP PR is the mergeable first half (DP-first, same as MCP).Summary by CodeRabbit
New Features
Bug Fixes