Skip to content

feat(llmo): CloudFront CDN log delivery with assume-role setup (LLMO-5566)#2680

Open
claudiaboldis wants to merge 57 commits into
mainfrom
feature/LLMO-5566-cloudfront-log-delivery-assume-role
Open

feat(llmo): CloudFront CDN log delivery with assume-role setup (LLMO-5566)#2680
claudiaboldis wants to merge 57 commits into
mainfrom
feature/LLMO-5566-cloudfront-log-delivery-assume-role

Conversation

@claudiaboldis

Copy link
Copy Markdown

Summary

  • Adds server-side STS AssumeRole (connector role) so the LLMO UI CloudFront wizard never needs AWS credentials in the browser — all CloudFront and CloudWatch Logs calls are performed by the Lambda
  • Adds 5 read-only wizard endpoints (connect, distributions, prerequisites, origins, behaviors) that power the step-by-step CloudFront BYOCDN setup flow
  • Adds POST /sites/:siteId/llmo/cdn-log-delivery (diagram step 8) — idempotently creates a CloudWatch Logs delivery source in the customer's account and links it to Adobe's cross-account delivery destination, enabling CloudFront standard log push to the cdn-logs S3 bucket

New files

File Purpose
src/support/edge-optimize.js assumeConnectorRole (STS), listCloudFrontDistributions, getDistributionConfig, idempotent step-on-poll deploy orchestrator
src/support/cdn-log-delivery.js Provider-agnostic log-delivery registry; CloudFront path uses logs:PutDeliverySource + logs:CreateDelivery via the assumed role

Endpoints (all POST, all INTERNAL_ROUTES — admin/IMS-only)

  • POST /sites/:siteId/llmo/edge-optimize/connect
  • POST /sites/:siteId/llmo/edge-optimize/distributions
  • POST /sites/:siteId/llmo/edge-optimize/prerequisites
  • POST /sites/:siteId/llmo/edge-optimize/origins
  • POST /sites/:siteId/llmo/edge-optimize/behaviors
  • POST /sites/:siteId/llmo/cdn-log-delivery

⚠️ Cross-service dependencies (not in this PR — must land before prod)

  1. Connector-role IAM policy (customer-bootstrap-role.yaml in EDGE_OPTIMIZE_TEMPLATE_BUCKET) must add logs:PutDeliverySource, logs:CreateDelivery, logs:GetDeliverySource, logs:DescribeDeliveries — today the role only grants CloudFront perms
  2. spacecat-auth-service must provision the cdn-logs-<org> delivery destination + cross-account policy for the BYOCDN CloudFront path (the handler surfaces a clear "destination not provisioned" 400 if it's missing)

Test plan

# verify connector role assumable
curl -s -X POST "https://spacecat.experiencecloud.live/sites/$SITE_ID/llmo/edge-optimize/connect" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"accountId":"123456789012","externalId":"<externalId>"}'
# Expected: HTTP 200 { "connected": true }
# list distributions via assumed role
curl -s -X POST "https://spacecat.experiencecloud.live/sites/$SITE_ID/llmo/edge-optimize/distributions" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"accountId":"123456789012","externalId":"<externalId>"}'
# Expected: HTTP 200 { "distributions": [...] }
# enable log delivery (idempotent)
curl -s -X POST "https://spacecat.experiencecloud.live/sites/$SITE_ID/llmo/cdn-log-delivery" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"accountId":"123456789012","externalId":"<externalId>","distributionId":"EDFDVBD6EXAMPLE"}'
# Expected: HTTP 200 { "alreadyExisted": false, "deliveryId": "..." }
# Re-run: HTTP 200 { "alreadyExisted": true }

Related

🤖 Generated with Claude Code

Akash Bhardwaj and others added 30 commits June 19, 2026 01:03
POST /sites/:siteId/llmo/edge-optimize-bootstrap-url returns a CloudFormation
quick-create URL with a server-side presigned template URL, so a customer can
create the cross-account Edge Optimize connector role in their own AWS account
without a public S3 bucket and without any S3 access of their own. Presigning is
done with the service execution role.

Includes route + capability registration, OpenAPI spec, and unit tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The getRouteHandlers "segregates static and dynamic routes" test asserts the
exact set of routes; add the new dynamic route to the expected list.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hardcode EDGE_OPTIMIZE_TEMPLATE_BUCKET and EDGE_OPTIMIZE_TRUSTED_PRINCIPAL_ARN
fallbacks so the dev/ci branch deploy returns a quick-create URL before those
env vars are wired into Vault/secrets. Marked TEMPORARY / TODO REMOVE —
revert before merge/prod (values must come from env config).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p-url' into feat/llmo-edge-optimize-bootstrap-url
Use llmo-edgeoptimize-cf-template (in 682033462621, where the service deploys
and signs) so the dev role reads it same-account; stage customer fetches via
the presigned URL. Still TEMPORARY / TODO REMOVE before merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…p-url' into feat/llmo-edge-optimize-bootstrap-url
…ault

The TEMPORARY hardcoded EDGE_OPTIMIZE_TEMPLATE_BUCKET default makes the
bucket always set, so the 'not configured' guard can no longer be hit via
an empty env. Exercise the same guard via the missing S3 client instead.
TODO: restore the empty-bucket variant when the temp default is removed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tighten the default lifetime of the bootstrap template presigned URL from
1h to 15m. The customer opens the quick-create link immediately, so a
shorter TTL shrinks the leak window. A leaked URL only grants GetObject on
the single template object until expiry; still override via
EDGE_OPTIMIZE_PRESIGN_TTL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(Phase 2)

Backend for the CloudFront 'Deploy routing' wizard's read steps. The
api-service assumes the customer's cross-account connector role server-side
(no AWS creds in the browser):

- New src/support/edge-optimize.js: assumeConnectorRole (STS AssumeRole with
  the per-session external ID) + listCloudFrontDistributions.
- POST /sites/:siteId/llmo/edge-optimize/connect - verifies the role is
  assumable (returns { connected } so the UI can poll while the customer
  creates the role); POST .../edge-optimize/distributions - lists the
  account's CloudFront distributions. Both gated by site access + LLMO admin
  and added to INTERNAL_ROUTES (not exposed to S2S).
- Adds @aws-sdk/client-sts and @aws-sdk/client-cloudfront.
- Unit tests for the support module (mocked SDK) and both handlers; route +
  capability lists updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document the already-shipped connect and distributions endpoints plus the
new read-only prerequisites, origins, and behaviors endpoints for the
CloudFront "Deploy routing" wizard. Adds shared connector/distribution
request schemas and per-endpoint response definitions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ints (Phase 2)

Add three read-only CloudFront wizard endpoints, mirroring the existing
connect/distributions handlers (12-digit accountId + externalId validation,
site/access/LLMO-admin gate, assumed-role calls, badRequest on failure):

- POST /sites/:siteId/llmo/edge-optimize/prerequisites -> checkEdgeOptimizePrerequisites
  reports connectorRole + cloudFrontRead checks (ok/false + detail, never 500)
- POST /sites/:siteId/llmo/edge-optimize/origins -> getEdgeOptimizeOrigins
  returns origins + hasEdgeOptimizeOrigin detection
- POST /sites/:siteId/llmo/edge-optimize/behaviors -> getEdgeOptimizeBehaviors
  returns default + ordered cache behaviors

Adds getDistributionConfig() support fn (GetDistributionConfigCommand) and
unit tests for the support fn, controller handlers, and route registration.
All three routes added to INTERNAL_ROUTES (admin/IMS-only, not S2S).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sync per-step endpoints that assume the customer connector role server-side
and perform one CloudFront write each (no AWS creds in the browser):

- create-origin: add the EdgeOptimize_Origin (env EDGE_OPTIMIZE_ORIGIN_DOMAIN,
  default dev.edgeoptimize.net) via UpdateDistribution (ETag IfMatch).
- create-function: create/update + publish the edgeoptimize-routing CF Function
  (bot-routing JS ported from the standalone wizard).
- apply-cache: add EO headers to the behavior's custom cache policy (common
  path; legacy ForwardedValues / managed-policy clone left as TODO).
- create-lambda: create exec role (bounded IAM-propagation retry) + the
  edgeoptimize-origin Lambda@Edge and publish a version.
- apply-associations: wire the function (viewer-request) + Lambda (origin-
  request/response) onto the selected behavior.
- verify: server-side bot-vs-human probe; passed requires x-edgeoptimize-request-id
  (x-edgeoptimize-fo = failover, not success).

Adds @aws-sdk/client-iam + @aws-sdk/client-lambda. All AWS ops use ETag
read-modify-write. Embedded function/Lambda code ported verbatim from the
connect-aws-wizard; Lambda code inlined per the helix-deploy bundling rule.
Mocked-SDK unit tests for all 6 support fns + handlers; routes/capabilities
+ OpenAPI updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nt-wizard' into feat/llmo-edge-optimize-cloudfront-wizard
The Default(*) behavior commonly uses an AWS-managed cache policy, which
cannot be updated (UpdateCachePolicy -> 'update is not allowed for this
policy'). applyEdgeOptimizeCacheHeaders now ports the full standalone-wizard
logic with all three scenarios:
- legacy (ForwardedValues, no CachePolicyId): add EO headers there + MinTTL 0
- custom policy: UpdateCachePolicy to add EO headers (existing path)
- managed policy: CLONE into a custom edgeoptimize-cache policy with the EO
  headers, then repoint the behavior to it (idempotent by name)

Adds GetCachePolicy/ListCachePolicies/CreateCachePolicy. Support tests
rewritten to dispatch by command name and cover all three scenarios.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…endpoint

Fixes the Lambda step's three failure modes (timeout, 'update in progress',
no existence check):
- waitForLambdaIdle now gates on State Active AND LastUpdateStatus !=
  InProgress (was State only), so we never hit ResourceConflictException
  ('update is in progress') on a retry after a slow/timed-out first call.
- createEdgeOptimizeLambda is fully idempotent: if a published numbered
  version already matches the current code, reuse it (no update/publish);
  otherwise update + publish. So a retry after a CDN first-byte timeout
  returns immediately instead of conflicting.
- New read-only POST /sites/:siteId/llmo/edge-optimize/lambda-status
  (getEdgeOptimizeLambdaStatus) so the wizard can detect on entry and poll
  whether the function already exists with a published version.

Support tests rewritten to dispatch by command name; status tests added.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dly)

create-lambda no longer blocks on a fresh function becoming Active (which
exceeded the CDN first-byte timeout -> 503). It now ensures the role + kicks
off the function create and returns { status: 'provisioning' | 'ready' }
immediately; the UI polls until a published version exists. Also:
- buildLambdaZip uses a fixed timestamp so CodeSha256 is deterministic
  (no version churn).
- lambda-status now reports roleExists + a ready flag (role is created
  synchronously by the create ack) so the wizard can show role + function
  state and check on entry.
- Removed the in-request waitForLambdaIdle/UpdateFunctionCode blocking path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
edge-optimize.test.js re-ran esmock() in beforeEach, re-instantiating the
mocked AWS SDK module graph on every test. As this file grew this session it
accumulated enough memory to push the 12.5k-test suite past the 4GB V8 heap
limit (worker OOM -> '1 failing: Worker terminated' + lost-worker coverage
dropping below the 90% gate). Move esmock to a single before() hook and reset
only the send stubs per test. Suite run time for this file drops from ~4min
to ~7s and the leak is gone.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nt-wizard' into feat/llmo-edge-optimize-cloudfront-wizard
The create-origin step created the EdgeOptimize_Origin without its custom
headers, so the routing function's request could not authenticate to Edge
Optimize or resolve the customer host - Verify never returned an
x-edgeoptimize-request-id.

- createEdgeOptimizeOrigin now sets x-edgeoptimize-api-key (site EO API key),
  x-forwarded-host (customer host), and optional x-edgeoptimize-fetcher-key,
  mirroring the standalone wizard + CloudFormation installer.
- Self-heals: an origin created header-less by the earlier version is patched
  in place on re-run (returns updated: true).
- Handler derives both server-side - api key from the tokowaka metaconfig
  (apiKeys[0]), forwarded host from calculateForwardedHost(site.baseURL) - so
  no new UI input; gateEdgeOptimizeWizard now returns the site to avoid a
  second fetch.
- Verify: documented the prod TODO (probe the customer's real domain, not the
  *.cloudfront.net domain) - behavior unchanged for dev testing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tighten the dev-only default for EDGE_OPTIMIZE_TRUSTED_PRINCIPAL_ARN from the
whole dev account (arn:aws:iam::682033462621:root) to the exact assuming
identity - the spacecat-api-service Lambda execution role
(arn:aws:iam::682033462621:role/spacecat-role-lambda-generic) - shrinking the
blast radius of the connector-role trust. No AWS-side change needed; the
assuming identity is already that role. Prod must still set this via env to the
prod execution role ARN (no in-code default) - tracked in the punch list.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

This PR will trigger a minor release when merged.

claudiaboldis and others added 4 commits June 25, 2026 14:28
…apabilities (LLMO-5566)

Adds the 15 CloudFront wizard + CDN log delivery endpoints to
INTERNAL_ROUTES in facs-capabilities.js — all are gated by
isLLMOAdministrator() and must not appear on the external customer
FACS surface. Fixes the facs-capabilities coverage invariant that was
failing CI after the main merge introduced the FACS hybrid permission
model.

Also adds test/e2e/llmo-cdn-log-delivery.e2e.js with four test tiers:
Tier 1 — input validation (400, always run)
Tier 2 — auth gate (403 for non-LLMO-admin, always run)
Tier 3 — response shape via soft-fail AWS calls (LLMO_ADMIN_API_KEY)
Tier 4 — full AWS integration (real connector role + distribution)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…566)

Adds 5 missing test cases that were causing the codecov/patch check to
fail (96.72% vs 99.94% target):

- cdn-log-delivery: missing imsOrgId and missing deliveryDestinationArn
  validation throws in createCdnLogDelivery
- edge-optimize: verify step stays in_progress when distribution domain
  is not yet propagated; verify step stays in_progress when
  verifyEdgeOptimizeRouting throws a fetch error
- llmo controller: enableCdnLogDelivery returns 400 when
  createCdnLogDelivery throws a non-ResourceNotFound error (covers
  the rethrow path and the outer catch block)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cboldis and others added 2 commits June 26, 2026 19:45
…decov/patch (LLMO-5566)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
cboldis and others added 3 commits June 26, 2026 23:45
…d, CF templates, setupType, rescan endpoint

- gateEdgeOptimizeWizard now derives externalId = toSafeAwsName(imsOrgId) server-side; FE no longer sends externalId
- getEdgeOptimizeBootstrapUrl accepts setupType ('log-only'|'log-and-oae') and selects the matching CF template key
- Add customer-bootstrap-role.yaml and customer-bootstrap-role-log-only.yaml CloudFormation templates
- Add POST /sites/:siteId/llmo/cdn-log-rescan (rescanCdnLogDelivery) for idempotent re-enabling of log delivery
- Remove 14 stale 'externalId missing' tests; add 9 tests for rescanCdnLogDelivery

Introduced by: #2680
…MO-5566)

- Remove dead !Organization early-return from gateEdgeOptimizeWizard
- Add test for custom EDGE_OPTIMIZE_ROLE_NAME env var path
- Add test for rejection with no .message (covers 'unknown error' fallback)

Introduced by: #2680
…strapUrl (LLMO-5566)

- Add test for log-only setupType without EDGE_OPTIMIZE_TEMPLATE_KEY_LOG_ONLY (default fallback)
- Add test for missing IMS org ID in the bootstrap URL inline org check

Introduced by: #2680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants