Migrated Brand Concierge functional tests to Vitest+Playwright+MSW#1549
Migrated Brand Concierge functional tests to Vitest+Playwright+MSW#1549carterworks wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: b6f2f1f The changes in this PR will be included in the next version bump. This PR includes no changesetsWhen changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
| Filename | Overview |
|---|---|
| packages/browser/test/integration/specs/BrandConcierge/brandConcierge.spec.js | New Vitest+MSW spec replacing TestCafe BC tests; BC3 silently changes the XDM feedback field under test (rating→classification), and BC4/BC5 invert the original assertion from "must throw" to "must not throw", removing validation coverage. |
Sequence Diagram
sequenceDiagram
participant Test as Vitest Test
participant MSW as MSW Worker
participant Alloy as Alloy SDK
participant Edge as Edge (mocked)
participant NR as NetworkRecorder
Test->>MSW: worker.use(handler)
Test->>Alloy: alloy("configure", bcConfig)
Test->>Alloy: alloy("sendConversationEvent", options)
Alloy->>Edge: POST /brand-concierge/conversations
MSW-->>Edge: "intercept & record"
MSW->>NR: captureRequest(requestId)
Edge-->>Alloy: SSE stream or 204
MSW->>NR: captureResponse(requestId)
Alloy-->>Test: resolve (+ onStreamResponse callbacks)
Test->>NR: findCalls(/brand-concierge\/conversations/)
NR-->>Test: NetworkCall[]
Test->>Test: "assert body.events[0].query.conversation.*"
Reviews (1): Last reviewed commit: "test(integration): migrate brand concier..." | Re-trigger Greptile
| }, | ||
| onStreamResponse: () => {}, | ||
| }); | ||
|
|
||
| const calls = await networkRecorder.findCalls( | ||
| /brand-concierge\/conversations/, | ||
| ); | ||
| expect(calls.length).toBeGreaterThanOrEqual(1); | ||
|
|
||
| const eventXdm = calls[0].request.body.events[0].xdm; | ||
| expect(eventXdm.interactionId).toBe("test-interaction-123"); | ||
| expect(eventXdm.conversationId).toBe("test-conversation-456"); | ||
| expect(eventXdm.conversation.feedback.classification).toBe("positive"); | ||
| expect(eventXdm.conversation.feedback.comment).toBe("Good bot response"); | ||
|
|
||
| expect(eventXdm.identityMap.ECID).toBeTruthy(); | ||
| expect(eventXdm.identityMap.ECID.length).toBeGreaterThanOrEqual(1); | ||
| }); | ||
|
|
||
| // Original test expected a throw; the anyOf validator is permissive so neither case throws. | ||
| test("BC4/C2590436 - sendConversationEvent with only onStreamResponse does not throw (permissive anyOf validator)", async ({ | ||
| alloy, | ||
| worker, | ||
| }) => { | ||
| worker.use(brandConciergeNoStreamHandler); | ||
| await alloy("configure", bcConfig); | ||
|
|
||
| await expect( | ||
| alloy("sendConversationEvent", { | ||
| onStreamResponse: () => {}, | ||
| }), | ||
| ).resolves.not.toThrow(); | ||
| }); | ||
|
|
||
| test("BC5/C2590437 - sendConversationEvent with data missing type does not throw (permissive anyOf validator)", async ({ | ||
| alloy, | ||
| worker, | ||
| }) => { | ||
| worker.use(brandConciergeNoStreamHandler); | ||
| await alloy("configure", bcConfig); | ||
|
|
||
| await expect( | ||
| alloy("sendConversationEvent", { | ||
| data: { | ||
| payload: { | ||
| rating: 5, | ||
| }, | ||
| }, | ||
| onStreamResponse: () => {}, | ||
| }), | ||
| ).resolves.not.toThrow(); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
BC4/BC5 now assert the opposite of the original test intent
The original functional tests explicitly verified that sendConversationEvent throws when called with no payload fields (BC4) or with data missing the required type field (BC5). The new tests assert the exact inverse — that those same calls resolve without throwing.
This means the test suite has moved from "invalid inputs must be rejected" to "invalid inputs are silently accepted." If validation were tightened again in the future, these tests would still pass (they only verify non-rejection). More importantly, if a consumer relies on the SDK to raise an error for a missing data.type, they now have no test coverage confirming that contract holds. The PR description notes the anyOf validator is permissive, but the original tests may have been documenting a deliberate product contract, not an implementation detail.
| expect(eventXdm.identityMap.ECID.length).toBeGreaterThanOrEqual(1); | ||
| }); | ||
|
|
||
| test("BC3/C2590435 - Send conversational event with feedback in XDM", async ({ | ||
| alloy, | ||
| worker, | ||
| networkRecorder, | ||
| }) => { | ||
| worker.use(brandConciergeNoStreamHandler); | ||
| await alloy("configure", bcConfig); | ||
|
|
||
| await alloy("sendConversationEvent", { | ||
| xdm: { | ||
| interactionId: "test-interaction-123", | ||
| conversationId: "test-conversation-456", | ||
| conversation: { | ||
| feedback: { | ||
| classification: "positive", |
There was a problem hiding this comment.
BC3 XDM feedback object uses a different field than the original test
The original functional test (Test 3) passes feedback: { rating: 4, comment: "Good bot response" } and asserts requestBody.events[0].xdm.conversation.feedback.rating === 4. This migration substitutes rating: 4 with classification: "positive" and asserts on classification instead. This is not a neutral translation — it changes which XDM field is being exercised. If rating was a deliberate part of the contract under test (e.g., to cover a numeric feedback value path), that coverage is now gone. If the schema changed and classification is now the correct field, that context is missing from the test or PR description.
| const waitUntil = (condition, { intervalMs = 50, timeoutMs = 3000 } = {}) => | ||
| new Promise((resolve, reject) => { | ||
| const start = Date.now(); | ||
| const poll = () => { | ||
| if (condition()) { | ||
| resolve(); | ||
| return; | ||
| } | ||
| if (Date.now() - start >= timeoutMs) { | ||
| reject(new Error("waitUntil timed out")); | ||
| return; | ||
| } | ||
| setTimeout(poll, intervalMs); | ||
| }; | ||
| poll(); | ||
| }); |
There was a problem hiding this comment.
Inline
waitUntil duplicates functionality available in the shared helpers
There is already a waitFor helper at packages/browser/test/integration/helpers/utils/waitFor.js. While waitUntil is a condition-poller rather than a simple delay, it is a general-purpose test utility that would benefit from living in the shared helpers/utils/ directory so other spec files can reuse it without copy-pasting.
| const waitUntil = (condition, { intervalMs = 50, timeoutMs = 3000 } = {}) => | |
| new Promise((resolve, reject) => { | |
| const start = Date.now(); | |
| const poll = () => { | |
| if (condition()) { | |
| resolve(); | |
| return; | |
| } | |
| if (Date.now() - start >= timeoutMs) { | |
| reject(new Error("waitUntil timed out")); | |
| return; | |
| } | |
| setTimeout(poll, intervalMs); | |
| }; | |
| poll(); | |
| }); | |
| // Consider moving this to packages/browser/test/integration/helpers/utils/waitUntil.js | |
| // so other specs can reuse it. | |
| const waitUntil = (condition, { intervalMs = 50, timeoutMs = 3000 } = {}) => | |
| new Promise((resolve, reject) => { | |
| const start = Date.now(); | |
| const poll = () => { | |
| if (condition()) { | |
| resolve(); | |
| return; | |
| } | |
| if (Date.now() - start >= timeoutMs) { | |
| reject(new Error("waitUntil timed out")); | |
| return; | |
| } | |
| setTimeout(poll, intervalMs); | |
| }; | |
| poll(); | |
| }); |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
9ef9711 to
cdde625
Compare
b9a1fcf to
bd82fca
Compare
There was a problem hiding this comment.
carterworks has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
cdde625 to
7ef7079
Compare
bd82fca to
663fb81
Compare
There was a problem hiding this comment.
carterworks has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
carterworks has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
60fcce5 to
0e5fc32
Compare
0e5fc32 to
918a39a
Compare
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
918a39a to
b6f2f1f
Compare
Integration Test Migration Review — BrandConcierge (
|
| Old case | New case | Subtests old→new | Status |
|---|---|---|---|
| C2590433 — message only | BC1/C2590433 | 5→5 | 🟢 |
| C2590434 — data object | BC2/C2590434 | 5→6 | 🟢 (extra surface-length assertion) |
| C2590435 — feedback in XDM | BC3/C2590435 | 5→5 | 🟡 payload field changed (see §3) |
| C2590436 — invalid options throws | BC4/C2590436 | 2→1 | 🔴 polarity inverted |
| C2590437 — missing data.type throws | BC5/C2590437 | 2→1 | 🔴 polarity inverted |
No old functional tests were deleted by this PR. Old spec: packages/browser/test/functional/specs/BrandConcierge/sendConversationalEvent.js (not modified).
1. Parity — 🟢 BC1–BC3, 🔴 BC4–BC5
BC1–BC3 map cleanly. Assertion counts are equal or higher. Assertion paths are corrected (see §3).
BC4 / BC5 — polarity inversion. The old tests assert errorThrown === true; the new tests assert .resolves.not.toThrow().
The comment in the new spec (brandConcierge.spec.js:177) acknowledges this: "Original test expected a throw; the anyOf validator is permissive so neither case throws."
After tracing the validator:
// validateMessage.js:35 — branch 2 of anyOf
objectOf({ xdm: xdmValidator, voiceEnabled: boolean() }).required()
createAnyOfValidator.js:17 iterates branches, returning the first that doesn't throw. Branch 2 accepts any non-null object because xdm and voiceEnabled are both optional, and createObjectOfValidator.js:37 copies unknown keys through without error. So { onStreamResponse: fn } and { data: { payload: … } } both match branch 2, and neither throws today.
This means:
- If the old functional tests ever passed against a real edge, the validator has since been relaxed.
- If they never passed (aspirational spec), the new tests accurately document current behavior.
This needs an explicit decision in the PR description / ticket. Is the permissive anyOf intentional (product decision to not validate on the client)? Or is branch 2 missing a .required() on xdm to narrow it to xdm-only calls? Until that's confirmed, these two cases represent reduced validation coverage.
2. Skip justification — 🟢
No skipped or .todo tests in the new spec.
3. Mock fidelity — 🟢 (one data-schema note)
Request path correction (🟢): Old tests check requestBody.query.conversation; new tests correctly check body.events[0].query.conversation. The source (createSendConversationEvent.js:54) calls event.mergeQuery(…) which places the query inside the event object, not the payload root. The old path was wrong; the new path is right.
BC3 feedback schema alignment (🟡): Old test passes { feedback: { rating: 4, comment: "…" } }, new test passes { feedback: { classification: "positive", comment: "…" } } (brandConcierge.spec.js:150–157). The validator schema (validateMessage.js:23–32) defines feedback.classification, not feedback.rating, so the new payload is schema-valid where the old was not. This is a correction, but it's worth a comment in the PR that the payload was updated to match the actual schema.
SSE mock fidelity (🟢): brandConciergeStreamingHandler returns a proper text/event-stream response with data: {…}\n\n — adequate for exercising the stream parser.
BC4/BC5 handler (🟡): Both use brandConciergeNoStreamHandler (204). Since neither test asserts any network call (they assert only that the call doesn't throw), this is fine for now, but if the tests are fixed to use rejects.toThrow(), they won't reach the network at all — worth noting.
4. Timing / flake — 🟡
BC1 stream callback (brandConcierge.spec.js:97–99): Uses a local waitUntil polling helper — appropriate for async SSE, but the helper is defined inline in the spec file (line 47–62) rather than imported from helpers/utils/. Consider extracting it to the shared utils.
BC4/BC5 assertion idiom — wrong pattern (🔴):
// brandConcierge.spec.js:185-189 and 199-208
await expect(
alloy("sendConversationEvent", { … }),
).resolves.not.toThrow();.toThrow() / .not.toThrow() applies to functions, not promise-resolved values. Chained after .resolves, it receives the settled value (here undefined) and expect(undefined).not.toThrow() passes trivially — it does not verify the promise resolved vs. rejected. Every other rejection test in the codebase uses .rejects.toThrow(). If the intent is "does not reject", the correct form is:
await expect(promise).resolves.toBeUndefined();
// or simply:
await promise; // throws the test if the promise rejectsAs written, BC4 and BC5 cannot catch a regression where the call starts throwing again. They are effectively no-ops.
5. Isolation — 🟢
networkRecorder.reset()in before/after via{ auto: true }fixture (extend.js:27–35).worker.resetHandlers()after each test (extend.js:21–24).- Cookie jar cleared before each test (
extend.js:52–56). - Per-test
worker.use(…)for BrandConcierge handlers — no global handler registration that could bleed.
6. Hygiene — 🟡
| Check | Status | Note |
|---|---|---|
| License header | 🟡 | Year reads 2026 (brandConcierge.spec.js:2); other new specs use 2025. Minor. |
test.meta tags |
🟢 | IDs embedded in test names (BC1/C2590433, etc.) |
vitest.projects.js |
🟢 | Glob test/integration/**/*.spec.js picks up the new file automatically — no change needed |
| Comments | 🟢 | Inline comment at line 177 explains the polarity change — good instinct, but the explanation should be in the PR description with a ticket ref, not just a code comment |
waitUntil duplication |
🟡 | Same utility may exist in helpers/utils/ — worth checking before merging |
Summary of required actions
-
🔴 BC4/BC5 assertion idiom — Replace
resolves.not.toThrow()with a form that actually validates behavior (eitherresolves.toBeUndefined()if non-throwing is the intent, orrejects.toThrow()if validation is tightened). -
🔴 BC4/BC5 behavioral decision — Confirm in the PR (with a ticket or design doc reference) whether the permissive
anyOfis intentional. If it's a known product decision, the test is correct documentation. If it's a validator bug, the tests should be written torejects.toThrow()and the validator fixed first. -
🟡 BC3 payload change — Call out
rating → classificationin the PR description as a schema alignment fix, not just a rename. -
🟡
waitUntil— Extract to shared utils or confirm it doesn't already exist there.
Changed Packages
Description
Migrates the Brand Concierge functional tests to the new Vitest+Playwright+MSW harness. BC4/BC5 validator behavior documented — anyOf is permissive, so these resolve rather than throw as original tests expected.
Related Issue
Part of the functional test → integration test migration. See
packages/browser/test/FUNCTIONAL_MIGRATION_PLAN.md.Motivation and Context
The existing TestCafe functional test suite is being migrated to Vitest+Playwright+MSW to enable faster, more reliable CI testing without a running server. This PR is part of a stacked series — each PR migrates one test file.
Functional tests replaced:
packages/browser/test/functional/specs/BrandConcierge/sendConversationalEvent.jsTypes of changes
Checklist:
Stack