fix(parser): require explicit provider factories by mariusvniekerk · Pull Request #784 · kenn-io/agentsview

mariusvniekerk · 2026-06-20T00:01:46Z

Every registered parser agent now has an explicit provider factory instead of relying on the legacy fallback. Claude.ai and ChatGPT are import-only parsers, so they use a small ProviderBase-backed import-only provider with no source discovery/watch behavior while preserving the standard unsupported parse response. The provider registry tests now fail if any future AgentDef falls through to legacyProviderFactory, which keeps new providers on the shared facade path.

roborev-ci · 2026-06-20T00:04:36Z

roborev: Combined Review (`a63405c`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 2m7s), codex_security (codex/security, done, 33s) | Total: 2m40s

roborev-ci · 2026-06-20T00:08:43Z

roborev: Combined Review (`41637b8`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 1m15s), codex_security (codex/security, done, 25s) | Total: 1m40s

roborev-ci · 2026-06-20T00:15:23Z

roborev: Combined Review (`555b938`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 2m54s), codex_security (codex/security, done, 11s) | Total: 3m5s

mariusvniekerk · 2026-06-20T00:17:27Z

This change is part of the following stack:

Design parser provider facade layer #748
- Add parser provider facade core #751
  - Add JSONL source set helper #752
    - feat(parser): migrate commandcode and iflow providers #757
      - Migrate gptme to parser provider facade #753
        
        feat(parser): migrate deepseek tui provider #758
        
        feat(parser): migrate amp and zencoder providers #759
        
        feat(parser): migrate pi provider #760
        
        feat(parser): migrate qwen provider #761
        
        feat(parser): migrate workbuddy provider #762
        feat(parser): migrate cortex provider #763
        feat(parser): migrate kimi provider #764
        feat(parser): migrate claw providers #766
        feat(parser): migrate qwenpaw provider #767
        feat(parser): migrate openhands provider #768
        feat(parser): migrate cursor provider #769
        feat(parser): migrate vibe provider #770
        feat(parser): migrate hermes provider #771
        feat(parser): migrate claude provider #772
        feat(parser): migrate cowork provider #773
        feat(parser): migrate opencode-family providers #774
        feat(parser): migrate codex provider #775
        feat(parser): migrate gemini copilot providers #776
        feat(parser): migrate copilot ide providers #778
        feat(parser): migrate positron provider #779
        feat(parser): migrate zed shelley providers #780
        feat(parser): migrate kiro providers #781
        feat(parser): migrate antigravity providers #782
        feat(parser): migrate db-backed providers #783
        fix(parser): require explicit provider factories #784 ◀

_{Change managed by git-spice.}

roborev-ci · 2026-06-20T00:20:30Z

roborev: Combined Review (`30ae7e6`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 2m1s), codex_security (codex/security, done, 14s) | Total: 2m15s

roborev-ci · 2026-06-20T00:26:33Z

roborev: Combined Review (`089ddd7`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 2m4s), codex_security (codex/security, done, 23s) | Total: 2m27s

roborev-ci · 2026-06-20T00:36:47Z

roborev: Combined Review (`0671d6c`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 1m19s), codex_security (codex/security, done, 23s) | Total: 1m42s

roborev-ci · 2026-06-20T00:45:24Z

roborev: Combined Review (`b6de96f`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 1m55s), codex_security (codex/security, done, 17s) | Total: 2m12s

roborev-ci · 2026-06-20T01:06:40Z

roborev: Combined Review (`42bcdb4`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (claude-code/default, done, 3m11s), codex_security (claude-code/security, done, 1m21s) | Total: 4m32s

roborev-ci · 2026-06-20T02:22:06Z

roborev: Review Unavailable (`53ea4b2`)

The review agent repeatedly failed to run (likely an agent or configuration error). roborev will try again on the next commit.

Last error: agent: claude-code failed stream: stream errors: You've hit your session limit · resets 5:50am (UTC): exit status 1

roborev-ci · 2026-06-21T04:43:36Z

roborev: Combined Review (`9671172`)

Synthesis unavailable. Showing individual review outputs.

claude-code — default (done)

Review Findings

Severity: Low
Location: internal/sync/engine.go:695-825 (attachProviderSources and its only callee discoverProviderSources)
Problem: These two methods are dead in production. The full-sync path uses discoverProviderFiles, and the changed-path path uses attachChangedPathProviderSources. The only caller of attachProviderSources (and therefore the only thing that reaches discoverProviderSources) is internal/sync/provider_process_test.go:1003. That is ~95 lines of unexercised production code that largely duplicates discoverProviderFiles, which will rot independently of the live path.
Fix: Remove both methods (and adjust the test), or wire attachProviderSources into the path it was intended for if it is meant to be used.

Severity: Low
Location: internal/sync/parsediff.go:614-620 (parseDiffPresencePath)
Problem: The helper is effectively return path. For a virtual path (base != path) it returns path; for a physical path (base == path) it returns base, which equals path. Both branches return the input unchanged, so the function name and structure imply a stripping/keeping distinction that does not exist. The correct exact-path behavior actually comes from storedByPath now being keyed by both full and base paths; this wrapper just obscures that.
Fix: Replace the call with job.path directly, or drop the helper.

Summary

A large, well-tested refactor that removes the legacy DiscoverFunc/FindSourceFunc adapters and makes per-agent providers authoritative for discovery, source lookup, watch planning, incremental parsing, export, and sync skip/caching; the only issues found are minor dead/redundant code.

claude-code — security (done)

I've reviewed the full diff. This is a large (65-commit, ~13k line) internal refactor that migrates the parser/sync system from per-agent function callbacks (DiscoverFunc, FindSourceFunc, WatchRootsFunc) to a unified provider abstraction, flipping all agents to provider-authoritative/import-only migration modes and removing the legacy adapter.

I focused on the changed surfaces that could touch a trust boundary:

cmd/agentsview/session_export.go — New provider-based source resolution. The Visual Studio Copilot multi-conversation trace filtering (the one path with an explicit disclosure concern) is preserved and extended: when the resolved source is a real trace file, the conversation ID is derived from the session ID and filtered, guarded by IsValidSessionID. The new visualStudioCopilotTraceHasConversation also gates on IsValidSessionID(rawID) before interpolating rawID into the trace-content match string, so a raw ID can't inject an arbitrary search substring. This is a local CLI over user-owned session files regardless.
internal/ssh/resolve.go — Only the filter predicate changed (ProviderSupportsSourceDiscovery vs DiscoverFunc != nil); the values interpolated into the generated remote shell script remain static parser.Registry constants, so no new injection.
internal/sync/engine.go — New provider source-discovery/attachment plumbing. The one new SQL query (countRootProviderVirtualDBSessions) is fully parameterized with ? placeholders. pathRewriter/idPrefix remote-sync machinery is moved but not semantically changed. No new exec/command sinks.
internal/parser/* providers — Codex incremental append parsing, fingerprint inode/device additions, factory wiring (panic on missing factory is a programmer-error guard). All operate on local, user-owned files that per the project threat model are non-adversarial.

The externally-reachable code (HTTP API, auth/bearer-token, CORS, managed Caddy config generation) and the DB/Postgres schema are not touched by this diff. No changes construct paths, queries, or commands from attacker-controlled input crossing a trust boundary, and no secrets are exposed.

No issues found.

roborev-ci · 2026-06-24T00:07:19Z

roborev: Combined Review (`245482d`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 9m51s), codex_security (codex/security, done, 4m54s) | Total: 14m45s

roborev-ci · 2026-06-25T06:00:41Z

roborev: Combined Review (`5a2f40d`)

Summary verdict: two medium correctness findings remain; no security findings were reported.

Medium

cmd/agentsview/main.go:926
Provider watch plans with only missing roots now fall through to legacy watch collection and often add neither a watcher nor an unwatched poll root. For providers like Zed before threads/ exists or Reasonix before its subdirs exist, new files are not picked up by the 2-minute unwatched poll and wait for the scheduled full sync.
Fix: Treat a non-empty provider watch plan as authoritative even when all roots are currently missing; return true with the configured agent dir as an unwatched poll root when missingRoots is non-empty and no existing root covers creation.
internal/sync/engine.go:6644
The provider lookup now runs before the existing Aider stored-path validation. Aider virtual paths are positional (<history>#<idx>), and the provider accepts stored paths structurally without checking that the index still maps to the requested raw session ID, so a shifted history can make FindSourceFile/SyncSingleSession parse the wrong run instead of falling through to raw-ID lookup.
Fix: Keep the Aider AiderRawIDAt validation before provider lookup, or make the Aider provider reject stored virtual paths under RequireFreshSource when the indexed run does not match RawSessionID, then fall back to raw-ID resolution.

Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 8m57s), codex_security (codex/security, done, 3m29s) | Total: 12m36s

roborev-ci · 2026-06-25T14:18:09Z

roborev: Combined Review (`b842c9e`)

Medium severity issue found; no high or critical findings.

Medium

cmd/agentsview/main.go:926 - When a provider watch plan resolves but every planned root is missing, collectProviderWatchRoots returns false, nil and falls back to legacy watch handling. For provider-authoritative agents without legacy WatchSubdirs, the missing configured/default root is not added to unwatchedDirs, so newly created agent directories are not picked up by the 2-minute unwatched-dir poll and may wait until the slower periodic full sync.

Suggested fix: Treat a provider plan with missing roots as provider-handled even if no roots were added, return the configured root or missing planned roots for unwatchedDirs, and add a regression test for a provider-authoritative agent whose root does not exist at startup.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 10m45s), codex_security (codex/security, done, 6m24s) | Total: 17m16s

Make the provider registry force every agent onto an explicit facade path instead of silently inheriting a legacy fallback factory. Remove the legacy provider fallback entirely so an unhandled AgentDef is a loud construction error, and represent the non-filesystem export parsers (Claude.ai, ChatGPT) with explicit import-only providers. Mark the concrete providers authoritative in the migration manifest and drop the legacy-only marker. Route FindSourceFile and SourceMtime through provider FindSource and Fingerprint so the stack tip exercises provider-owned source identities and composite mtimes rather than parallel legacy dispatch. Retire the test scaffolding that depended on the removed legacy types: per-provider tests assert concrete construction, the obsolete legacy-fallback and legacy-only-mode registry tests are dropped, and the zero-legacy anti-shim gate runs with an empty pending list. With codex's legacy ShallowWatchRootsFunc removed in favor of the provider WatchPlan, fix collectProviderWatchRoots so a WatchPlan root that does not exist yet but lives under an already-watched ancestor no longer marks the whole directory for unwatched polling. A not-yet-created per-session recursive root otherwise regressed parity by polling two codex dirs that share a watched state-directory root; the ancestor watch observes the target's creation and a later sync establishes the deeper watch. refactor(parser): fold export parsers onto the import-only providers ChatGPT and Claude.ai sessions never come from disk discovery; they enter the archive only through a one-shot import. Move the ParseChatGPTExport and ParseClaudeAIExport free functions onto the import-only provider as the ChatGPTExportParser and ClaudeAIExportParser methods, and route the importer and tests through NewProvider plus a type assertion. This removes the last provider-specific legacy parse free functions, so the parser package now owns every agent's parse behavior on provider receivers rather than on standalone entrypoints. refactor(sync): remove the dead provider shadow-compare harness No provider runs in shadow-compare mode: every agent is now either provider-authoritative or import-only. The shadow harness that dual-ran a side-effect-free provider parse against the legacy result and recorded the diff was therefore never invoked at runtime. Delete the harness end to end: the ObserveProviderSource entry point and its comparison machinery, the Engine.observeProviderShadow hook and its two call sites, the ProviderShadowRecorder config/field wiring, and the ProviderMigrationShadowCompare mode (collapsing every switch that paired it with provider-authoritative). The provider outcome validation and effect planning helpers that the live parse path still relies on move to provider_effects.go, which is all that file ever held that was reachable. test(parser): drop the facade-migration anti-shim scaffolding With every provider folded onto receiver methods and zero provider-specific legacy parse free functions left, the migration's enforcement tests have served their purpose. They assert the absence of named functions and that provider files do not shim legacy entrypoints, which is only meaningful while the stack is mid-migration; after merge they are pure maintenance drag that breaks whenever a symbol is legitimately renamed. Delete the per-provider Test*ProviderOwnsLegacyEntrypoints guards and the shared anti-shim scan (provider_shim_scan_test.go). The providers' behavioral tests remain and are what actually protect the parse paths going forward. feat(parser): migrate aider, omp, reasonix to providers origin/main carries three agents the facade stack never migrated: Aider, OhMyPi, and Reasonix. After rebasing onto it, the explicit provider registry panicked on startup because those agents had no concrete factory, and the migration manifest still listed them as legacy-only against a manifest that no longer defines that mode. Give each a concrete provider so the zero-legacy registry stays intact: - OMP shares Pi's JSONL session format, so the Pi provider is parameterized by AgentDef (type and ID prefix) and serves both Pi and OhMyPi; ParseOMPSession is folded away. - Reasonix gets a single-file provider whose composite fingerprint folds the .jsonl.meta sidecar (mirroring reasonixEffectiveInfo) and whose changed-path classifier reproduces the project/global/archive/subagent layouts and the sidecar-to-transcript mapping. - Aider gets a multi-session provider that fans one history file out into one session per run under "<history>#<idx>" virtual paths and force-replaces on parse, mirroring the Shelley shape. Per-run skip is handled by dropUnchangedSharedSQLiteResults (content-hash compare); remote-sync identity stability is preserved by threading the path rewriter through ProviderConfig so per-run IDs stay stable across temp extraction dirs. The three manifest entries flip to provider-authoritative and the legacy engine methods (processAider, processReasonix, aiderFileUnchanged, aiderIdentityPath, classifyAiderPath) plus the now-dead legacy processFile fall-through are removed. The two codex append regression tests that were re-pointed onto processFile no longer consume their re-stat result; drop the unused assignment to satisfy staticcheck. test(sync): restore provider runtime regressions The shadow-compare harness removal also deleted coverage for live provider-authoritative runtime behavior. Restore those checks against the final processFile and SyncSingleSession paths with a small fake provider so future migrations cannot drop source lookup, retry data-version, skip-cache, skip-reason, not-found, or force-parse behavior silently.\n\nAlso assert the checked-in migration manifest only contains final provider modes and remove stale shadow-compare wording from live sync comments.\n\nValidation: go test -tags "fts5" ./internal/sync -run 'TestProcessFileProvider|TestSyncSingleSessionProviderAuthoritativeBypassesProviderSkipCache' -count=1; go test -tags "fts5" ./internal/parser -run 'TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check fix(sync): make provider source lookup authoritative Stored file_path values can be stale after provider migration, especially for virtual DB-backed sources and remote-canonical Aider histories. Treat them as lookup hints instead of source-of-truth paths so provider-owned identity and freshness decide what can be resynced. Aider now resolves remote canonical physical and virtual hints using the same path-rewriter identity model as parse, Hermes verifies state.db contains the requested raw ID before claiming it, and DB-backed providers fall through from stale hints to raw-ID lookup while fresh deleted rows remain not found. Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestDBBackedProviderStoredVirtualPathFreshness|TestDBBackedProviderRejectsInvalidStoredVirtualPaths|TestFindSourceFileProviderAuthoritativePrefersProviderOverStoredPath|TestSyncForgeMissingConversation' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check fix(watch): preserve provider watch root semantics Provider-authoritative agents must drive live watcher setup from their WatchPlan, otherwise legacy flags such as Cowork's ShallowWatch can silently narrow recursive coverage. Keep the provider root shape when collecting watcher roots so recursive provider roots stay recursive. Missing provider roots are only treated as covered when an existing recursive watch covers the subtree or a shallow root can observe direct creation of that missing root. This preserves Codex's shallow parent behavior without letting shallow ancestors hide deeper missing recursive roots from polling. Validation: go test -tags "fts5" ./cmd/agentsview -run 'TestCollectWatchRoots(UsesCoworkProviderRecursiveRoot|UsesGeminiProviderMetadataRoot|UsesAntigravityCLIHistoryRoot|PreservesDirsSharingWatchRoot|HermesSessionsWatchesStateDBParent)|TestMissingWatchRootCoverageDoesNotTreatShallowAncestorAsRecursive' -count=1; go test -tags "fts5" ./cmd/agentsview ./internal/parser -run 'TestCollectWatchRoots|TestMissingWatchRootCoverage|TestCoworkProviderSourceMethods|TestGeminiProvider|TestAntigravityCLI|TestSQLiteFanoutSourceSetUsesFindDBPathAsCanonical' -count=1; go test -tags "fts5" ./cmd/agentsview ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check test(provider): cover non-sync provider callers Provider migration removed legacy discovery and source lookup hooks, so non-sync callers need explicit coverage that they continue to use provider capabilities. Add contracts for parse-diff supported-agent resolution, token-use raw disk probing, and SSH remote directory resolution across the provider-authoritative agents called out by review. The Cursor token-use case uses a real provider-owned source layout to prove an unsynced raw or canonical session ID resolves through provider FindSource. Comments now describe the shared disk source lookup instead of implying FindSourceFunc is still the only path. Validation: go test -tags "fts5" ./cmd/agentsview ./internal/sync ./internal/ssh -run 'TestParseDiffSupportedAgentsIncludesProviderAuthoritativeAgents|TestParseDiffProviderAuthoritativeAgentsAreDiscoverable|TestParseDiffDiscoversProviderSources|TestResolveSessionID_ProviderAuthoritativeCursorOnDiskNotInDB|TestAgentHasDiskSourceLookupIncludesProviderAuthoritativeAgents|TestBuildResolveScript' -count=1; go test -tags "fts5" ./cmd/agentsview ./internal/sync ./internal/ssh -count=1; go fmt ./...; go vet ./...; git diff --check docs(provider): spell out source identity contract Provider implementers need the source-hint and freshness rules at the API boundary, not only embedded in migration review context. Document that stored file paths and fingerprint keys are advisory, provider lookup is authoritative, and persisted source identity must stay compatible with source metadata, skip-cache, data-version, PostgreSQL, and session metadata consumers. Also update the facade design note so the current stack tip no longer claims shadow-compare mode still exists. Validation: go test -tags "fts5" ./internal/parser -run 'TestProvider|TestProviderMigrationModes' -count=1; go fmt ./...; go vet ./...; git diff --check; mdformat applied by commit hook test(provider): enforce anti-shim ownership policy The migration should not rely on per-agent AgentDef hooks or exported provider-specific facade functions once a provider is authoritative. Add one maintained package-wide guard for that policy and remove the obsolete Kiro-only scan. Aider and Reasonix were still wired through legacy DiscoverFunc/FindSourceFunc despite having provider-authoritative implementations. Remove those hook assignments and make their provider-specific parser/discovery/lookup helpers package-local so runtime callers go through the provider registry. Validation: go test -tags "fts5" ./internal/parser -run 'TestProviderAuthoritativeAgentDefsDoNotExposeLegacyHooks|TestNoExportedProviderFacadeShims|TestNoProviderFacadeShimNamePolicyDocumentsAllowedHelpers|TestProviderAntiShimScanReadsExpectedPackage|TestAider|TestReasonix|TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -run 'Aider|Reasonix|TestProviderAuthoritativeAgentDefsDoNotExposeLegacyHooks|TestNoExportedProviderFacadeShims|TestProviderMigrationModes' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check; mdformat applied by commit hook fix(parser): make import export capabilities agent-specific ChatGPT and Claude.ai imports should advertise only the export parser surface that belongs to their agent. The shared import-only provider type made both interface assertions succeed, which hid capability loss during migration review and let callers treat either provider as supporting the other export format. Validation: go test -tags "fts5" ./internal/parser -run 'TestImportOnlyProviderExportCapabilitiesAreAgentSpecific|TestParseChatGPTExport|TestParseClaudeAIExport' -count=1; go test -tags "fts5" ./internal/parser -count=1; go fmt ./...; go vet ./...; git diff --check fix(parser): hash reasonix metadata sidecars Reasonix metadata-only edits can change session fields without touching the transcript bytes. Provider fingerprinting already folded sidecar size and mtime, but the hash stayed transcript-only, so stored freshness state could miss metadata-only changes under hash-based comparisons. Keep missing-sidecar hashes compatible with the existing transcript-only value, and add provider coverage for layout classification plus deleted sidecar/transcript behavior so the migration contract is explicit. Validation: go test -tags "fts5" ./internal/parser -run 'TestReasonixProvider(Fingerprint|ChangedPath)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -run 'Reasonix' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check fix(sync): keep aider off mtime-only skip cache Aider histories fan out one physical file into multiple virtual run rows. Letting the generic skip cache key the physical history by mtime alone can bypass the provider fingerprint and per-run DB checks, hiding same-mtime content changes, missing run rows, stale hashes, or stale data versions. Disable generic skip caching for Aider and rely on the provider parse plus dropUnchangedSharedSQLiteResults hash/data-version filtering. This preserves correctness at the cost of reparsing the shared history before dropping unchanged runs. Validation: go test -tags "fts5" ./internal/sync -run 'TestProcessFileAiderProvider' -count=1; go test -tags "fts5" ./internal/parser -run 'TestAiderProviderFindSourceUsesCanonicalIdentity|TestAiderProviderRemoteIdentityStable' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -run 'Aider' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check refactor(parser): extract multi-session container base Shelley and Aider both surface one physical container (a SQLite DB / a chat history file) as many virtual member sessions, and each hand-rolled the same source-set scaffolding: discovery, watch plan, changed-path classification, the StoredFilePath/FingerprintKey/RawSessionID FindSource tiers, fingerprinting, and the container/member parse fan-out. Introduce multiSessionContainerSourceSet, a reusable source set, provider, and factory configured entirely through functional options (with*()), so a new special case is a new option rather than a wider signature or a new interface method. Aider's remote-sync identity (PathRewriter) and its canonical-path stored fallback, and Shelley's member-presence check, are all expressed as options. Fold shelley_provider.go and aider_provider.go onto the base. Per-provider code drops ~45% each (shelley 532->289, aider 501->274); the 469-line base is a one-time cost the remaining container-family providers (zed, kiro, opencode, db-backed, copilot) can reuse. refactor(parser): add generic SourceSet provider plumbing The multi-session container base shipped with its own ProviderFactory and a delegating Provider that forwarded six methods to the source set. Every future reusable base (single-file sidecar, and the rest) would re-hand-roll that same factory + forwarding shell. Extract it once: a SourceSet interface (the Provider source methods plus Parse, minus the Definition/Capabilities/config plumbing), a sourceSetProvider wrapper that supplies ProviderBase and applies the two normalizations every provider shares (raw-session-ID injection on FindSource, the request/config machine fallback on Parse), and a generic sourceSetFactory built from def + caps + a per-config SourceSet constructor. multiSessionContainerSourceSet now implements SourceSet directly (gaining a Parse method); newMultiSessionProviderFactory becomes a thin adapter over newSourceSetFactory. ParseIncremental stays on the ProviderBase unsupported default until a base needs it. refactor(parser): add single-file source-set base, fold reasonix Add singleFileSourceSet, the second reusable SourceSet base, for providers whose physical source is one file that parses into exactly one session (no virtual member paths, no fan-out). Like multiSessionContainerSourceSet it is configured through functional options (withFile*()) and plugs into newSourceSetFactory. The sidecar/composite fingerprint variance across this family stays inside each provider's withFileFingerprint closure, so the base carries no sidecar knowledge until a shared helper is warranted. Fold reasonix onto it as the validation provider: discovery, the .jsonl.meta sidecar changed-path mapping, the WatchSubdirs-aware changed-path resolution, the composite fingerprint, and the project-hint + fingerprint stamping in parse all become option closures. The provider file drops from 457 to ~280 lines; behavior is preserved (full parser and sync suites green). refactor(parser): fold zed onto multi-session container base Zed is a SQLite-DB-per-root thread container like Shelley. Replace its hand-rolled factory/provider/source set with multiSessionContainerSourceSet option closures, preserving the zed specifics: the container path is root/threads/threads.db, the member fingerprint mtime comes from ZedSQLiteSourceMtime, and the container fan-out stamps the DB's own content hash (computed in the parseContainer closure, not the request fingerprint, so missing-fingerprint parses still hash the rows). sqliteDBCompositeMtime stays put; it is shared with shelley and kiro. refactor(parser): fold visual studio copilot onto container base Visual Studio Copilot surfaces conversations from shared trace files, but unlike the SQLite containers it discovers one source per conversation (deduped across traces, newest wins) plus a bare source per unreadable trace. Two base generalizations support that: withSourceDiscovery lets a provider emit member-level matches at discovery time, and multiSessionMatch now carries a ProjectHint surfaced on the SourceRef. The multi-session parse closures now receive the full ParseRequest instead of just the machine string, mirroring the single-file base. This lets vs_copilot honor req.Source.ProjectHint, which the engine sets to the DB-preserved project on single-session re-sync so a user's project override is not reverted. shelley, zed, and aider closures are updated to the new signature (they read req.Machine). parseConversation becomes a free function and the test helpers call it directly. vs_copilot keeps its virtual-paths-always-strict changed-path classification and stamps the shared trace hash on every fanned-out conversation. Provider file drops 454->250; full parser and sync suites green. refactor(parser): fold cowork onto single-file base Cowork sources are single Claude-format transcripts, but one transcript can yield several sessions (main plus subagents) and a parse drives removals via excluded session IDs. Generalize singleFileSourceSet's parse contract to return ([]ParseResult, []string excluded) instead of one *ParseResult, and add withAlwaysCompleteResultSet so a parse that only excludes sessions still reports a complete (not skipped) result set. SourcesForChangedPath now derives allowMissing from jsonlMissingPathFallbackAllowed(req) rather than hardcoding true, which cowork (and vibe) require and reasonix is indifferent to. reasonix's parse closure is updated to the new slice signature. Cowork's hand-rolled factory/provider/source set become option closures; parseSession becomes the free function parseCoworkSession; the metadata-to-transcript changed-path mapping, composite mtime, and project-hint-from-metadata are all preserved. Provider file drops 523->352; full parser and sync suites green. refactor(parser): fold vibe onto single-file base Vibe sessions are single messages.jsonl transcripts with a sibling meta.json. Fold onto singleFileSourceSet: discovery, the messages.jsonl/meta.json changed-path mapping (strict vs session-dir-name fallback under allowMissing), the composite fingerprint, and the fallback-ID exclusion become option closures. The single result plus exclusions and the skip-on-no-session behavior ride the base's multi-result parse contract without withAlwaysCompleteResultSet, since vibe still skips when no session is parsed. parseSession and parseVibeResult become the free functions parseVibeSession and parseVibeResultFile; the test helpers call them and the provider's Discover directly instead of a concrete *vibeProvider. Provider file drops 507->300; full parser and sync suites green. refactor(parser): make JSONLSourceSet a SourceSet, fold amp Add a ParseFile option to JSONLSourceSetOptions and a Parse method to JSONLSourceSet, so the directory-of-files source set (and DirectoryJSONLSourceSet) implements the full SourceSet interface and rides the generic sourceSetFactory. Parse mirrors the single-file base: empty results with no exclusions is a clean no-session skip; req.Machine is resolved by sourceSetProvider. Amp is folded as the template: its hand-rolled factory, provider, five forwarding methods, and Parse collapse to newSourceSetFactory plus an ampParseFile closure; parseSession becomes the free function parseAmpSession. 139->63 lines. refactor(parser): fold qwen provider onto SourceSet factory Replace the hand-rolled qwenProvider struct, factory struct, var _ Provider assertion, forwarding methods, and Parse with the generic newSourceSetFactory plus a ParseFile option on the JSONL source set. The ParseFile closure passes req.Source.ProjectHint as the project hint. Convert parseSession from a *qwenProvider method to the free function parseQwenSession and update the test helper accordingly. refactor(parser): fold workbuddy onto SourceSet convergence Replace the hand-rolled workBuddyProvider struct, factory, and forwarding methods with newSourceSetFactory plus a ParseFile option, matching the amp provider. Convert parseSession from a method to the free function parseWorkBuddySession. Add an optional LookupIDValid predicate to JSONLSourceSetOptions so the generic FindSource fallback accepts WorkBuddy composite subagent IDs (<id>:subagent:<id>), which IsValidSessionID rejects. The option defaults to IsValidSessionID, preserving behavior for all other source-set providers. refactor(parser): fold deepseek_tui onto SourceSet convergence refactor(parser): fold zencoder onto SourceSet convergence refactor(parser): thread context through JSONLSourceSet ParseFile ParseFile now receives the parse context so directory-of-files folds can do context-aware work such as git-root project resolution; the existing single-result closures ignore it. Also adds a RawSessionIDForLookup hook that normalizes a stored raw session ID before the FindSource discovery comparison, for providers whose stored IDs carry a suffix the discovered filename stem lacks. refactor(parser): fold iflow onto SourceSet convergence iFlow now rides the generic source-set factory via DirectoryJSONLSourceSet's ParseFile. Its multi-result parse, relationship inference, and git-aware project resolution move into the ParseFile closure using the threaded context, and the subagent-base-ID normalization its custom FindSource performed is carried by the RawSessionIDForLookup hook. refactor(parser): fold kimi onto SourceSet convergence Kimi rides the generic source-set factory via ParseFile. Its colon-joined raw session IDs cannot be matched by the filename-stem discovery scan, so the wire.jsonl path reconstruction (including the agents/ subagent layout) moves to a new RawSessionIDSourceFiles hook that FindSource consults before the scan. refactor(parser): fold kiro_ide onto SourceSet convergence Kiro IDE rides the generic source-set factory via ParseFile. Its two on-disk layouts (old <ws-hash>:<file-hash> .chat IDs and new UUID .json files under workspace-sessions/) are resolved by reconstructing candidate paths in the RawSessionIDSourceFiles hook, since the colon-joined old IDs cannot be matched by the filename-stem discovery scan. refactor(parser): fold qwenpaw onto SourceSet convergence QwenPaw rides the generic source-set factory via ParseFile. Its colon-joined raw IDs are reconstructed through RawSessionIDSourceFiles; a DB-recorded file_path outside the configured roots is honored via the new StoredPathFallbackRoot hook, which synthesizes the implicit <root>/<workspace>/sessions/ layout; and the wholesale-rewrite outcome is carried by the ForceReplace option. refactor(parser): convert JSONLSourceSet to functional options JSONLSourceSet and DirectoryJSONLSourceSet are now built with with*() option closures plus default bundles (withContentHashing, withSymlinkFollowing) instead of a struct literal, matching the multiSessionContainer and singleFile bases. Each source only states what differs from the zero-value defaults. Constructors are lowercased (newJSONLSourceSet / newDirectoryJSONLSourceSet) as they are package-internal. Behavior is unchanged; SiblingMetadataSourceSet keeps its struct-based constructor via the shared jsonlSourceSetFromOptions. fix(parser): make aider discovery opt-in Aider had no central session store, so the registry gave it DefaultDirs [""], which resolves to $HOME and drove an always-on bounded walk. For a passive viewer that is a poor default: background refreshes (usage reports, desktop launches) enumerate $HOME and trigger macOS privacy prompts for Documents/Downloads/Music/Photos. Drop the default so aider is discovered only when the user opts in via AIDER_DIR or aider_dirs; a configured broad root still gets the bounded, protected-folder-pruned walk. This aligns the provider-migration branch with the opt-in behavior shipped on main in #844, which this branch predates. fix(ssh): resolve remote aider history without DefaultDirs Making aider discovery opt-in emptied its DefaultDirs, but the SSH resolve script only emitted the AIDER_DIR history-file snippet while iterating that list, so an explicitly configured remote AIDER_DIR stopped resolving any .aider.chat.history.md files for transfer. Handle aider independently of DefaultDirs, emitting buildAiderResolveSnippet whenever its EnvVar is set while still avoiding any default $HOME scan. Also cover JSONLSourceSet.FindSource RawSessionIDForLookup normalization, which runs before both the LookupIDValid gate and the SessionIDFromPath discovery comparison.

Rebasing the provider stack onto current origin/main brought in new upstream tests whose expectations met the migration's refactored APIs: - parseDiffSourceReliableForRaced gated the raced reclassification on def.DiscoverFunc, which the provider migration leaves nil for every agent, so no file-based source was ever reliable and genuine live-write skew was reported as a plain change. Make it an Engine method that reuses parseDiffAgentDiscoverable, the same provider-aware on-disk source predicate resolveParseDiffAgents uses. - A Kiro SQLite store is discovered as one container source but fans out into one session per row, so collectAndBatch counted it as a single file. Add the extra sessions it produces to TotalSessions, restoring the per-session tally the legacy syncKiroSQLite phase reported. - Codex no longer advertises incremental append, so an appended transcript is re-parsed in full and stores the raw file size and hash. Update the former consumed-prefix incremental assertion to that full-parse reality; the parsed-snapshot vs partial-tail distinction is still enforced at parse-diff time via CodexTranscriptConsumedSize.

Every agent is now provider-authoritative, so no AgentDef sets DiscoverFunc or FindSourceFunc. The fields, the always-false branches that consulted them (full-sync discovery, parse-diff discovery and discoverability, SSH resolve, source lookup, token-use disk probing), and the now-empty parseDiffDatabaseSources were left behind by the staged migration. Provider discovery (discoverProviderSources / parseDiffProviderSources) and provider lookup (findProviderSourceFile) already own every one of these paths. Also drop the migration-era scaffolding tests that only asserted this legacy code no longer exists: the provider_anti_shim_test.go suite (legacy-hook nil checks plus the AST scan for exported Discover/Find/Parse/Process facade functions) and the per-agent require.Nil(def.DiscoverFunc) registry assertions. With the fields gone, their absence is enforced by the compiler.

Hermes FindSource aborted the whole lookup when a state.db query errored, even though parseArchive deliberately falls back to the transcript parser when state.db is unreadable or schema-incompatible. A valid transcript session sitting next to a corrupt or legacy state.db could no longer be located for resync. Treat a state.db lookup error as non-claiming for that root and continue to the transcript lookup, matching the parser's documented fallback.

Guard provider.findRequests with require.Len before indexing so a missing request fails as a clear length assertion rather than a panic. Drive the parse-diff provider-authoritative contract from the registry so it covers every current file-based provider-authoritative agent instead of a hand-maintained subset. Add a Codex regression test pinning PreferStoredSource to the stored archived duplicate, contrasted with the canonicalize-to-live behavior it opts out of.

roborev-ci · 2026-06-25T17:42:24Z

roborev: Combined Review (`d28f042`)

Reviewed provider migration outputs: one Medium correctness finding remains; no security findings.

Medium

internal/sync/engine.go:6693 - FindSourceFile passes archived StoredFilePath into provider lookup before raw-ID resolution for every provider. Some source sets prefer that hint even when RequireFreshSource is true, so stale paths can win. For Aider, a stored <history>#idx can point at a different run after earlier runs are inserted or removed, bypassing the prior AiderRawIDAt validation and causing SyncSingleSession or watching to parse the wrong run or fail instead of re-resolving by raw session ID.

Suggested fix: Only pass or prefer stored paths for providers that validate them, or make provider lookup honor RequireFreshSource by verifying the stored source still exists and matches the requested raw ID. For Aider, fall back to raw-ID lookup unless the stored virtual path recomputes to that raw ID.

Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 10m1s), codex_security (codex/security, done, 5m13s) | Total: 15m22s

mariusvniekerk · 2026-06-26T02:43:51Z

Superseded by the family-grouped collapse of this stack into 10 green branches (#876–#885), which preserves every commit while cutting the branch count for review. Closing in favor of that stack.

^{generated by a clanker}

mariusvniekerk force-pushed the provider-explicit-registry branch from a63405c to 41637b8 Compare June 20, 2026 00:06

mariusvniekerk force-pushed the provider-explicit-registry branch from 555b938 to 30ae7e6 Compare June 20, 2026 00:17

mariusvniekerk mentioned this pull request Jun 20, 2026

feat(parser): migrate db-backed providers #783

Closed

mariusvniekerk force-pushed the provider-explicit-registry branch 3 times, most recently from 2fa10c1 to 089ddd7 Compare June 20, 2026 00:24

mariusvniekerk force-pushed the provider-db-backed-family branch from 56a8a67 to 81b0cad Compare June 20, 2026 00:34

mariusvniekerk force-pushed the provider-explicit-registry branch from 089ddd7 to 0671d6c Compare June 20, 2026 00:34

mariusvniekerk mentioned this pull request Jun 20, 2026

feat(parser): migrate antigravity providers #782

Closed

mariusvniekerk force-pushed the provider-db-backed-family branch from 81b0cad to ae27e4f Compare June 20, 2026 00:42

mariusvniekerk force-pushed the provider-explicit-registry branch from 0671d6c to b6de96f Compare June 20, 2026 00:42

mariusvniekerk force-pushed the provider-db-backed-family branch from ae27e4f to 0799e94 Compare June 20, 2026 00:53

mariusvniekerk force-pushed the provider-explicit-registry branch from b6de96f to 1ab0ad4 Compare June 20, 2026 00:53

mariusvniekerk force-pushed the provider-db-backed-family branch from 0799e94 to 82f9e12 Compare June 20, 2026 00:57

mariusvniekerk force-pushed the provider-explicit-registry branch from 1ab0ad4 to b68ac14 Compare June 20, 2026 00:57

mariusvniekerk force-pushed the provider-db-backed-family branch from 82f9e12 to e8149e5 Compare June 20, 2026 01:02

mariusvniekerk force-pushed the provider-explicit-registry branch from b68ac14 to 42bcdb4 Compare June 20, 2026 01:02

mariusvniekerk force-pushed the provider-db-backed-family branch from e8149e5 to 83f8b73 Compare June 20, 2026 01:42

mariusvniekerk force-pushed the provider-explicit-registry branch from 42bcdb4 to 53ea4b2 Compare June 20, 2026 01:42

mariusvniekerk force-pushed the provider-db-backed-family branch from 83f8b73 to 1f17524 Compare June 21, 2026 00:41

mariusvniekerk force-pushed the provider-db-backed-family branch from 5157aa7 to 058ca18 Compare June 21, 2026 01:47

mariusvniekerk force-pushed the provider-explicit-registry branch 2 times, most recently from 488d841 to 9671172 Compare June 21, 2026 01:52

mariusvniekerk force-pushed the provider-db-backed-family branch from 4f27d46 to 39a2870 Compare June 23, 2026 23:56

mariusvniekerk force-pushed the provider-explicit-registry branch from 9671172 to 245482d Compare June 23, 2026 23:56

mariusvniekerk force-pushed the provider-db-backed-family branch from 39a2870 to 7d46117 Compare June 25, 2026 05:49

mariusvniekerk force-pushed the provider-explicit-registry branch from 245482d to 5a2f40d Compare June 25, 2026 05:49

mariusvniekerk added 5 commits June 25, 2026 12:06

mariusvniekerk force-pushed the provider-db-backed-family branch from 7d46117 to 225fe95 Compare June 25, 2026 17:29

mariusvniekerk force-pushed the provider-explicit-registry branch from b842c9e to d28f042 Compare June 25, 2026 17:29

mariusvniekerk closed this Jun 26, 2026

Uh oh!

Conversation

mariusvniekerk commented Jun 20, 2026

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (a63405c)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (41637b8)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (555b938)

Uh oh!

mariusvniekerk commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (30ae7e6)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (089ddd7)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (0671d6c)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (b6de96f)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (42bcdb4)

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Review Unavailable (53ea4b2)

Uh oh!

roborev-ci Bot commented Jun 21, 2026

roborev: Combined Review (9671172)

claude-code — default (done)

Review Findings

Summary

claude-code — security (done)

Uh oh!

roborev-ci Bot commented Jun 24, 2026

roborev: Combined Review (245482d)

Uh oh!

roborev-ci Bot commented Jun 25, 2026

roborev: Combined Review (5a2f40d)

Medium

Uh oh!

roborev-ci Bot commented Jun 25, 2026

roborev: Combined Review (b842c9e)

Medium

Uh oh!

roborev-ci Bot commented Jun 25, 2026

roborev: Combined Review (d28f042)

Medium

Uh oh!

mariusvniekerk commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

roborev: Combined Review (`a63405c`)

roborev: Combined Review (`41637b8`)

roborev: Combined Review (`555b938`)

mariusvniekerk commented Jun 20, 2026 •

edited

Loading

roborev: Combined Review (`30ae7e6`)

roborev: Combined Review (`089ddd7`)

roborev: Combined Review (`0671d6c`)

roborev: Combined Review (`b6de96f`)

roborev: Combined Review (`42bcdb4`)

roborev: Review Unavailable (`53ea4b2`)

roborev: Combined Review (`9671172`)

roborev: Combined Review (`245482d`)

roborev: Combined Review (`5a2f40d`)

roborev: Combined Review (`b842c9e`)

roborev: Combined Review (`d28f042`)