Skip to content

feat(parser): migrate OpenClaw + QClaw providers#880

Merged
mariusvniekerk merged 3 commits into
mainfrom
fam/claw-family
Jun 28, 2026
Merged

feat(parser): migrate OpenClaw + QClaw providers#880
mariusvniekerk merged 3 commits into
mainfrom
fam/claw-family

Conversation

@mariusvniekerk

@mariusvniekerk mariusvniekerk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Migrates the OpenClaw and QClaw providers onto the facade, preserving discovery and parse behavior.

@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (5970de1)

High-risk regression found in the OpenClaw/QClaw provider migration; security review found no exploitable issue.

High

  • Location: internal/parser/types.go:367
  • Problem: Removing DiscoverFunc/FindSourceFunc from OpenClaw and QClaw leaves existing hook-based callers behind. internal/ssh/resolve.go:77 will now skip these agents entirely for remote sync, and cmd/agentsview/token_use.go:91 / cmd/agentsview/token_use.go:112 can no longer resolve unsynced OpenClaw/QClaw IDs for on-demand session usage.
  • Fix: Either keep the legacy hooks until those callers are provider-aware, or update the SSH resolver and session ID resolver to include provider-authoritative file-based agents.

Medium

  • Location: internal/parser/claw_provider.go:312
  • Problem: The provider fingerprint no longer computes a content hash, so clawParseOutcome only sees an empty req.Fingerprint.Hash and full syncs will overwrite previously stored OpenClaw/QClaw file_hash values with NULL. The removed legacy processors always computed ComputeFileHash.
  • Fix: Populate SourceFingerprint.Hash in clawSourceSet.Fingerprint using the file content hash, or compute and stamp the hash during Parse.

Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 5m44s), codex_security (codex/security, done, 2m41s) | Total: 8m35s

@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (645790d)

Medium issue found; no high or critical findings.

Medium

  • internal/parser/claw_provider.go:312 - OpenClaw/QClaw provider fingerprints now include only size and mtime, so clawParseOutcome does not receive a real Fingerprint.Hash during normal sync. Since the removed legacy processOpenClaw/processQClaw paths computed and stored ComputeFileHash, reparsing these sessions can clear sessions.file_hash.

    Fix: Preserve legacy behavior by computing a content hash for the source, either in Fingerprint or as a fallback in clawParseOutcome, and add a provider-level test that exercises provider.Fingerprint instead of injecting a hash.


Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 7m10s), codex_security (codex/security, done, 1m57s) | Total: 9m14s

@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (08a36f5)

Summary verdict: One medium issue needs fixing before merge.

Medium

  • internal/parser/claw_provider.go:312 - Provider fingerprints omit Hash, while clawParseOutcome only sets sess.File.Hash from req.Fingerprint.Hash. Real OpenClaw/QClaw syncs will write a nil file_hash and can clear hashes previously populated by the legacy ComputeFileHash path.
    • Fix: Populate SourceFingerprint.Hash with the transcript SHA-256, or compute/set the hash during parse. Add a test that runs Fingerprint then Parse.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 8m29s), codex_security (codex/security, done, 1m39s) | Total: 10m15s

@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (9cdc39d)

Medium findings remain in the OpenClaw/QClaw provider migration; no security-impacting issues were reported.

Medium

  • internal/parser/claw_provider.go:312 - Provider fingerprints only include size/mtime, so clawParseOutcome never receives a real Fingerprint.Hash. Legacy OpenClaw/QClaw processing computed and persisted file_hash, but provider sync may now overwrite it with an empty value. Compute a content hash in clawSourceSet.Fingerprint or during parse, and add an engine/provider-path test that verifies stored file_hash is non-empty.

  • internal/ssh/resolve.go:77 - Remote sync discovery still skips file-based agents with no DiscoverFunc. Since this diff removes those hooks from OpenClaw/QClaw, their remote directories are no longer emitted for transfer. Teach the resolver to include provider-authoritative file-based agents, or keep small legacy discovery adapters until remote sync is provider-aware.

  • cmd/agentsview/token_use.go:91 - Local session usage disk probing still only calls FindSourceFunc. OpenClaw/QClaw sessions present on disk but not yet in the DB now resolve as unknown, so on-demand SyncSingleSession is skipped. Add provider-backed disk probes for provider-authoritative file-based agents, or retain FindSourceFunc adapters for these agents.


Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 6m13s), codex_security (codex/security, done, 1m29s) | Total: 7m52s

@roborev-ci

roborev-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown

roborev: Combined Review (4f66903)

Reviewed OpenClaw/QClaw provider migration: 2 medium issues need attention before merge.

Medium

  • internal/parser/types.go:367
    Removing FindSourceFunc for OpenClaw/QClaw breaks session usage / token-use for unsynced Claw sessions. resolveRawSessionID still probes disk only through AgentDef.FindSourceFunc, so valid on-disk openclaw:<agent>:<session> or qclaw:<agent>:<session> IDs that are not yet in the DB are treated as unknown and skipped instead of being synced on demand.
    Fix: Add a provider-backed disk probe to resolveRawSessionID for provider-authoritative agents, or keep compatibility FindSourceFunc wrappers until that caller is migrated. Add tests for unsynced OpenClaw and QClaw IDs on disk.

  • internal/parser/claw_provider.go:312
    The new Claw provider fingerprint omits a content hash, but clawParseOutcome only preserves Session.File.Hash when req.Fingerprint.Hash is set. The removed legacy processOpenClaw / processQClaw paths always computed file_hash, so the next provider parse will write NULL and clear existing stored hashes.
    Fix: Compute and populate SourceFingerprint.Hash in clawSourceSet.Fingerprint for Claw sources, matching the legacy full-file hash behavior.


Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 6m7s), codex_security (codex/security, done, 1m4s) | Total: 7m20s

@roborev-ci

roborev-ci Bot commented Jun 27, 2026

Copy link
Copy Markdown

roborev: Combined Review (8eb63eb)

Summary verdict: one medium regression should be fixed before merge; no security issues were found.

Medium

  • internal/parser/types.go:367, cmd/agentsview/token_use.go:91
    OpenClaw/QClaw now drop FindSourceFunc, but resolveRawSessionID still resolves on-disk unsynced sessions only through FindSourceFunc. An unsynced Claw session present on disk is now treated as unknown, so session usage/token-use skips the on-demand SyncSingleSession path and reports not found.

    Fix: Teach resolveRawSessionID to use provider FindSource for provider-authoritative file-based agents, or keep thin legacy FindSourceFunc shims for Claw. Add on-disk empty-DB tests for OpenClaw and QClaw raw/canonical IDs.


Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 6m19s), codex_security (codex/security, done, 2m49s) | Total: 9m16s

@roborev-ci

roborev-ci Bot commented Jun 27, 2026

Copy link
Copy Markdown

roborev: Combined Review (def6e5d)

Medium-risk issues remain around OpenClaw/QClaw provider migration; no security findings were reported.

Medium

  • internal/parser/types.go:368 - Removing FindSourceFunc from OpenClaw/QClaw breaks session usage / token-use on-demand resolution for sessions that exist on disk but are not yet in the DB. resolveRawSessionID still probes disk only through FindSourceFunc, so provider-owned Claw sessions return known=false and never call SyncSingleSession.

    Suggested fix: Add a provider-backed disk probe to resolveRawSessionID, or keep thin FindSourceFunc compatibility wrappers until the CLI uses providers. Add tests for canonical and raw on-disk OpenClaw/QClaw IDs absent from the DB.

  • internal/parser/claw_provider.go:315 - Fingerprint now reads and hashes the entire Claw transcript. Provider-owned sources call Fingerprint during mtime filtering, so incremental syncs can hash every OpenClaw/QClaw file before excluding old files by mtime, regressing the legacy stat-only cutoff path.

    Suggested fix: Keep Fingerprint cheap for freshness checks and compute the full content hash during Parse when populating Session.File.Hash, or add a separate cheap mtime path for provider filtering.


Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 9m20s), codex_security (codex/security, done, 2m25s) | Total: 11m53s

@roborev-ci

roborev-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown

roborev: Combined Review (48ae3e7)

Medium-risk issue found: the Claw provider migration appears to break on-disk token/session usage lookup for unsynced OpenClaw/QClaw sessions.

Medium

  • internal/parser/types.go:367: Removing FindSourceFunc from OpenClaw/QClaw breaks resolveRawSessionID disk probes, which still only call AgentDef.FindSourceFunc in cmd/agentsview/token_use.go. Unsynced on-disk OpenClaw/QClaw sessions can no longer be recognized by token/session usage lookup, so the on-demand SyncSingleSession path is skipped and the CLI reports not found.

    Fix: Teach resolveRawSessionID to use provider FindSource for provider-authoritative file agents, or keep compatibility FindSourceFunc wrappers for these agents. Add coverage for canonical and raw OpenClaw/QClaw IDs present on disk but absent from the DB.


Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 4m16s), codex_security (codex/security, done, 2m37s) | Total: 7m0s

@roborev-ci

roborev-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown

roborev: Combined Review (e15adbf)

No issues found.


Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 6m58s), codex_security (codex/security, done, 2m45s) | Total: 9m43s

@roborev-ci

roborev-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown

roborev: Combined Review (edf4f0f)

PR is not ready: one medium regression needs to be fixed before merge.

Medium

  • internal/parser/types.go:367: Removing FindSourceFunc for OpenClaw/QClaw breaks on-disk session ID resolution in resolveRawSessionID, which still skips any file-based agent without FindSourceFunc. Unsynced OpenClaw/QClaw sessions passed to session usage or token-use will no longer be found or synced on demand.
    • Fix: Add provider-backed lookup to resolveRawSessionID for provider-authoritative agents, or keep small FindSourceFunc adapters until those callers are migrated. Add on-disk resolution tests for OpenClaw and QClaw.

Panel: ci_default_security | Synthesis: codex, 19s | Members: codex_default (codex/default, done, 6m47s), codex_security (codex/security, done, 2m0s) | Total: 9m6s

@roborev-ci

roborev-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown

roborev: Combined Review (ad8fb37)

Reviewed provider migration: two medium regressions remain; no security findings were reported.

Medium

  • internal/parser/types.go:359: OpenClaw/QClaw now drop DiscoverFunc, but SSH remote sync still only emits remote transfer targets for file-based agents with DiscoverFunc != nil. Remote hosts with only OpenClaw/QClaw sessions will no longer transfer those directories, so new remote sessions for these agents silently stop syncing.

    • Fix: Update the remote resolver to include provider-authoritative file-based agents via their default dirs/provider metadata, or keep temporary legacy discovery hooks until remote resolution is migrated; add a regression test that OpenClaw/QClaw appear in the resolve script.
  • internal/parser/types.go:367: Removing FindSourceFunc also removes OpenClaw/QClaw from resolveRawSessionID's on-disk probes, which still skip agents without that hook. A canonical or raw Claw ID that exists on disk but has not been synced yet now resolves as known=false, so session usage/token-use will not perform the intended on-demand sync.

    • Fix: Add a provider-backed lookup path in resolveRawSessionID for provider-authoritative file-based agents, or retain compatibility FindSourceFunc adapters; cover unsynced OpenClaw/QClaw disk resolution in tests.

Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 7m2s), codex_security (codex/security, done, 3m33s) | Total: 10m45s

@roborev-ci

roborev-ci Bot commented Jun 28, 2026

Copy link
Copy Markdown

roborev: Combined Review (f3f502a)

The PR has one Medium issue to address before merging.

Medium

  • internal/parser/claw_provider.go:271: FindSource only resolves stored paths through currently configured roots. After this migration, SyncSingleSession can still prefer an existing DB file_path even when the OpenClaw/QClaw root was removed or changed, but processProviderFile then asks the provider to resolve that stored path and gets “provider source not found.” The legacy processors parsed that stored path directly, so explicit resyncs for existing sessions can regress.

    Fix: Add a stored-path fallback for Claw sources that infers the implicit root from <root>/<agent>/sessions/<file>, validates the shape, and returns a SourceRef for existing stored paths outside configured roots.


Panel: ci_default_security | Synthesis: codex, 23s | Members: codex_default (codex/default, done, 7m23s), codex_security (codex/security, done, 1m56s) | Total: 9m42s

Base automatically changed from fam/qwen-family to main June 28, 2026 20:42
OpenClaw and QClaw share a Claw-style source layout where each agent directory owns a sessions folder and active JSONL files compete with archived JSONL variants for the same logical session.

Moving them behind concrete provider facades keeps that active-over-archive and newest-archive policy explicit without broadening the generic JSONL source helpers around variable archive suffixes. The providers preserve colon-delimited agent/session lookup, selected-source change classification, symlinked agent directories, stale stored-path remapping, source fingerprinting, and existing parse normalization.

fix(parser): promote claw archives on removal

Claw providers choose a single source per logical session, so live-sync removal events need to account for source promotion. When an active file or newest archive disappears, another archive may become the selected source even though the changed path is no longer the source to parse.

This keeps write events strict about the selected path, while remove and rename-style missing-path events can remap a valid stale Claw path to the newly selected source for the same raw session ID.

test(parser): opt openclaw qclaw into provider shadow

OpenClaw and QClaw now have concrete facade providers on this branch, so their migration modes should enter shadow comparison rather than staying legacy-only and additive.

Earlier provider opt-ins remain inherited; later provider branches still own their own modes.

Validation: go test -tags "fts5" ./internal/parser -run TestProviderMigrationModes -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare claw shadow parity

OpenClaw and QClaw are shadow-compared on this branch, so add source-level migration coverage that compares provider observation with their legacy parsers.

The paired test follows the shared provider implementation and keeps the agent/session raw ID shape and planned data-version behavior visible during review.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatchesClawLegacyParsers|Test(OpenClaw|QClaw)Provider|TestParse(OpenClaw|QClaw)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...

refactor(parser): fold claw providers into provider

OpenClaw and QClaw should no longer keep exported discover/find/parse entrypoints beside the provider facade. Folding discovery, raw-ID lookup, archive selection, and parsing into the concrete providers makes this branch a real migration instead of another shim around the legacy path.

The sync engine now relies on provider changed-path handling for this family, so the provider migration mode can become authoritative and the shadow-only comparison test is removed.

Validation: go test -tags "fts5" ./internal/parser -run 'TestClawProvidersOwnLegacyEntrypoints|TestOpenClaw|TestQClaw|TestClawProvider|TestParseOpenClaw|TestParseQClaw|TestDiscoverOpenClaw|TestDiscoverQClaw|TestFindOpenClaw|TestFindQClaw' -count=1 -v; go test -tags "fts5" ./internal/sync -run 'TestEngine_ClassifyPathsQClaw|TestProviderMigration|TestObserveProvider|TestProviderProcess' -count=1 -v; go fmt ./...; go test -tags "fts5" ./internal/parser ./internal/sync ./cmd/agentsview -count=1; go vet ./...; git diff --check
Update openclaw and qclaw provider call sites to the exported source-set framework API.
clawSourceSet.Fingerprint built a SourceFingerprint without a Hash, so
clawParseOutcome's guarded assignment left Session.File.Hash empty for both
OpenClaw and QClaw. The legacy processOpenClaw/processQClaw paths always
computed a full-file hash via ComputeFileHash and persisted file_hash, and
the full-parse write overwrites file_hash unconditionally, so the migration
cleared existing hashes to NULL on resync. Compute the content hash in
Fingerprint via hashJSONLSourceFile.
@mariusvniekerk mariusvniekerk merged commit a57f24f into main Jun 28, 2026
13 of 27 checks passed
@mariusvniekerk mariusvniekerk deleted the fam/claw-family branch June 28, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant