fix(analytics): define top-session active duration as clamped idle gaps (alt to #867) by mariusvniekerk · Pull Request #869 · kenn-io/agentsview

mariusvniekerk · 2026-06-26T00:26:24Z

Alternative to #867. This branch is built on @rodboev's commits (kept in
the history, so the original work and authorship are preserved); it changes
only how "active duration" is computed.

What changed

#867 computes a session's active duration by summing only the gaps that
follow a tool-use message. That captures tool-execution latency but
discards all model generation and thinking time — including the time spent
before the first tool call — so a session that reasons for minutes and then
fires one quick tool is scored as nearly idle.

This branch instead defines active duration the same way the existing
velocity "active minutes" metric already does: the sum of consecutive
inter-message gaps, with each gap capped at 5 minutes. Every gap counts —
model generation, tool execution, and quick human turnarounds — and only
stretches longer than the cap are bounded as idle.

Why

Message timestamps alone can't tell a 20-minute subagent run from a
20-minute coffee break; both are one long gap. Capping each gap fails
gracefully in both directions (a long idle gap and a long active gap each
contribute at most 5 minutes) instead of guessing, and it keeps the metric
robust to messy data: resumed sessions, machine sleep, and API stalls
produce the largest gaps, and those would otherwise dominate the sum.

It also unifies the definition. The 5-minute cap is hoisted to a single
shared constant (db.ActiveGapCapSec / ActiveGapCapMs) used by both the
velocity metric and Top Sessions, so the two "active" numbers on the
dashboard can't drift; a test asserts the seconds and milliseconds forms
agree.

Tradeoff

A genuinely long active operation — a 15-minute test run, a 20-minute
subagent — is also capped at 5 minutes, so it is undercounted. That is the
deliberate cost of not guessing whether a long gap was work or idle; the
error is bounded and symmetric.

Where to look

internal/db/analytics.go — shared constant, SQLite active-duration SQL,
and the velocity metric now sourcing the same cap.
internal/db/timing.go — SQLite Go fallback (timezone-aware ranking path)
running the same clamp.
internal/postgres/analytics.go, internal/duckdb/analytics_usage.go —
PostgreSQL and DuckDB twins.

All three SQL backends drop the has_tool_use filter and the trailing gap
to ended_at so the value matches the velocity computation.

^{generated by a clanker}

Top Sessions "By Duration" introduced an active-duration metric that summed only the gaps following a tool-use message. That captured tool-execution latency but discarded all model generation and thinking time, including the stretch before the first tool call, so it systematically undercounted how long a session was actually worked on. Redefine active duration as the sum of consecutive inter-message gaps with each gap capped at five minutes, which is exactly the definition the velocity "active minutes" metric already uses. A gap below the cap counts in full (generation, tool execution, or a quick human turnaround); only stretches longer than the cap are bounded as idle. Timestamps alone cannot distinguish a long tool run from a human stepping away, so capping fails gracefully in both directions instead of guessing. Hoist the cap to a single shared constant (ActiveGapCapSec / ActiveGapCapMs) so the two "active" definitions cannot drift, guarded by a test that the seconds and milliseconds forms agree. The active-duration SQL drops the has_tool_use filter and the trailing gap to ended_at to match the velocity computation across SQLite, PostgreSQL, and DuckDB, and the SQLite Go fallback runs the same clamp. Alternative implementation to #83, built on its commits.

roborev-ci · 2026-06-26T00:33:52Z

roborev: Combined Review (`0915ffc`)

Medium issue found.

Medium

internal/db/analytics.go:4535 - SQLite timezone fallback calls activeDurationMinForSession while the outer rows iterator remains open. This can hold one reader connection and acquire another per row; with a capped reader pool, concurrent duration fallback requests can exhaust the pool and block indefinitely. It also introduces an unbounded N+1 query pattern.
- Fix: compute active duration in the outer query, or collect candidate row IDs first, close rows, then batch-load active durations before sorting.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 6m12s), codex_security (codex/security, done, 1m7s) | Total: 7m26s

The timezone-aware Go fallback for Top Sessions "By Duration" computed each row's active duration with a per-row follow-up query while the outer rows iterator was still open. That iterator pins one reader connection, and every row then borrowed a second one, an unbounded N+1 over the result set. The reader pool is capped at four connections, so a handful of concurrent duration-fallback requests could each hold their outer cursor while contending for inner connections and deadlock until the busy timeout or context cancellation freed them. Compute active duration in the outer query instead, exactly as the in-SQL and PostgreSQL paths already do, so the whole result set comes back in one statement with no second connection per row. The correlated-subquery SQL is hoisted into a shared sqliteActiveDurationExpr helper used by both SQLite paths, keeping the clamp definition from drifting, and is only emitted for the duration metric so other metrics pay nothing for it.

roborev-ci · 2026-06-26T13:07:59Z

roborev: Combined Review (`665e53a`)

Summary verdict: one Medium issue remains; no High or Critical findings were reported.

Medium

internal/postgres/analytics.go:2208, internal/db/analytics.go:4500 - Duration ranking eligibility is inconsistent across backends. PostgreSQL and the SQLite Go fallback rank by active_duration_min but only require non-null start/end times, unlike the SQLite SQL and DuckDB paths. Reversed durations such as ended_at < started_at can still rank highly if message-gap active duration is positive, and the SQLite fallback also does not reject empty timestamp strings.

Suggested fix: Mirror SQLite/DuckDB eligibility in both paths: add ended_at >= started_at for PostgreSQL, and use NULLIF(..., '') plus julianday(ended_at) >= julianday(started_at) in the SQLite fallback. Add parity tests for reversed and empty duration rows.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 6m40s), codex_security (codex/security, done, 38s) | Total: 7m25s

Top Sessions "By Duration" ranks by message-gap active duration, which stays positive even when a session's wall-clock timestamps are reversed (ended_at < started_at) or empty. The SQLite SQL and DuckDB paths already exclude such rows, but the PostgreSQL path and the timezone-aware SQLite Go fallback only checked for NULL timestamps. As a result the same malformed session could rank into the top list on those two backends while being filtered out on the others, breaking backend parity. Add the missing guards: ended_at >= started_at for PostgreSQL, and the NULLIF(..., '') emptiness checks plus julianday(ended_at) >= julianday(started_at) for the SQLite fallback, matching the canonical SQLite SQL and DuckDB eligibility. Parity tests cover reversed rows on all affected paths and empty-string rows on SQLite; the fallback empty case pins created_at into the filter window so it reaches the guard instead of being dropped by the date range.

roborev-ci · 2026-06-26T21:38:01Z

roborev: Combined Review (`9d33b21`)

No issues found.

Panel: ci_default_security | Synthesis: codex | Members: codex_default (codex/default, done, 6m8s), codex_security (codex/security, done, 14s) | Total: 6m22s

wesm · 2026-06-27T18:38:30Z

LGTM here

rodboev and others added 8 commits June 25, 2026 18:27

fix(analytics): surface active duration in top sessions (#83)

60d7f20

fix(analytics): keep active-duration math aligned across backends (#83)

e074bea

fix(analytics): keep duration ranking correct in fallback paths (#83)

88f1fc8

fix(analytics): keep postgres top-session duration null-safe (#83)

27a8c44

fix(analytics): preserve top-session active-duration ordering (#83)

ae94c1f

fix(analytics): repair postgres top-session SQL for CI (#83)

0d07c52

fix(analytics): close postgres active-duration cast for CI (#83)

1db5e41

rodboev mentioned this pull request Jun 26, 2026

fix(analytics): surface active duration in top sessions (#83) #867

Closed

wesm merged commit 634d930 into main Jun 27, 2026
23 checks passed

wesm deleted the feat/top-sessions-active-duration-clamp branch June 27, 2026 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(analytics): define top-session active duration as clamped idle gaps (alt to #867)#869

fix(analytics): define top-session active duration as clamped idle gaps (alt to #867)#869
wesm merged 10 commits into
mainfrom
feat/top-sessions-active-duration-clamp

mariusvniekerk commented Jun 26, 2026

Uh oh!

roborev-ci Bot commented Jun 26, 2026

Uh oh!

roborev-ci Bot commented Jun 26, 2026

Uh oh!

roborev-ci Bot commented Jun 26, 2026

Uh oh!

wesm commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mariusvniekerk commented Jun 26, 2026

What changed

Why

Tradeoff

Where to look

Uh oh!

roborev-ci Bot commented Jun 26, 2026

roborev: Combined Review (0915ffc)

Medium

Uh oh!

roborev-ci Bot commented Jun 26, 2026

roborev: Combined Review (665e53a)

Medium

Uh oh!

roborev-ci Bot commented Jun 26, 2026

roborev: Combined Review (9d33b21)

Uh oh!

wesm commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

roborev: Combined Review (`0915ffc`)

roborev: Combined Review (`665e53a`)

roborev: Combined Review (`9d33b21`)