Kimi wire logs carry token counts only as session-level aggregates
(StatusUpdate), so individual messages have no per-message token_usage
and no model identifier. The usage/cost engine priced 0 tokens and Kimi
sessions never appeared in the cost views.
Emit one session-level ParsedUsageEvent when a Kimi session exposes only
aggregate output tokens, defaulting the model to a recent, cleanly
matchable Kimi catalog entry (moonshot/kimi-k2.6) so the cost engine can
produce an estimate. The estimate is a lower bound: prices come from the
LiteLLM catalog rather than Kimi directly, the wire logs do not record
which model served each turn, and only output tokens are exposed by the
aggregates (input and cache tokens are not priced). Sessions whose
messages already carry per-message token_usage (native step.end
protocol) are priced message-by-message and skipped to avoid double
counting.
processKimi previously dropped parser-emitted usage events; pass them
through so the events reach the store.
Kimi sessions previously showed no cost and never appeared in the usage
views. Kimi wire logs record token counts only as session-level
aggregates (the
StatusUpdatepath), so individual messages carry noper-message
token_usageand no model identifier. The cost enginepriced 0 tokens per message and produced $0 for the whole session.
This change emits a single session-level
ParsedUsageEventwhen a Kimisession exposes only aggregate output tokens, and routes parser-emitted
usage events through
processKimi(which previously dropped them).The cost it produces is explicitly an estimate, not exact billing:
list), not from Kimi/Moonshot directly.
sells several models concurrently at different rates (e.g.
kimi-k2.7-code,kimi-k2.6,kimi-k2.5), so the per-turn model isunknown. The event defaults to
moonshot/kimi-k2.6, a recent modelwith a cleanly matchable catalog entry. k2.7 is currently priced only
under a cloudflare-namespaced key that the provider-conflict rule
rejects for a
moonshot/model, and it carries the same 0.95/4.0 rateas k2.6, so k2.6 is the more robust proxy at an identical price. Bump
the constant once the catalog gains a cleanly matchable k2.7 entry.
cache tokens are not priced. The figure is a lower bound that tracks
output volume rather than a full invoice.
Sessions whose messages already carry per-message
token_usage(thenative
step.endprotocol) are still priced message-by-message and areskipped here to avoid double counting.
Reviewers may want to look at
defaultKimiModelininternal/parser/kimi.gofor the estimation rationale and theemission/skip logic at the end of
ParseKimiSession.