feat(experiments): Filter experiments list by a rollup metric#453
Conversation
Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (5)
💤 Files with no reviewable changes (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds numeric rollup filtering to the experiments list API. OpenAPI, shared filter types, endpoint handling, and tests now cover ChangesMetric Rollup Filters
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
packages/nmp_common/src/nmp/common/entities/values.py (1)
274-310: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value
NumberFilteroverlapsFloatFilter.
FloatFilteralready provides$gte/$lte;NumberFilteradds$gt/$lt/$eq. Consider folding the extra operators intoFloatFilter(or deriving one from the other) to avoid two near-identical numeric filters drifting apart.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nmp_common/src/nmp/common/entities/values.py` around lines 274 - 310, `NumberFilter` duplicates most of `FloatFilter` and risks the two numeric filter models drifting apart. Refactor the filter types so there is a single source of truth for numeric comparisons, either by moving `$gt`/`$lt`/`$eq` into `FloatFilter` or by making `NumberFilter` inherit/compose from `FloatFilter`; update the `NumberFilter` and `FloatFilter` definitions in `values.py` so their shared behavior lives in one place and their aliases/config stay consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openapi/ga/individual/platform.openapi.yaml`:
- Around line 3738-3740: The filter examples in the OpenAPI docs use unprefixed
operators that do not match the schema. Update the example text in the
`NumberFilter`/rollup metric descriptions so the keys match the defined query
shape (`$gte`, `$lte`, `$gt`, `$lt`, `$eq`) everywhere this example appears,
including the related `NumberFilter` documentation block. Keep the surrounding
example values the same, but ensure the operator names in the docs are
consistent with the schema.
- Around line 14255-14279: The NumberFilter schema currently allows empty
objects because it lacks a minimum property constraint. Update the NumberFilter
definition in the openapi schema to require at least one predicate by adding
minProperties: 1 alongside the existing properties and additionalProperties:
false, so the schema still accepts $gte, $lte, $gt, $lt, or $eq but rejects {}.
In `@openapi/ga/openapi.yaml`:
- Around line 3734-3740: Update the filter documentation text in the OpenAPI
spec so the numeric range examples use the same $-prefixed operator keys defined
by the schema. In the affected description near the experiments filter section,
change the examples for run_count, cost_usd.mean, latency_ms.p95, and
evaluators.<name>.mean to use $gte/$lte/$gt/$lt/$eq consistently. Apply the same
wording cleanup anywhere the duplicated filter description appears so the
examples match the actual supported operators and do not point clients to
invalid keys.
- Around line 14255-14279: NumberFilter currently allows an empty object, so
update the NumberFilter schema in openapi/ga/openapi.yaml to require at least
one predicate operator. Add minProperties: 1 alongside the existing properties
definition so validation rejects {} while still allowing $gte, $lte, $gt, $lt,
or $eq. Use the NumberFilter schema block to locate the change.
In `@services/intake/src/nmp/intake/api/v2/experiments/endpoints.py`:
- Around line 992-1016: The metric filter validation in the LogicalOperation
handling is rejecting nested AND groups because
`_operation_references_metric(child)` treats a child AND containing metrics as
invalid, even though the parent combinator is already AND. Update the logic
around `LogicalOperation`, `_operation_references_metric`, and
`_validated_metric_predicate` to either flatten nested ANDs before validation or
explicitly recurse through AND children so metric comparisons inside sub-ANDs
are accepted; if nested ANDs remain unsupported, adjust the HTTPException detail
to clearly state that only flat metric comparisons are allowed.
---
Nitpick comments:
In `@packages/nmp_common/src/nmp/common/entities/values.py`:
- Around line 274-310: `NumberFilter` duplicates most of `FloatFilter` and risks
the two numeric filter models drifting apart. Refactor the filter types so there
is a single source of truth for numeric comparisons, either by moving
`$gt`/`$lt`/`$eq` into `FloatFilter` or by making `NumberFilter` inherit/compose
from `FloatFilter`; update the `NumberFilter` and `FloatFilter` definitions in
`values.py` so their shared behavior lives in one place and their aliases/config
stay consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 6359daae-f76d-4371-af1a-b831e0bf4a36
⛔ Files ignored due to path filters (1)
web/packages/sdk/generated/agents/schema/DeploymentLogsResponse.tsis excluded by!**/generated/**
📒 Files selected for processing (8)
openapi/ga/individual/platform.openapi.yamlopenapi/ga/openapi.yamlopenapi/openapi.yamlpackages/nmp_common/src/nmp/common/entities/values.pyservices/intake/src/nmp/intake/api/v2/experiments/endpoints.pyservices/intake/src/nmp/intake/api/v2/experiments/schemas.pyservices/intake/tests/integration/spans/test_experiment_metric_sort.pyservices/intake/tests/test_experiment_metric_filter.py
|
Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>
Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>
Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>
What
Adds metric filtering to the experiments list — filter by rollup metrics (cost, latency, evaluator scores, run count), not just entity fields.
How
Reuses the platform's standard
filter[field][op]bracket syntax, so it combines naturally with existing entity filters:?filter[cost_usd.mean][$lte]=0.5&filter[run_count][$gte]=1&filter[experiment_group_id]=
run_count,cost_usd.<stat>,latency_ms.<stat>,evaluators.<name>.<stat>(stat ∈ mean/median/p90/p95/p99/sum/count). Operators:$gte/$lte/$gt/$lt/$eq.list_experimentssplits the filter tree: entity predicates go to the entity store; metric predicates are applied in-app after rollup hydration (compute-on-read, same plumbing as metric sort). Declared via self-mapping namespaces onExperimentFilterso paths pass validation untranslated.NumberFilterrange type ($gte/$lte/$gt/$lt/$eq) alongsideDatetimeFilter/StringFilter.Behavior
Tests
Unit tests for the split/validate/match helpers + endpoint wiring (validation, 400/503), and an integration test combining entity + metric filters end to end against ClickHouse. OpenAPI specs regenerated.
Summary by CodeRabbit
Summary by CodeRabbit
New Features
run_count,cost_usd,latency_ms, and evaluator metrics.mean,median,p90,p95,p99,sum,count) with operators like$eq,$gt,$gte,$lt,$lte.Bug Fixes
400errors.503when telemetry data for rollups is unavailable.Documentation
Tests