Skip to content

[UT][VL] Refresh TPC-H q19 plan stability golden file#12374

Draft
brijrajk wants to merge 1 commit into
apache:mainfrom
brijrajk:fix/tpch-q19-plan-stability-golden-file
Draft

[UT][VL] Refresh TPC-H q19 plan stability golden file#12374
brijrajk wants to merge 1 commit into
apache:mainfrom
brijrajk:fix/tpch-q19-plan-stability-golden-file

Conversation

@brijrajk

@brijrajk brijrajk commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Fixes #12375.

Problem

GlutenTPCHPlanStabilitySuitetpch/q19 has been failing in spark-test-spark40 CI runs for PRs that touch Velox backend Scala files.

Root cause

GlutenPlanStabilitySuite.glutenNormalizeIds() uses the regex (?<prefix>(?<!id=)#)\\d+L? which matches any #<number> in the explain text — including TPC-H string literals. The p_brand filter in q19 uses values Brand#11, Brand#12, Brand#13 (actual TPC-H spec data values). These appear unquoted in the explain output:

EqualTo(p_brand, Brand#12)

The normalizer incorrectly treats #12 as an ExprId and remaps it sequentially. The suite code itself warns about this at line 67–68:

"Running all suites together in one JVM is recommended to avoid ExprId normalization issues where string constants (e.g., Brand#23 in TPCH q19) may collide with ExprId numbers."

What changed

The golden file was committed in #11805 (c37fee4e5, 2026-03-24). Since then 264 commits landed on main, shifting the ExprId counter. Brand#12 now normalizes to Brand#6 and _pre_1#14 shifts to _pre_1#13.

Exact diff (original vs current):

- EqualTo(p_brand, Brand#12) ... Brand#13
+ EqualTo(p_brand, Brand#6)  ... Brand#12

- _pre_1#14 / sum#15 / isEmpty#16
+ _pre_1#13 / sum#14 / isEmpty#15

Evidence that this is pre-existing

Ran GlutenTPCHPlanStabilitySuite on main at commit 6097b59a6 (2026-06-25, [MINOR][VL] Build Arrow 18 with patch for Power #12344) — without any pending PR applied:

Tests: succeeded 21, failed 1  ← q19 fails on main too
BUILD FAILURE

Then regenerated with SPARK_GENERATE_GOLDEN_FILES=1 and re-ran:

Tests: succeeded 22, failed 0
BUILD SUCCESS

Only q19/explain.txt changed. simplified.txt and all other queries (q1–q18, q20–q22) are unaffected.

Why it only surfaces on PRs touching Velox backend Scala files

spark-test-spark40 is only triggered when Velox backend Scala files are modified. Most PRs touch native C++ code, docs, or non-Velox modules and never trigger this check.

Fix

Regenerated q19/explain.txt by running GlutenTPCHPlanStabilitySuite with SPARK_GENERATE_GOLDEN_FILES=1 SPARK_ANSI_SQL_MODE=false.

A proper long-term fix (tracked in #12375) would be to make glutenNormalizeIds skip #N occurrences inside string literal contexts.

Impact

  • Only gluten-ut/spark40/src/test/resources/backends-velox/gluten-tpch-plan-stability/q19/explain.txt changes
  • No production code changes
  • No other test queries affected

The ExprId normalizer in GlutenPlanStabilitySuite uses regex `#\d+`
which inadvertently matches TPC-H string literals such as Brand#11,
Brand#12, Brand#13 (p_brand values in q19's filter). Over the 264
commits since the golden file was added in apache#11805, new optimizer rules
shifted the ExprId counter so Brand#12 now normalizes to Brand#6 and
_pre_1#14 to _pre_1#13, causing a spurious plan mismatch.

Regenerated by running GlutenTPCHPlanStabilitySuite with
SPARK_GENERATE_GOLDEN_FILES=1. Only q19/explain.txt changes; simplified.txt
and all other queries are unaffected.

Verified: q19 fails on main without this fix (21/22); passes with it (22/22).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[UT][VL] GlutenTPCHPlanStabilitySuite q19 golden file stale — Brand#12 ExprId normalization collision

1 participant