[fix](fe) Handle generated columns in delete partial update#64884
[fix](fe) Handle generated columns in delete partial update#64884bobhan1 wants to merge 1 commit into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
a37d300 to
6ff668a
Compare
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: DELETE on a unique merge-on-write table with light delete disabled is rewritten to a partial update load that writes key columns and the delete sign. BindSink used to auto-add every generated column and then recompute omitted generated columns, which required ordinary value columns that were not part of the DELETE output and caused analysis to fail. This change treats DELETE partial updates specially: it skips generated columns that are not emitted by the child plan and uses child output directly for generated columns that are emitted, while leaving normal partial update generated-column dependency checks unchanged.
### Release note
Fix DELETE failure on unique merge-on-write tables with generated columns when light delete is disabled.
### Check List (For Author)
- Test: Regression test / Unit Test
- `./build.sh --fe -j100`
- `./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.DeleteFromUsingCommandTest`
- `./run-regression-test.sh --run -d regression-test/suites/ddl_p0/test_create_table_generated_column -s test_generated_column_delete -forceGenOut`
- `./run-regression-test.sh --run -d regression-test/suites/ddl_p0/test_create_table_generated_column -s test_generated_column_delete`
- `./run-regression-test.sh --run -d regression-test/suites/ddl_p0/test_create_table_generated_column -s test_partial_update_generated_column`
- Behavior changed: Yes (DELETE now succeeds for unique merge-on-write tables with generated columns when light delete is disabled.)
- Does this need documentation: No
6ff668a to
b102fc9
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Summary: I did not find any blocking issue in this PR. The change is narrowly scoped to DELETE partial-update binding for generated columns, and the added FE/regression coverage exercises generated value columns, generated key columns, NOT NULL value columns, and a VARIANT generated-column table with enable_mow_light_delete=false.
Critical checkpoint conclusions:
- Goal/test coverage: The PR fixes MoW DELETE partial-update analysis for generated columns by not forcing generated value columns into the sink column list when the DELETE child does not produce them, while preserving generated key columns that are produced by the DELETE projection. Added FE analyzer tests and regression expected output cover the main cases.
- Scope/focus: The implementation is limited to OLAP sink binding and passes
falsethrough external sink call sites, so connector/Hive/Iceberg/MaxCompute insert behavior is unchanged. - Concurrency/lifecycle: No new shared state, locking, threads, or lifecycle-managed objects are introduced.
- Configuration/compatibility: No new config, protocol field, storage format, or mixed-version behavior is introduced. Existing
enable_mow_light_delete=falsebehavior is covered by tests. - Parallel paths: Simple DELETE and DELETE USING both reach the delete-as-insert OLAP sink path for this mode. Cluster-key and sync-MV cases disable partial update before this shortcut applies.
- Data correctness: Generated key columns remain in the child output because
DeleteFromCommand.completeQueryPlanprojects key columns; generated value columns without child output are skipped so they are not incorrectly recomputed from missing base value columns or marked in the partial-update input set. - Tests/results: The regression outputs are ordered and match schema order. Tables are dropped before use and hardcoded names are used.
- Observability/performance: No new logging/metrics are needed for this planner binding fix; the change avoids unnecessary generated expression analysis in the DELETE partial-update path.
Subagent conclusions:
optimizer-rewrite: no candidate findings; convergence round 1 returnedNO_NEW_VALUABLE_FINDINGSfor the final empty comment set.tests-session-config: no candidate findings; convergence round 1 returnedNO_NEW_VALUABLE_FINDINGSfor the final empty comment set.- No candidates were accepted, dismissed as duplicates, or submitted as inline comments.
Validation performed:
- Reviewed the GitHub changed-file list and all four changed files plus the relevant DELETE, sink binding, translator, generated-column, and BE partial-update validation paths.
- Verified no existing inline review comments were present.
- Ran
git diff --checkon the explicit base/head changed-file range; it passed.
Validation not run:
- FE unit tests and regression tests were not executed locally because this checkout is not worktree-initialized and
thirdparty/installed/bin/protocis missing.
TPC-H: Total hot run time: 29022 ms |
TPC-DS: Total hot run time: 171671 ms |
ClickBench: Total hot run time: 25.11 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Original Problem: DELETE failed during analysis on a unique merge-on-write table that contains a generated column when light delete is disabled. The original reported table had a generated VARIANT column:
and the DELETE failed with:
A minimized reproduction is:
After inserting rows, running:
failed with:
Problem Summary: DELETE on a unique merge-on-write table with light delete disabled is rewritten to a partial update load that writes key columns and the delete sign. BindSink used to auto-add every generated column and then recompute omitted generated columns, which required ordinary value columns that were not part of the DELETE output and caused analysis to fail.
This change treats DELETE partial updates specially: generated columns that are not emitted by the child plan are skipped, and generated columns that are emitted by the child plan use that child output directly. Normal partial update generated-column dependency checks are unchanged.
The tests cover generated value columns, generated key columns, a generated value-column table with an omitted NOT NULL value column that has no default value, and the original VARIANT generated-column shape based on
receive_address_detail.Release note
Fix DELETE failure on unique merge-on-write tables with generated columns when light delete is disabled.
Check List (For Author)
./build.sh --fe -j100./run-fe-ut.sh --run org.apache.doris.nereids.trees.plans.DeleteFromUsingCommandTest./run-regression-test.sh --run -d regression-test/suites/ddl_p0/test_create_table_generated_column -s test_generated_column_delete -forceGenOut./run-regression-test.sh --run -d regression-test/suites/ddl_p0/test_create_table_generated_column -s test_generated_column_delete./run-regression-test.sh --run -d regression-test/suites/ddl_p0/test_create_table_generated_column -s test_partial_update_generated_column