[refine](column) separate mutable subcolumn mutation from read-only traversal#64905
[refine](column) separate mutable subcolumn mutation from read-only traversal#64905Mryange wants to merge 1 commit into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Review result: request changes for one test coverage gap in the core column COW refactor.
Critical checkpoint conclusions:
- Goal: the PR separates read-only subcolumn traversal from recursive COW mutation. The code mostly accomplishes that by keeping
for_each_subcolumnconst-only and routingIColumn::mutatethroughmutate_subcolumns(). - Scope/focus: the source change is small and focused on BE column wrappers, with tests added for array and nullable COW behavior.
- Concurrency/lifecycle: no new threads, locks, static initialization, persistence, transaction, or configuration behavior are introduced.
- Compatibility/protocol: no serialization format, FE/BE protocol, or function symbol compatibility change found.
- Parallel paths: array, nullable, map, struct, const, and variant wrapper paths were checked. The remaining issue is that the distinct map enumeration path lacks focused mutation coverage.
- Error handling/memory: no ignored
Statusor new memory ownership issue found in the changed code. - Tests:
git diff --checkis clean for the PR file set. Formatter/style CI is green. I could not run BE UT locally becausethirdparty/installedis absent in this runner. The current macOS BE-UT CI job fails before compiling Doris withERROR: The JAVA version is 25, it must be JDK-17, so I did not treat that as PR-caused.
User focus: no additional user-provided focus points were present.
Subagent conclusions: optimizer-rewrite reported no candidates. tests-session-config proposed TEST-1; I verified and accepted the ColumnMap portion as MAIN-1 for an inline comment. Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same ledger and one-comment final set.
| callback(keys_column); | ||
| callback(values_column); | ||
| callback(offsets); | ||
| void mutate_subcolumns() override { |
There was a problem hiding this comment.
This COW refactor changes ColumnMap from the old callback/defer path to a class-specific mutate_subcolumns() that must enumerate all three children (keys_column, values_column, and typed offsets_column). The new tests cover array and nullable, but there is no focused test that mutates a shared/exclusive ColumnMap and verifies all three child pointers are detached or preserved correctly. A missed or wrong entry in this method would compile and leave one map child aliased across mutation, so please add a BE unit test analogous to the new array/nullable cases for ColumnMap.
What problem does this PR solve?
Column wrappers used
for_each_subcolumnfor both read-only traversal and mutable subcolumn detachment during COW mutation. This mixed two different contracts in one callback API and forced typed subcolumns such asColumnNullable::_null_map,ColumnArray::offsets, andColumnMap::offsets_columnto move through temporary baseIColumn::WrappedPtrbridges.Root cause: the mutable callback accepted
IColumn::WrappedPtr&, but several subcolumns are stored as strongly typed wrappers. Binding typed wrapper references to the base wrapper callback is unsafe, so each implementation needed ad-hoc move/defer/cast code.This PR keeps
for_each_subcolumnas a const read-only traversal API and addsmutate_subcolumns()for the COW mutation path. Commonmutate_subcolumnhelpers handle generic and strongly typed subcolumns, soColumnArray,ColumnMap,ColumnNullable,ColumnStruct, andColumnVariantcan detach children without exposing a mutable traversal callback. The added BEUT cases also verify that mutating exclusive subcolumns does not introduce an extra copy.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)