[feat](tools) add tpch primary key/foreign key in tpch-tools#64863
[feat](tools) add tpch primary key/foreign key in tpch-tools#64863englefly wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found one correctness issue in the new TPCH constraint cleanup scripts.
Critical checkpoints:
- Goal/test proof: The PR adds manual TPCH PK/FK/UK helper SQL files, but the PK/UK drop script does not use Doris' supported constraint-drop syntax, so that cleanup path is not proven to work.
- Scope/focus: The change is small and focused; the issue is localized to the PK/UK cleanup file.
- Concurrency/lifecycle/config/compatibility/persistence/data writes/observability: Not applicable for these standalone SQL helper scripts.
- Parallel paths: The FK cleanup script uses
drop constraint; the PK/UK cleanup script should follow the same path. - Test coverage/results: I did not run an FE parser/regression test in this runner; I verified the syntax against the Nereids grammar, command builder, existing constraint tests, and
git diff --check. - User focus: No additional user-provided focus was supplied.
Subagent conclusions:
optimizer-rewriteproposed OPT-1, accepted as inline comment FINAL-1.tests-session-configreported TEST-1 as a duplicate of OPT-1 and found no additional issue.- Convergence round 1 ended with both live subagents replying
NO_NEW_VALUABLE_FINDINGSfor the same ledger/comment set.
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found two issues in the new constrained TPC-H regression suite.
Critical checkpoints:
- Goal/test proof: The PR adds TPCH PK/FK/UK helper scripts and a
tpch_sf0.1_constraints_p1regression suite. The constraint declarations themselves match the TPC-H keys and FKs, but the suite currently misses the source for one checked-in expected output and its load setup is not rerunnable. - Scope/focus: GitHub's PR files API reports 76 changed files, matching
.code-review.9240ex/pr_changed_files.txt. I used that as authoritative because the local checkout is shallow/grafted and a direct local base..head diff showed unrelated FE/CDC history. - Concurrency/lifecycle/config/compatibility/persistence/data writes/observability: No production concurrency, config, compatibility, persistence, or data-write path changes in the authoritative PR scope. The relevant lifecycle concern is regression DB reuse across repeated suite runs.
- Parallel paths: The helper drop scripts now use
drop constraint; the load suite does not use an equivalent cleanup path before re-adding constraints. - Test coverage/results: The new constrained suite copies all q01-q22 SQL and expected outputs, but
cse.outhas no matching constrained suite source. I did not run the full regression suite in this runner. - User focus: No additional user-provided review focus was supplied.
Subagent conclusions:
optimizer-rewriteinitially duplicated two out-of-scope FE candidates from the misleading local diff; after the GitHub-scope correction, they were dismissed and the subagent reportedNO_NEW_VALUABLE_FINDINGSfor the final two-comment set.tests-session-configfoundTSC-1, accepted as FINAL-1, and duplicated the same out-of-scope FE/CDC false positives; after correction, it reportedNO_NEW_VALUABLE_FINDINGSfor the final two-comment set.- Convergence round 1 ended with both live subagents replying
NO_NEW_VALUABLE_FINDINGSfor the same current ledger/comment set.
| @@ -0,0 +1,65 @@ | |||
| -- This file is automatically generated. You should know what you did if you want to edit this | |||
| -- !cse -- | |||
There was a problem hiding this comment.
This expected output has no matching constrained-suite source file, so the CSE case will not run. The existing suite has regression-test/suites/tpch_sf0.1_p1/sql/cse.groovy, but regression-test/suites/tpch_sf0.1_constraints_p1/sql/ only contains q01.sql through q22.sql; adding data/tpch_sf0.1_constraints_p1/sql/cse.out alone leaves the output stale and unconsumed. Please add the corresponding constrained cse.groovy source, or remove this .out file if CSE is intentionally not part of the new suite.
|
|
||
| // Primary Key Constraints | ||
| sql """ alter table region add constraint r_pk primary key (r_regionkey) """ | ||
| sql """ alter table nation add constraint n_pk primary key (n_nationkey) """ |
There was a problem hiding this comment.
This makes the new load suite fail on rerun in the same regression database. The suite creates tables with IF NOT EXISTS, deletes/reloads the rows, and then unconditionally re-adds r_pk, n_r_fk, r_uk, etc. Those constraints remain in metadata after the first run, and Doris rejects the second ADD CONSTRAINT with Constraint name ... has existed (ConstraintManager.checkConstraintNotExistence). Please make the setup idempotent, for example by dropping existing FKs first and then PK/UK constraints before re-adding them, or by recreating the tables for this constrained suite.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)