Skip to content

[Fix](fold_const) MAKE_SET constant folding should clear#64907

Open
linrrzqqq wants to merge 1 commit into
apache:masterfrom
linrrzqqq:fix-make-set-fold-constant-oom
Open

[Fix](fold_const) MAKE_SET constant folding should clear#64907
linrrzqqq wants to merge 1 commit into
apache:masterfrom
linrrzqqq:fix-make-set-fold-constant-oom

Conversation

@linrrzqqq

Copy link
Copy Markdown
Collaborator

Problem Summary:

MAKE_SET uses bit &= ~(1 << pos) for clearing bits at high positions in the FE constant folding path, which leads to incorrect clearing of high bits when pos >= 32 due to integer shift modulo.

For Java int shifts, the shift distance is masked with 0x1F, which means only the low 5 bits are used:

  • pos = 0..31 -> normal
  • pos = 32 -> treated as 0
  • pos = 33 -> treated as 1
  • pos = 64 -> treated as 0 again

For inputs like: MAKE_SET(4294967296, ...)(4294967296 == 1L << 32)
expect: bit &= ~(1 << 32)
got: bit &= ~(1 << 0)
That does not clear bit 32 at all. So:

  • bit stays unchanged and pos stays 32
  • the loop never makes progress
  • the same string is appended again and again
  • Java throws OutOfMemoryError: Java heap space

before(FE constant folding failed or FE OOM):

Doris> EXPLAIN SELECT MAKE_SET(4294967296,
    ->     'a00', 'a01', 'a02', 'a03', 'a04', 'a05', 'a06', 'a07',
    ->     'a08', 'a09', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15',
    ->     'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23',
    ->     'a24', 'a25', 'a26', 'a27', 'a28', 'a29', 'a30', 'a31',
    ->     'a32', 'a33') AS ms;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                                                                                                                                                                                             |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                                                                                                                                                                             |
|   OUTPUT EXPRS:                                                                                                                                                                                                                                                             |
|     ms[#0]                                                                                                                                                                                                                                                                  |
|   PARTITION: UNPARTITIONED                                                                                                                                                                                                                                                  |
|                                                                                                                                                                                                                                                                             |
|   HAS_COLO_PLAN_NODE: false                                                                                                                                                                                                                                                 |
|                                                                                                                                                                                                                                                                             |
|   VRESULT SINK                                                                                                                                                                                                                                                              |
|      MYSQL_PROTOCOL                                                                                                                                                                                                                                                         |
|                                                                                                                                                                                                                                                                             |
|   0:VUNION(11)                                                                                                                                                                                                                                                              |
|      constant exprs:                                                                                                                                                                                                                                                        |
|          make_set(4294967296, 'a00', 'a01', 'a02', 'a03', 'a04', 'a05', 'a06', 'a07', 'a08', 'a09', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15', 'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23', 'a24', 'a25', 'a26', 'a27', 'a28', 'a29', 'a30', 'a31', 'a32', 'a33') |
|                                                                                                                                                                                                                                                                             |
|                                                                                                                                                                                                                                                                             |
|                                                                                                                                                                                                                                                                             |
| ========== STATISTICS ==========                                                                                                                                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
17 rows in set (28.822 sec)
2026-06-26 23:27:50,285 INFO (mysql-nio-pool-0|303) [StmtExecutor.executeByNereids():821] Command(EXPLAIN SELECT MAKE_SET(4294967296,    
 'a00', 'a01', 'a02', 'a03', 'a04', 'a05', 'a06', 'a07',     'a08', 'a09', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15',     'a16', 'a17', 'a
18', 'a19', 'a20', 'a21', 'a22', 'a23',     'a24', 'a25', 'a26', 'a27', 'a28', 'a29', 'a30', 'a31',     'a32', 'a33') AS ms) process fail
ed.
org.apache.doris.nereids.exceptions.AnalysisException: Nereids cost too much time (32s > 30s). You should increment timeout by set 'nerei
ds_timeout_second' or disable check timeout by set 'enable_nereids_timeout' to false. Time consuming details, parse time: 6ms, plan time:
 {"plan":-1,"garbage_collect":-1,"lock_tables":0,"analyze":2,"rewrite":-1,"fold_const_by_be":0,"collect_partitions":-1,"optimize":-1,"tra
nslate":-1,"init_scan_node":-1,"finalize_scan_node":-1,"create_scan_range":-1,"distribute":-1}

now:

Doris>  SET debug_skip_fold_constant = 0;
Doris> EXPLAIN SELECT MAKE_SET(4294967296,
    ->     'a00', 'a01', 'a02', 'a03', 'a04', 'a05', 'a06', 'a07',
    ->     'a08', 'a09', 'a10', 'a11', 'a12', 'a13', 'a14', 'a15',
    ->     'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23',
    ->     'a24', 'a25', 'a26', 'a27', 'a28', 'a29', 'a30', 'a31',
    ->     'a32', 'a33') AS ms;
+----------------------------------+
| Explain String(Nereids Planner)  |
+----------------------------------+
| PLAN FRAGMENT 0                  |
|   OUTPUT EXPRS:                  |
|     ms[#0]                       |
|   PARTITION: UNPARTITIONED       |
|                                  |
|   HAS_COLO_PLAN_NODE: false      |
|                                  |
|   VRESULT SINK                   |
|      MYSQL_PROTOCOL              |
|                                  |
|   0:VUNION(12)                   |
|      constant exprs:             |
|          'a32'                   |
|                                  |
|                                  |
|                                  |
| ========== STATISTICS ========== |
+----------------------------------+

@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq

Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq

Copy link
Copy Markdown
Collaborator Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issue found.

Critical checkpoints:

  • Goal: the PR fixes FE constant folding for MAKE_SET when clearing bit positions >= 32 by changing the mask to 1L << pos; this addresses the Java int-shift masking issue described in the PR.
  • Scope: the authoritative GitHub PR file list shows a focused one-line FE fix plus one regression invocation. Local commit history in this checkout is a shallow/root snapshot, so I used the GitHub PR file metadata as the review scope authority.
  • Parallel paths: BE MAKE_SET already clears with an unsigned 64-bit mask, so the FE fold path now matches the runtime path for high bits.
  • Tests: the added testFoldConst case exercises MAKE_SET(4294967296, ...) with enough arguments to reach bit 32 and compares FE-folded output against no-fold output.
  • Concurrency, lifecycle, config, persistence, and compatibility: not implicated by this FE executable-function mask change.
  • User focus: no additional user focus was provided.

Subagent conclusions:

  • optimizer-rewrite: no candidate findings; convergence round 1 returned NO_NEW_VALUABLE_FINDINGS.
  • tests-session-config: no candidate findings; convergence round 1 returned NO_NEW_VALUABLE_FINDINGS.
  • No candidates were accepted for inline comments; two suspicious points were dismissed in the shared ledger with code evidence.

Validation limits: I did not run regression tests or FE build locally because this checkout is not worktree-initialized and thirdparty/installed / thirdparty/installed/bin/protoc are absent.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 28955 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2985af6665bc303b0ec056789daa66abf80dcfb9, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17791	3965	3928	3928
q2	1996	309	186	186
q3	10310	1372	828	828
q4	4681	471	341	341
q5	7525	842	562	562
q6	180	169	139	139
q7	749	820	622	622
q8	9329	1587	1563	1563
q9	5840	4481	4497	4481
q10	6729	1769	1517	1517
q11	443	277	248	248
q12	629	428	290	290
q13	18126	3310	2762	2762
q14	278	255	246	246
q15	q16	778	769	711	711
q17	1029	988	1015	988
q18	6954	5682	5468	5468
q19	1345	1202	1082	1082
q20	501	390	271	271
q21	5915	2569	2421	2421
q22	438	353	301	301
Total cold run time: 101566 ms
Total hot run time: 28955 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4317	4215	4261	4215
q2	321	370	236	236
q3	4548	4922	4413	4413
q4	2052	2151	1382	1382
q5	4385	4252	4294	4252
q6	232	178	130	130
q7	1713	1626	1411	1411
q8	2679	2209	2141	2141
q9	8060	8054	8089	8054
q10	4793	4749	4277	4277
q11	575	421	370	370
q12	739	745	535	535
q13	3239	3558	2974	2974
q14	301	306	285	285
q15	q16	738	749	672	672
q17	1375	1332	1327	1327
q18	7917	7400	6853	6853
q19	1078	1106	1092	1092
q20	2224	2242	1951	1951
q21	5234	4603	4441	4441
q22	509	475	414	414
Total cold run time: 57029 ms
Total hot run time: 51425 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 171049 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2985af6665bc303b0ec056789daa66abf80dcfb9, data reload: false

query5	4321	628	467	467
query6	439	184	174	174
query7	4801	561	298	298
query8	333	186	171	171
query9	8755	4063	4084	4063
query10	452	309	261	261
query11	5915	2355	2127	2127
query12	162	105	102	102
query13	1302	618	449	449
query14	6727	5302	4953	4953
query14_1	4305	4286	4305	4286
query15	217	205	182	182
query16	1021	451	445	445
query17	1133	741	599	599
query18	2726	481	354	354
query19	226	179	139	139
query20	110	107	103	103
query21	215	133	120	120
query22	13681	13588	13393	13393
query23	17219	16510	16105	16105
query23_1	16179	16063	16152	16063
query24	7381	1753	1293	1293
query24_1	1337	1296	1285	1285
query25	543	447	359	359
query26	1295	320	166	166
query27	2602	522	345	345
query28	4403	2008	1990	1990
query29	1061	614	473	473
query30	306	237	190	190
query31	1107	1075	943	943
query32	108	61	59	59
query33	529	311	247	247
query34	1168	1166	675	675
query35	776	778	670	670
query36	1368	1375	1203	1203
query37	149	106	92	92
query38	1876	1715	1671	1671
query39	916	902	895	895
query39_1	866	866	903	866
query40	222	123	101	101
query41	66	68	64	64
query42	95	94	95	94
query43	320	324	280	280
query44	1456	792	784	784
query45	206	192	176	176
query46	1066	1198	729	729
query47	2431	2356	2303	2303
query48	425	412	302	302
query49	604	453	326	326
query50	979	362	274	274
query51	4527	4399	4312	4312
query52	81	79	70	70
query53	238	264	198	198
query54	274	233	195	195
query55	75	70	65	65
query56	242	224	213	213
query57	1456	1393	1283	1283
query58	239	216	212	212
query59	1553	1597	1424	1424
query60	286	247	230	230
query61	157	152	153	152
query62	694	649	588	588
query63	218	191	192	191
query64	2465	756	621	621
query65	4881	4768	4690	4690
query66	1755	459	335	335
query67	28888	28785	28612	28612
query68	3195	1537	966	966
query69	419	293	263	263
query70	1077	978	961	961
query71	296	228	210	210
query72	2958	2577	2340	2340
query73	858	742	436	436
query74	5098	4955	4736	4736
query75	2581	2564	2168	2168
query76	2320	1206	800	800
query77	344	381	294	294
query78	12395	12405	11848	11848
query79	1190	1204	776	776
query80	519	474	402	402
query81	447	277	243	243
query82	233	153	124	124
query83	266	276	247	247
query84	265	142	115	115
query85	837	594	498	498
query86	346	280	288	280
query87	1856	1835	1791	1791
query88	3707	2824	2830	2824
query89	416	383	336	336
query90	2160	190	187	187
query91	215	164	129	129
query92	62	62	56	56
query93	1528	1415	886	886
query94	529	332	316	316
query95	662	484	340	340
query96	1045	808	352	352
query97	2680	2732	2543	2543
query98	234	210	196	196
query99	1160	1166	1022	1022
Total cold run time: 256174 ms
Total hot run time: 171049 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.17 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2985af6665bc303b0ec056789daa66abf80dcfb9, data reload: false

query1	0.00	0.00	0.01
query2	0.11	0.05	0.05
query3	0.28	0.13	0.13
query4	1.60	0.14	0.14
query5	0.24	0.24	0.22
query6	1.25	1.05	1.09
query7	0.04	0.00	0.00
query8	0.05	0.04	0.03
query9	0.38	0.31	0.34
query10	0.58	0.56	0.55
query11	0.19	0.14	0.14
query12	0.18	0.15	0.15
query13	0.47	0.47	0.47
query14	1.03	0.99	1.00
query15	0.62	0.60	0.58
query16	0.31	0.32	0.32
query17	1.13	1.10	1.09
query18	0.22	0.21	0.21
query19	2.06	2.03	1.94
query20	0.02	0.01	0.01
query21	15.45	0.22	0.13
query22	4.85	0.06	0.05
query23	16.16	0.32	0.12
query24	3.01	0.41	0.34
query25	0.12	0.05	0.04
query26	0.73	0.21	0.16
query27	0.04	0.03	0.04
query28	3.55	0.93	0.53
query29	12.52	4.29	3.45
query30	0.28	0.15	0.15
query31	2.77	0.61	0.31
query32	3.23	0.59	0.49
query33	3.27	3.30	3.20
query34	15.70	4.20	3.49
query35	3.55	3.53	3.55
query36	0.54	0.42	0.42
query37	0.09	0.07	0.06
query38	0.05	0.03	0.04
query39	0.04	0.03	0.03
query40	0.18	0.16	0.15
query41	0.09	0.03	0.02
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 97.06 s
Total hot run time: 25.17 s

@hello-stephen

Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants