Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
150 commits
Select commit Hold shift + click to select a range
1e08997
[DAG] visitEXTRACT_SUBVECTOR - Fold EXTRACT_SUBVECTOR(EXTRACT_SUBVECT…
RKSimon Jun 19, 2026
4b2a02d
[X86] Replace X86 specific PDEP/PEXT handling with generic intrinsics…
RKSimon Jun 19, 2026
f0134cc
AMDGPU: Add subtarget feature for controllable xnack modes (#204523)
arsenm Jun 19, 2026
b65d510
[offload][OpenMP] Fix record replay when no memory is used (#201771)
kevinsala Jun 19, 2026
9c50867
[ORC][examples] Add a new example showing basic symbolAliases usage. …
lhames Jun 19, 2026
0ad5d54
[GlobalISel] TableGen memcpy-like prelegalizer combines (#203235)
c-rhodes Jun 19, 2026
d87e513
[clang][test] Use #marker in enable_if tests (#204624)
tbaederr Jun 19, 2026
7d122c3
[LifetimeSafety] Propagate loans through pointer inc/dec and compound…
Xazax-hun Jun 19, 2026
6352a58
[mlir][IR] Fix typo in code example of DenseTypedElementsAttr (#204739)
matthias-springer Jun 19, 2026
3b1a922
[VPlan] Extend licm to sink replicate stores (#191026)
artagnon Jun 19, 2026
f8bd135
[lit] Make RecursionError less likely in internal shell (#204573)
statham-arm Jun 19, 2026
9361b3d
[LV] Add test for WidenCall with mixed scalar-vector operands (#203092)
artagnon Jun 19, 2026
b9587a7
[ELF][AArch64] Relax zero TLSLE add to nop (#204286)
JiangNingHX Jun 19, 2026
b496d06
[LifetimeSafety] Model bit_cast and atomic casts in the fact generato…
Xazax-hun Jun 19, 2026
c327ab3
[AArch64] Fix Windows target detection in FrameLowering (#204347)
MacDue Jun 19, 2026
06137a5
[llvm] Remove LLVM_ABI_FOR_TEST in public headers (#204627)
Steelskin Jun 19, 2026
47b29c2
[SPIR-V] Legalize G_PHI of oversized vectors via fewer-elements (#203…
mdfaijul Jun 19, 2026
40cbc98
[AArch64][SDAG] Legalise nxv1 gather/scatter nodes (#204620)
gbossu Jun 19, 2026
a5e83b9
[Clang][NEON ACLE] Remove +bf16 requirement from opaque bfloat builti…
paulwalker-arm Jun 19, 2026
fdf3d44
[InstCombine] Add tests showing failure to fold pdep(0,x) and pext(0,…
RKSimon Jun 19, 2026
e6daa68
Revert "Revert "[Compiler-rt][test] Fix circular link dependency betw…
quic-garvgupt Jun 19, 2026
500d1f8
[SPIR-V] Fix crash on void indirect call with aggregate argument (#20…
MrSidims Jun 19, 2026
b90ec9c
[StackColoring] Remove unused BB numbering state (#204414)
osa1 Jun 19, 2026
f6fd6ea
[mlir][ExecutionEngine] Fix dead -Wno-c++98-compat-extra-semi guard (…
bogdan-petkovic Jun 19, 2026
3cc9463
[Delinearization] Narrow the scope of the term collection (#204145)
kasuga-fj Jun 19, 2026
60a2d43
[AArch64] Add SVE shuffle optimization pass (#193951)
huntergr-arm Jun 19, 2026
80c80e6
[clang][bytecode] Check const writes more thorougly (#204529)
tbaederr Jun 19, 2026
a6fe3c7
[libc++][test] Migrate _BitInt probe to __BITINT_MAXWIDTH__ and fix l…
xroche Jun 19, 2026
eb7ce80
CodeGenPassBuilder: Use cl::boolOrDefault directly in CGPassBuilderOp…
petar-avramovic Jun 19, 2026
54a7896
[JITLink][COFF] Synthesize __imp_ IAT entries (#203906)
mkovacevic99 Jun 19, 2026
12ee71c
[Clang] Respect `-fno-slp-vectorize` for the LTO pipeline (#201585)
jhuber6 Jun 19, 2026
0390898
[mlir][affine] Implement LoopLikeInterface::getStaticTripCount on Aff…
j2kun Jun 19, 2026
72af16e
[clang][FreeBSD] Re-enable the crash-recovery test on FreeBSD (#192608)
aokblast Jun 19, 2026
8259502
[Clang][Hexagon] Predefine _GNU_SOURCE for C++ compilations (#201599)
quic-k Jun 19, 2026
8eae496
[libc++] Make std::multimap constexpr as part of P3372R3 (#161901)
vinay-deshmukh Jun 19, 2026
46ece19
[libc++] Add a missing include in string.h (#135134)
atetubou Jun 19, 2026
8ca5830
[libc++] Default the allocator argument for most string constructors …
philnik777 Jun 19, 2026
d6ccc29
[clang][bytecode] Take AccessKinds into account in diagnoseNonConstVa…
tbaederr Jun 19, 2026
bf18f6f
[CommandLine] Make cl::boolOrDefault a scoped enum (#204553)
petar-avramovic Jun 19, 2026
8c7b7dd
[lldb] Fix format string (#204837)
s-barannikov Jun 19, 2026
16a0a10
[NFC] Remove SVEShuffleOpts variable unused in release build (#204833)
huntergr-arm Jun 19, 2026
5bb3690
[llvm][Target] Avoid premature Twine .str() materialization (#204836)
tgymnich Jun 19, 2026
467a5fe
[clang] Avoid premature Twine .str() materialization (#204830)
tgymnich Jun 19, 2026
22dce64
[AMDGPU] Remove some functions unused since #105645. NFC. (#204844)
jayfoad Jun 19, 2026
29692c1
[libc] Implement basename and dirname in libgen.h (#204554)
kaladron Jun 19, 2026
fd6a30b
[libcxx] Make std::pair pretty-printer ABI-independent (#201768)
aokblast Jun 19, 2026
d43b360
[SLP] Fix reduction cost crash for reduced values replaced by extract…
alexey-bataev Jun 19, 2026
8c922aa
[MemorySanitizer] Merge x86 BMI and PackedBits handlers into handleGe…
RKSimon Jun 19, 2026
abbb031
[BOLT][rewrite] warn about functions without CFG before binary analys…
devnexen Jun 19, 2026
85c81a2
[x64][win] Windows x64 unwind v3: Use tail-relative epilog offsets an…
dpaoliello Jun 19, 2026
e995171
[CIR][AArch64] Upstream widening-addition and vector-shift-left-and-w…
iamvickynguyen Jun 19, 2026
ae60782
Revert "[libc] Implement basename and dirname in libgen.h (#204554)" …
kaladron Jun 19, 2026
403ce0d
[flang][OpenMP] Emit warning that REVERSE_OFFLOAD is not supported (#…
kparzysz Jun 19, 2026
ef5d544
[flang][OpenMP] Scope-qualify user-defined reduction names in lowerin…
ceseo Jun 19, 2026
fa135bb
[llubi] Run verifier on the input IR (#204095)
nofe1248 Jun 19, 2026
eb21e78
Mark LastEpilogIdx as maybe_unused (#204857)
dpaoliello Jun 19, 2026
d10349c
[InstSimplify] Add fold for pdep(0,x) -> 0 and pext(0,x) -> 0 (#204810)
RKSimon Jun 19, 2026
bb87fbb
[llvm] Avoid premature Twine .str() materialization (#204828)
tgymnich Jun 19, 2026
394aa60
[CIR][NFC] Sync AArch64 NEON intrinsics with Clang (#204862)
AmrDeveloper Jun 19, 2026
7319a3c
[Support] Remove unused parameter of DataExtractor constructor (#204840)
s-barannikov Jun 19, 2026
95e3219
[mlir][ptr] Add constantop convertion (#204846)
linuxlonelyeagle Jun 19, 2026
f4043db
[lldb] Survive ptrace(PT_DENY_ATTACH) when attaching (#204688)
JDevlieghere Jun 19, 2026
6b7dbd8
AMDGPU/GlobalISel: RegBankLegalize rules for gfx950 mfmas (#204696)
vangthao95 Jun 19, 2026
d1c306c
[CIR] Implement Aggregate non-atomic to atomic cast (#204653)
AmrDeveloper Jun 19, 2026
e6a92e0
[offload] Fix teams/threads limits in record replay (#200639)
kevinsala Jun 19, 2026
f193189
[libc++][byte] Apply [[nodiscard]] to std::byte (#204674)
H-G-Hristov Jun 19, 2026
fe9521d
[LV] Unify header phi fixup and remove fixNonInductionPHIs (NFC). (#2…
fhahn Jun 19, 2026
39f8f90
[SPIR-V] Lower undef nested in a constant aggregate (#204377)
MrSidims Jun 19, 2026
4195b29
workflows/subscriber: Update to latest github automation container (#…
tstellar Jun 19, 2026
086f633
AMDGPU/GlobalISel: RegBankLegalize rules for load_async_to_lds (#204683)
vangthao95 Jun 19, 2026
f9fa598
[AMDGPU] Use explicit carry nodes for i64 wide integer lowering (#204…
shiltian Jun 19, 2026
90b2048
bitcode: Improve invalid summary version error (#204888)
arsenm Jun 19, 2026
c890f4d
[Bazel] Fixes 95e3219 (#204873)
forking-google-bazel-bot[bot] Jun 19, 2026
a8aba70
[Flang] Standardize coarray TODO() diagnostic messages (#204708)
sscalpone Jun 19, 2026
ba5384a
[Support] Add a parser for cl::opt<ElementCount> (#203969)
MacDue Jun 19, 2026
b32488f
[Clang][UBSan] Use EmitCheckedLValue for C++ trivial operator= operan…
hubert-reinterpretcast Jun 20, 2026
e47530b
[BOLT][AArch64] Align tentative layout bases using per-section alignm…
yozhu Jun 20, 2026
0928584
[clang-format][NFC] Clean up FormatTokenLexer (#203825)
owenca Jun 20, 2026
359bfe6
[LifetimeSafety] Allow configuring lifetimebound fix-it spelling (#20…
zeyi2 Jun 20, 2026
2678b8f
[DirectX] Handle llvm.dx.resource.getbasepointer intrinsic in DXILRes…
hekota Jun 20, 2026
e9acb01
[OpenMP][offload] Cross-team reductions with variable number of teams…
ro-i Jun 20, 2026
4c16440
Revert "[OpenMP][offload] Cross-team reductions with variable number …
ro-i Jun 20, 2026
a389989
[MLIR][WASM] Introduce the RaiseWasmMLIRPass to convert WasmSSA MLIR …
flemairen6 Jun 20, 2026
ea9bae0
Revert "[MLIR][WASM] Introduce the RaiseWasmMLIRPass to convert WasmS…
lforg37 Jun 20, 2026
ec3b418
[libc++][test] Rewrite tests for `std::byte` (#204116)
frederick-vs-ja Jun 20, 2026
9f578bc
[clang-format] Reset `Line->IsModuleOrImportDecl` in `addUnwrappedLin…
damster101 Jun 20, 2026
f210807
[LoopCacheAnalysis] Drop isLoopSimplifyForm check (NFCI) (#204822)
kasuga-fj Jun 20, 2026
342de06
[Reassociate] Distribute multiply over add to enable factorization (#…
hazarathayya Jun 20, 2026
56262f2
workflows/new-prs: Use github-automation container (#204706)
tstellar Jun 20, 2026
68079bb
[clang] Implement `__builtin_elementwise_pext` and `__builtin_element…
eisenwave Jun 20, 2026
9019eff
[InstCombine] Fold trunc scmp/ucmp -> scmp/ucmp with the target type …
AZero13 Jun 20, 2026
d186503
[clang][bytecode][NFC] Remove dead code (#204910)
tbaederr Jun 20, 2026
465c904
[X86] combineX86ShufflesRecursively - delay widening shuffle inputs. …
RKSimon Jun 20, 2026
e26ff54
[InstCombine] Remove fold with OneUse as there is fold without the ch…
andjo403 Jun 20, 2026
2ec6f28
[InstCombine] Fold sext(and/or/xor(trunc nsw x), y) -> and/or/xor(sex…
andjo403 Jun 20, 2026
f6296fb
[InstCombine] Fold zext(and/or/xor(trunc nuw x), y) -> and/or/xor(zex…
andjo403 Jun 20, 2026
b2c0c48
[InstCombine] Fold or (ashr X, BW-1), zext (icmp ne|sgt X, 0) to scmp…
AZero13 Jun 20, 2026
bae51e7
[IR] handle oversized constant alloca counts in getAllocationSize (#2…
pektezol Jun 20, 2026
c888371
[clangd] Look for resource-dir relative to detected compiler path as …
playerC Jun 20, 2026
0c3c664
[VectorCombine] Add subvector reduction support to foldShuffleChainsT…
as4230 Jun 20, 2026
2c022e8
[Verifier] Only accept noundef metadata on loads and update metadata …
philnik777 Jun 20, 2026
cd532fe
[LoopCacheAnalysis] Generate tests by update_analyze_test_checks.py (…
kasuga-fj Jun 20, 2026
6619aa7
[AMDGPU] Use SchedModel latencies for Fence barrier edges (#204657)
jrbyrnes Jun 20, 2026
b9c334d
[SLP] Fix scheduling crash for reordered insertvalue buildvector nodes
alexey-bataev Jun 20, 2026
cb85dfe
[VPlan] Skip shl->mul SCEV rewrite for out-of-range shift amounts. (#…
fhahn Jun 20, 2026
18c1cbc
[llvm-objcopy][MachO] Align __LINKEDIT entries to pointer size (#203680)
goranmoomin Jun 20, 2026
a891d7b
[llvm-objcopy][MachO] Use alignToPowerOf2 instead of alignTo (#204033)
drodriguez Jun 20, 2026
d0c2776
[BasicAA] Add additional tests with GEPs with phi/select pointer ops …
fhahn Jun 20, 2026
5502491
[VPlan] Properly check predicates and types in canNarrowOps. (#204948)
fhahn Jun 20, 2026
3c5f0c2
[VPlan] Add memory op decision test for scalarizing loads. (NFC) (#20…
fhahn Jun 20, 2026
959f069
[SelectionDAG] Keep split vector atomic store value in a vector regis…
jofrn Jun 20, 2026
61d601e
[AMDGPU][VOPD] Cache load reachability checks in VOPDpairing (#204854)
zGoldthorpe Jun 20, 2026
afac572
[clang] Add clang-format-check-format instead to CLANG_TEST_DEPS (#20…
owenca Jun 20, 2026
ec56065
workflows/new-prs: Remove obsolete code (#204955)
tstellar Jun 21, 2026
9d6c686
[orc-rt] Sink Session::sendWrapperResult into Session.cpp. NFC. (#204…
lhames Jun 21, 2026
7376a70
[tsan] fit Go/s390x mapping under QEMU (#204503)
tamird Jun 21, 2026
f42072e
[Analysis] Add `KnownBits` optimization for `pdep` and `pext` (#204223)
eisenwave Jun 21, 2026
8947e49
[InstCombine] Move alignment assumptions to the base of constant offs…
philnik777 Jun 21, 2026
71c2feb
Support for -fsplit-lto-unit option in flang driver (#204904)
shivaramaarao Jun 21, 2026
4417256
[LV] Avoid zero-width VF in computeVPlanOuterloopVF. (#204918)
fhahn Jun 21, 2026
9b36e4f
[orc-rt] Replace TaskDispatcher with Session-supplied wrapper-runner.…
lhames Jun 21, 2026
a12b7af
[X86] Select BLSI for i8 operands (#202344) (#204746)
Harishankar14 Jun 21, 2026
e0cc08d
[clang][x86] Add constexpr support for VNNI intrinsics (#190549)
ak-deo Jun 21, 2026
4f3eb80
[Xtensa] Call isUInt<8> in range-check asserts (#204731)
kernhanda Jun 21, 2026
3b46feb
[VPlan] Allow plain active lane mask in LastActiveLane verifier. (#20…
fhahn Jun 21, 2026
d6d4921
[gn] Fix missing dependency (#204991)
nico Jun 21, 2026
6542d6d
[ARM] Use lo tCMPr opcode when expanding CMP_SWAP (#204567)
davemgreen Jun 21, 2026
48c0a2a
Revert "[Legalizer] Add support for promoting integers for s/ucmp (#1…
AZero13 Jun 21, 2026
47fd9ed
[gn build] Port 60a2d437bd04 (#204996)
nico Jun 21, 2026
d3b48cc
[gn build] Port a64928f267f3 (#204997)
nico Jun 21, 2026
a323090
[VPlan] Add VPReplicateRecipe::getNumOperandsWithoutMask (NFC) (#205004)
fhahn Jun 21, 2026
5bb5410
[VPlan] Use pattern matching in isUsedByLoadStoreAddress (NFC) (#205008)
fhahn Jun 21, 2026
d1744cf
[orc-rt] Add InProcessControllerAccess class. (#204976)
lhames Jun 21, 2026
2e87cf8
[AtomicExpand] Add bitcasts when expanding store atomic vector (#197862)
jofrn Jun 22, 2026
bc047d4
[orc-rt]R Align scope-exit with LLVM (rename to scope_exit, use CTAD)…
lhames Jun 22, 2026
75fbd79
[lld-macho] Relax safe ICF's keepUnique for ld64-coalesced data secti…
nocchijiang Jun 22, 2026
fc7bcd0
[clang][RISCV] Handle VLS CC on unsupported primitive type in aggrega…
4vtomat Jun 22, 2026
f571aba
[llvm][RISCV] Revise xsfmm intrinsic interface. (#201527)
4vtomat Jun 22, 2026
b68b823
[Mips] Fix Clang crashes when assembling MIPS64r6 LDPC with non-8-byt…
yingopq Jun 22, 2026
72b891b
[clang] Avoid assertion on invalid member template specialization (#2…
w007878 Jun 22, 2026
6f98573
[orc-rt] Rename scope_exit header, add nodiscard attribute. (#205030)
lhames Jun 22, 2026
6a2128a
[ProfileData] Lazy-load fixed-length MD5 name table (#202014)
kazutakahirata Jun 22, 2026
1092b2b
[AMDGPU] Improve the description of asyncmark semantics (#202579)
ssahasra Jun 22, 2026
9f0b22c
[LoongArch] Custom scalar UINT_TO_FP and FP_TO_UINT with LSX instruct…
lrzlin Jun 22, 2026
de045d5
[orc-rt] Tidy up some SPS tag types. NFC. (#205038)
lhames Jun 22, 2026
15a3238
[AArch64] Lower extends of boolean vector loads via scalar load (#203…
he-weiwen Jun 22, 2026
6d66cc1
[orc-rt] Add SPS serialization for ExecutorAddrRange. (#205041)
lhames Jun 22, 2026
25e4057
[clang] Respect `CLANG_USE_EXPERIMENTAL_CONST_INTERP` (#200716)
tbaederr Jun 22, 2026
64ad10f
[AMDGPU][doc] Refactor Barrier Execution Model (#204566)
Pierre-vh Jun 22, 2026
7e09aaa
Merge llvm/main into amd-debug
mariusz-sikora-at-amd Jun 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
17 changes: 3 additions & 14 deletions .github/workflows/new-prs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ jobs:
runs-on: ubuntu-24.04
permissions:
pull-requests: write
container:
image: "ghcr.io/llvm/amd64/ci-ubuntu-24.04-github-automation:latest@sha256:82b5304c5d99cf5d75a2334885aca57490cbb04f37d07fc49a10a2649824e526"
# Only comment on PRs that have been opened for the first time, by someone
# new to LLVM or to GitHub as a whole. Ideally we'd look for FIRST_TIMER
# or FIRST_TIME_CONTRIBUTOR, but this does not appear to work. Instead check
Expand All @@ -33,26 +35,13 @@ jobs:
(github.event.pull_request.author_association != 'MEMBER') &&
(github.event.pull_request.author_association != 'OWNER')
steps:
- name: Checkout Automation Script
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
sparse-checkout: llvm/utils/git/
ref: main

- name: Setup Automation Script
working-directory: ./llvm/utils/git/
run: |
pip install --require-hashes -r requirements.txt

- name: Greet Author
working-directory: ./llvm/utils/git/
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
ISSUE_NUMBER: ${{ github.event.pull_request.number }}
PR_AUTHOR: ${{ github.event.pull_request.user.login }}
run: |
python3 ./github-automation.py \
github-automation.py \
--token "$GH_TOKEN" \
pr-greeter \
--issue-number "$ISSUE_NUMBER" \
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/subscriber.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
if: github.repository == 'llvm/llvm-project'
runs-on: ubuntu-24.04
container:
image: "ghcr.io/llvm/amd64/ci-ubuntu-24.04-github-automation:latest@sha256:06164c484402046b0d624e5df8b3435a91ea7d204e2416201a9bac8d809b9aa6"
image: "ghcr.io/llvm/amd64/ci-ubuntu-24.04-github-automation:latest@sha256:82b5304c5d99cf5d75a2334885aca57490cbb04f37d07fc49a10a2649824e526"

steps:
- id: app-token
Expand Down
24 changes: 24 additions & 0 deletions bolt/include/bolt/Core/BinaryContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
#include "llvm/Support/RWMutex.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/TargetParser/Triple.h"
#include <atomic>
#include <functional>
#include <list>
#include <map>
Expand Down Expand Up @@ -810,6 +811,29 @@ class BinaryContext {
/// final addresses functions will have.
uint64_t LayoutStartAddress{0};

/// Maximum alignment of objects emitted into the main (hot) and cold code
/// sections, populated by the parallel AlignerPass (updateMaxCodeAlignment).
std::atomic<uint16_t> MaxMainCodeAlignment{1};
std::atomic<uint16_t> MaxColdCodeAlignment{1};

/// Fold \p Alignment into the running max for the main code section (when
/// \p InMainSection) and/or the cold code section (when \p InColdSection),
/// reflecting which output section(s) the object is emitted into. Safe to
/// call concurrently.
void updateMaxCodeAlignment(uint16_t Alignment, bool InMainSection,
bool InColdSection) {
auto AtomicMax = [](std::atomic<uint16_t> &Max, uint16_t Value) {
uint16_t Cur = Max.load(std::memory_order_relaxed);
while (Value > Cur &&
!Max.compare_exchange_weak(Cur, Value, std::memory_order_relaxed))
;
};
if (InMainSection)
AtomicMax(MaxMainCodeAlignment, Alignment);
if (InColdSection)
AtomicMax(MaxColdCodeAlignment, Alignment);
}

/// Old .text info.
uint64_t OldTextSectionAddress{0};
uint64_t OldTextSectionOffset{0};
Expand Down
20 changes: 20 additions & 0 deletions bolt/lib/Passes/Aligner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,26 @@ Error AlignerPass::runOnFunctions(BinaryContext &BC) {
else
alignMaxBytes(BF);

// Record the function's effective code alignment so layout passes can align
// the tentative section base to the eventual section alignment without
// re-scanning all functions. AssignSections (run just before this pass) has
// assigned the output sections, so route the alignment to whichever of
// .text / .text.cold the function actually emits into: a whole cold
// function (and its constant island) lands entirely in .text.cold, while a
// split function contributes its (duplicated) island and code to both.
const uint16_t Align = std::max<uint16_t>(
BF.getAlignment(),
BF.hasIslandsInfo() ? BF.getConstantIslandAlignment() : uint16_t(0));
const SmallString<32> MainSectionName = BF.getCodeSectionName();
const bool InMainSection =
StringRef(MainSectionName) == BC.getMainCodeSectionName();
bool InColdSection =
StringRef(MainSectionName) == BC.getColdCodeSectionName();
if (!InColdSection && BF.isSplit())
InColdSection = StringRef(BF.getCodeSectionName(FragmentNum::cold())) ==
BC.getColdCodeSectionName();
BC.updateMaxCodeAlignment(Align, InMainSection, InColdSection);

if (opts::AlignBlocks && !opts::PreserveBlocksAlignment)
alignBlocks(BF, Emitter.MCE.get());
};
Expand Down
11 changes: 8 additions & 3 deletions bolt/lib/Passes/LongJmp.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -317,7 +317,9 @@ void LongJmpPass::tentativeBBLayout(const BinaryFunction &Func) {
uint64_t LongJmpPass::tentativeLayoutRelocColdPart(
const BinaryContext &BC, BinaryFunctionListType &SortedFunctions,
uint64_t DotAddress) {
DotAddress = alignTo(DotAddress, llvm::Align(opts::AlignFunctions));
DotAddress =
alignTo(DotAddress, std::max<uint64_t>(opts::AlignFunctions,
BC.MaxColdCodeAlignment.load()));
for (BinaryFunction *Func : SortedFunctions) {
if (!Func->isSplit())
continue;
Expand Down Expand Up @@ -452,8 +454,11 @@ void LongJmpPass::tentativeLayout(const BinaryContext &BC,
}
}

if (!EstimatedTextSize || EstimatedTextSize > BC.OldTextSectionSize)
DotAddress = alignTo(BC.LayoutStartAddress, opts::AlignText);
if (!EstimatedTextSize || EstimatedTextSize > BC.OldTextSectionSize) {
uint64_t TextAlign =
std::max<uint64_t>(opts::AlignText, BC.MaxMainCodeAlignment.load());
DotAddress = alignTo(BC.LayoutStartAddress, TextAlign);
}

tentativeLayoutRelocMode(BC, SortedFunctions, DotAddress);
}
Expand Down
8 changes: 5 additions & 3 deletions bolt/lib/Rewrite/BinaryPassManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -518,6 +518,11 @@ Error BinaryFunctionPassManager::runAllPasses(BinaryContext &BC) {

Manager.registerPass(std::make_unique<Peepholes>(PrintPeepholes));

// Assign each function an output section before AlignerPass and LongJmpPass,
// so those passes can attribute per-section code alignment and tentative
// layout to the final .text / .text.cold sections.
Manager.registerPass(std::make_unique<AssignSections>());

Manager.registerPass(std::make_unique<AlignerPass>());

// Perform reordering on data contained in one or more sections using
Expand Down Expand Up @@ -555,9 +560,6 @@ Error BinaryFunctionPassManager::runAllPasses(BinaryContext &BC) {
Manager.registerPass(
std::make_unique<RetpolineInsertion>(PrintRetpolineInsertion));

// Assign each function an output section.
Manager.registerPass(std::make_unique<AssignSections>());

// This pass turns tail calls into jumps which makes them invisible to
// function reordering. It's unsafe to use any CFG or instruction analysis
// after this point.
Expand Down
31 changes: 26 additions & 5 deletions bolt/lib/Rewrite/RewriteInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2284,15 +2284,16 @@ Error RewriteInstance::readSpecialSections() {
BC->printSections(BC->outs());
}

if (opts::RelocationMode == cl::BOU_TRUE && !HasTextRelocations) {
if (opts::RelocationMode == cl::boolOrDefault::BOU_TRUE &&
!HasTextRelocations) {
BC->errs()
<< "BOLT-ERROR: relocations against code are missing from the input "
"file. Cannot proceed in relocations mode (-relocs).\n";
exit(1);
}

BC->HasRelocations =
HasTextRelocations && (opts::RelocationMode != cl::BOU_FALSE);
BC->HasRelocations = HasTextRelocations &&
(opts::RelocationMode != cl::boolOrDefault::BOU_FALSE);

if (BC->IsLinuxKernel && BC->HasRelocations) {
BC->outs() << "BOLT-INFO: disabling relocation mode for Linux kernel\n";
Expand Down Expand Up @@ -3917,8 +3918,28 @@ void RewriteInstance::runBinaryAnalyses() {
NamedRegionTimer T("runBinaryAnalyses", "run binary analysis passes",
TimerGroupName, TimerGroupDesc, opts::TimeRewrite);
BinaryFunctionPassManager Manager(*BC);
// FIXME: add a pass that warns about which functions do not have CFG,
// and therefore, analysis is most likely to be less accurate.

// Warn about functions for which BOLT could not reconstruct the CFG: binary
// analyses are less precise on them and may report both false negatives and
// false positives.
unsigned NoCFGCount = 0;
for (const auto &BFI : BC->getBinaryFunctions()) {
const BinaryFunction &BF = BFI.second;
// Skip ignored functions: BOLT does not attempt to build a CFG for them
// (e.g. pseudo functions such as PLT stubs), so a missing CFG there is
// expected rather than a sign of degraded analysis.
if (BF.isIgnored() || BF.hasCFG())
continue;
++NoCFGCount;
if (opts::Verbosity >= 1)
BC->errs() << "BOLT-WARNING: no CFG for " << BF
<< "; binary analyses may be imprecise\n";
}
if (NoCFGCount)
BC->errs() << "BOLT-WARNING: " << NoCFGCount
<< " function(s) lack CFG; binary-analysis results may be"
" incomplete. Re-run with -v=1 to list these functions.\n";

using PtrAuthScanner = PAuthGadgetScanner::Analysis;

// Accumulate all enabled analyses.
Expand Down
38 changes: 38 additions & 0 deletions bolt/test/binary-analysis/AArch64/cfg-warning.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
## Verify that binary analyses warn about functions for which BOLT could not
## reconstruct the CFG, since analysis results are less reliable for them.

// RUN: %clang %cflags %s %p/../../Inputs/asm_main.c -o %t.exe
// RUN: llvm-bolt-binary-analysis --scanners=ptrauth-pac-ret %t.exe 2>&1 \
// RUN: | FileCheck --check-prefix=SUMMARY %s
// RUN: llvm-bolt-binary-analysis --scanners=ptrauth-pac-ret -v=1 %t.exe 2>&1 \
// RUN: | FileCheck --check-prefix=VERBOSE %s

.text

## A function with a regular CFG must not be reported.
.globl f_good
.type f_good,@function
f_good:
ret
.size f_good, .-f_good
// SUMMARY-NOT: BOLT-WARNING:{{.*}}f_good
// VERBOSE-NOT: BOLT-WARNING:{{.*}}f_good

## An unanalyzable indirect branch prevents BOLT from building the CFG.
.globl f_nocfg
.type f_nocfg,@function
f_nocfg:
adr x2, 1f
br x2
1:
ret
.size f_nocfg, .-f_nocfg

## Without -v, only the aggregate warning is emitted; functions are not listed
## individually.
// SUMMARY-NOT: BOLT-WARNING: no CFG for
// SUMMARY: BOLT-WARNING: {{[0-9]+}} function(s) lack CFG; binary-analysis results may be incomplete. Re-run with -v=1 to list these functions.

## With -v=1, each function lacking a CFG is listed before the summary.
// VERBOSE: BOLT-WARNING: no CFG for {{.*}}f_nocfg{{.*}}; binary analyses may be imprecise
// VERBOSE: BOLT-WARNING: {{[0-9]+}} function(s) lack CFG; binary-analysis results may be incomplete. Re-run with -v=1 to list these functions.
24 changes: 23 additions & 1 deletion clang-tools-extra/clangd/CompileCommands.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,28 @@ std::string detectStandardResourceDir() {
return GetResourcesPath("clangd", (void *)&StaticForMainAddr);
}

std::optional<std::string>
detectResourceDirWithClangPath(std::optional<std::string> ClangPath) {
std::string ResourceDir = detectStandardResourceDir();
if (llvm::sys::fs::exists(ResourceDir))
return ResourceDir;
vlog("Auto-detected standard resource directory '{0}' doesn't exist",
ResourceDir);

if (ClangPath) {
ResourceDir = GetResourcesPath(*ClangPath);
if (llvm::sys::fs::exists(ResourceDir))
return ResourceDir;
vlog("Auto-detected using clang path '{0}' "
"resource directory '{1}' doesn't exist",
*ClangPath, ResourceDir);
}

elog("Failed to auto-detect resource directory, "
"specify it manually via --resource-dir command line argument");
return std::nullopt;
}

// The path passed to argv[0] is important:
// - its parent directory is Driver::Dir, used for library discovery
// - its basename affects CLI parsing (clang-cl) and other settings
Expand Down Expand Up @@ -188,7 +210,7 @@ static std::string resolveDriver(llvm::StringRef Driver, bool FollowSymlink,
CommandMangler CommandMangler::detect() {
CommandMangler Result;
Result.ClangPath = detectClangPath();
Result.ResourceDir = detectStandardResourceDir();
Result.ResourceDir = detectResourceDirWithClangPath(Result.ClangPath);
Result.Sysroot = detectSysroot();
return Result;
}
Expand Down
4 changes: 4 additions & 0 deletions clang/docs/LanguageExtensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -905,6 +905,10 @@ T __builtin_elementwise_fshr(T x, T y, T z) perform a funnel shift right. Co
first argument is 0 and no second argument is provided.
T __builtin_elementwise_clmul(T x, T y) perform a carry-less multiplication of x and y, returning the least integer types
significant bits of the wide result.
T __builtin_elementwise_pext(T x, T m) extract bits from x selected by the mask m, pack them contiguously integer types
into the least significant bits of the result, and zero the rest.
T __builtin_elementwise_pdep(T x, T m) deposit the least significant bits of x at the positions integer types
where m has a 1-bit, and zero the rest.
============================================== ====================================================================== =========================================


Expand Down
8 changes: 7 additions & 1 deletion clang/docs/LifetimeSafety.rst
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,12 @@ more accurate checks in calling code.

To enable annotation suggestions, use ``-Wlifetime-safety-suggestions``.

Fix-it hints normally insert ``[[clang::lifetimebound]]``. If a visible
object-like macro expands to ``[[clang::lifetimebound]]`` or
``__attribute__((lifetimebound))``, Clang will use the last such macro
visible at the insertion point. To force a project-specific macro spelling,
use ``-lifetime-safety-lifetimebound-macro=<macro>``.

.. code-block:: c++

#include <string_view>
Expand Down Expand Up @@ -688,5 +694,5 @@ Performance
Lifetime analysis relies on Clang's CFG (Control Flow Graph). For functions
with very large or complex CFGs, analysis time can sometimes be significant. To mitigate
this, the analysis allows to skip functions where the number of CFG blocks exceeds
a certain threshold, controlled by the ``-flifetime-safety-max-cfg-blocks=N`` language
a certain threshold, controlled by the ``-lifetime-safety-max-cfg-blocks=N`` language
option.
5 changes: 5 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,10 @@ Non-comprehensive list of changes in this release
integers including ``_BitInt`` types. This includes constexpr evaluation
support.

- Added ``__builtin_elementwise_pext`` and ``__builtin_elementwise_pdep`` for
parallel bit extract and parallel bit deposit operations on integers including
``_BitInt`` types. This includes constexpr evaluation support.

- Deprecated float types support from ``__builtin_elementwise_max`` and
``__builtin_elementwise_min``.

Expand Down Expand Up @@ -842,6 +846,7 @@ Miscellaneous Clang Crashes Fixed
- Fixed an assertion failure in ``isAtEndOfMacroExpansion`` on macro expansions crossing the boundary of two fileIDs. (#GH115007), (#GH21755)
- Fixed an assertion failure when ``__builtin_dump_struct`` is used with an
immediate-escalated callable. (#GH192846)
- Fixed a crash when diagnosing an invalid out-of-line definition of a member class template. (#GH201490)

OpenACC Specific Changes
------------------------
Expand Down
12 changes: 12 additions & 0 deletions clang/include/clang/Basic/Builtins.td
Original file line number Diff line number Diff line change
Expand Up @@ -1835,6 +1835,18 @@ def ElementwiseClmul : Builtin {
let Prototype = "void(...)";
}

def ElementwisePext : Builtin {
let Spellings = ["__builtin_elementwise_pext"];
let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
let Prototype = "void(...)";
}

def ElementwisePdep : Builtin {
let Spellings = ["__builtin_elementwise_pdep"];
let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
let Prototype = "void(...)";
}

def ReduceMax : Builtin {
let Spellings = ["__builtin_reduce_max"];
let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
Expand Down
Loading