Skip to content

[AMDGPU][COMGR] Remove COMGR hotswap HSA tool#3007

Merged
chinmaydd merged 1 commit into
ROCm:amd-stagingfrom
harsh-amd:comgr-hotswap-tool-removal
Jun 23, 2026
Merged

[AMDGPU][COMGR] Remove COMGR hotswap HSA tool#3007
chinmaydd merged 1 commit into
ROCm:amd-stagingfrom
harsh-amd:comgr-hotswap-tool-removal

Conversation

@harsh-amd

@harsh-amd harsh-amd commented Jun 22, 2026

Copy link
Copy Markdown

Summary

  • Stop building and installing the COMGR-owned HSA tools library for hotswap.
  • Remove the COMGR hotswap tool loader/device-detection tests that only covered libamd_comgr_hotswap_tool.so.
  • Keep the hotswap README scoped to COMGR-owned API and transpiler documentation after removing the runtime-tool docs.

Stack

  1. This PR: COMGR hotswap tool removal.
  2. AMDGPU: add hotswap entry trampoline core #3008: entry trampoline implementation.
  3. [AMDGPU][COMGR] Add hotswap text displacement infrastructure #3000: displacement infrastructure.

Testing

  • git diff --check rocm/amd-staging..comgr-hotswap-tool-removal

Comment thread amd/comgr/src/hotswap/README.md Outdated
@chinmaydd

Copy link
Copy Markdown

I dont see anything relevant to this change on the tagged SWDEV. Please correct me if I'm missing something

@chinmaydd chinmaydd added the comgr Related to Code Object Manager label Jun 22, 2026
@harsh-amd harsh-amd force-pushed the comgr-hotswap-tool-removal branch from 5c99247 to f72a250 Compare June 22, 2026 17:22
@harsh-amd

Copy link
Copy Markdown
Author

I dont see anything relevant to this change on the tagged SWDEV. Please correct me if I'm missing something

Thanks, I have removed all references to that SWDEV.

@harsh-amd harsh-amd force-pushed the comgr-hotswap-tool-removal branch from f72a250 to 6ebc89d Compare June 22, 2026 17:30
@harsh-amd harsh-amd force-pushed the comgr-hotswap-tool-removal branch from 6ebc89d to 59385ff Compare June 22, 2026 17:39
@chinmaydd

chinmaydd commented Jun 22, 2026

Copy link
Copy Markdown

Thanks @harsh-amd.

I see that 6007 on TheRock was merged. We should probably remove the toggling / passing of relevant flags on there.

@nirmie

nirmie commented Jun 22, 2026

Copy link
Copy Markdown

I think we need to export the amd_comgr.h header so that the tool in rocm-systems can use it, right now the cmakelists.txt for amd/comgr/CMakeLists.txt only goes one directory deep into the /include so it doesnt find "amd_comgr.h"

When I tried to build manually I ran into this issue, it was resolved by going to line 432 and replacing:

target_include_directories(amd_comgr
  PUBLIC
    $<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}/include>
    $<INSTALL_INTERFACE:include>)

With

target_include_directories(amd_comgr
  PUBLIC
    $<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}/include>
    $<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}>
    $<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}/${amd_comgr_NAME}>)

A new error seems to appear after that though, figuring that out right now - 'amd_comgr_hotswap_rewrite' was
not declared

@chinmaydd

Copy link
Copy Markdown

That change is probably unrelated to this PR.

Also, the runtime uses Comgr APIs by referencing the exact header you mentioned : https://github.com/ROCm/rocm-systems/blob/develop/projects/clr/rocclr/device/comgrctx.hpp#L11

I'd recommend you figure out how the import is working there.

@harsh-amd

harsh-amd commented Jun 22, 2026

Copy link
Copy Markdown
Author

@nirmie

nirmie commented Jun 22, 2026

Copy link
Copy Markdown

I see, I think we just need uniformity in how we are referencing amd_comgr.h. In another PR in rocm-systems we can update https://github.com/harsh-amd/rocm-systems/blob/hotswap-revert-7342/projects/hotswap/hotswap.cpp#L8 to use the "amd_comgr/amd_comgr.h".

@harsh-amd

Copy link
Copy Markdown
Author

I see, I think we just need uniformity in how we are referencing amd_comgr.h. In another PR in rocm-systems we can update https://github.com/harsh-amd/rocm-systems/blob/hotswap-revert-7342/projects/hotswap/hotswap.cpp#L8 to use the "amd_comgr/amd_comgr.h".

Okay lets plan to do that then

@chinmaydd

Copy link
Copy Markdown

Not sure why the gfx90a test isnt running here. Its the only required check thats preventing us from merging this in.

@harsh-amd

Copy link
Copy Markdown
Author

Not sure why the gfx90a test isnt running here. Its the only required check thats preventing us from merging this in.

Can you manually retrigger it?

@chinmaydd

Copy link
Copy Markdown

!PSDB

@chinmaydd

Copy link
Copy Markdown

I've asked on the CI channel internally, we should get some help soon.

@chinmaydd

Copy link
Copy Markdown

Ok, I think I could trigger the build

@chinmaydd chinmaydd enabled auto-merge (squash) June 23, 2026 03:08
@chinmaydd chinmaydd merged commit 8535f28 into ROCm:amd-staging Jun 23, 2026
57 of 60 checks passed
@lamb-j lamb-j added the hotswap Related to the Comgr Hotswap feature label Jun 23, 2026
harsh-amd added a commit to ROCm/rocm-systems that referenced this pull request Jun 26, 2026
… header paths (#7629)

## Motivation
This PR contains two components. One component is to update to use the
tooling bundled in projects/hotswap instead of the tool that exists at
comgr (deprecated). The other component is to resolve amd_comgr header
issues during the build. Additionally, there are some changes to the
CMakeLists.txt to have the hotswap HSA tool against
hsa-runtime64::hsa-runtime64.

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->

## Technical Details
~~Builds on top of Harsh's PRs here: #7577~~ (deprecated, #7577 code is
directly in this)
Relevant PRs: amd-llvm: ROCm/llvm-project#3007
<!-- Explain the changes along with any relevant GitHub links. -->

## JIRA ID
N/A
<!-- If applicable, mention the JIRA ID resolved by this PR (Example:
Resolves SWDEV-12345). -->
<!-- Do not post any JIRA links here. -->

## Test Plan
Contained build for the tool (verifies if linking resolves)
```
git sparse-checkout add projects/rocr-runtime/runtime/hsa-runtime/inc 2>/dev/null || true

P=~/hotswap/npi-build/_gapfix/comgr-prefix
DIST=~/hotswap/npi-build/therock/build/dist/rocm

cmake -S projects/hotswap -B /tmp/build-hotswap -GNinja -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_PREFIX_PATH="$DIST" \
  -Damd_comgr_DIR="$P/lib/cmake/amd_comgr" \
  -Dhsa-runtime64_DIR="$DIST/lib/cmake/hsa-runtime64"

cmake --build /tmp/build-hotswap --target hsa-hotswap
```
Tool name testing must be done with full build
<!-- Explain any relevant testing done to verify this PR. -->

## Test Result
Ran:
ls -l /tmp/build-hotswap/libhsa-hotswap.so
nm -D /tmp/build-hotswap/libhsa-hotswap.so | grep hotswap_rewrite
```
nm -D /tmp/build-hotswap/libhsa-hotswap.so | grep hotswap_rewrite
-rwxrwxr-x 1 nisenthi nisenthi 50480 Jun 22 18:45 /tmp/build-hotswap/libhsa-hotswap.so
                 U amd_comgr_hotswap_rewrite@amd_comgr_3.2
```
<!-- Briefly summarize test outcomes. -->

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
lamb-j added a commit to ROCm/TheRock that referenced this pull request Jun 29, 2026
…6155)

ISSUE ID: #6096

## Summary

Pin-only amd-llvm SMP bump: `46fcb339` (ww19.3) → `aa451e1f` (ww19.4),
adding one commit — ROCm/llvm-project#3007, which **removes the
comgr-side hotswap HSA tool** (`libamd_comgr_hotswap_tool.so`). COMGR
retains the `amd_comgr_hotswap_rewrite` API; the HSA_TOOLS_LIB tool now
lives in rocm-systems (`libhsa-hotswap.so`, separate PRs).

## Why this is safe standalone

- Nothing in TheRock references/validates
`libamd_comgr_hotswap_tool.so`, so removing it breaks no build or
packaging step.
- TheRock currently passes `-DHOTSWAP_BUILD_TOOL=ON` to comgr; after
#3007 that option no longer exists, so it becomes a **harmless
unused-variable warning**, not an error.
- The runtime hotswap path is `HSA_TOOLS_LIB`-gated (unset by default),
so there is no runtime behavior change.

## Scope

This PR is **only** the `compiler/amd-llvm` pin. The companion TheRock
CMake cleanup (dropping the dead `HOTSWAP_BUILD_TOOL` wiring, building
the rocm-systems `libhsa-hotswap.so`) and the rocm-systems SMP bump land
separately.

---------

Co-authored-by: theRonShark <ron.lieberman@amd.com>
lamb-j added a commit to ROCm/TheRock that referenced this pull request Jun 29, 2026
Integrate the relocated hotswap HSA tool. The HSA_TOOLS_LIB tool moved from
comgr (libamd_comgr_hotswap_tool.so, removed by ROCm/llvm-project#3007) to
rocm-systems projects/hotswap (libhsa-hotswap.so); comgr keeps only the
amd_comgr_hotswap_rewrite API.

- compiler/CMakeLists.txt: drop the removed HOTSWAP_BUILD_TOOL args; keep
  COMGR_ENABLE_HOTSWAP_TRANSPILE.
- core/CMakeLists.txt + core/artifact-core-runtime.toml: declare the
  hsa-hotswap subproject (rocm-systems projects/hotswap; deps amd-comgr +
  ROCR-Runtime) and package libhsa-hotswap.so into the core-runtime artifact.
- tests/test_rocm_sanity.py: add test_hotswap_tool_loads. When hotswap is
  enabled (libamd_comgr.so exports amd_comgr_hotswap_rewrite), libhsa-hotswap.so
  must be packaged and load cleanly under ROCr (rocminfo triggers hsa_init ->
  ROCr dlopens HSA_TOOLS_LIB tools). The allowlist is gfx1250->gfx1250 only, so
  the tool stays inert on other targets and rocminfo still succeeds. Skips when
  hotswap is disabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants