-
Notifications
You must be signed in to change notification settings - Fork 32
Add CLAUDE.md files for Claude Code guidance #1273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tgymnich
wants to merge
5
commits into
main
Choose a base branch
from
tim/claude
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
1c040ae
Add CLAUDE.md files for Claude Code guidance
tgymnich 616e8ca
Remove @ imports and mentions of water/waveasm CLAUDE.md from root
tgymnich 14a36a3
Add AGENTS.md
tgymnich 94d6164
Add clang-format formatting guidance to water and waveasm AGENTS.md
tgymnich 8c9cd08
Update AGENTS.md files with build instructions and formatting guidance
tgymnich File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| Wave is a Python DSL for high-performance ML kernel development targeting AMD GPUs (ROCm). The default compilation path is pure Python using IREE for codegen. Water and WaveASM are optional C++ extensions that replace parts of the IREE path. | ||
|
|
||
| ## Commands | ||
|
|
||
| ### Setup | ||
| ```bash | ||
| python -m venv .venv && source .venv/bin/activate | ||
| pip install -r requirements-iree-pinned.txt | ||
| pip install -r pytorch-cpu-requirements.txt # CPU-only dev/testing | ||
| pip install -e ".[dev]" | ||
| pre-commit install && pre-commit install --hook-type commit-msg | ||
| ``` | ||
|
|
||
| ### Testing | ||
| ```bash | ||
| pytest -n 4 --capture=tee-sys -vv ./tests/unittests/ # unit tests | ||
| pytest -s tests/unittests/test_file.py::test_name -v # single test | ||
| lit lit_tests/ -vv # MLIR LIT tests | ||
| pytest -s tests/ --run-e2e # GPU tests (requires hardware) | ||
| ``` | ||
|
|
||
| ### Linting | ||
| ```bash | ||
| mypy # type check wave_lang | ||
| pre-commit run # run Black, Ruff, clang-format against currently staged files | ||
| ``` | ||
|
|
||
| ### Gotchas | ||
| - **Always set `WAVE_CACHE_ON=0`** when testing code changes — stale cache entries hide the effect of edits: `WAVE_CACHE_ON=0 pytest ...` | ||
| - Dump MLIR for debugging: `pytest --dump-mlir-files-path=/tmp/mlir tests/` | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Compilation Flow | ||
|
|
||
| ``` | ||
| Wave Python DSL | ||
| ↓ graph transformation passes [wave_lang/kernel/wave/codegen/] | ||
| Transformed FX graph | ||
| ↓ WaveEmitter [compiler/wave_codegen/emitter.py] | ||
| stream.executable MLIR | ||
| ↓ iree.compiler.compile_str() [wave/utils/compile_utils.py] | ||
| VMFB (IREE bytecode module) | ||
| ↓ iree.runtime.VmModule | ||
| GPU kernel execution | ||
| ``` | ||
|
|
||
| Entry point: `wave_compile()` in `wave_lang/kernel/wave/compile.py`. | ||
|
|
||
| ### Runtimes | ||
|
|
||
| **IREE runtime (default):** Loads VMFB into IREE's VM. Handles GPU command buffers, queue submission, benchmarking, multi-device. | ||
|
|
||
| **Wave runtime (`options.wave_runtime=True`):** Launches HSACO kernels directly via HIP API. Supports dynamic strides and custom grid layout. Typically paired with WaveASM. Entry point: `invoke_with_wave_runtime()` in `wave_lang/kernel/wave/utils/run_utils.py`. | ||
|
|
||
| ### Key Source Locations | ||
|
|
||
| - `wave_lang/kernel/wave/compile.py` — pipeline orchestration, backend/runtime selection | ||
| - `wave_lang/kernel/wave/codegen/` — graph transformation passes (scheduling, barriers, index analysis) | ||
| - `wave_lang/kernel/compiler/wave_codegen/emitter.py` — lowers FX graph to MLIR | ||
| - `wave_lang/kernel/wave/water.py` — Water/WaveASM lowering pipeline entry points | ||
| - `wave_lang/kernel/wave/mlir_converter/` — Wave FX ↔ Water MLIR conversion; runs in a subprocess to avoid MLIR library conflicts (Water backend only) | ||
|
|
||
| ### Optional Extensions | ||
|
|
||
| Water and WaveASM intercept MLIR before IREE and produce HSACO directly. Enable via env vars: | ||
|
|
||
| | Variable | Purpose | | ||
| |---|---| | ||
| | `WAVE_BUILD_WATER=1` | Build Water from source | | ||
| | `WAVE_BUILD_WAVEASM=1` | Build WaveASM from source | | ||
| | `WAVE_WATER_DIR=water/build` | Use existing Water build (fast) | | ||
| | `WAVE_WAVEASM_DIR=waveasm/build` | Use existing WaveASM build (fast) | | ||
|
|
||
| When both active: stream.executable MLIR → `water-opt` → `waveasm-translate` → `water-opt` → ExecutionEngine. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| See @AGENTS.md |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| Water is an optional MLIR layer in the Wave compiler stack that replaces IREE's middle-end lowering. It defines the `wave.*` and `normalform.*` dialects, transformation passes, and Python bindings (`water_mlir` package). | ||
|
|
||
| ## Building | ||
|
|
||
| Water must be built with CMake first. `pip install` alone does not build Water — `WAVE_WATER_DIR` is required to point Wave at an existing Water build. | ||
|
|
||
| LLVM is pinned at `water/llvm-sha.txt`. CLI tool: `water-opt` (analogous to `mlir-opt`). | ||
|
|
||
| ### Step 1: Build Water with CMake | ||
|
|
||
| Requires a pre-built LLVM/MLIR. Set `$BUILD_DIR` to your LLVM build or install tree. | ||
|
|
||
| ```bash | ||
| # Configure | ||
| cmake -G Ninja \ | ||
| -B water/build \ | ||
| water/ \ | ||
| -DMLIR_DIR=$BUILD_DIR/lib/cmake/mlir \ | ||
| -DBUILD_SHARED_LIBS=ON \ | ||
| -DPython3_EXECUTABLE="$(which python)" \ | ||
| -DWATER_ENABLE_PYTHON=ON | ||
|
|
||
| # Optional: faster builds with clang + ccache + lld | ||
| cmake -B water/build \ | ||
| -DCMAKE_C_COMPILER=clang \ | ||
| -DCMAKE_CXX_COMPILER=clang++ \ | ||
| -DCMAKE_C_COMPILER_LAUNCHER=ccache \ | ||
| -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \ | ||
| -DLLVM_USE_LINKER=lld | ||
|
|
||
| # Build | ||
| cmake --build water/build | ||
| ``` | ||
|
|
||
| ### Step 2: Install Wave with Water bindings | ||
|
|
||
| ```bash | ||
| WAVE_WATER_DIR=water/build pip install -e ".[dev]" | ||
| ``` | ||
|
|
||
| `WAVE_WATER_DIR` tells Wave where to find the Water build. Without it, Water is not included. | ||
|
|
||
| ### Iterating on C++ changes | ||
|
|
||
| ```bash | ||
| ninja -C water/build # rebuild changed C++ targets and Python bindings | ||
| ``` | ||
|
|
||
| ## Formatting | ||
|
|
||
| C++ code is formatted with `git clang-format` which formats only the lines changed relative to a commit (default: `HEAD`) | ||
| ```bash | ||
| git clang-format # format staged changes | ||
| git clang-format HEAD~1 # also include most recent commit | ||
| git clang-format main # format everything touched on your branch | ||
| ``` | ||
|
|
||
| ## Testing | ||
|
|
||
| ```bash | ||
| ninja -C water/build check-water # all lit tests | ||
| lit test/Dialect/Wave/<test>.mlir -vv # single test | ||
| ``` | ||
|
|
||
| Tests use lit + FileCheck. `.mlir` files use `// CHECK` comments. Negative tests are named `*-invalid.mlir`. | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Dialects | ||
|
|
||
| **`wave.*`** — primary dialect. `wave.tensor` has symbolic shapes (unknown until inferred by passes) and an address space (`Global`, `Shared`, `Register`). Each op carries a `WaveIndexMappingAttr` encoding element distribution across device/workgroup/workitem/register dimensions as `(offset, count, step)` triples. | ||
|
|
||
| **`normalform.*`** — `normalform.module` wraps IR and enforces declared invariants. Passes declare pre/post-conditions as normal form attributes, enabling composable pass ordering without new IR constructs. | ||
|
|
||
| ### Pass Pipeline | ||
|
|
||
| `water-middle-end-lowering` runs these in order (`include/water/Dialect/Wave/Transforms/Passes.td`): | ||
|
|
||
| | Pass | Purpose | | ||
| |---|---| | ||
| | `water-wave-detect-normal-forms` | Detect satisfied invariants | | ||
| | `water-wave-infer-types` | Shape inference via dataflow | | ||
| | `water-wave-infer-index-exprs` | Forward/backward index expression propagation | | ||
| | `water-wave-propagate-elements-per-thread` | Replace register tensors with vector types | | ||
| | `water-wave-resolve-distributed-allocations` | Map distributed shapes to concrete memref layouts | | ||
| | `lower-wave-to-mlir` | Lower to arith/math/vector/memref dialects | | ||
| | `lower-normalform-module` | Remove the normalform wrapper | | ||
|
|
||
| Generic passes include SLP vectorization, bounds-checking assertions, alloc-to-alloca, and GPU module serialization (ROCDL). | ||
|
|
||
| ### Python Bindings | ||
|
|
||
| Package `water_mlir` (prefixed to avoid IREE conflicts): | ||
| - `water_mlir.dialects.wave` — auto-generated op bindings from `WaveOps.td` | ||
| - `water_mlir.sympy_to_affine_converter` — converts SymPy expressions to MLIR affine expressions | ||
| - C++ extension via nanobind (`WaterExtensionNanobind.cpp`) | ||
|
|
||
| ### Key Design Principles | ||
|
|
||
| - **Lazy type inference**: `wave.tensor` shapes start unknown — don't assume they're set at construction. | ||
| - **Elements-per-thread (EPT)**: tracked separately from types; required before register tensors can be lowered to vector types. A pass that changes element counts must update EPT. | ||
| - **`water_mlir` prefix**: the Python package is prefixed to avoid conflicts with IREE's MLIR bindings. Import as `from water_mlir.dialects import wave`, not `mlir.dialects.wave`. | ||
| - **subprocess isolation**: the Wave-side `mlir_converter` runs Water in a subprocess specifically to avoid MLIR library symbol clashes with IREE. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| See @AGENTS.md |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| WaveASM is an optional C++ backend in the Wave compiler stack that replaces IREE's GPU codegen. It translates MLIR into AMDGCN assembly for AMD GPUs (gfx942/CDNA3, gfx950/CDNA3.5, gfx1250/RDNA4) and produces `.hsaco` binaries via its own `waveasm.*` MLIR dialect, linear-scan register allocator, and assembly emitter. | ||
|
|
||
| ## Building | ||
|
|
||
| ```bash | ||
| # First build | ||
| WAVE_BUILD_WAVEASM=1 pip install -e ".[dev]" | ||
|
|
||
| # Iterating on C++ changes (same pattern as Water) | ||
| ninja -C waveasm/build | ||
| pip install -e ".[dev]" # re-links extension, skips CMake | ||
| ``` | ||
|
|
||
| Set `WAVE_WAVEASM_DIR=waveasm/build` after first build to avoid full rebuilds on pip install. CLI tool: `waveasm-translate`. | ||
|
|
||
| ## Formatting | ||
|
|
||
| C++ code is formatted with `git clang-format` which formats only the lines changed relative to a commit (default: `HEAD`) | ||
|
|
||
| ```bash | ||
| git clang-format # format staged changes | ||
| git clang-format HEAD~1 # also include most recent commit | ||
| git clang-format main # format everything touched on your branch | ||
| ``` | ||
|
|
||
| ## Testing | ||
|
|
||
| ```bash | ||
| ninja -C waveasm/build check-waveasm # lit regression tests | ||
| ninja -C waveasm/build check-waveasm-all # + GPU functional tests (requires hardware) | ||
| lit test/Transforms/<test>.mlir -vv # single test | ||
| ``` | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Compilation Pipeline | ||
|
|
||
| ``` | ||
| Input MLIR (gpu, arith, vector, memref, scf, amdgpu dialects) | ||
| ↓ TranslateFromMLIR [lib/Transforms/TranslateFromMLIR.cpp] | ||
| WaveASM IR (virtual registers, pseudo-ops) | ||
| ↓ ScopedCSE, Peephole, BufferLoadStrengthReduction | ||
| ↓ ArithLegalization | ||
| Concrete SALU/VALU machine ops | ||
| ↓ Liveness → LinearScanRegAlloc → VGPRCompaction | ||
| Physical register assignments | ||
| ↓ Ticketing, HazardMitigation | ||
| ↓ AssemblyEmitter → clang++ | ||
| .hsaco GPU binary | ||
| ``` | ||
|
|
||
| ### Dialect | ||
|
|
||
| Types (`WaveASMTypes.td`): virtual (`!waveasm.vreg/sreg/areg`) and physical (`!waveasm.pvreg/psreg/pareg`) register types, plus `!waveasm.imm` and `!waveasm.scc`. The two-phase virtual→physical split is intentional — optimization passes run on virtual SSA, allocation happens once at the end. | ||
|
|
||
| ~300 machine ops in `WaveASMOps.td`: VALU, SALU, MFMA, memory (global/LDS/SMEM), control flow, and utility ops. Pseudo-ops (`waveasm.arith.*`) exist for cases where the concrete instruction depends on register class — ArithLegalization resolves them. | ||
|
|
||
| ### Adding New Dialect Support | ||
|
|
||
| `TranslateFromMLIR` uses a handler registry. To translate a new upstream op, add a handler to the appropriate file in `lib/Transforms/handlers/` and register it in the `TranslationContext`. The `TranslationContext` also manages the SRD (Shader Resource Descriptor) table and expression cache — use it rather than tracking state locally in handlers. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| See @AGENTS.md |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gitignores a CLAUDE.local.md file, but I would like the main CLAUDE.md (or AGENTS.md) to explicitly reference and tell the model to also adhere to an AGENTS.local.md file -- from what I've read, most agent programs don't read an AGENTS.local.md, and it seems that claude doesn't use CLAUDE.local.md anymore either. I think this is a shame. Anyway, we will all have some personal workflow stuff in addition to any shared AGENTS.md stuff. There is the global ~/.claude/CLAUDE.md and similar for other agents, but that affects all repos, and I really want to have personal and repo-specific instructions for agents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Claude does still use CLAUDE.local.md:
So you should be able to still use per-repo configs.
I also added AGENTS.local.md to .gitignore.