engines: add ROCm hipFile engine via shared gpuaccel layer#2113
engines: add ROCm hipFile engine via shared gpuaccel layer#2113sbates130272 wants to merge 4 commits into
Conversation
Extract the vendor-neutral orchestration out of the NVIDIA-specific
libcufile engine so that a second GPU-accelerator backend (ROCm hipFile)
can share it without code duplication.
Steps:
1. Extract all CUDA calls from the engine interface into thin wrappers
inside libcufile.c, leaving the rest of the engine logic untouched.
2. Add a small backend vtable (struct gpuaccel_backend) so the wrappers
are reached through a function table rather than directly.
3. Rename the now-vendor-neutral orchestration from libcufile_* to
gpuaccel_* throughout.
4. Move the gpuaccel core into a new engines/gpuaccel.c and gpuaccel.h;
libcufile.c retains only the CUDA/cuFile-specific pieces.
Steps 1-4 are intended to be no-functional-change for libcufile: the
cuFile behaviour is preserved as an identity transform through the vtable.
Signed-off-by: Zach Byrne <zbyrne@amd.com>
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Two independent clean-ups to the gpuaccel layer: * Add sync_after_write_recv, sync_after_verify_copy, and sync_after_verify_read_copy flags to struct gpuaccel_backend. hipFile requires several extra stream-synchronize calls that cuFile does not; the flags let each backend declare its needs without ifdefs. libcufile sets all flags to false (no behaviour change). * Move the static running/initialized state out of gpuaccel.c file scope and into struct gpuaccel_backend so that cuFile and hipFile can coexist in one process without sharing state. libcufile initialises the new fields (no behaviour change). Signed-off-by: Zach Byrne <zbyrne@amd.com> Signed-off-by: Stephen Bates <sbates@raithlin.com>
Add a new fio I/O engine for AMD ROCm hardware using the HIP and hipFile APIs. The engine implements struct gpuaccel_backend by calling HIP functions to allocate and manage GPU memory (hipMalloc/hipFree), copy between VRAM and host buffers (hipMemcpy), and perform direct I/O from VRAM via hipFileRead/hipFileWrite. The engine is built into the fio binary like libcufile (not as a dynamic external engine), so the shared gpuaccel symbols resolve at link time. configure carries the full ROCm include/lib/rpath flags; the Makefile stanza stays minimal. ROCM_PATH defaults to /opt/rocm and can be overridden via the environment. New options: gpu_dev_ids - comma-separated list of GPU device IDs to use rocm_io - select hipfile (direct) or posix (bounce-buffer) I/O mode Signed-off-by: Zach Byrne <zbyrne@amd.com> Signed-off-by: Stephen Bates <sbates@raithlin.com>
Add two example fio job files for the new libhipfile engine: examples/libhipfile-hipfile.fio - direct GPU I/O via hipFile examples/libhipfile-posix.fio - POSIX bounce-buffer mode Document the libhipfile engine, its gpu_dev_ids and rocm_io options, and the --enable-libhipfile configure flag in both HOWTO.rst and fio.1, mirroring the existing libcufile entries. Signed-off-by: Zach Byrne <zbyrne@amd.com> Signed-off-by: Stephen Bates <sbates@raithlin.com>
| } else { | ||
| log_err("Illegal %s IO type: %d\n", be->name, o->io_mode); | ||
| assert(0); | ||
| rc = EINVAL; |
There was a problem hiding this comment.
-EINVAL
-EIO
-ENOMEM
The convention is to return negative error. Please update here and other places below.
| assert(*be->running >= 0); | ||
| if (*be->running == 0) { | ||
| /* only close the driver if initialized and | ||
| this is the last worker thread */ |
There was a problem hiding this comment.
Nit: For multiple line comment, I feel this looks better
/*
* ...
*/
| } | ||
|
|
||
| if (cur) | ||
| gpu_id = atoi(cur); |
There was a problem hiding this comment.
For any invalid values this will return 0, which is a valid GPU ID.
This needs to be handled properly
| } else if (o->io_mode == IO_POSIX) { | ||
| sz = pread(io_u->file->fd, ((char*) io_u->xfer_buf) + xfered, | ||
| remaining, io_offset + xfered); | ||
| if (sz < 0) { |
There was a problem hiding this comment.
I think this can return 0. So maybe that needs to be considered here and at other places.
|
One verify issue I observed, I tried the |
|
@ankit-sam i assume that there is an NVMe SSD mounted? Also can you tell me what the output of ais-check is? |
@ankit-sam in addition could you say how you built fio and where you installed/built hipfile? I wasn't able to repro the bad header issue on our test system. Thanks! |
Yes, its NVMe SSD mounted The setup I have is with rocm-7.2.2, and by default fio configure couldn't find it so made a couple of changes to the configure I build fio by doing I added a debug log in fio to figure out the issue The issue seems to be a bit weird, its trying to match the last rand_seed to the first one. |
This series adds support for AMD ROCm hardware to fio by introducing a new
libhipfile I/O engine, structured analogously to the existing libcufile engine.
Rather than duplicating the cuFile orchestration logic, the series first
refactors libcufile.c into a shared gpuaccel layer that both backends build on.
Patch 1extracts the vendor-neutral orchestration from libcufile.c into a new
engines/gpuaccel.c / gpuaccel.h, introducing a struct gpuaccel_backend vtable
so each backend supplies only its vendor-specific calls (malloc, free, memcpy,
read, write). libcufile retains its existing behaviour as an identity transform
through the vtable; no functional change is intended for cuFile users.
Patch 2 adds sync_after_write_recv, sync_after_verify_copy, and
sync_after_verify_read_copy flags to struct gpuaccel_backend, allowing hipFile
to request extra stream-synchronize calls without ifdefs. Also moves the static
running/initialized state into the backend struct so cuFile and hipFile can
coexist in one process. No functional change to libcufile.
Patch 3 implements struct gpuaccel_backend for AMD ROCm using hipMalloc/hipFree,
hipMemcpy, and hipFileRead/hipFileWrite. Built into the fio binary (like
libcufile), enabled via --enable-libhipfile. ROCM_PATH defaults to /opt/rocm.
Exposes two new options: gpu_dev_ids (comma-separated GPU device list) and
rocm_io (hipfile for direct VRAM I/O, posix for bounce-buffer mode).
Patch 4 (examples/doc: add libhipfile example job files and documentation):
Adds examples/libhipfile-hipfile.fio and examples/libhipfile-posix.fio, and
documents the new engine, its options, and the configure flag in both HOWTO.rst
and fio.1, mirroring the existing libcufile entries.
Fixes #2111.
Signed-off-by: Zach Byrne zbyrne@amd.com
Signed-off-by: Stephen Bates sbates@raithlin.com