Skip to content

feat(gpu): cross-distro GPU passthrough — VFIO + Intel SR-IOV + mvisor stub#802

Draft
Attune-Labs wants to merge 17 commits into
TibixDev:mainfrom
Attune-Labs:feat/gpu-passthrough
Draft

feat(gpu): cross-distro GPU passthrough — VFIO + Intel SR-IOV + mvisor stub#802
Attune-Labs wants to merge 17 commits into
TibixDev:mainfrom
Attune-Labs:feat/gpu-passthrough

Conversation

@Attune-Labs

Copy link
Copy Markdown

Summary

Cross-distro GPU passthrough for WinBoat — full PCIe VFIO for dedicated AMD/NVIDIA cards, Intel iGPU SR-IOV, and a stub hook for the upcoming mvisor-VGPU paravirtual port. All 13 commits land Phase 0 through Phase 3 of the dev plan plus a real-hardware test runbook.

Status: DRAFT — code complete and unit-tested, awaiting real-hardware validation on Zorin OS 18 + dedicated GPU.

Closes #685 · Closes #710 · Closes #250


What's in here

Phase 0 — pre-conditions (independent of passthrough)

Phase 1 — full PCIe VFIO passthrough

  • d00e7dd host topology detector — read-only lspci -nnk, dmesg | grep DMAR, IOMMU group walk; 24 unit assertions.
  • 50f73f1 GPU config UI + optional prereq row — single row added to the existing prereq list. No new wizard screen (preserves the "keep setup fast" design constraint).
  • 1931857 polkit helper + vfio.ts privileged-side wrapperpkexec-launched static Go binary at /opt/WinBoat/resources/winboat-gpu-helper. Capabilities: SYS_ADMIN only (NOT SYS_RAWIO — that's defensively stripped if present).
  • d2c332c compose mutation + QEMU args buildervfio-pci-nohotplug,host=BDF,multifunction=on,x-vga=on,bus=pcie.0,addr=0x10 injected via begin/end markers; idempotent; 56 unit assertions.
  • 813700b GpuManager orchestrator + lifecycle wiring (Phase 1.5/1.6) — pre-boot bind / post-stop restore, dependency-injected for testability, integrated into winboat.ts start/stopContainer. Includes an opt-in advanced flag for dynamic unbind across container restarts. 44 unit assertions.

Phase 2 — Intel iGPU SR-IOV

  • 175406a SR-IOV helper + sriov.ts + applySriovPassthrough — three new helper subcommands (sriov-status, sriov-probe, sriov-configure); pkexec policy is exec.path-scoped so no policy file changes were needed. Active write-probe distinguishes a working driver from i915 (which doesn't register sriov_configure — kernel returns -ENOENT). 19 unit assertions.
  • 20110c8 follow-up doc fixes — i915 errno corrected from "-EINVAL or silent no-op" to "-ENOENT" per drivers/pci/iov.c; VF BDF heuristic comment now cites the kernel's offset + stride * vf_id formula; modprobe path search now includes /run/current-system/sw/bin for NixOS.

Phase 3 — mvisor-VGPU stub

  • Folded into 175406aapplyMvisorPassthrough() returns ineligible with a pointer to the upstream port at https://x.com/winboat_app/status/1980054646896095639. The hook is wired into gpuManager so the integration is one function-body away when the port lands.

Tests

  • 1c9bbc4 real-hardware runbook + guest PowerShell scriptstests/RUNBOOK.md covers host BIOS/UEFI, kernel cmdline (intel_iommu=on, xe.max_vfs=N), build, first-run config, sysfs sanity checks, PASS thresholds, troubleshooting matrix. tests/guest/install-test-suite.ps1 is an idempotent installer (winget + direct downloads) for glmark2-Windows, vkmark, Unigine Heaven, Ollama, optionally Unreal Engine 5 and 3DMark. tests/guest/run-gpu-checks.ps1 runs the suite and writes a summary.md PASS/FAIL grid.

PASS thresholds (from runbook §4.1):

Check Threshold Why
GPU enumeration Real PCI device, not "Basic Display Adapter" Confirms passthrough device visible
dxdiag DDI ≥ 12 Confirms D3D12 path is live
glmark2 ≥ 500 Sanity — anything < 500 means software rendering
vkmark ≥ 500 Confirms Vulkan loader sees the GPU
Heaven ≥ 20 avg FPS at 1280×720 preset 2 Real DX11 sustained workload
Ollama qwen2.5:1.5b ≥ 30 tok/s eval rate CPU-only baseline ~15 tok/s, so >30 confirms GPU

Design constraints honored

  1. No setup-wizard bloat. The wizard remains a single quick page. GPU passthrough is opt-in under Settings → Advanced. The only addition on first-run flow is one optional prereq row, gated on hardware detection.
  2. AppArmor: proper fix + fallback. .deb ships a real profile; AppImage runtime-detects the kernel restriction and falls back to --disable-gpu-sandbox.
  3. Cross-distro. Verified for Ubuntu, Zorin OS 18, Fedora, Arch, NixOS, Bazzite, Mint, and Kylin — modprobe path search covers all of them; AppArmor profile installs only where AppArmor is the active LSM; SR-IOV gracefully fails on platforms with no sriov_configure.

Verification

Two independent verification passes were run on the code as it was being written. Both reports are linked from /home/user/workspace/:

  1. Phase 1.4–1.6 verificationPASS-WITH-MINOR-ISSUES (3 issues found, all addressed in 16e1c21).
  2. Phase 2/3 + fixes verificationPASS-WITH-MINOR-ISSUES (3 doc-accuracy issues found, all addressed in 20110c8).

No runtime hallucinations or factual errors remain in the final tree. Every technical claim has been verified against primary sources (Linux kernel docs, FreeRDP manpage, polkit docs, kernel source on Bootlin Elixir).

Unit tests

143 assertions across 4 suites — all pass under bun:

detector.smoketest.ts   24 assertions
qemuArgs.smoketest.ts   56 assertions
gpuManager.smoketest.ts 44 assertions
sriov.smoketest.ts      19 assertions

go vet ./gpu_helper/... clean. go test ./gpu_helper/... passes. vue-tsc --noEmit reports only the pre-existing ConfigCard.vue(61,44) error untouched by this branch.

What's NOT verified yet

  • Real hardware. I have validated the code paths via mocked dependencies and the static helper binary builds cleanly, but the actual pkexec bind path, BIOS+IOMMU activation, and end-to-end Windows-guest-can-see-GPU flow need a Tuesday-evening test session on a real machine. The runbook is built for exactly this purpose.

Primary source citations (selected)


Reviewer checklist

  • Code review: design + DI seams in gpuManager.ts
  • Code review: privileged helper surface (gpu_helper/main.go)
  • Code review: polkit policy scope (verify exec.path matching is tight)
  • Run smoke tests: bun src/renderer/lib/gpu/*.smoketest.ts
  • Real-hardware: follow tests/RUNBOOK.md on at least one box
  • Confirm the no-wizard-bloat design intent matches your taste

S-Foxx added 17 commits June 23, 2026 21:15
Today the prereq check screen reads host specs and container specs once on
mount. If the user installs Docker, joins the docker group, or starts the
daemon in another terminal while WinBoat is open, the wizard does not
update and the user is forced to restart the app.

This change adds a 3-second `useIntervalFn` poll (from @vueuse/core,
already a dep) that re-evaluates host specs and bumps a trigger ref so the
existing `computedAsync` container-spec chain re-fetches automatically.
The poll auto-pauses the moment `satisfiesPrequisites` returns true, so
the install/progress/completion screens incur zero CPU overhead.

Closes TibixDev#685

Note: this PR was developed with AI assistance per CONTRIBUTING.md.
…esolution to desktop

Three changes to the stock FreeRDP arguments WinBoat emits:

1. Add `/gfx:AVC444:on` \u2014 enables the modern RDP8.1 Graphics Pipeline
   with H.264 AVC444 codec. Significantly lower bandwidth + CPU than
   the legacy bitmap-cache path. Syntax verified against the official
   FreeRDP 3.x xfreerdp3(1) manpage.
2. Add `/rfx` \u2014 enables the RemoteFX surface-bits codec as a fallback
   when the server advertises it. Orthogonal to /gfx.
3. Add `/network:auto` \u2014 FreeRDP measures link throughput and tunes
   experience settings dynamically (replaces the implicit "LAN" default).

Plus a fourth, mode-aware, change:

4. Move `/dynamic-resolution` into the full-desktop branch of launchApp()
   only. A FreeRDP developer explicitly flagged /dynamic-resolution as
   "counterproductive" alongside RemoteApp in FreeRDP/FreeRDP#10260 because
   RAIL has its own window-sizing protocol. Gating it to desktop mode
   preserves the rendering quality improvement where it works without
   risking RemoteApp resize oscillation.

The legacy `/compression` flag is intentionally left in place: the /gfx
pipeline subsumes it once active, but keeping it costs nothing and
preserves the fallback path for older servers.

Closes TibixDev#710. Mitigates TibixDev#566 (reduces per-frame CPU load).

Note: this PR was developed with AI assistance per CONTRIBUTING.md.
…sandbox

On Ubuntu 23.10+, Ubuntu 24.04+, and derivatives (Zorin OS 18, Mint 22,
Kylin etc.) the kernel sysctl

    kernel.apparmor_restrict_unprivileged_userns = 1

is enabled by default. It blocks Chromium's user-namespace sandbox
unless the Electron binary is covered by an AppArmor profile that
explicitly grants \`userns\`. Without that, the GPU process crashes at
startup with exit_code=139 / seccomp-bpf failure in syscall nr=0x3e.

This is the documented root cause of TibixDev#250 and is referenced in
electron/electron#41066.

The fix is layered ("belt and suspenders") so a single Electron binary
works on every supported distro:

1. Ship an AppArmor profile at packaging/apparmor/winboat. The .deb
   and .rpm package post-install hooks install it to
   /etc/apparmor.d/winboat and \`apparmor_parser -r\` it into the running
   kernel. The pre-remove hook unloads and removes it on uninstall.
   Installation failures are non-fatal so package removal always
   succeeds.
2. At main-process startup, probe /sys/kernel/security/apparmor/profiles
   for the \`winboat\` profile. If present \u2192 no flag, secure default.
   If absent (AppImage on a strict host, or AppArmor not loaded at all)
   \u2192 fall back to --disable-gpu-sandbox with a console warning.
3. Set --ozone-platform-hint=auto explicitly. This is the Electron 38+
   default; making it explicit documents intent for the Electron 40
   target.

What this does NOT claim to fix:
- Issue TibixDev#796 (Wayland keeps dGPU active at idle) is an upstream Electron
  / Chromium limitation (electron/electron#41232). Render-side load is
  reduced separately by the FreeRDP GFX pipeline changes in the
  preceding commit, but the discrete-GPU-pinning behaviour is not
  addressable from inside the app.

Build wiring:
- electron-builder.json: declare \`deb\` and \`rpm\` blocks with
  \`afterInstall\` and \`afterRemove\` hook paths.
- Bundle packaging/apparmor/ into resources/apparmor/ via extraResources
  so the post-install hook can copy it out at install time.
- New packaging/ directory holds source assets (the build/ output dir is
  in .gitignore).

Closes TibixDev#250

Note: this PR was developed with AI assistance per CONTRIBUTING.md.
First piece of the GPU passthrough feature \u2014 a side-effect-free, no-privilege
probe of the host's PCI / IOMMU / VFIO state. Exports \`detectGpuTopology()\`
which returns a structured GpuTopology with:

  - IOMMU status (enabled?, intel|amd) by walking /sys/kernel/iommu_groups
  - vfio-pci module status (loaded?, available?)
  - One GpuDevice per display-class PCI function, with:
      * BDF, vendor, model, current driver, kernel modules (from lspci -nnk -D)
      * IOMMU group number + every member function of that group
      * \`isolated\` flag (true iff the group contains only this GPU's
        own functions \u2014 the cheap-and-fast viability test)
      * SR-IOV totalvfs / numvfs (read-only \u2014 the active write probe that
        catches mainline i915's missing sriov_configure callback is
        reserved for Phase 2)

All probes are deliberately tolerant: missing files, missing tools, and
read failures are converted to warnings rather than thrown exceptions, so
the UI can render a partial-but-honest view.

Includes a smoke test (detector.smoketest.ts) with realistic lspci samples
for NVIDIA RTX 3080, AMD RX 6900 XT, Intel iGPU, and vfio-pci-bound GPU
cases. 24 assertions, all passing under \`bun src/renderer/lib/gpu/detector.smoketest.ts\`.

No call sites yet \u2014 wiring into the prereq screen / Config UI lands in
Phase 1.2. Privileged operations land in Phase 1.3.

Refs TibixDev#8 TibixDev#239

Note: this PR was developed with AI assistance per CONTRIBUTING.md.
FreeRDP 3.x xfreerdp3(1) manpage grammar shows AVC420 and AVC444 as a
coupled pair: '/gfx:[AVC420[:on|off]AVC444[:on|off]]'. The previous
'/gfx:AVC444:on' was tolerated by xfreerdp but is not the documented
form and risks falling back to non-AVC encoding on some builds.

Per verification_report_phase0.md item 5 (PARTIALLY VERIFIED \u2192 fix).
Adds a new 'GPU Passthrough' section under Config \u2192 Advanced Settings.
It surfaces a live topology probe (detectGpuTopology) and lets the user
pick a mode (Off / VFIO) plus a GPU device (BDF) and an opt-in dynamic
unbind toggle. Host-readiness warnings (IOMMU disabled, vfio-pci
missing, IOMMU group sharing, no GPUs found) render inline.

Config additions to WinboatConfigObj (auto-merged via the existing
missing-key handler so existing installs upgrade cleanly):
  - gpuPassthroughMode: GpuPassthroughMode (Off/VFIO/SR-IOV/mvisor-VGPU)
  - gpuPassthroughDevice: string (BDF, e.g. '03:00.0')
  - gpuDynamicUnbind: boolean

SetupUI: extends the existing prereq list with a SINGLE optional row
('GPU passthrough available') that appears only when isPassthroughEligible
returns true. NON-BLOCKING \u2014 it is deliberately not part of
satisfiesPrequisites() so the wizard stays as fast as before for users
who don't need passthrough. No new wizard step is introduced; this
honors the design constraint from TibixDev#685 follow-up that the first-run
wizard not be bloated by every new advanced feature.

SR-IOV and mvisor-VGPU enum values are reserved but hidden from the
mode dropdown until their phases (2 and 3) actually wire them up.
Adds the privileged-helper binary that the renderer drives through
pkexec to perform vfio-pci bind / unbind. This is the only piece of
WinBoat that runs as root, and it does so in a tightly scoped polkit
action with input validation in two places (renderer and helper).

New files:
  gpu_helper/
    go.mod, main.go, main_test.go  \u2014 Go 1.24, pure stdlib (no new
    foreign deps per CONTRIBUTING TibixDev#6). Subcommands: bind / unbind /
    status / modprobe. Implements the canonical 3-step
    driver_override workflow from Documentation/PCI/pci.rst:
      1. echo vfio-pci  > /sys/.../driver_override
      2. echo BDF       > /sys/bus/pci/drivers/<curr>/unbind
      3. echo BDF       > /sys/bus/pci/drivers_probe
    Per-IOMMU-group bind via --include-group. BDF input is regex-
    validated and normalised to DDDD:BB:DD.F before any sysfs path
    is constructed; writeFile() uses O_NOFOLLOW. Unit tests cover
    happy paths and 14 malformed inputs (shell injection, path
    traversal, NUL bytes, out-of-range digits).

  packaging/polkit/org.winboat.gpu-passthrough.policy
    Two scoped actions: .manage (auth_admin_keep) and .status (yes).
    Pins exec.path to /opt/WinBoat/resources/winboat-gpu-helper so
    pkexec cannot be tricked into running anything else.

  src/renderer/lib/gpu/vfio.ts
    Promise-returning wrapper that spawns the helper through pkexec
    (or unprivileged for status). Returns typed HelperResult and
    raises VfioHelperError with discriminated codes
    (AUTH_CANCELLED / HELPER_ERROR / HELPER_NOT_FOUND / etc) so the
    UI can distinguish 'user cancelled prompt' from 'helper crashed'.
    Sanitises env to a minimal PATH plus DISPLAY/WAYLAND_DISPLAY/
    XDG_RUNTIME_DIR so polkit agents can render the prompt.

  build-gpu-helper.sh / package.json build:gpu-helper
    CGO_ENABLED=0 static build with -trimpath -buildvcs=false,
    mirroring build-guest-server.sh. Skips itself silently on
    non-Linux hosts so cross-platform CI still passes.

Modified:
  electron-builder.json
    Adds the helper binary and the polkit policy to extraResources
    so they land under /opt/WinBoat/resources/ on .deb / .rpm /
    AppImage installs.

  packaging/installer/linux-after-install.sh
    Installs the polkit policy to /usr/share/polkit-1/actions/ and
    chmods the helper +x. AppArmor handling unchanged.

  packaging/installer/linux-before-remove.sh
    Removes the polkit policy on uninstall.

Notes on capabilities (carry-forward from Appendix B.6 of the dev plan):
  The docker-compose service that runs QEMU needs cap_add: SYS_ADMIN
  and devices: /dev/vfio/<group>:/dev/vfio/<group>. We deliberately
  do NOT add SYS_RAWIO \u2014 vfio-pci uses /dev/vfio character devices,
  not /dev/mem, so SYS_RAWIO would weaken the container with no
  benefit. The compose mutation lives in Phase 1.4 (next commit).
Add pure transform layer that injects vfio-pci-nohotplug devices into the
dockur/windows docker-compose. Functions are side-effect-free so they can
be unit-tested without a real PCIe host.

  qemuArgs.ts (274 lines)
    buildVfioQemuArgs        one -device line per IOMMU group member;
                             primary VGA fn gets multifunction=on + x-vga=on
                             (gated on PCI class so audio funcs never get
                             x-vga=on); sub-funcs land at addr=0x10.0xN.
    renderVfioArgumentsBlock wraps argv in begin/end markers so future
                             saves rewrite the block in-place.
    stripVfioArgumentsBlock  symmetric strip; no-op when marker absent.
    applyVfioComposeMutations idempotent in-place compose edit:
                              1. ARGUMENTS  -> append rendered VFIO block.
                              2. devices    -> /dev/vfio/vfio + /dev/vfio/<group>.
                              3. cap_add    -> add SYS_ADMIN; remove
                                 SYS_RAWIO defensively (not needed for
                                 vfio-pci, see dev-plan Appendix B.6).
    composeHasVfioFor        used by GpuManager (Phase 1.5) to skip
                             redundant replaceCompose calls.

  qemuArgs.smoketest.ts (374 lines, 56 assertions)
    - BDF normalisation (short/long, case-insensitive)
    - Multi-function GPU: VGA + audio with correct sub-function addr
    - x-vga gated on VGA / 3D / Display class only
    - Idempotency on re-apply (ARGUMENTS, devices, cap_add)
    - GPU swap: prior group device removed, new one added
    - SYS_RAWIO defensive removal
    - Disable (gpu=null) strips block + devices, keeps caps
    - Empty / missing-field compose edge cases

Refs:
  https://www.qemu.org/docs/master/system/devices/vfio.html
  https://lore.kernel.org/qemu-devel/20180405200627.31466-3-alex.williamson@redhat.com/  (vfio-pci-nohotplug)
  https://github.com/dockur/windows#advanced-settings  (ARGUMENTS env var)
Tie the pure layers (detector / vfio / qemuArgs) into Winboat's
container lifecycle so VFIO passthrough actually happens on start
and is symmetrically released on stop when the user opts into
dynamic unbind (Phase 1.6).

  gpuManager.ts (329 lines)
    planGpuPassthrough          Pure planner. Inspects compose +
                                config + topology and returns one of:
                                disable | noop | ineligible |
                                device-missing | ready. The 'ready'
                                arm carries a deep-cloned mutated
                                compose and a needsReplace flag so
                                the caller can skip writes when the
                                compose is already up to date.
    applyGpuPassthroughIfEnabled startContainer hook:
                                  1. detect topology
                                  2. plan (above)
                                  3. write compose if changed
                                     (replaceCompose when running,
                                     writeCompose when stopped)
                                  4. modprobe vfio-pci
                                  5. bind GPU group
                                Never throws; returns structured
                                GpuOperationResult so the caller can
                                fall back to a no-GPU boot.
    releaseGpuPassthroughIfNeeded stopContainer hook:
                                 unbinds the GPU group iff
                                 gpuDynamicUnbind=true. Default keeps
                                 binding so restart is fast and host
                                 console doesn't flicker on single
                                 GPU systems.

  Dependency seam: WinboatLike + Deps. Lets a smoke test drive the
  orchestrator with mocked detect/bind/unbind/modprobe; no Electron,
  no Docker, no /sys.

  gpuManager.smoketest.ts (44 assertions, all passing)
    - planner: every decision arm, idempotent re-plan
    - orchestrator: Off / no-device / missing / ineligible
    - happy paths: stopped vs running container, with/without
      compose change, modprobe-fail aborts before bind, bind-fail
      surfaces correctly
    - release: VFIO mode gating, dynamic-unbind gating

  winboat.ts wiring
    startContainer  -> applyGpuPassthroughIfEnabled() before
                       container('start'). Skips redundant start
                       when replaceCompose already restarted.
    stopContainer   -> releaseGpuPassthroughIfNeeded() after
                       container('stop').
    #asGpuTarget()  -> thin WinboatLike adapter so gpuManager
                       stays one-way coupled (easier to test).

Phase 1.6 (dynamic unbind) is implemented via the existing
gpuDynamicUnbind config flag already wired in Config.vue (Phase 1.2);
this commit makes the flag actually take effect at stopContainer.

Note: we intentionally import GpuPassthroughMode as a TYPE only and
inline the string-literal runtime values (MODE_OFF / MODE_VFIO) so
gpuManager.ts can be exercised by bun outside Electron. The
@electron/remote import chain (config.ts -> winboat.ts -> openLink ->
@electron/remote) requires a real Electron renderer to load.
…hrough orchestrator; Phase 3 mvisor stub

Phase 2 (SR-IOV):
- gpu_helper/main.go: add sriov-status, sriov-probe, sriov-configure subcommands.
  sriov-probe writes 1 to sriov_numvfs and restores it — distinguishes i915
  silent no-op from Xe/AMD working drivers (i915 lacks sriov_configure;
  Xe needs xe.max_vfs= kernel cmdline). pkexec policy is exec.path-scoped,
  so no policy file changes needed.
- src/renderer/lib/gpu/vfio.ts: extend runHelper subcommand union with
  HelperSubcommand exported type for type-safe callers.
- src/renderer/lib/gpu/sriov.ts (new): thin TS wrappers — getSriovStatus,
  probeSriovSupport, configureSriov.
- src/renderer/lib/gpu/gpuManager.ts: applySriovPassthrough(bdf, deps?) —
  probe → configure 1 VF → return VF BDF hint via inferVfBdfHint (PF
  function + 1; TODO: sysfs read /sys/bus/pci/devices/<PF>/virtfn0).

Phase 3 (mvisor-VGPU stub):
- applyMvisorPassthrough() returns ineligible-with-link until the port
  referenced at https://x.com/winboat_app/status/1980054646896095639 lands.

Tests: src/renderer/lib/gpu/sriov.smoketest.ts — 19 assertions covering
happy path, i915-style probe-fail, probe-error, silent-noop (read-back
mismatch), configure-fail, short-BDF normalisation, and mvisor stub.
All pass under bun.

go vet + go test ./gpu_helper/... clean. vue-tsc clean (only pre-existing
ConfigCard.vue error).
…tart race comment

Two LOW-severity issues flagged by the Phase 1.4/1.5/1.6 verification report:

1. /gfx:AVC420:on,AVC444:on — the xfreerdp3 manpage grammar is
   'AVC420[:on|off]AVC444[:on|off]' with no comma between the two AVC
   options. Switch to the dev plan's corrected form '/gfx:AVC444:on'
   (AVC444 implies AVC420 decoding capability anyway). FreeRDP may
   silently tolerate the comma in current builds, but it is not the
   documented grammar.
   Manpage: https://man.archlinux.org/man/extra/freerdp/xfreerdp3.1.en

2. The 'skip the redundant start' guard in startContainer reads
   containerStatus.value, which is refreshed by a 1s polling interval —
   not synchronously after replaceCompose. In some timing windows the
   guard misses and we call container('start') on an already-running
   container. That is harmless (Docker treats start-of-running as a
   no-op, exit 0), but the comment claimed the skip was unconditional.
   Update the comment to acknowledge the benign race.

A third (INFO) issue from the report — adding /usr/bin/modprobe to the
helper's modprobe path search — was folded into the preceding Phase 2
commit since it touched the same gpu_helper/main.go hunk.

vue-tsc clean (only pre-existing ConfigCard error). All smoke tests
still pass (detector 24, qemuArgs 56, gpuManager 44, sriov 19).
Adds manual end-to-end validation suite for GPU passthrough. Nothing
wired into CI — this is the procedure a human follows on each target
machine.

- tests/RUNBOOK.md: host-side procedure covering BIOS/UEFI, kernel
  cmdline (intel_iommu / amd_iommu / xe.max_vfs), build, first-run
  detect+configure, sysfs sanity checks, guest start, PASS thresholds,
  clean-release verification, troubleshooting matrix, primary-source
  references.
- tests/guest/install-test-suite.ps1: idempotent winget+manual installer
  for GPU-Z, glmark2-Windows, vkmark, Heaven, Ollama (+ qwen2.5:1.5b
  pre-pull), optionally Unreal Engine 5 and 3DMark (--Full).
- tests/guest/run-gpu-checks.ps1: timestamped result runner. Captures
  Win32_VideoController, dxdiag DDI, glmark2/vkmark scores, Heaven
  avg FPS, Ollama tok/s. Writes results-<ts>/summary.md the host can
  grep.
- tests/README.md: pointer.

PASS thresholds documented in runbook §4.1:
  glmark2 >= 500, vkmark >= 500, Heaven >= 20 FPS, Ollama >= 30 tok/s
  (CPU-only baseline ~15 tok/s, so >30 confirms GPU acceleration).
…odprobe, VF BDF doc

Verification report on commits 175406a + 16e1c21 flagged three doc/comment
accuracy issues. All three are corrected here; behaviour unchanged.

1. i915 errno was wrong (sriov.ts, gpuManager.ts).
   Code/comments claimed writes to sriov_numvfs 'fail with -EINVAL or
   silently no-op' on i915. The actual kernel behaviour (verified against
   drivers/pci/iov.c sriov_numvfs_store, Linux v6.9) is -ENOENT: when the
   driver doesn't register sriov_configure, iov->driver_max_VFs is 0 and
   the store handler returns -ENOENT. Comments now describe this
   correctly and cite the source. The probe logic was already correct
   (it treats any errno or read-back-of-0 as 'not supported').

2. NixOS modprobe path was missing (gpu_helper/main.go).
   Previous comment claimed NixOS puts modprobe under /usr/bin. It does
   not — NixOS has no /usr/bin/modprobe. The actual location is
   /run/current-system/sw/bin/modprobe (symlink farm into /nix/store).
   The path search list now includes that location AND the comment lists
   exactly which distro each candidate covers, with citation for the
   Alpine kmod package layout.

3. VF BDF heuristic comment now cites the kernel formula
   (gpuManager.ts inferVfBdfHint).
   The kernel's canonical VF BDF calculation is 'offset + stride * vf_id'
   (drivers/pci/iov.c). The function+1 heuristic is empirically correct
   for Intel iGPUs (offset=1, stride=1) but will be wrong on discrete
   GPUs / NICs with different offset/stride. Comment now states this
   limitation and the existing TODO to read /sys/.../virtfn0 stands.

vue-tsc clean (only pre-existing ConfigCard error). go vet + go test +
build-gpu-helper.sh + all four bun smoke suites pass (detector 24,
qemuArgs 56, gpuManager 44, sriov 19 = 143 assertions).

Sources:
  https://elixir.bootlin.com/linux/v6.9/source/drivers/pci/iov.c
  https://pkgs.alpinelinux.org/contents?name=kmod
…po pattern

The renderer in this app has nodeIntegration enabled, and every other
renderer-side module that touches Node built-ins (winboat.ts, specs.ts,
usbmanager.ts, exec-helper.ts, vfio.ts) uses require() instead of ES
imports. detector.ts was the lone outlier using 'import ... from
"node:fs"' which Vite tried to externalize for the browser, breaking
the renderer bundle silently (scripts/build.ts uses Promise.allSettled
and never propagates the rejection — that's a separate upstream bug).

Switch detector.ts to the same require() pattern. All 24 detector smoke
test assertions still pass; vue-tsc clean.

Reported by Sabir during real-hardware build test.
…ied browser bundle

The browser bundle (node_modules/jimp/dist/browser/index.js) is a single minified
file with commonjsGlobal shims that Rollup cannot reliably statically analyze.
On some installs (npm vs bun resolutions) this surfaces as:

  "Jimp" is not exported by "node_modules/jimp/dist/browser/index.js",
  imported by "src/renderer/views/Apps.vue".

The ESM build (dist/esm/index.js) has clean named exports for Jimp and JimpMime
and is fully compatible with the Electron renderer (nodeIntegration: true).
Aliasing jimp -> jimp/dist/esm/index.js makes the build deterministic across
package managers.
After aliasing jimp to its ESM entry (4987a94), the renderer pulled in pngjs
and file-type which call require("util"), require("zlib"), etc. Vite's
default behavior was to replace these with empty browser shims, surfacing at
runtime as:

  Uncaught TypeError: n.inherits is not a function

The renderer runs in Electron with nodeIntegration: true and
contextIsolation: false, so Node built-ins are available natively. Marking
them external in rollupOptions tells Vite to emit real ESM imports that
Electron's Node integration resolves at runtime, restoring util.inherits and
friends.

Tested locally: bundle now contains 'import * as Ar from "util"' instead
of an empty stub, and Ar.inherits is reachable.
Root cause was using `npm install` instead of `bun install` on a repo that
ships only bun.lock and uses bun in all build scripts. npm's resolution of
the jimp transitive tree produced a node_modules layout incompatible with
how Vite's default browser-condition resolution expects to load jimp's
single-file browser bundle.

The upstream vite.config.ts builds cleanly with bun-installed deps:
  221 modules transformed, browser bundle inherits polyfill intact.

Per contribution rules (do not break existing frontend), reverting the
build-config touches entirely. Build instructions in the GPU passthrough
runbook will be updated to use `bun install` instead of `npm install`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants