Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 66 additions & 20 deletions docs/source/getting_started/televiz.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@ Televiz
=======

Televiz (``isaacteleop.viz``) is a lightweight compositor for Isaac Teleop. It composites camera and
sensor feeds — with 3D rendered content coming soon — into an XR headset, a desktop window, or an
offscreen buffer, integrating directly with the device-tracking and retargeting pipeline.
sensor feeds, plus 3D rendered content (gsplat, nvblox, neural reconstruction), into an XR headset, a
desktop window, or an offscreen buffer, integrating directly with the device-tracking and retargeting
pipeline.

It is a **compositor**, not a capture or streaming layer: it consumes GPU frames and assembles them
into a final image. Camera capture, decode, and network transport live in the application (see
Expand All @@ -29,16 +30,17 @@ which owns the Vulkan context, the display target, the OpenXR session (in XR mod
of **layers**. Content producers submit GPU buffers to layers; the session composites every layer
into one frame each time you call ``render()``.

The built-in layer type today is
:code-file:`QuadLayer <src/viz/layers/cpp/inc/viz/layers/quad_layer.hpp>` — a CUDA-fed 2D texture
plane (mono or stereo), optionally placed in 3D space. Use it for camera feeds.
Two layer types are available:

.. note::
* :code-file:`QuadLayer <src/viz/layers/cpp/inc/viz/layers/quad_layer.hpp>` — a CUDA-fed 2D texture
plane (mono or stereo), optionally placed in 3D space. Use it for camera feeds.
* :code-file:`ProjectionLayer <src/viz/layers/cpp/inc/viz/layers/projection_layer.hpp>` — a full-view
RGBD layer for external renderers (gsplat, nvblox, neural reconstruction) that produce per-view
``(color, depth)`` buffers. Use it to present a rendered 3D scene from the current head pose.

**Coming soon:** ``ProjectionLayer``, a full-view stereo RGBD layer for external renderers
(gsplat, nvblox, neural reconstruction) that produce per-view ``(color, depth)`` buffers,
Z-composited with quads. It is not yet available in this release — see `ProjectionLayer
(coming soon)`_ below.
A session holds **either** one ``ProjectionLayer`` **or** any number of ``QuadLayer`` s, not both:
quads composite into a shared render target, while a projection layer is presented directly (see
`ProjectionLayer`_).

All symbols are imported from the top-level module::

Expand Down Expand Up @@ -212,19 +214,63 @@ For a stereo layer both buffers are copied on the same stream and signaled toget
never sees a half-matched pair. Lock-mode placement strategies (``world`` / ``head`` / ``lazy``) are
**application policy** and ship in the sample, not in the module.

ProjectionLayer (coming soon)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ProjectionLayer
^^^^^^^^^^^^^^^

.. note::
A full-view RGBD layer for **in-loop** renderers — gsplat, nvblox, or neural reconstruction engines
that produce per-view ``(color, depth)`` buffers. Configure it with ``ProjectionLayerConfig``:

``ProjectionLayer`` is under active development and **not yet available in this release**. The
description below is a preview of the planned API and may change.
.. list-table::
:header-rows: 1
:widths: 25 75

* - Field
- Description
* - ``name``
- Layer name.
* - ``view_resolution``
- Per-view render resolution. **Must equal** ``session.get_recommended_resolution()`` — the
layer's images are copied 1:1 into the presentation swapchains (per-eye in XR). A mismatch
is rejected by ``add_projection_layer``.
* - ``color_format``
- ``PixelFormat.kRGBA8``.
* - ``depth_format``
- ``PixelFormat.kD32F`` (default) so the depth reaches the XR runtime for positional
reprojection, or ``None`` to present color only.
* - ``stereo``
- ``True`` for per-eye buffers. A stereo (XR) display **requires** a stereo layer; a mono layer
is rejected at ``add_projection_layer``.

Unlike ``QuadLayer``, a projection layer is **direct-present**: each view's ``(color, depth)`` is
copied straight into the presentation swapchains (no shared render target). Because of that a session
holds *either* one ``ProjectionLayer`` *or* any number of ``QuadLayer`` s, never both.

The renderer runs **in-loop** with the frame loop: read the predicted view poses from the
``FrameInfo`` returned by ``begin_frame()``, render against them, then ``submit()`` before
``end_frame()``:

.. code-block:: python

cfg = televiz.ProjectionLayerConfig()
cfg.view_resolution = session.get_recommended_resolution()
cfg.stereo = session.is_xr_mode()
layer = session.add_projection_layer(cfg)

while running:
info = session.begin_frame()
if info.should_render:
# Render against THIS frame's per-eye poses (info.views[i].pose + .fov).
color, depth = renderer.render(info.views) # RGBA8 + D32F CUDA buffers
if layer.stereo:
layer.submit(left_color, left_depth, right_color, right_depth, stream=cuda_stream)
else:
layer.submit(color, depth, stream=cuda_stream)
session.end_frame()

A planned full-view RGBD layer for in-loop renderers — gsplat, nvblox, or neural reconstruction
engines that produce per-view ``(color, depth)`` buffers. Unlike ``QuadLayer``, the renderer will
run **in-loop** with the XR frame loop: render against the predicted view poses from the current
frame, then submit between ``begin_frame()`` and ``end_frame()``. Output is composited with depth,
so it Z-combines with quad layers.
If the renderer is slower than display rate, the runtime / CloudXR paces the app via ``xrWaitFrame``
and reprojects the last submitted frame at display rate. In XR, a visible layer that does **not**
submit for a frame presents nothing (the swapchains are cleared) rather than reproject stale RGBD
under a new pose.

Frame loop
----------
Expand Down
2 changes: 1 addition & 1 deletion docs/source/overview/architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Visualization (Televiz)
Televiz (``isaacteleop.viz``) is a lightweight compositor module for visualizing what the operator
sees — camera and sensor feeds, plus 3D rendered content — in an XR headset or a desktop window.

- Composites multiple sources into one view: 2D camera/sensor planes (``QuadLayer``) today, with full-view stereo RGBD (``ProjectionLayer``) for 3D rendered content coming soon
- Composites multiple sources into one view: 2D camera/sensor planes (``QuadLayer``) and full-view stereo RGBD (``ProjectionLayer``) for 3D rendered content
- Per-eye stereo rendering and 3D placement in XR; the same API drives windowed and offscreen output
- Zero-copy submission of GPU frames straight from CuPy, PyTorch, or any CUDA memory object
- Shares one OpenXR session with the teleop device trackers, so rendering and tracking can run over a single CloudXR connection
Expand Down
36 changes: 24 additions & 12 deletions src/viz/core/cpp/device_image.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,14 @@ VkFormat to_vk_storage_format(PixelFormat format)
case PixelFormat::kRGBA8:
return VK_FORMAT_R8G8B8A8_UNORM;
case PixelFormat::kD32F:
return VK_FORMAT_D32_SFLOAT;
// Single-channel float COLOR format, NOT VK_FORMAT_D32_SFLOAT. Depth
// formats use hardware depth compression in optimal tiling that CUDA
// external-memory array interop cannot interpret, so a CUDA-written
// D32_SFLOAT image reads back as garbage on the Vulkan side. R32_SFLOAT
// is bit-identical (IEEE float32) and interops exactly like the color
// images do; the bridge into the D32_SFLOAT XR depth swapchain happens
// via a staging buffer in the backend (float bits copy verbatim).
return VK_FORMAT_R32_SFLOAT;
}
throw std::runtime_error("DeviceImage: unsupported PixelFormat");
}
Expand All @@ -96,7 +103,7 @@ VkFormat to_vk_view_format(PixelFormat format)
case PixelFormat::kRGBA8:
return VK_FORMAT_R8G8B8A8_SRGB;
case PixelFormat::kD32F:
return VK_FORMAT_D32_SFLOAT;
return VK_FORMAT_R32_SFLOAT; // see to_vk_storage_format
}
throw std::runtime_error("DeviceImage: unsupported PixelFormat");
}
Expand Down Expand Up @@ -128,13 +135,17 @@ std::unique_ptr<DeviceImage> DeviceImage::create(const VkContext& ctx,
{
throw std::invalid_argument("DeviceImage: resolution must be non-zero");
}
if (format != PixelFormat::kRGBA8)
if (format != PixelFormat::kRGBA8 && format != PixelFormat::kD32F)
{
// kD32F is reserved for ProjectionLayer's depth path. The
// CUDA-Vulkan interop contract for a depth image (sample
// semantics, layout transitions, color-space view) is not
// worked out yet, so refuse to half-build it.
throw std::invalid_argument("DeviceImage: only PixelFormat::kRGBA8 is supported");
throw std::invalid_argument("DeviceImage: unsupported PixelFormat");
}
if (format == PixelFormat::kD32F && mip_levels > 1)
{
// Depth + mip chain is meaningless (filtering depth between mip
// levels produces incorrect occlusion) and we'd have to
// special-case the blit-down pipeline. Reject explicitly rather
// than silently allocating the chain.
throw std::invalid_argument("DeviceImage: kD32F does not support mip_levels > 1");
}
// mip_levels == 0 -> auto-compute full chain to 1x1.
if (mip_levels == 0)
Expand Down Expand Up @@ -335,8 +346,9 @@ void DeviceImage::create_vk_image_view()
info.image = image_;
info.viewType = VK_IMAGE_VIEW_TYPE_2D;
info.format = vk_format_;
info.subresourceRange.aspectMask =
(format_ == PixelFormat::kD32F) ? VK_IMAGE_ASPECT_DEPTH_BIT : VK_IMAGE_ASPECT_COLOR_BIT;
// Always COLOR: kD32F is stored as R32_SFLOAT (a color format), not a
// depth format — see to_vk_storage_format.
info.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
info.subresourceRange.baseMipLevel = 0;
info.subresourceRange.levelCount = mip_levels_;
info.subresourceRange.baseArrayLayer = 0;
Expand Down Expand Up @@ -517,8 +529,8 @@ void DeviceImage::run_one_shot_layout_transition(VkImageLayout old_layout,
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.image = image_;
barrier.subresourceRange.aspectMask =
(format_ == PixelFormat::kD32F) ? VK_IMAGE_ASPECT_DEPTH_BIT : VK_IMAGE_ASPECT_COLOR_BIT;
// kD32F is stored as R32_SFLOAT (color format), so always COLOR aspect.
barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
barrier.subresourceRange.baseMipLevel = 0;
barrier.subresourceRange.levelCount = mip_levels_;
barrier.subresourceRange.baseArrayLayer = 0;
Expand Down
2 changes: 2 additions & 0 deletions src/viz/layers/cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ cmake_minimum_required(VERSION 3.20)
# viz/layers_tests/.
add_library(viz_layers STATIC
quad_layer.cpp
projection_layer.cpp
inc/viz/layers/quad_layer.hpp
inc/viz/layers/projection_layer.hpp
)

target_include_directories(viz_layers
Expand Down
Loading
Loading