NVIDIA · farbod-nv · May 16, 2026 · Jun 1, 2026 · Jun 24, 2026 · Jun 24, 2026
diff --git a/docs/source/getting_started/televiz.rst b/docs/source/getting_started/televiz.rst
@@ -5,8 +5,9 @@ Televiz
 =======
 
 Televiz (``isaacteleop.viz``) is a lightweight compositor for Isaac Teleop. It composites camera and
-sensor feeds — with 3D rendered content coming soon — into an XR headset, a desktop window, or an
-offscreen buffer, integrating directly with the device-tracking and retargeting pipeline.
+sensor feeds, plus 3D rendered content (gsplat, nvblox, neural reconstruction), into an XR headset, a
+desktop window, or an offscreen buffer, integrating directly with the device-tracking and retargeting
+pipeline.
 
 It is a **compositor**, not a capture or streaming layer: it consumes GPU frames and assembles them
 into a final image. Camera capture, decode, and network transport live in the application (see
@@ -29,16 +30,17 @@ which owns the Vulkan context, the display target, the OpenXR session (in XR mod
 of **layers**. Content producers submit GPU buffers to layers; the session composites every layer
 into one frame each time you call ``render()``.
 
-The built-in layer type today is
-:code-file:`QuadLayer <src/viz/layers/cpp/inc/viz/layers/quad_layer.hpp>` — a CUDA-fed 2D texture
-plane (mono or stereo), optionally placed in 3D space. Use it for camera feeds.
+Two layer types are available:
 
-.. note::
+* :code-file:`QuadLayer <src/viz/layers/cpp/inc/viz/layers/quad_layer.hpp>` — a CUDA-fed 2D texture
+  plane (mono or stereo), optionally placed in 3D space. Use it for camera feeds.
+* :code-file:`ProjectionLayer <src/viz/layers/cpp/inc/viz/layers/projection_layer.hpp>` — a full-view
+  RGBD layer for external renderers (gsplat, nvblox, neural reconstruction) that produce per-view
+  ``(color, depth)`` buffers. Use it to present a rendered 3D scene from the current head pose.
 
-   **Coming soon:** ``ProjectionLayer``, a full-view stereo RGBD layer for external renderers
-   (gsplat, nvblox, neural reconstruction) that produce per-view ``(color, depth)`` buffers,
-   Z-composited with quads. It is not yet available in this release — see `ProjectionLayer
-   (coming soon)`_ below.
+A session holds **either** one ``ProjectionLayer`` **or** any number of ``QuadLayer`` s, not both:
+quads composite into a shared render target, while a projection layer is presented directly (see
+`ProjectionLayer`_).
 
 All symbols are imported from the top-level module::
 
@@ -212,19 +214,63 @@ For a stereo layer both buffers are copied on the same stream and signaled toget
 never sees a half-matched pair. Lock-mode placement strategies (``world`` / ``head`` / ``lazy``) are
 **application policy** and ship in the sample, not in the module.
 
-ProjectionLayer (coming soon)
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ProjectionLayer
+^^^^^^^^^^^^^^^
 
-.. note::
+A full-view RGBD layer for **in-loop** renderers — gsplat, nvblox, or neural reconstruction engines
+that produce per-view ``(color, depth)`` buffers. Configure it with ``ProjectionLayerConfig``:
 
-   ``ProjectionLayer`` is under active development and **not yet available in this release**. The
-   description below is a preview of the planned API and may change.
+.. list-table::
+   :header-rows: 1
+   :widths: 25 75
+
+   * - Field
+     - Description
+   * - ``name``
+     - Layer name.
+   * - ``view_resolution``
+     - Per-view render resolution. **Must equal** ``session.get_recommended_resolution()`` — the
+       layer's images are copied 1:1 into the presentation swapchains (per-eye in XR). A mismatch
+       is rejected by ``add_projection_layer``.
+   * - ``color_format``
+     - ``PixelFormat.kRGBA8``.
+   * - ``depth_format``
+     - ``PixelFormat.kD32F`` (default) so the depth reaches the XR runtime for positional
+       reprojection, or ``None`` to present color only.
+   * - ``stereo``
+     - ``True`` for per-eye buffers. A stereo (XR) display **requires** a stereo layer; a mono layer
+       is rejected at ``add_projection_layer``.
+
+Unlike ``QuadLayer``, a projection layer is **direct-present**: each view's ``(color, depth)`` is
+copied straight into the presentation swapchains (no shared render target). Because of that a session
+holds *either* one ``ProjectionLayer`` *or* any number of ``QuadLayer`` s, never both.
+
+The renderer runs **in-loop** with the frame loop: read the predicted view poses from the
+``FrameInfo`` returned by ``begin_frame()``, render against them, then ``submit()`` before
+``end_frame()``:
+
+.. code-block:: python
+
+   cfg = televiz.ProjectionLayerConfig()
+   cfg.view_resolution = session.get_recommended_resolution()
+   cfg.stereo = session.is_xr_mode()
+   layer = session.add_projection_layer(cfg)
+
+   while running:
+       info = session.begin_frame()
+       if info.should_render:
+           # Render against THIS frame's per-eye poses (info.views[i].pose + .fov).
+           color, depth = renderer.render(info.views)        # RGBA8 + D32F CUDA buffers
+           if layer.stereo:
+               layer.submit(left_color, left_depth, right_color, right_depth, stream=cuda_stream)
+           else:
+               layer.submit(color, depth, stream=cuda_stream)
+       session.end_frame()
 
-A planned full-view RGBD layer for in-loop renderers — gsplat, nvblox, or neural reconstruction
-engines that produce per-view ``(color, depth)`` buffers. Unlike ``QuadLayer``, the renderer will
-run **in-loop** with the XR frame loop: render against the predicted view poses from the current
-frame, then submit between ``begin_frame()`` and ``end_frame()``. Output is composited with depth,
-so it Z-combines with quad layers.
+If the renderer is slower than display rate, the runtime / CloudXR paces the app via ``xrWaitFrame``
+and reprojects the last submitted frame at display rate. In XR, a visible layer that does **not**
+submit for a frame presents nothing (the swapchains are cleared) rather than reproject stale RGBD
+under a new pose.
 
 Frame loop
 ----------

diff --git a/docs/source/overview/architecture.rst b/docs/source/overview/architecture.rst
@@ -43,7 +43,7 @@ Visualization (Televiz)
 Televiz (``isaacteleop.viz``) is a lightweight compositor module for visualizing what the operator
 sees — camera and sensor feeds, plus 3D rendered content — in an XR headset or a desktop window.
 
-- Composites multiple sources into one view: 2D camera/sensor planes (``QuadLayer``) today, with full-view stereo RGBD (``ProjectionLayer``) for 3D rendered content coming soon
+- Composites multiple sources into one view: 2D camera/sensor planes (``QuadLayer``) and full-view stereo RGBD (``ProjectionLayer``) for 3D rendered content
 - Per-eye stereo rendering and 3D placement in XR; the same API drives windowed and offscreen output
 - Zero-copy submission of GPU frames straight from CuPy, PyTorch, or any CUDA memory object
 - Shares one OpenXR session with the teleop device trackers, so rendering and tracking can run over a single CloudXR connection

diff --git a/src/viz/core/cpp/device_image.cpp b/src/viz/core/cpp/device_image.cpp
@@ -84,7 +84,14 @@ VkFormat to_vk_storage_format(PixelFormat format)
     case PixelFormat::kRGBA8:
         return VK_FORMAT_R8G8B8A8_UNORM;
     case PixelFormat::kD32F:
-        return VK_FORMAT_D32_SFLOAT;
+        // Single-channel float COLOR format, NOT VK_FORMAT_D32_SFLOAT. Depth
+        // formats use hardware depth compression in optimal tiling that CUDA
+        // external-memory array interop cannot interpret, so a CUDA-written
+        // D32_SFLOAT image reads back as garbage on the Vulkan side. R32_SFLOAT
+        // is bit-identical (IEEE float32) and interops exactly like the color
+        // images do; the bridge into the D32_SFLOAT XR depth swapchain happens
+        // via a staging buffer in the backend (float bits copy verbatim).
+        return VK_FORMAT_R32_SFLOAT;
     }
     throw std::runtime_error("DeviceImage: unsupported PixelFormat");
 }
@@ -96,7 +103,7 @@ VkFormat to_vk_view_format(PixelFormat format)
     case PixelFormat::kRGBA8:
         return VK_FORMAT_R8G8B8A8_SRGB;
     case PixelFormat::kD32F:
-        return VK_FORMAT_D32_SFLOAT;
+        return VK_FORMAT_R32_SFLOAT; // see to_vk_storage_format
     }
     throw std::runtime_error("DeviceImage: unsupported PixelFormat");
 }
@@ -128,13 +135,17 @@ std::unique_ptr<DeviceImage> DeviceImage::create(const VkContext& ctx,
     {
         throw std::invalid_argument("DeviceImage: resolution must be non-zero");
     }
-    if (format != PixelFormat::kRGBA8)
+    if (format != PixelFormat::kRGBA8 && format != PixelFormat::kD32F)
     {
-        // kD32F is reserved for ProjectionLayer's depth path. The
-        // CUDA-Vulkan interop contract for a depth image (sample
-        // semantics, layout transitions, color-space view) is not
-        // worked out yet, so refuse to half-build it.
-        throw std::invalid_argument("DeviceImage: only PixelFormat::kRGBA8 is supported");
+        throw std::invalid_argument("DeviceImage: unsupported PixelFormat");
+    }
+    if (format == PixelFormat::kD32F && mip_levels > 1)
+    {
+        // Depth + mip chain is meaningless (filtering depth between mip
+        // levels produces incorrect occlusion) and we'd have to
+        // special-case the blit-down pipeline. Reject explicitly rather
+        // than silently allocating the chain.
+        throw std::invalid_argument("DeviceImage: kD32F does not support mip_levels > 1");
     }
     // mip_levels == 0 -> auto-compute full chain to 1x1.
     if (mip_levels == 0)
@@ -335,8 +346,9 @@ void DeviceImage::create_vk_image_view()
     info.image = image_;
     info.viewType = VK_IMAGE_VIEW_TYPE_2D;
     info.format = vk_format_;
-    info.subresourceRange.aspectMask =
-        (format_ == PixelFormat::kD32F) ? VK_IMAGE_ASPECT_DEPTH_BIT : VK_IMAGE_ASPECT_COLOR_BIT;
+    // Always COLOR: kD32F is stored as R32_SFLOAT (a color format), not a
+    // depth format — see to_vk_storage_format.
+    info.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
     info.subresourceRange.baseMipLevel = 0;
     info.subresourceRange.levelCount = mip_levels_;
     info.subresourceRange.baseArrayLayer = 0;
@@ -517,8 +529,8 @@ void DeviceImage::run_one_shot_layout_transition(VkImageLayout old_layout,
     barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
     barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
     barrier.image = image_;
-    barrier.subresourceRange.aspectMask =
-        (format_ == PixelFormat::kD32F) ? VK_IMAGE_ASPECT_DEPTH_BIT : VK_IMAGE_ASPECT_COLOR_BIT;
+    // kD32F is stored as R32_SFLOAT (color format), so always COLOR aspect.
+    barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
     barrier.subresourceRange.baseMipLevel = 0;
     barrier.subresourceRange.levelCount = mip_levels_;
     barrier.subresourceRange.baseArrayLayer = 0;

diff --git a/src/viz/layers/cpp/CMakeLists.txt b/src/viz/layers/cpp/CMakeLists.txt
@@ -10,7 +10,9 @@ cmake_minimum_required(VERSION 3.20)
 # viz/layers_tests/.
 add_library(viz_layers STATIC
     quad_layer.cpp
+    projection_layer.cpp
     inc/viz/layers/quad_layer.hpp
+    inc/viz/layers/projection_layer.hpp
 )
 
 target_include_directories(viz_layers