Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/source/_static/egocentric-input.gif
Comment thread
shaosu-nvidia marked this conversation as resolved.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions docs/source/_static/egocentric-reconstruction.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
123 changes: 111 additions & 12 deletions docs/source/references/egocentric_hand_reconstruction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,26 @@ Egocentric Hand Reconstruction
Automated pipeline for 4D hand and camera pose reconstruction from egocentric
videos. Integrates ViPE and Dyn-HaMR in containerized environments.

.. list-table::
:widths: 50 50

* - .. image:: ../_static/egocentric-input.gif
:alt: Source egocentric video
:width: 100%
:class: no-image-zoom
- .. image:: ../_static/egocentric-reconstruction.gif
:alt: Smooth fit grid reconstruction
:width: 100%
:class: no-image-zoom
* - .. centered:: Source egocentric video
- .. centered:: Reconstructed 4D hand and camera poses


Video Capture
---------------------------

To capture egocentric video with an OAK camera, see the
:doc:`/device/oak` documentation.
`OAK camera plugin <https://nvidia.github.io/IsaacTeleop/main/device/oak.html>`_ documentation.

Setup
-----
Expand All @@ -20,9 +35,41 @@ System Requirement
^^^^^^^^^^^^^^^^^^

- OS: Ubuntu 24.04
- GPU: NVIDIA RTX 6000 Ada or L40
- Memory: 100GB (for a reference 30s video, more for longer)
- Storage: 100GB
- GPU: NVIDIA RTX 6000 Ada, L40, H100,
- System RAM: 100GB (for a reference 30s video, more for longer)
- Free Disk: 100GB

Prerequisites
^^^^^^^^^^^^^

Ensure the following are installed and configured before starting:

**Docker ≥ 20.10** (BuildKit support required):

.. code-block:: bash

docker --version # should print 20.10 or newer

**NVIDIA Container Toolkit** — required for GPU access inside containers:

- Install guide: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

**Python tooling** — required only for downloading videos from S3/Swift URLs:

.. code-block:: bash

pip install boto3


Checkout the code
^^^^^^^^^^^^^^^^^

.. code-block:: bash

git clone https://github.com/NVIDIA/IsaacTeleop.git
cd IsaacTeleop/src/postprocessing/egocentric_hand_reconstruction

The ``./docker`` and ``./scripts`` directories referenced in this guide are located under this directory.

Prepare data files
^^^^^^^^^^^^^^^^^^
Expand All @@ -32,19 +79,18 @@ Place required files in the ``outputs/`` directory.
.. code-block:: text

...
├── doc/
├── docker/
├── scripts/
├── ...
├── osmo/
└── outputs/
├── MANO_RIGHT.pkl
└── BMC/
└── *.npy

**MANO model** (required):

- Download from: https://mano.is.tue.mpg.de/
- Place: ``outputs/MANO_RIGHT.pkl``
- Create an academic account at https://mano.is.tue.mpg.de/ and accept the license.
- The download is a ZIP archive — extract it and place ``MANO_RIGHT.pkl`` in ``outputs/``.

**BMC data** (required):

Expand Down Expand Up @@ -106,10 +152,11 @@ Run complete reconstruction (ViPE + Dyn-HaMR) with a single command:
# Using a remote video file
./scripts/run_reconstruction.sh s3://path/to/your_video.mp4

The script accepts either a **local file path** or a ``s3://`` **URL**
pointing to a video on a S3-compatible cloud storage. When a URL is provided,
the video is automatically downloaded to the ``outputs/`` directory before
processing begins.
The script accepts either a **local file path** or a remote **URL**
pointing to a video on cloud storage. Both ``s3://`` URLs (S3-compatible
cloud storage) and ``swift://`` URLs (OpenStack Object Storage) are
supported. When a URL is provided, the video is automatically downloaded
to the ``outputs/`` directory before processing begins.

To use a remote video, set the following environment variables for
credentials:
Expand Down Expand Up @@ -148,6 +195,53 @@ The pipeline will:
3. Run Dyn-HaMR for hand reconstruction.
4. Save all results to ``outputs/logs/``.

Batch Reconstruction with OSMO
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For large-scale batch processing, the pipeline can be submitted as an
`OSMO <https://github.com/NVIDIA/OSMO>`_ workflow using ``hand_reconstruction.yaml``.
This runs ViPE and Dyn-HaMR as two chained tasks on a GPU pool.

**Prerequisites:**

- A working OSMO cluster deployment (see the `OSMO deployment guide <https://nvidia.github.io/OSMO/main/deployment_guide/getting_started/infrastructure_setup.html>`_)
- OSMO CLI installed and authenticated (``osmo login …``)
- Bucket and image registry credentials stored in OSMO
- Container images built and pushed to your registry (see `Build Docker images`_)
- MANO and BMC assets available at an S3 URL

See ``osmo/README.md`` for full setup details including credential registration and container image push steps.

**Submit a workflow:**

.. code-block:: bash

osmo workflow submit osmo/hand_reconstruction.yaml \
--pool POOL_NAME \
--set-string \
experiment_id=EXPERIMENT_ID \
source_url=s3://INPUT_S3_PATH \
dest_url=s3://OUTPUT_S3_PATH \
assets_url=s3://ASSETS_S3_PATH \
vipe_image=CONTAINER_REGISTRY/ego_vipe:TAG \
dynhamr_image=CONTAINER_REGISTRY/ego_dynhamr:TAG

**Monitor progress:**

.. code-block:: bash

osmo workflow logs WORKFLOW_ID -n 100

Estimated Runtime
^^^^^^^^^^^^^^^^^

For a reference 30-second video, expect approximately:

- **ViPE**: ~7 minutes
- **Dyn-HaMR**: ~30 minutes

Actual runtime may vary depending on system hardware and video length.

View results
^^^^^^^^^^^^

Expand All @@ -158,3 +252,8 @@ View results

# View visualization
vlc outputs/logs/video-custom/<DATE>/<VIDEO_NAME>*/*_grid.mp4

Limitations
-----------

The quality of the reconstructed result is directly related to the capture quality of the egocentric video.
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@ The reconstruction pipeline requires two sets of external data files, stored in
- **MANO_RIGHT.pkl**
- **BMC/**

See [`doc/quickstart.md`](../doc/quickstart.md) for detailed setup instructions.
See [`Isaac Teleop Documentation`](https://nvidia.github.io/IsaacTeleop/main/references/egocentric_hand_reconstruction.html) for detailed setup instructions.

### Container images

The workflow requires two container images (`vipe_image` and `dynhamr_image`). Build them locally following the instructions in [`doc/quickstart.md`](../doc/quickstart.md):
The workflow requires two container images (`vipe_image` and `dynhamr_image`). Build them locally following the instructions in [`Isaac Teleop Documentation`](https://nvidia.github.io/IsaacTeleop/main/references/egocentric_hand_reconstruction.html):

```bash
./docker/vipe.sh build
Expand Down Expand Up @@ -107,7 +107,6 @@ osmo credential set REGISTRY_CREDENTIAL \

See the [OSMO credentials documentation](https://nvidia.github.io/OSMO/main/user_guide/getting_started/credentials.html) for details.


## Template Parameters

The workflow uses `{{placeholder}}` template variables that are filled at submission time via `--set-string`:
Expand Down
Loading