Skip to content

ci: add metadrive simulation test workflow#38236

Draft
dhruvpandoh wants to merge 29 commits into
commaai:masterfrom
dhruvpandoh:metadrive-ci
Draft

ci: add metadrive simulation test workflow#38236
dhruvpandoh wants to merge 29 commits into
commaai:masterfrom
dhruvpandoh:metadrive-ci

Conversation

@dhruvpandoh

Copy link
Copy Markdown

Summary

Adds a GitHub Actions workflow for running the MetaDrive simulation bridge test in CI.

This targets #30693.

The workflow:

  • runs on free GitHub-hosted Ubuntu runners
  • configures headless software OpenGL rendering using Mesa llvmpipe and Xvfb
  • starts a virtual PulseAudio sink for audio-dependent processes
  • runs tools/sim/tests/test_metadrive_bridge.py::TestMetaDriveBridge::test_driving
  • uploads qlog/rlog/camera/log artifacts for debugging

Testing

Initial validation will be through GitHub Actions on this PR.

The goal is to verify:

  • the test runs successfully on free GitHub Actions runners
  • the workflow completes within the configured timeout
  • artifacts are uploaded on both success and failure
  • the test can pass reliably across repeated manual runs

@github-actions

Copy link
Copy Markdown
Contributor

Process replay diff report

Replays driving segments through this PR and compares the behavior to master.
Please review any changes carefully to ensure they are expected.

✅ 0 changed, 66 passed, 0 errors

@dhruvpandoh

Copy link
Copy Markdown
Author

Current status:

I have the workflow building openpilot and getting all the way to the MetaDrive pytest target in GitHub Actions. MetaDrive starts up, openpilot launches, and modeld loads successfully.

The part I’m still debugging is the runtime behavior on the hosted GitHub runner. The sim repeatedly hits cameraOdometry/livePose timing issues:

"Observation timestamp is older than the max rewind threshold of the filter"

Because of that, selfdriveState never becomes engageable and the test ends with 0 active steps.

So far I’ve tried the main workflow-side fixes I could think of: pulling the needed LFS assets, using openpilot’s testing deps, ONNXCPU, Xvfb, virtual PulseAudio, and CPU/thread tuning. I’m keeping this as a draft while I investigate whether the right fix is in the simulator timing/rendering path rather than more workflow env changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant