feat: Phase 1 — automated auth bootstrap and integration test infrastructure#35
feat: Phase 1 — automated auth bootstrap and integration test infrastructure#35Jaydeep869 wants to merge 6 commits into
Conversation
Add automated auth bootstrap (no browser OAuth required), integration test lifecycle tasks, CI workflow, and correct test tags. ## Auth bootstrap (scripts/bootstrap.sh) Implements a fully automated Keycloak ROPC flow: 1. Wait for Keycloak health 2. Obtain admin token via master realm 3. Create smoke-test-user (idempotent — 409 is OK) 4. Request offline token via Resource Owner Password Credentials grant using the new smoke-test-client OIDC client (directAccessGrants=true) 5. Wait for Minder server health 6. Call to trigger user auto-enrollment 7. Extract and persist the root project ID Eliminates the need for any browser interaction or GitHub OAuth. ## Integration test tasks (Taskfile.yml) - integration-setup: starts minder docker stack + runs bootstrap.sh - integration-run: executes the Robot Framework suite - integration-teardown: docker compose down + cleanup - integration-test: full lifecycle (setup → run → teardown with defer) ## CI workflow (.github/workflows/integration-tests.yml) Runs on: workflow_dispatch, weekly schedule (Mon 06:00 UTC), and PRs touching test infrastructure files. Steps: checkout both repos, build Minder, start run-docker stack, health-check, bootstrap, run tests, upload results, teardown. ## Test tags Add correct tags to all test files so -i core and -e github-required work as documented in the README: - api-tests.robot, api-profiles.robot, api-datasources.robot, api-project.robot, api-provider.robot: tagged smoke + core - api-repositories.robot, api-history.robot: tagged github-required + provider-required (unchanged, already had these tags) - minder-tests.robot: Valid login → smoke+core, Project created → smoke+core, Provider enrolled → smoke+github-required+provider-required (cannot pass without DB-seeded provider, Phase 2) ## Configs and compose - smoke-test-config.yaml: host-side config (localhost ports) - smoke-test-config-docker.yaml: in-network config (Docker service names) - docker-compose.test.yml: sidecar compose for running tests alongside the Minder stack ## Note on Keycloak client bootstrap.sh now defaults to CLIENT_ID=smoke-test-client. A companion PR to mindersec/minder is required to add this client to the Keycloak realm JSON with directAccessGrantsEnabled: true.
|
hey @evankanderson Can you review this PR. |
|
Approved mindersec/minder#6491, as that configuration should only apply to |
evankanderson
left a comment
There was a problem hiding this comment.
Thanks for doing this! I'm looking forward to trying it out (probably tomorrow at this point, because I've got a 2 hour run to look forward to this afternoon).
| # Manual trigger for on-demand runs | ||
| workflow_dispatch: | ||
| inputs: | ||
| minder_ref: | ||
| description: 'Minder branch/tag to test against' | ||
| required: false | ||
| default: 'main' | ||
| test_tags: | ||
| description: 'Robot Framework tag filter (e.g. "smoke", "core")' | ||
| required: false | ||
| default: '' |
There was a problem hiding this comment.
This seems dangerous in combination with imposter commits. The workflow would be:
- Attacker creates a commit in their own fork which has malicious
make bootstrapormake run-dockercode. - Attacker triggers a workflow run using the commit SHA
- Attacker's code running as the GitHub Action can perform all actions that the workflow can, including reading secrets and using the action token.
There was a problem hiding this comment.
Removed the minder_ref input entirely. We now always test against mindersec/minder@main using pre-built ghcr.io images. Now workflow no longer checks out or executes any Makefile/compose from the minder repo, so there's no path for an attacker to inject code via a fork commit.
| - name: Install ko | ||
| uses: ko-build/setup-ko@v0.6 |
There was a problem hiding this comment.
I understand why we're doing this, but it would be great if we could use an already-built minder image, rather than spending actions minutes refetching and rebuilding the binaries.
There was a problem hiding this comment.
Done. Removed the Go/ko/bootstrap/build steps. The workflow now fetches docker-compose.yaml from mindersec/minder@main, patches out the build: directives with yq, and pulls ghcr.io/mindersec/minder:latest. The minder CLI binary is downloaded from the GitHub release page.
| echo "Waiting for Minder server to be healthy..." | ||
| timeout 180 bash -c ' | ||
| until curl -sf http://localhost:8080/api/v1/health > /dev/null 2>&1; do | ||
| echo " ...waiting" | ||
| sleep 5 | ||
| done | ||
| ' | ||
| echo "Minder server is healthy!" |
There was a problem hiding this comment.
I think make run-docker should do this as part of the Docker-Compose healthchecks already. If not, we should adjust those to avoid needing extra checks here.
| - name: Build smoke test image | ||
| run: task build |
There was a problem hiding this comment.
Do we need a container image for the smoke tests, or can they run outside the docker environment in the main thread of execution?
| - cmd: | | ||
| echo "Starting Minder Docker stack..." | ||
| docker compose -f {{.MINDER_REPO_PATH}}/docker-compose.yaml up -d --wait | ||
| silent: false |
There was a problem hiding this comment.
If I want to run the tests against a Minder installation where I've done some manual setup before invoking the test, I can just run task test or task integration-run, correct?
Does it make sense to have both of those targets?
There was a problem hiding this comment.
Replaced task build + task test with a direct pip install -r requirements.txt + robot invocation on the runner. task test (container-based) is kept for local dev isolation. task integration-run now runs robot directly on the host, that's the answer to your question: yes, if you have a Minder stack already running and just want to run the tests, task integration-run is the target (no setup, no container). Also added task integration-run-core as a shortcut for running only core tests.
Address four issues raised in code review: ## 1. Security: remove freeform minder_ref input (workflow_dispatch) The previous workflow allowed specifying any minder branch/tag/SHA via workflow_dispatch input. An attacker could point this at a fork commit containing malicious Makefile or docker-compose code, which would then execute in the GitHub Actions context with access to repo secrets. Fix: remove the minder_ref input entirely. We always test against mindersec/minder@main using pre-built published images. The workflow no longer checks out the minder repo or runs its Makefile. ## 2. Performance: use pre-built minder images instead of building from source The previous workflow ran make bootstrap + make build + ko resolve to build the minder server image from Go source. This added ~10 minutes of setup (Go toolchain, ko, protoc, sqlc, etc.) before any test ran. Fix: download the minder CLI binary from the latest GitHub release and pull the minder server Docker image from the registry using docker compose pull. Eliminates the Go/ko/bootstrap setup steps entirely. ## 3. Redundancy: remove explicit health check step The previous workflow had a dedicated 'Wait for Minder stack health' step that polled /api/v1/health and /health/ready. This was redundant because: - docker compose up --wait already blocks until healthchecks pass - bootstrap.sh has its own wait_for_url loops as a safety net Fix: remove the step. Rely on docker compose up --wait and bootstrap.sh. ## 4. Architecture: run tests directly on the runner (no container image) The previous workflow ran task build (build a Docker image with Robot + Python) then task test (run tests inside that container). Building a container just to run Python tests in CI adds complexity for no benefit since the runner already has Python. Fix: install requirements.txt via pip directly on the runner and invoke robot as a host command. The container-based task test is kept for local dev isolation but is not used in CI. Also adds task integration-run-core as a convenience shortcut and clarifies the distinction between task test (container, local dev) and task integration-run (host robot, CI / pre-existing stack) in both the Taskfile and README.
Three CI failures addressed:
1. The minder docker-compose.yaml uses 'build: context: .' for the
minder server and migrate services. docker compose pull skips build-
directive services, and docker compose up tries to build from local
source — which fails in CI because there is no local minder checkout.
Fix: fetch the compose file from mindersec/minder@main and use yq to
replace build directives with 'ghcr.io/mindersec/minder:latest'.
2. The minder server requires SSH keys and config files that 'make
bootstrap' normally generates from local source. Without the minder
repo checked out these files are missing and docker compose up fails.
Fix: generate minimal SSH key set and fetch config templates directly
from mindersec/minder@main.
3. MINDER_PROJECT was referenced as '${{ env.MINDER_PROJECT }}' in the
step env block. GitHub Actions evaluates step env values at parse time,
not after previous steps run. bootstrap.sh writes MINDER_PROJECT to
$GITHUB_ENV, making it available as a plain shell variable in
subsequent steps — no explicit reference needed in the YAML.
Signed-off-by: jaydeep869 <jaydeeppokhariya2106@gmail.com>
c35775c to
e7d064c
Compare
The teardown and log steps run with 'if: always()' and 'if: failure()' respectively, but docker-compose.minder.yaml is only created mid-workflow. If any earlier step (e.g. yq patch, curl) fails before the file is written, the teardown step crashes: open .../docker-compose.minder.yaml: no such file or directory Fix: check file existence before running docker compose commands in both the log-dump and teardown steps. This makes the cleanup path idempotent regardless of where in the workflow the failure occurred. Signed-off-by: jaydeep869 <jaydeeppokhariya2106@gmail.com>
Two fixes:
1. Wrong release asset name for minder CLI.
The goreleaser archive name template is:
minder_VERSION_Os_Arch (e.g. minder_v0.1.2_linux_amd64.tar.gz)
The previous URL used 'minder_Linux_x86_64.tar.gz' which does not
exist, causing curl to return a 404 HTML page piped into tar —
hence 'gzip: stdin: not in gzip format'.
Fix: include the version in the filename and use lowercase os/arch.
2. dorny/test-reporter fails hard when results/xoutput.xml does not
exist (i.e. when the tests never ran due to an earlier failure).
This masked the real failure with a secondary 'No test report files
were found' error.
Fix: add hashFiles() guard so the reporter only runs when the file
exists, and set fail-on-error: false so a missing report does not
override the real exit code.
Signed-off-by: jaydeep869 <jaydeeppokhariya2106@gmail.com>
The goreleaser name template strips the 'v' prefix from the tag, so tag v0.1.2 produces minder_0.1.2_linux_amd64.tar.gz. The previous URL used the raw tag name (v0.1.2) in the path, resulting in a 404. Rather than hardcode the naming convention (which may change), query the releases API for .assets[] and filter by 'linux_amd64.tar.gz'. This approach is robust against any future goreleaser config changes. Also adds a diagnostic fallback that prints all available asset names if the expected asset is not found, making future failures easier to debug. Signed-off-by: jaydeep869 <jaydeeppokhariya2106@gmail.com>
evankanderson
left a comment
There was a problem hiding this comment.
I had trouble getting the offline token code working. I ended up using minder auth offline-token get -f cli-offline.token to get a working token to be able to make more progress. With that and the following changes, I was able to get 15/21 tests labelled "core" to pass. I think we should scale down the default targets to only the passing ones -- maybe label the others "extended-only" or some such?
Changes needed to get tests passing:
- Keycloak needs to be started with
KC_HTTP_MANAGEMENT_HEALTH_ENABLED: "false"environment variable to enable health checks on the (exposed) main port. (Used bybootstrap.sh) - The
minder-cliclient needs to havedirectAccessGrantsEnabled: trueinstacklok.yamlgiven the configuration inbootstrap.sh. I haven't tried using thesmoke-tests-clientto get a token for use by the Minder CLI, since I wasn't able to get the offline token working correctly. - I found that I needed to do a
minder auth loginor equivalent once to bootstrap the project.auth offline-token usedidn't seem to do this, though you might be able tocreate projectexplicitly. - It looks like the robot tests take advantage of "child projects" -- a partially-working hierarchical projects implementation that needs to be enabled for parent projects via direct database inserts:
INSERT INTO entitlements (feature, project_id) VALUES ('project_hierarchy_operations_enabled', '{project_id}');
- The Minder auth token has moved since the smoke tests were written (to allow storing multiple auth tokens for different servers). You'll now need something like the following in
_get_bearer_token(self):inminder_restapi_lib.py:credentials_path = os.path.expanduser(f"~/.config/minder/{get_grpc_url_from_config().replace(':', '_')}.json")
- It's documented, but you might want to check for a
MINDER_CONFIGand/orconfig.yamlin the current directory before running the bootstrap script, otherwise theminderbits will run against a cloud service. - I had to set the following additional environment variables when running directly from
uv tool run:MINDER_PROJECTMINDER_OFFLINE_TOKEN_PATHMINDER_RULETYPES_PATHTEST_TAGS=coreMINDER_TEST_ORG<-- I didn't set this, this might have permitted more tests to pass
I ended up using uv tool run --with-requirements requirements.txt robot ... to run the python script. UV is generally a nicer way to handle Python and dependencies, but was pretty new when these tests were being written.
| # KC_ADMIN_PASS — Keycloak admin password (default: admin) | ||
| # TEST_USER — Test user to create (default: smoke-test-user) | ||
| # TEST_PASS — Test user password (default: smoke-test-password) | ||
| # CLIENT_ID — OIDC client ID (default: minder-cli) |
There was a problem hiding this comment.
I think this should be smoke-test-client here and below, unless you want to set directAccessGrantsEnabled: true on the minder-cli client.
| .services.minder.image = "ghcr.io/mindersec/minder:latest" | | ||
| del(.services.migrate.build) | | ||
| .services.migrate.image = "ghcr.io/mindersec/minder:latest" |
There was a problem hiding this comment.
These two should be ghcr.io/mindersec/minder/server:latest
Summary
This PR implements the Phase 1 integration test infrastructure agreed in
the LFX mentorship design discussion. It enables fully automated smoke
tests against a local
run-dockerMinder stack — no browser, no GitHubOAuth, no manual steps.
What's included
scripts/bootstrap.sh(new)Automated Keycloak ROPC bootstrap flow:
/health/ready)smoke-test-useridempotently (HTTP 409 = already exists, OK)smoke-test-client/api/v1/health)minder auth offline-token use→ triggers user auto-enrollmentAll steps are configurable via environment variables with sensible
defaults for
run-docker.Taskfile.yml— integration test tasks (new)integration-setupintegration-runintegration-teardowndocker compose down -v+ cleanupintegration-testdeferteardown.github/workflows/integration-tests.yml(new)CI pipeline that runs on:
workflow_dispatch(manual, with selectableminder_refand tag filter)Steps: checkout both repos → build Minder →
make run-docker→ healthchecks → bootstrap → run tests → upload results (xUnit + Robot HTML) →
teardown. Minder logs printed on failure.
Test tags fixed (all
.robotfiles)Previously most tests had no
coretag, makingtask test -- -i corereturn zero tests despite the README documenting it.
api-tests.robotsmoke coreapi-profiles.robotsmoke coreapi-datasources.robotsmoke coreapi-project.robotsmoke coreapi-provider.robotsmoke coreminder-tests.robot— Valid loginsmoke coreminder-tests.robot— Project createdsmoke coreminder-tests.robot— Provider enrolledsmoke github-required provider-required(needs Phase 2 DB seeding)Configs and compose (new)
smoke-test-config.yaml— host-side CLI config (localhost ports)smoke-test-config-docker.yaml— in-network config (Docker service names)docker-compose.test.yml— sidecar compose to run tests alongside stackDependencies
Requires mindersec/minder#XX (adds
smoke-test-clientKeycloak configwith
directAccessGrantsEnabled: true). The bootstrap script will failwith a clear error message if this client is not present.
What's NOT in this PR (future phases)
api-repositories.robotand
api-history.robot(provider-requiredtagged tests)Testing