libnvme: make discovery connects honor the exclusion list#3541
Merged
Conversation
The exclusion list is aimed squarely at orchestrators: it names the controllers no auto-connecting agent may touch, so that independent agents can coexist with each other and with the administrator's wishes. nvme connect-all is such an orchestrator -- it connects whatever a Discovery Log Page, the NBFT table, or a configuration file enumerates -- yet libnvmf_exclusion_match() had no caller in nvme-cli, so an excluded controller was happily reconnected by the very next connect-all run. Consult the list right before an enumerated controller is connected: DLP entries and referral DCs (nvmf_connect_disc_entry), NBFT records (nbft_connect), and controllers listed in config.json or discovery.conf. A match is skipped and reported at INFO level, so "connect-all --verbose" shows which controllers were left alone and why. Controllers named explicitly by the user (nvme connect, nvme discover with an address) are deliberately not checked: a targeted human action overrides the list, and only the entries such a DC enumerates are gated. Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Collaborator
|
Looks good, I've spend some time trying to figure out if the return code was correct in all places. As far I can tell it all makes sense. Tricky stuff, wouldn't be surprised if this is not 100% right. But let's add those hooks into place and see what the testing reveals. Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The exclusion list (merged in #3523) is aimed squarely at orchestrators: it names the controllers no auto-connecting agent may touch, so independent agents can coexist with each other and with the administrator's wishes.
nvme connect-allis such an orchestrator — it connects whatever a Discovery Log Page, the NBFT table, or a configuration file enumerates — yetlibnvmf_exclusion_match()had no caller in nvme-cli, so an excluded controller was happily reconnected by the very nextconnect-allrun. This PR closes that gap (called out as a deliberate follow-up in the #3523 discussion).One static helper consults the list right before an enumerated controller is connected: DLP entries and referral DCs (
nvmf_connect_disc_entry), NBFT records (nbft_connect), and controllers enumerated fromconfig.json/discovery.conf. A match is skipped and reported at INFO level, soconnect-all --verboseshows which controllers were left alone and why.Controllers named explicitly by the user (
nvme connect,nvme discoverwith an address) are deliberately not checked — a targeted human action overrides the list; only the entries such a DC enumerates are gated. This matches the "not allowed to block targeted human actions" rule already documented in EXCLUSIONS.md, which now also namesconnect-allexplicitly.Testing. Unit: the existing exclusion/TID match suites (green on glibc and musl; the connect paths themselves need a kernel). Functional, against a local nvmet target with multiple subsystems: with
exclusion = nqn=<subsysnqn>inexclusions.conf,nvme connect-all -vconnected every subsystem except the excluded one and logged the skip; afternvme exclusion remove, the nextconnect-allconnected it again.