nvme-top: add paging support and improve topology update handling#3529
nvme-top: add paging support and improve topology update handling#3529shroffni wants to merge 3 commits into
Conversation
The nvme-top UI consists of two dashboards. The first dashboard displays the list of available NVMe subsystems on the host. Users can scroll through the list and press **Enter** to select a subsystem, which opens the second dashboard showing statistics for the selected subsystem. When the user returns to the first dashboard by pressing Esc, the previously selected subsystem may have moved because the NVMe topology was updated while viewing the second dashboard. This becomes particularly problematic when the number of subsystems exceeds a single page. Preserving the previously selected row is not feasible because the first dashboard screen buffer is rebuilt when switching back from the second dashboard. Additionally, nvme-top renders the dashboard line-by-line rather than page-by-page, so there is no mapping between the screen buffer and the page containing the previously selected subsystem. Instead, restore the selected subsystem as the first entry in the subsystem list when transitioning from the second dashboard back to the first. This provides a consistent and predictable view even when the topology changes while the user is viewing subsystem statistics. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
The top dashboard currently supports navigation only through the Up and Down arrow keys. When the number of NVMe subsystems becomes large, or when a subsystem contains tens or hundreds of namespaces, navigating the dashboard one row at a time becomes cumbersome. Add paging support to the top dashboard so users can navigate more efficiently using the Page Up and Page Down keys. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Receiving ENOBUFS while reading from the netlink uevent socket is normal and indicates that one or more uevents were dropped because the socket receive buffer overflowed. This can happen during periods of high kobject activity when many add, remove, or change events are generated. Instead of treating ENOBUFS as a fatal error, synthesize it as an NVMe kobject change event and trigger a topology rescan. This ensures that the internal topology is refreshed even if individual uevents were missed. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
| subsys_arr = NULL; | ||
| libnvme_free_global_ctx(ctx); | ||
|
|
||
| ctx = stdout_top_rescan_topology(); |
There was a problem hiding this comment.
I haven't noticed during the initial review. I assume you have to free the global context so that all resources are freed from libnvme_scan_topology? The idea about the global context that it is allocated only once, so we need to look into a libnvme_rescan_topology I think.
There was a problem hiding this comment.
So for this series this is fine, but we should really make the library able to deal with this use case.
There was a problem hiding this comment.
There is libnvme_refresh_topology, which should do the trick,
There was a problem hiding this comment.
Yes this looks good. Should I fix it in this series? Or shall I post it in a separate patch?
There was a problem hiding this comment.
you can post a separate patch for this or update the series. whatever you prefer. Could you look through the copilot findings below though? No need for long explanation if you think it's wrong.
There was a problem hiding this comment.
Pull request overview
This PR improves the nvme-top TUI dashboard’s robustness and navigation by adding Page Up/Page Down support, making topology refresh behavior safer when switching screens, and treating netlink ENOBUFS as a non-fatal condition to keep topology consistent.
Changes:
- Add Page Up / Page Down key events and implement paging in both the subsystem list and subsystem topology views.
- Improve behavior when returning from the topology view by rebuilding the subsystem list starting at the first entry.
- Treat netlink
ENOBUFSas a dropped-uevent signal and trigger a topology rescan instead of aborting event processing.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| util/dashboard.h | Extends the event enum with Page Up/Down and an ignore event. |
| util/dashboard.c | Adds ENOBUFS-as-rescan handling and extends escape-sequence parsing to recognize Page Up/Down. |
| nvme-print-stdout-top.c | Implements paging logic and adjusts topology rescan/selection behavior when switching screens. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| switch (event) { | ||
| case EVENT_TYPE_ERROR: /* fall through */ | ||
| case EVENT_TYPE_KEY_QUIT: /* fall through */ | ||
| case EVENT_TYPE_TIMEOUT: | ||
| return event; | ||
| default: |
| switch (event) { | ||
| case EVENT_TYPE_ERROR: /* fall through */ | ||
| case EVENT_TYPE_KEY_QUIT: /* fall through */ | ||
| case EVENT_TYPE_TIMEOUT: | ||
| return event; | ||
| default: |
| case EVENT_TYPE_ERROR: /* fall through */ | ||
| case EVENT_TYPE_KEY_QUIT: | ||
| return event; | ||
| case EVENT_TYPE_TIMEOUT: | ||
| return EVENT_TYPE_KEY_ESC; |
| subsys_idx = 0; | ||
| subsys_arr = stdout_top_build_subsys_arr(ctx, | ||
| &num_subsys); | ||
| if (!subsys_arr) |
This series contains three patches that improve usability and robustness of the nvme-top dashboard.
The first patch fixes an issue when returning from the subsystem topology dashboard to the subsystem list dashboard. If the NVMe topology changes while the subsystem dashboard is active, the previously selected subsystem may move, particularly when the subsystem list spans multiple page frames. Since the subsystem list is rebuilt when switching back and nvme-top does not maintain page-to-screen mappings, preserving the previous selection is not feasible. Instead, restore the selected subsystem as the first entry in the list. This change lays the groundwork for the paging support introduced in the next patch.
The second patch adds Page Up and Page Down support to the nvme-top dashboard. This makes navigation significantly more efficient when the host contains a large number of NVMe subsystems or when a subsystem contains tens or hundreds of namespaces.
The final patch fixes the netlink uevent handling by treating ENOBUFS as a non-fatal condition. An ENOBUFS error indicates that one or more kernel kobject uevents were dropped because the socket receive buffer overflowed. Rather than terminating the event processing, nvme-top synthesizes an NVMe kobject change event and triggers a topology rescan, ensuring that the displayed topology remains consistent even when individual uevents are missed.