Fix experimental verify with norandommap#2110
Conversation
experimental_verify replays the workload instead of keeping fio's normal per-write io_piece history. With norandommap, random writes may overwrite the same offset multiple times and leave other offsets untouched. During verify replay, fio can then read an offset that was never written and report a false "bad magic header 0" failure. Add a per-file bitmap for experimental_verify + norandommap to record which blocks were actually written. During replay, skip offsets that were never written. Mark bits only after writes are accepted for queueing or completion so serialize_overlap/FIO_Q_BUSY requeues do not create false written entries. Signed-off-by: Yehor Malikov <Yehor.Malikov@solidigm.com>
A verify_only run starts in a fresh fio process, so it cannot reuse the bitmap built during the original write run. For experimental_verify with norandommap, rebuild the bitmap during the dry-run pass by replaying the write workload without issuing writes. Reset replay-sensitive counters and file state before the verify read pass so experimental_verify replays the same offset sequence again and checks only blocks that were actually written. Signed-off-by: Yehor Malikov <Yehor.Malikov@solidigm.com>
Enable verify_only coverage for experimental_verify now that replay state is reset correctly, and add an experimental_verify case to the verify header test matrix. Also disable write sequence checking for experimental_verify overlap-risk workloads unless explicitly requested, since duplicate overwrites cannot be sequence-verified without the full write history. Signed-off-by: Yehor Malikov <Yehor.Malikov@solidigm.com>
1b4d507 to
4b593bb
Compare
|
Maybe a bit off-topic, but does anyone have experience with or a solution for the high RAM usage of FIO verification when We also have an additional disk that could be used as a swap device, but we found that the resulting performance is quite poor. I will be very thankful for the suggestions and ideas! |
|
(
Something tells me you've already tried :-) |
Exactly :-) |
This PR fixes false verify failures when experimental_verify is used with norandommap.
We need this path to keep RAM usage low on very large devices. Standard verify stores per-write metadata in memory, which can become huge with large random-write workloads. We tried other approaches, including memory-backed storage, but performance was poor and fio runs took too long.
With norandommap, fio can overwrite the same offset multiple times and leave other offsets untouched. experimental_verify replays the workload instead of storing full write history, so it could replay and verify offsets that were never written. That caused false bad magic header 0 errors.
This patch adds a lightweight per-file bitmap to track blocks that were actually written. During replay, fio skips blocks that were never written.