Skip to content

fix(header-serde): change train() to return Result and skip subdirectories#929

Open
kushdab wants to merge 2 commits into
cloudflare:mainfrom
kushdab:fix/header-serde-train-result
Open

fix(header-serde): change train() to return Result and skip subdirectories#929
kushdab wants to merge 2 commits into
cloudflare:mainfrom
kushdab:fix/header-serde-train-result

Conversation

@kushdab

@kushdab kushdab commented Jun 30, 2026

Copy link
Copy Markdown

Summary

Fixes #925.

The dict::train() function had two related TODOs:

  1. Skip subdirectories fs::read_dir yields both files and directories. Passing a directory path to zstd::dict::from_files causes an IO error at runtime. The fix adds an is_file() guard so only regular files are included in the training set.

  2. Return Result instead of panicking the two .unwrap() calls (one on read_dir, one on dict::from_files) caused a silent process-kill on any bad path or training failure. The return type is now Result<Vec<u8>> using the crate's existing pingora_error machinery (OrErr::explain_err).

Changes

pingora-header-serde/src/dict.rs

  • Return type: Vec<u8>Result<Vec<u8>>
  • fs::read_dir(...).unwrap().explain_err(InternalError, ...)?
  • Added .is_file() filter in the filter_map closure to skip subdirectories
  • dict::from_files(...).unwrap().explain_err(InternalError, ...)
  • Updated existing test helper from train(path) to train(path).expect(...)
  • Added test_train_skips_subdirectories regression test

pingora-header-serde/src/trainer.rs

  • main() now returns Result<(), Box<dyn std::error::Error>> and propagates errors with ?, so the binary exits with a non-zero code and a readable message instead of panicking silently

kushdab added 2 commits June 30, 2026 04:28
- Filters out subdirectories from the training set (was the first TODO)
- Changes the return type from Vec<u8> to Result<Vec<u8>> and replaces
  the two .unwrap() calls with .explain_err() / ? (second TODO)
- Updates the existing test helper to use .expect() and adds a new
  test that asserts train() returns Ok on a known-good directory

Closes cloudflare#925
Now that dict::train() returns Result<Vec<u8>>, change main() to return
Result<(), Box<dyn std::error::Error>> so the error propagates to the
OS exit code instead of being silently ignored.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Finishing TODO in pingora-header-serde for checking if path is a file + avoiding panics

1 participant