Add support for unsigned bit-fields in structs by KimayaBedarkar · Pull Request #154 · FStarLang/pal

KimayaBedarkar · 2026-06-30T18:12:43Z

Add support for unsigned bit-fields in structs

What this adds

Support for unsigned (and _Bool) bit-fields in C structs, e.g.

struct flags {
    unsigned int a : 3;      // value in [0, 8)
    unsigned int nibble : 4; // value in [0, 16)
    _Bool  ready : 1;        // 0 or 1
    int    normal;           // ordinary field, unchanged
};

Before this change PAL silently mistranslated these: the : N width was
dropped and each bit-field became a full-width, separately-addressable field.
That is wrong — C bit-fields are not addressable (&s.a is illegal) and
share storage units, so they do not fit PAL's per-field ref T + pts_to
model.

Encoding: refined machine cell + mask-on-write

Each unsigned bit-field is modeled as an ordinary scalar record slot whose
backing cell is the declared underlying machine type refined to the width:

unsigned int a : N   ==>   (v:UInt32.t{UInt32.v v < pow2 N})

(UInt8/16/64.t for the other underlying widths.)

Reads are conversion-free: the stored value already has the bit-field's
logical unsigned type and is used directly (via subtyping) in returns, specs,
and arithmetic.
Writes mask the RHS to the width — s->a = e stores
Pulse.Lib.C.BitField.mask_uW N e — so the stored value satisfies the cell's
< pow2 N refinement. This is exactly C unsigned modular truncation
(e % 2^N), and the masked value's postcondition (mask == e % pow2 N) makes
the truncation provable in user specs (_ensures(s->a == v % 8)).
Defaults are 0 (in range for every N >= 1).

Physical bit packing / storage sharing is intentionally not modeled: each
bit-field is an independent value. This is sound because a bit-field's backing
cell is never C-address-exposed (&s->a is illegal C and never appears in valid
input).

Why this design over `FStar.UInt.uint_t n`

The natural alternative is to store each bit-field as the parametric,
exactly-n-bit integer FStar.UInt.uint_t n (= x:int{0 <= x < pow2 n}).
We chose the refined machine cell instead, for three reasons:

It is the faithful model of C. A C bit-field genuinely is an
unsigned int (or unsigned char, ...) whose value happens to be
range-limited. (v:UInt32.t{UInt32.v v < pow2 N}) says exactly that.
uint_t n is a mathematical nat, a value type C has no notion of.
Reads are conversion-free. Because the cell already is a UInt32.t, a
read is directly usable wherever the field's unsigned int value is expected
— returns, specifications, and integer-promoted arithmetic — with no cast.
With uint_t n every read site would need a uint_to_t conversion back to a
machine type, adding proof obligations and noise to generated code.
Consistency with how PAL already handles unsigned overflow. PAL lowers
unsigned +/-/* to width-selected, total library helpers
(add_wrap/sub_wrap/mul_wrap) that keep values as machine ints and encode
modular ("wrapping") semantics in their postconditions. The mask-on-write
design mirrors this one-to-one: a width-selected total helper (mask_uW) that
keeps the value a machine int and states its modular result (v % pow2 N) in
the postcondition. uint_t n would introduce a value representation that
nothing else in PAL uses.

Note: an earlier justification appealed to C extraction ergonomics. That
reasoning was dropped — PAL is a one-way C→Pulse transpiler with no
KaRaMeL/C-extraction path, so extraction is not a factor in this decision.

Why a refinement is still needed

There is no machine type for arbitrary widths (1, 3, 7, 31, ...); the smallest
is UInt8. A bare UInt32.t only bounds the value < pow2 32. The refinement
is what narrows a machine word down to the actual N bits, so it carries the
real invariant regardless of which storage width backs the field.

Why the mask uses `rem`, not `logand`

mask_uW truncates by modular arithmetic (UW.rem v (pow2 n)) rather than a
bitwise logand + logand_mask lemma. This is both simpler and more
consistent: the < pow2 n bound falls out of "mod by a positive" and the exact
value v % pow2 n is the rem postcondition, with no bitwise reasoning — and it
uses the same modular primitives as the add_wrap/sub_wrap/mul_wrap
overflow helpers.

Scope and rejections

Unsigned and _Bool bit-fields are supported.
Signed bit-fields are rejected with a diagnostic: their reads need
sign-extension and their out-of-range writes are implementation-defined
(C11 6.3.1.3p3), so they are not portable to bake into a verification model.
Union bit-fields are rejected (would need the value encoding layered onto
the union variant model).
Anonymous / zero-width bit-fields (int : 3;, int : 0;) are padding and
are skipped entirely — no record slot, no accessors.

What it gives us

Correct translation of a common C idiom that was previously mistranslated.
Reads for free (subtyping, no conversions).
Writes that truncate with C unsigned modular semantics, provable in user-facing
specifications (_ensures(s->nibble == v % 16)).
An encoding that stays uniform across widths and consistent with PAL's existing
unsigned-overflow handling, keeping the generated Pulse idiomatic.

Commits

ir: model unsigned bit-fields as refined machine cells (IR variant, passes,
emit field/representation helpers).
frontend: parse unsigned bit-fields; reject signed/union; skip padding.
pulse: add Pulse.Lib.C.BitField truncation helpers (mask_u8/16/32/64).
emit: mask unsigned bit-field writes (C truncation on store).
test+docs: bitfields test + internals documentation.

gebner · 2026-07-02T23:13:56Z

This does not seem to work with typedefs:

#include "pal.h"
#include <stdint.h>

struct mine {
  uint16_t a : 2;
};

void should_work(struct mine *s) {
  _assert(s->a <= 4);
}

Add a `FieldT::BitField { name, ty, width }` variant to the struct-field IR and thread it through the passes. A C bit-field is not separately addressable (`&s.f` is illegal) and shares storage units, so it does not fit PAL's `ref T` + `pts_to` per-field model. Instead, each unsigned bit-field is modeled as an ordinary scalar record slot whose backing cell is the declared underlying machine type refined to the width, e.g. `unsigned int a : N` becomes `(v:UInt32.t{UInt32.v v < pow2 N})`. Reads are then conversion-free: the stored value already has the bit-field's logical unsigned type. This commit covers the read/representation side only: - the IR variant and its `name`/`is_array`/`fixed_array_info`/ `logical_type` accessors; - pretty-printing; - `check`/`elab`/`prune` match arms; - the emit field helpers (`emit_field_default`, `emit_field_record_type`, `emit_field_projection_type`, and the new `emit_bitfield_value_type`) that produce the refined cell type. No frontend produces bit-fields yet and writes are not masked; those follow in subsequent commits. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Detect `f->isBitField()` while walking record declarations and build the new `BitField` IR node. A shared `addRecordField` helper now handles both structs and unions: - anonymous / zero-width padding bit-fields are skipped (no accessible value); - signed bit-fields are rejected (sign-extension on read and implementation-defined out-of-range writes, C11 6.3.1.3p3); - union bit-fields are rejected (would need the value encoding layered on the union variant model); - unsigned (and `_Bool`) bit-fields become `field_bitfield(name, ty, width)`, with the width read via `getBitWidthValue()`. The plumbing is a `DeclBuilder::field_bitfield` FFI entry in `iface.zng` and its `clang.rs` implementation. `getBitWidthValue()` is used in its LLVM 20+ no-argument form (the build already requires clang >= 20). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add `Pulse.Lib.C.BitField` providing `mask_u8/16/32/64`. Assigning `v` to an `n`-bit unsigned bit-field stores its low `n` bits (`v % pow2 n`, C unsigned modular truncation); PAL backs each bit-field by a range-refined cell `(x:UW.t{UW.v x < pow2 n})`, so the masked result must carry that bound. Each helper truncates by modular arithmetic (`rem` by `pow2 n`), mirroring how `add_wrap`/`sub_wrap`/`mul_wrap` model unsigned overflow via the modular primitives rather than bitwise ops. The `< pow2 n` bound then holds by "mod by a positive" and the exact value `UW.v r == UW.v v % pow2 n` comes straight from the `rem` postcondition; the `n = W` case (field as wide as its storage) is the identity. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Route an assignment to a struct member whose field is an unsigned bit-field through `Pulse.Lib.C.BitField.mask_uW width rhs` before the store, so the value stored in the range-refined cell satisfies its `< pow2 width` refinement. This is C unsigned modular truncation on write, and mirrors how unsigned arithmetic overflow is handled via the width-selected `add_wrap`/`sub_wrap`/`mul_wrap` helpers. Adds `bitfield_member_mask` (recognizes a bit-field LHS and picks the width + helper), `get_bitfield_mask_fn` (machine width -> `mask_uW`), and the `FieldT::bit_width` accessor they use. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add a `bitfields` test exercising reads (via pointer, by value, and an ordinary member alongside bit-fields), masked writes with truncation postconditions (`s->a == v % 8`, `s->nibble == v % 16`), an in-range constant write, and `_Bool` / `unsigned char` bit-fields. Document the encoding in the internals guide and list `Pulse.Lib.C.BitField` among the support libraries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

KimayaBedarkar · 2026-07-03T00:19:33Z

This does not seem to work with typedefs:

#include "pal.h"
#include <stdint.h>

struct mine {
  uint16_t a : 2;
};

void should_work(struct mine *s) {
  _assert(s->a <= 4);
}

Should work now

KimayaBedarkar changed the title ~~Add support for unsigned bit-fields in structs~~ Draft: Add support for unsigned bit-fields in structs Jun 30, 2026

KimayaBedarkar force-pushed the kimaya/bitfields branch 2 times, most recently from c39f0e0 to 6cdaa3d Compare July 1, 2026 17:49

KimayaBedarkar changed the title ~~Draft: Add support for unsigned bit-fields in structs~~ Add support for unsigned bit-fields in structs Jul 1, 2026

KimayaBedarkar and others added 5 commits July 2, 2026 16:50

KimayaBedarkar force-pushed the kimaya/bitfields branch from 6cdaa3d to 2291d56 Compare July 3, 2026 00:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for unsigned bit-fields in structs#154

Add support for unsigned bit-fields in structs#154
KimayaBedarkar wants to merge 5 commits into
mainfrom
kimaya/bitfields

KimayaBedarkar commented Jun 30, 2026 •

edited

Loading

Uh oh!

gebner commented Jul 2, 2026

Uh oh!

KimayaBedarkar commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

KimayaBedarkar commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add support for unsigned bit-fields in structs

What this adds

Encoding: refined machine cell + mask-on-write

Why this design over FStar.UInt.uint_t n

Why a refinement is still needed

Why the mask uses rem, not logand

Scope and rejections

What it gives us

Commits

Uh oh!

gebner commented Jul 2, 2026

Uh oh!

KimayaBedarkar commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KimayaBedarkar commented Jun 30, 2026 •

edited

Loading

Why this design over `FStar.UInt.uint_t n`

Why the mask uses `rem`, not `logand`