Add support for unsigned bit-fields in structs#154
Open
KimayaBedarkar wants to merge 5 commits into
Open
Conversation
c39f0e0 to
6cdaa3d
Compare
Contributor
|
This does not seem to work with typedefs: #include "pal.h"
#include <stdint.h>
struct mine {
uint16_t a : 2;
};
void should_work(struct mine *s) {
_assert(s->a <= 4);
} |
Add a `FieldT::BitField { name, ty, width }` variant to the struct-field
IR and thread it through the passes. A C bit-field is not separately
addressable (`&s.f` is illegal) and shares storage units, so it does not
fit PAL's `ref T` + `pts_to` per-field model.
Instead, each unsigned bit-field is modeled as an ordinary scalar record
slot whose backing cell is the declared underlying machine type refined
to the width, e.g. `unsigned int a : N` becomes `(v:UInt32.t{UInt32.v v <
pow2 N})`. Reads are then conversion-free: the stored value already has
the bit-field's logical unsigned type.
This commit covers the read/representation side only:
- the IR variant and its `name`/`is_array`/`fixed_array_info`/
`logical_type` accessors;
- pretty-printing;
- `check`/`elab`/`prune` match arms;
- the emit field helpers (`emit_field_default`,
`emit_field_record_type`, `emit_field_projection_type`, and the new
`emit_bitfield_value_type`) that produce the refined cell type.
No frontend produces bit-fields yet and writes are not masked; those
follow in subsequent commits.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Detect `f->isBitField()` while walking record declarations and build the
new `BitField` IR node. A shared `addRecordField` helper now handles both
structs and unions:
- anonymous / zero-width padding bit-fields are skipped (no accessible
value);
- signed bit-fields are rejected (sign-extension on read and
implementation-defined out-of-range writes, C11 6.3.1.3p3);
- union bit-fields are rejected (would need the value encoding layered
on the union variant model);
- unsigned (and `_Bool`) bit-fields become `field_bitfield(name, ty,
width)`, with the width read via `getBitWidthValue()`.
The plumbing is a `DeclBuilder::field_bitfield` FFI entry in `iface.zng`
and its `clang.rs` implementation. `getBitWidthValue()` is used in its
LLVM 20+ no-argument form (the build already requires clang >= 20).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add `Pulse.Lib.C.BitField` providing `mask_u8/16/32/64`. Assigning `v` to
an `n`-bit unsigned bit-field stores its low `n` bits (`v % pow2 n`, C
unsigned modular truncation); PAL backs each bit-field by a range-refined
cell `(x:UW.t{UW.v x < pow2 n})`, so the masked result must carry that
bound.
Each helper truncates by modular arithmetic (`rem` by `pow2 n`),
mirroring how `add_wrap`/`sub_wrap`/`mul_wrap` model unsigned overflow via
the modular primitives rather than bitwise ops. The `< pow2 n` bound then
holds by "mod by a positive" and the exact value `UW.v r == UW.v v %
pow2 n` comes straight from the `rem` postcondition; the `n = W` case
(field as wide as its storage) is the identity.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Route an assignment to a struct member whose field is an unsigned bit-field through `Pulse.Lib.C.BitField.mask_uW width rhs` before the store, so the value stored in the range-refined cell satisfies its `< pow2 width` refinement. This is C unsigned modular truncation on write, and mirrors how unsigned arithmetic overflow is handled via the width-selected `add_wrap`/`sub_wrap`/`mul_wrap` helpers. Adds `bitfield_member_mask` (recognizes a bit-field LHS and picks the width + helper), `get_bitfield_mask_fn` (machine width -> `mask_uW`), and the `FieldT::bit_width` accessor they use. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a `bitfields` test exercising reads (via pointer, by value, and an ordinary member alongside bit-fields), masked writes with truncation postconditions (`s->a == v % 8`, `s->nibble == v % 16`), an in-range constant write, and `_Bool` / `unsigned char` bit-fields. Document the encoding in the internals guide and list `Pulse.Lib.C.BitField` among the support libraries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6cdaa3d to
2291d56
Compare
Collaborator
Author
Should work now |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add support for unsigned bit-fields in structs
What this adds
Support for unsigned (and
_Bool) bit-fields in C structs, e.g.Before this change PAL silently mistranslated these: the
: Nwidth wasdropped and each bit-field became a full-width, separately-addressable field.
That is wrong — C bit-fields are not addressable (
&s.ais illegal) andshare storage units, so they do not fit PAL's per-field
ref T+pts_tomodel.
Encoding: refined machine cell + mask-on-write
Each unsigned bit-field is modeled as an ordinary scalar record slot whose
backing cell is the declared underlying machine type refined to the width:
(
UInt8/16/64.tfor the other underlying widths.)logical unsigned type and is used directly (via subtyping) in returns, specs,
and arithmetic.
s->a = estoresPulse.Lib.C.BitField.mask_uW N e— so the stored value satisfies the cell's< pow2 Nrefinement. This is exactly C unsigned modular truncation(
e % 2^N), and the masked value's postcondition (mask == e % pow2 N) makesthe truncation provable in user specs (
_ensures(s->a == v % 8)).0(in range for everyN >= 1).Physical bit packing / storage sharing is intentionally not modeled: each
bit-field is an independent value. This is sound because a bit-field's backing
cell is never C-address-exposed (
&s->ais illegal C and never appears in validinput).
Why this design over
FStar.UInt.uint_t nThe natural alternative is to store each bit-field as the parametric,
exactly-
n-bit integerFStar.UInt.uint_t n(=x:int{0 <= x < pow2 n}).We chose the refined machine cell instead, for three reasons:
It is the faithful model of C. A C bit-field genuinely is an
unsigned int(orunsigned char, ...) whose value happens to berange-limited.
(v:UInt32.t{UInt32.v v < pow2 N})says exactly that.uint_t nis a mathematicalnat, a value type C has no notion of.Reads are conversion-free. Because the cell already is a
UInt32.t, aread is directly usable wherever the field's
unsigned intvalue is expected— returns, specifications, and integer-promoted arithmetic — with no cast.
With
uint_t nevery read site would need auint_to_tconversion back to amachine type, adding proof obligations and noise to generated code.
Consistency with how PAL already handles unsigned overflow. PAL lowers
unsigned
+/-/*to width-selected, total library helpers(
add_wrap/sub_wrap/mul_wrap) that keep values as machine ints and encodemodular ("wrapping") semantics in their postconditions. The mask-on-write
design mirrors this one-to-one: a width-selected total helper (
mask_uW) thatkeeps the value a machine int and states its modular result (
v % pow2 N) inthe postcondition.
uint_t nwould introduce a value representation thatnothing else in PAL uses.
Why a refinement is still needed
There is no machine type for arbitrary widths (1, 3, 7, 31, ...); the smallest
is
UInt8. A bareUInt32.tonly bounds the value< pow2 32. The refinementis what narrows a machine word down to the actual
Nbits, so it carries thereal invariant regardless of which storage width backs the field.
Why the mask uses
rem, notlogandmask_uWtruncates by modular arithmetic (UW.rem v (pow2 n)) rather than abitwise
logand+logand_masklemma. This is both simpler and moreconsistent: the
< pow2 nbound falls out of "mod by a positive" and the exactvalue
v % pow2 nis therempostcondition, with no bitwise reasoning — and ituses the same modular primitives as the
add_wrap/sub_wrap/mul_wrapoverflow helpers.
Scope and rejections
_Boolbit-fields are supported.sign-extension and their out-of-range writes are implementation-defined
(C11 6.3.1.3p3), so they are not portable to bake into a verification model.
the union variant model).
int : 3;,int : 0;) are padding andare skipped entirely — no record slot, no accessors.
What it gives us
specifications (
_ensures(s->nibble == v % 16)).unsigned-overflow handling, keeping the generated Pulse idiomatic.
Commits
ir:model unsigned bit-fields as refined machine cells (IR variant, passes,emit field/representation helpers).
frontend:parse unsigned bit-fields; reject signed/union; skip padding.pulse:addPulse.Lib.C.BitFieldtruncation helpers (mask_u8/16/32/64).emit:mask unsigned bit-field writes (C truncation on store).test+docs:bitfieldstest + internals documentation.