Skip to content

Harden PathMap algebra and serialized layouts#45

Open
MesTTo wants to merge 3 commits into
Adam-Vandervorst:masterfrom
MesTTo:pr/pathmap-hardening
Open

Harden PathMap algebra and serialized layouts#45
MesTTo wants to merge 3 commits into
Adam-Vandervorst:masterfrom
MesTTo:pr/pathmap-hardening

Conversation

@MesTTo

@MesTTo MesTTo commented Jun 23, 2026

Copy link
Copy Markdown

Adds hardening changes without syntax changes or feature additions.

The branch fixes DenseByteNode composite operand selection and adds rejection paths for malformed serialized offsets and invalid line-list node layouts.

Validation:

  • Targeted tests for each changed path
  • cargo test --lib --tests with default features
  • jscpd on touched files

Note: cargo test --no-default-features still fails in the upstream baseline path, so it is not used as branch validation.

@MesTTo MesTTo marked this pull request as ready for review June 23, 2026 10:23
@adamv-symbolica

Copy link
Copy Markdown

The changes to LineListNode seem to hint at a bug-fix, can you give an explanation of the bug? The test for it is quite impenetrable to me.

@MesTTo

MesTTo commented Jun 23, 2026

Copy link
Copy Markdown
Author

The LineListNode change is not fixing a newly observed runtime failure by itself... It is guarding an existing layout invariant... LineListNode is written around a fixed in-memory layout. The key byte capacity is chosen so the node stays at 64 bytes with slim_ptrs on x86_64, and 48 bytes without slim_ptrs because of the DynBox header. If that size changes, the node can stop matching the assumptions behind KEY_BYTES_CNT and the cache-line/list-node packing. The old test only checked this at runtime when that test was selected; the new const assertion makes layout drift a compile-time failure. the test is not meant to prove a semantic transformation; it is just the serialised form of the malformed-offset case

Comment thread src/dense_byte_node.rs
Comment on lines 701 to +1881
@@ -1850,7 +1854,12 @@ impl<V: Clone + Send + Sync, A: Allocator, Cf: CoFree<V=V, A=A>, OtherCf: CoFree
if new_mask > 0 {
AlgebraicResult::Identity(new_mask)
} else {
AlgebraicResult::Element(Self::new(None, self.val().cloned()))
let val = if val_mask & SELF_IDENT > 0 {
self.val().cloned()
} else {
other.val().cloned()
};
AlgebraicResult::Element(Self::new(None, val))
}
},
(AlgebraicResult::Identity(rec_mask), AlgebraicResult::None) => {
@@ -1864,7 +1873,12 @@ impl<V: Clone + Send + Sync, A: Allocator, Cf: CoFree<V=V, A=A>, OtherCf: CoFree
if new_mask > 0 {
AlgebraicResult::Identity(new_mask)
} else {
AlgebraicResult::Element(Self::new(self.rec().cloned(), None))
let rec = if rec_mask & SELF_IDENT > 0 {
self.rec().cloned()
} else {
other.rec().cloned()
};
AlgebraicResult::Element(Self::new(rec, None))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I meant DenseNode -- i.e. the motivation behind these.

@MesTTo

MesTTo commented Jun 24, 2026

Copy link
Copy Markdown
Author

ah right, denseNode not linelist... that's the actual fix, sorry.

it's in combine_algebraic_results. when you combine two cofrees and one half (rec or val) comes back as Identity(mask), the mask says which operand the result equals: SELF_IDENT for self, COUNTER_IDENT for other. once the combined cofree isn't identical to either operand (new_mask == 0) it builds a fresh Element, and for the surviving half the old code just cloned self.val() / self.rec().

that's only right when the surviving half is identity-to-self. in a commutative op the half can be identity-to-other instead (SELF_IDENT cleared, COUNTER_IDENT set), and then cloning self's half drops the wrong operand's value/child into the node. so the fix reads the bit: mask & SELF_IDENT > 0 takes self, else takes other. same for the rec arm.

the giveaway is the (rec_el, val_el) arm right below... it already picks the operand by index (0 => self, 1 => other), the two identity-collapse arms just weren't doing the same. this makes them consistent.

the line 698 hunk in node_get_payloads is the same accounting on the read side: a 1-byte value request landing on a cofree that still has a rec link wasn't flagging that the rec half is there.

no change for the self-identity case, it only diverges when the surviving half matched other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants