Remove global rmem page slab by ianks · Pull Request #392 · msgpack/msgpack-ruby

ianks · 2026-06-10T17:30:13Z

The notable change here is the removal of the msgpack_rmem_*` routines, which serve as a mechanism for efficiently providing chunks of memory for decoding work.

Why is it not ractor safe?

The old page-recycling slab is a process-global msgpack_rmem_t mutated through an unsynchronized bitmask, so parallel Ractors would race on it.

How did you address this?

Drop the global slab entirely and use plain xmalloc/xfree. Modern arena-based mallocs (i.e. jemalloc) are good at recycling and avoiding thread contention, so maintaining a custom slab allocator is not worth it.

Did you try alternatives?

Yes:

Per-instance slab on the factory - turned out to be slightly slower, and more memory intensive

No:

Lock-free concurrent linked list - too much work for something may not show much improvement over malloc(3)

Perf (local HTTP requests)

⚠️ TAKE THESE WITH A BIG GRAIN OF SALT, THE ERROR BARS ARE BIG
⚠️ THE GOAL OF THIS PR IS RACTOR COMPATIBILITY NOT PERF

Single-threaded, jemalloc 5.3, decode-heavy workload

On a realistic, "real" HTTP request benchmark, bare-xmalloc is maybe a touch faster, but mostly noise in the diff:

warm, paired (no-slab this PR vs slab, not ractor-safe)	Δ	95% CI
wall time	−0.1%	−0.8% to +4.9%
CPU	−0.3%	−1.9% to +3.9%
allocations	identical	—

RSS Impact

Memory usage seems fine as well, with the bare-xmalloc (this PR) having a higher peak as decay time increases (this is expected, and not a problem).

sustained decode	slab (today)	xmalloc (this PR)
`dirty_decay_ms:10000` (default)	43 MiB	102 MiB
`dirty_decay_ms:1000`	33 MiB	34 MiB
peak (live working set)	~113 MiB	~114 MiB

Microbenchmarks (`ruby --yjit`)

Generated by /tmp/msgpack_format_yjit_pr_body.py from raw benchmark/ips output; table values were parsed, not hand-entered.

Ruby: ruby 4.0.4 (2026-05-12 revision b89eb1bcbf) +YJIT +PRISM [arm64-darwin25]
Command per ref: bundle exec rake compile && bundle exec ruby --yjit -Ilib -Iext <generated bench>
Benchmark config: benchmark/ips warmup 15s, measurement 60s per case
Started: 2026-06-15T22:14:36Z; finished: 2026-06-15T22:29:47Z
origin/master: 09c914d
PR branch: b82fefc

benchmark	origin/master	PR branch	delta
pack-plain	5.729M i/s (±2.7%)	5.349M i/s (±1.3%)	-6.6%
pack-structured	2.873M i/s (±2.8%)	2.787M i/s (±1.2%)	-3.0%
pack-extended	2.077M i/s (±1.8%)	1.913M i/s (±2.1%)	-7.9%
unpack-plain	4.809M i/s (±1.1%)	4.553M i/s (±1.4%)	-5.3%
unpack-structured	1.100M i/s (±1.6%)	1.058M i/s (±1.2%)	-3.8%
unpack-extended	1.523M i/s (±3.2%)	1.462M i/s (±2.6%)	-4.0%

Raw benchmark output

origin/master

Warming up --------------------------------------
          pack-plain   500.901k i/100ms
     pack-structured   274.579k i/100ms
       pack-extended   201.774k i/100ms
        unpack-plain   437.897k i/100ms
   unpack-structured   108.509k i/100ms
     unpack-extended   149.952k i/100ms
Calculating -------------------------------------
          pack-plain      5.729M (± 2.7%) i/s -    343.618M in  60.030490s
     pack-structured      2.873M (± 2.8%) i/s -    172.436M in  60.066053s
       pack-extended      2.077M (± 1.8%) i/s -    124.696M in  60.046588s
        unpack-plain      4.809M (± 1.1%) i/s -    288.574M in  60.010534s
   unpack-structured      1.100M (± 1.6%) i/s -     66.082M in  60.098069s
     unpack-extended      1.523M (± 3.2%) i/s -     91.321M in  60.032914s

PR branch

Warming up --------------------------------------
          pack-plain   419.123k i/100ms
     pack-structured   261.892k i/100ms
       pack-extended   186.273k i/100ms
        unpack-plain   422.506k i/100ms
   unpack-structured   103.562k i/100ms
     unpack-extended   144.592k i/100ms
Calculating -------------------------------------
          pack-plain      5.349M (± 1.3%) i/s -    321.048M in  60.029030s
     pack-structured      2.787M (± 1.2%) i/s -    167.349M in  60.053708s
       pack-extended      1.913M (± 2.1%) i/s -    114.744M in  60.000081s
        unpack-plain      4.553M (± 1.4%) i/s -    273.361M in  60.055928s
   unpack-structured      1.058M (± 1.2%) i/s -     63.484M in  60.031311s
     unpack-extended      1.462M (± 2.6%) i/s -     87.767M in  60.056025s

byroot · 2026-06-13T06:30:29Z

So the memory arena is something I've always been dubious of the benefit of, so I wouldn't mind getting rid of it.

That being said, MessagePack needs way more changes than this to be usable in Ractors.

#390 is on my TODO list, I'll come around to it eventually.

The page-recycling slab was a process-global msgpack_rmem_t mutated through an unsynchronized bitmask, so concurrent packing or unpacking could race on it. Drop the slab and serve rmem pages from xmalloc/xfree instead. Modern arena-based mallocs are good at recycling these allocations without maintaining process-global mutable state in msgpack-ruby.

ianks · 2026-06-15T22:54:46Z

@byroot Added micro-benchmarks to the PR body. On m4 pro with jemalloc, it's slighly faster to just plain xmalloc 🤷🏻

Also, I've removed all Ractor references from this PR, so it should be independently mergeable.

Hope you are well!

ianks · 2026-06-15T23:31:45Z

That being said, MessagePack needs way more changes than this to be usable in Ractors.

After looking through the C code a bit more, I didn't find any remaining global mutable state that would make rb_ext_ractor_safe(true) obviously unsafe. Curious if you know of anything?

The non-shareable MessagePack::DefaultFactory is a separate ergonomics concern IME, which can be addressed in another PR

I'm tempted to add rb_ext_ractor_safe(true) back if you agree the rest of the C code Ractor-safe

ianks force-pushed the ractor-safe-rmem branch from cc53acc to d1c9fce Compare June 10, 2026 20:43

ianks changed the title ~~Make the C extension Ractor-safe~~ Make the C extension Ractor-safe by removing msgpack_rmem_* slab allocator Jun 10, 2026

ianks marked this pull request as ready for review June 10, 2026 22:06

ianks force-pushed the ractor-safe-rmem branch 3 times, most recently from bb0785b to b82fefc Compare June 11, 2026 02:46

ianks force-pushed the ractor-safe-rmem branch from b82fefc to 335b893 Compare June 15, 2026 22:53

ianks force-pushed the ractor-safe-rmem branch from 335b893 to b0bc81c Compare June 15, 2026 22:55

ianks changed the title ~~Make the C extension Ractor-safe by removing msgpack_rmem_* slab allocator~~ Remove global rmem page slab Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove global rmem page slab#392

Remove global rmem page slab#392
ianks wants to merge 1 commit into
msgpack:masterfrom
ianks:ractor-safe-rmem

ianks commented Jun 10, 2026 •

edited

Loading

Uh oh!

byroot commented Jun 13, 2026

Uh oh!

ianks commented Jun 15, 2026 •

edited

Loading

Uh oh!

ianks commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ianks commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why is it not ractor safe?

How did you address this?

Did you try alternatives?

Yes:

No:

Perf (local HTTP requests)

RSS Impact

Microbenchmarks (ruby --yjit)

origin/master

PR branch

Uh oh!

byroot commented Jun 13, 2026

Uh oh!

ianks commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianks commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ianks commented Jun 10, 2026 •

edited

Loading

Microbenchmarks (`ruby --yjit`)

ianks commented Jun 15, 2026 •

edited

Loading