Skip to content

bench: lock-manager probe + deadlock-detect knob + sizing fix#27

Merged
gburd merged 3 commits into
masterfrom
perf/lock-scaling
Jun 20, 2026
Merged

bench: lock-manager probe + deadlock-detect knob + sizing fix#27
gburd merged 3 commits into
masterfrom
perf/lock-scaling

Conversation

@gburd

@gburd gburd commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

bench: lock-manager probe, deadlock-detection knob, and sizing fix

Benchmark tooling and a fix that came out of the lock-manager scaling
investigation. No engine change — the engine win it found is in #28.

  • fix: the TPROC drivers opened environments with default lock-region
    sizing (~1000 locks) and a tiny log buffer, so a batched bulk load or
    many-thread run fails mid-run with ENOMEM. Size the lock subsystem and log
    buffer when those subsystems are enabled.
  • lock_bench: a direct lock-manager probe — each thread allocates its
    own locker and loops lock_get/lock_put on per-thread (no-conflict) or
    shared read objects, bypassing the access methods and buffer pool so the
    lock subsystem's own scaling is measured in isolation. This is what exposed
    the global-locker-mutex bottleneck that btree-bound workloads hide.
  • -D N knob: A/B BDB's detect-on-every-conflict default against a
    background detector every N ms.

Verified: clean build; the TPROC drivers populate and run at scale; the probe
runs across thread counts.

gburd added 3 commits June 19, 2026 21:09
The TPROC drivers opened the environment with default lock-region sizing
(~1000 locks/objects/lockers) and a tiny default log buffer.  A batched
bulk load or a many-thread run exhausts those entries and fails mid-run
with ENOMEM (BDB2055 'Lock table is out of available lock entries',
BDB1501 'Logging region out of memory'), and an unchecked failure during
populate could leave a partially built environment that crashes on reuse.

Size the lock subsystem (200k locks/objects/lockers) and the log buffer
(16MB) when the corresponding subsystems are enabled.  Verified populate +
run at scale 5 and 50.
Add a -D N toggle to the shared harness: 0 (default) keeps BDB's
detect-on-every-conflict behavior; N>0 disables inline detection and runs a
background deadlock detector every N ms instead.  Lets a run A/B the cost of
synchronous vs periodic deadlock detection.

Measurement tooling only; no engine change.  (A/B on a 12-core box found the
two modes within noise on the contended debit/credit workload.)
A micro-benchmark that exercises the lock subsystem in isolation: each
thread allocates its own locker and loops lock_get/lock_put on either
per-thread (distinct, no-conflict) or shared read objects, with no access
method or buffer pool in the path.  This exposes lock-manager scaling that
btree-bound workloads (e.g. scale_bench rrand) hide behind page cache
misses.  On a 24-thread box it shows the per-op global locker mutex
plateauing throughput at ~8 threads.
@gburd gburd changed the title bench: lock-subsystem sizing fix + deadlock-detection knob perf(lock): shared locker latch on lock-get hot path (+ bench fixes/tooling) Jun 20, 2026
@gburd gburd force-pushed the perf/lock-scaling branch from b3190af to ec259f1 Compare June 20, 2026 19:15
@gburd gburd changed the title perf(lock): shared locker latch on lock-get hot path (+ bench fixes/tooling) bench: lock-manager probe + deadlock-detect knob + sizing fix Jun 20, 2026
@gburd gburd merged commit c3a36a2 into master Jun 20, 2026
36 of 39 checks passed
@gburd gburd deleted the perf/lock-scaling branch June 20, 2026 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant