bench: lock-manager probe + deadlock-detect knob + sizing fix#27
Merged
Conversation
The TPROC drivers opened the environment with default lock-region sizing (~1000 locks/objects/lockers) and a tiny default log buffer. A batched bulk load or a many-thread run exhausts those entries and fails mid-run with ENOMEM (BDB2055 'Lock table is out of available lock entries', BDB1501 'Logging region out of memory'), and an unchecked failure during populate could leave a partially built environment that crashes on reuse. Size the lock subsystem (200k locks/objects/lockers) and the log buffer (16MB) when the corresponding subsystems are enabled. Verified populate + run at scale 5 and 50.
Add a -D N toggle to the shared harness: 0 (default) keeps BDB's detect-on-every-conflict behavior; N>0 disables inline detection and runs a background deadlock detector every N ms instead. Lets a run A/B the cost of synchronous vs periodic deadlock detection. Measurement tooling only; no engine change. (A/B on a 12-core box found the two modes within noise on the contended debit/credit workload.)
A micro-benchmark that exercises the lock subsystem in isolation: each thread allocates its own locker and loops lock_get/lock_put on either per-thread (distinct, no-conflict) or shared read objects, with no access method or buffer pool in the path. This exposes lock-manager scaling that btree-bound workloads (e.g. scale_bench rrand) hide behind page cache misses. On a 24-thread box it shows the per-op global locker mutex plateauing throughput at ~8 threads.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
bench: lock-manager probe, deadlock-detection knob, and sizing fix
Benchmark tooling and a fix that came out of the lock-manager scaling
investigation. No engine change — the engine win it found is in #28.
sizing (~1000 locks) and a tiny log buffer, so a batched bulk load or
many-thread run fails mid-run with ENOMEM. Size the lock subsystem and log
buffer when those subsystems are enabled.
lock_bench: a direct lock-manager probe — each thread allocates itsown locker and loops
lock_get/lock_puton per-thread (no-conflict) orshared read objects, bypassing the access methods and buffer pool so the
lock subsystem's own scaling is measured in isolation. This is what exposed
the global-locker-mutex bottleneck that btree-bound workloads hide.
-D Nknob: A/B BDB's detect-on-every-conflict default against abackground detector every N ms.
Verified: clean build; the TPROC drivers populate and run at scale; the probe
runs across thread counts.