perf(lock): shared locker latch on lock-get hot path (2.7x at 24t) by gburd · Pull Request #28 · berkeleydb/libdb

gburd · 2026-06-20T19:15:41Z

perf(lock): take the locker mutex shared on the lock-get hot path

Every DB_ENV->lock_get/lock_put resolves its locker through
__lock_getlocker_int under the region-global locker mutex mtx_lockers.
On the lock-get path that lookup is create=0 — a read-only walk of a
locker hash bucket — yet it was held exclusive, serializing every lock
acquisition across all cores even with objects fully partitioned (240-way)
and zero lock conflict.

Fix

Make mtx_lockers a DB_MUTEX_SHARED latch and take it shared for the
read-only locker lookup on the hot path (__lock_get_api). Locker
create/free, the deadlock detector's locker-list walk, failchk, and stat keep
it exclusive, so a reader never runs concurrently with a writer.

Measured (`lab/bench/lock_bench` distinct, no conflict, 24-thread box)

threads	master	this branch	upper bound*
1	1.38M	1.27M	1.87M
8	3.03M	6.36M (2.1×)	10.1M
24	2.60M	7.01M (2.7×)	16.1M

Master plateaus and declines past 8 threads; the shared latch scales to
24 threads. *Upper bound = removing the mutex entirely (unsafe diagnostic);
the shared latch captures ~half, the rest needs partitioning the locker hash
(deferred — more invasive). Single-thread cost rises ~8% (shared vs plain
mutex, uncontended), dwarfed by the multi-core gain.

No regression on real workloads: rrand unchanged (btree-bound), tproc_b
flat (deadlock/disk-bound) — helps where the bottleneck is, costs nothing
elsewhere.

Verified

TCL lock001/002/003 (incl. the multi-process test), txn001/002,
test001, ssi001/002 pass; concurrent shared read-lock acquisition runs
clean; clean build (gcc via Nix, Apple clang).

The probe and benchmark fixes used to find this are in #27.

Every DB_ENV->lock_get / lock_put resolves its locker through __lock_getlocker_int under the region-global locker mutex (mtx_lockers). On the lock-get path the lookup is create=0 -- a read-only walk of the locker hash bucket -- yet it was held *exclusive*, serializing every lock acquisition across all cores even when objects are fully partitioned and there is no lock conflict. Make mtx_lockers a DB_MUTEX_SHARED latch and take it in shared mode for the read-only locker lookup on the hot path (__lock_get_api). Locker create, free, the deadlock detector's locker-list walk, failchk, and stat continue to hold it exclusive, so they never run concurrently with a reader. Measured with lab/bench/lock_bench (distinct mode, no lock conflict, on a 24-thread box): master plateaus and then declines past 8 threads (~3.0M ops/s peak, 2.6M at 24t); the shared latch scales to 7.0M at 24t -- 2.1x at 8 threads, 2.7x at 24. It captures roughly half the upper bound of removing the mutex entirely; the remainder is the shared latch's own reference-count cache line, which would require partitioning the locker hash to recover (left for later -- this is the low-risk 80/20). A small single-thread regression (~8%) reflects the shared latch's slightly higher uncontended cost and is dwarfed by the multi-core gain. Verified: TCL lock001/002/003 (incl. multi-process), txn001/002, test001, ssi001/002 pass; concurrent shared read-lock acquisition (lock_bench shared) runs clean.

gburd mentioned this pull request Jun 20, 2026

bench: lock-manager probe + deadlock-detect knob + sizing fix #27

Merged

gburd merged commit 8f207cf into master Jun 20, 2026
36 of 39 checks passed

gburd deleted the perf/lock-shared-latch branch June 20, 2026 19:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(lock): shared locker latch on lock-get hot path (2.7x at 24t)#28

perf(lock): shared locker latch on lock-get hot path (2.7x at 24t)#28
gburd merged 1 commit into
masterfrom
perf/lock-shared-latch

gburd commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gburd commented Jun 20, 2026

perf(lock): take the locker mutex shared on the lock-get hot path

Fix

Measured (lab/bench/lock_bench distinct, no conflict, 24-thread box)

Verified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Measured (`lab/bench/lock_bench` distinct, no conflict, 24-thread box)