自适应锁策略：低竞争短临界区短暂自旋，高竞争长临界区回退互斥睡眠

在高并发系统中，选择合适的同步原语至关重要。自旋锁（spinlock）和互斥锁（mutex）各有优劣，前者忙等待节省上下文切换开销但消耗 CPU，后者睡眠节省 CPU 但引入 syscall 和调度延迟。自适应锁策略的核心观点是：在低竞争、短临界区路径上短暂自旋以优化尾延迟，在高竞争或长临界区时快速回退到 mutex 睡眠，从而兼顾吞吐量和延迟。

这一策略源于实际工程观察：perf top 显示 pthread_mutex_lock 占 60% CPU 时，纯 spinlock 易导致多核满载，而纯 mutex 在纳秒级临界区下尾延迟爆炸。自旋锁通过原子 CAS（如 LOCK CMPXCHG）实现无 syscall 快速获取，失败时每轮40-80ns 缓存线弹跳；mutex 无竞争时仅 25-50ns，但竞争时 futex (FUTEX_WAIT) syscall + 3-5μs 上下文切换。glibc pthread mutex 默认采用自适应模式，先原子尝试，若失败则短暂自旋 N 次（默认100）后 futex 睡眠，正好契合该策略。

证据支持这一观点：在基准测试中，4 线程竞争 100ns 临界区，spinlock ops/sec 更高但 CPU 100%；mutex ops/sec 稍低但 CPU 闲置，可跑其他任务。高竞争 8 线程下，mutex 优于 spinlock，因自旋浪费电能无收益。PostgreSQL LWLock 即 hybrid：查找用 spin（ns 级），IO 用 mutex（ms 级）；Redis 微队列用 spinlock。Linux kernel mutex 有乐观自旋（CONFIG_MUTEX_SPIN_ON_OWNER），持有者运行时自旋等待，融合两者优点。

为落地该策略，给出可操作参数与清单。首先，实现自适应 spinlock：用 C11 atomic_compare_exchange_weak 循环，自旋前加 pause 指令降低功耗，指数退避避免 Thundering Herd。参数阈值：

自旋迭代上限：50-200，低核少、高核多（单核设 1，直接 mutex）。
每个迭代 pause 次数：1-4，Intel PAUSE hint 优化 spin-wait loop。
回退条件：迭代超限或 need_resched ()，futex 睡眠。
锁对齐：attribute((aligned (64))) 避 false sharing。

代码模板（用户态）：

typedef struct { atomic_int lock; } adaptive_lock_t;
void adaptive_lock(adaptive_lock_t *l) {
    if (atomic_exchange(&l->lock, 1) == 0) return;  // fastpath
    int spins = 0;
    while (atomic_compare_exchange_weak(&l->lock, &(int){0}, 1)) {
        if (++spins > 100 || sched_getcpu() == prev_cpu) {  // 动态阈值
            futex_wait(&l->lock, 1);  // 伪码，回退
            break;
        }
        for (volatile int p=0; p<4; p++) pause();
    }
}

单核检测：sysconf (_SC_NPROCESSORS_ONLN)==1，直接用 PTHREAD_MUTEX_INITIALIZER。

监控与诊断清单：

perf stat -e context-switches,cache-misses,cycles：高 ctx-switch 低 CPU→mutex overhead，试自旋；高 cache-miss 100% CPU→spin bounce，shard 锁或 mutex。
strace -c：futex 调用 > 10^6/s→热锁，考虑无锁或分片。
/proc/PID/status：voluntary_ctxt_switches 高→mutex 正常；involuntary 高→spin 中 preempt。
尾延迟 P99：>1ms 且 mutex 相关→增自旋阈值至 150。
阈值调优：基准测试 hold_time=50ns/1us/10us，测 throughput/tail_lat，选最佳 spin_limit。

风险与回滚：用户态 spin 易 preempt，长 spin（>10ms）灾难；优先级反转用 PI mutex（PTHREAD_MUTEX_RECURSIVE）。部署时默认 spin=64，回滚纯 mutex 若 CPU>80%。jemalloc 示例：mutex 用 adaptive_np，自旋用 spin_t iteration 指数退避。

最后，资料来源：How Tech 文章《Spinlocks vs. Mutexes》基准代码与 heuristics；Linux man pthread_spin_lock (3)、futex (2)；jemalloc 源码 mutex/spin 实现。