This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

[RFC] [PATCH 0/2] nptl: Support TP futexes in pthread mutex & rwlock

From: Waiman Long <longman at redhat dot com>

To: GLIBC Devel <libc-alpha at sourceware dot org>

Cc: Torvald Riegel <triegel at redhat dot com>, Carlos O'Donell <codonell at redhat dot com>, Thomas Gleixner <tglx at linutronix dot de>, Waiman Long <longman at redhat dot com>

Date: Wed, 25 Oct 2017 10:10:02 -0400

Subject: [RFC] [PATCH 0/2] nptl: Support TP futexes in pthread mutex & rwlock

Authentication-results: sourceware.org; auth=none

Authentication-results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com

Authentication-results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=longman at redhat dot com

Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AF87B2CB7

Throughput-Optimized (TP) futex is a new futex type that is being proposed to be merged into the Linux kernel: v6: https://lkml.org/lkml/2017/3/22/654 This patchset enhances the pthread_mutex and pthread_rwlock APIs to support the new TP futexes when they are available in the running kernel. For mutex, a user can designate the use of TP futexes by using pthread_mutexattr_setprotocol(attr, PTHREAD_THROUGHPUT_NP); For rwlock, a user can designate the use of TP futexes by using pthread_rwlockattr_setkind_np(attr, PTHREAD_RWLOCK_USE_TP_FUTEX_NP); A locking microbenchmark was run on a 2-socket 40-core Intel Gold 6148 system (HT on) for 5s with a 4.14 based Linux kernel (with TP futex patch). A modified glibc (v2.26) with TP futex support was used for measuring locking performance with standard mutex and rwlock versus the TP futex versions. For mutex, the total locking ops (in 5s) with various number of locking threads and critical section loads were as follows: threads/Load pthread-mutex pthread-tp-mutex % change ------------ ------------- ---------------- -------- 10/10 41,758,248 60,517,142 44.9% 20/10 53,331,622 60,349,448 13.2% 40/10 47,683,927 52,437,558 10.0% 80/10 30,632,089 56,019,758 82.9% 10/50 27,155,380 42,259,331 55.6% 20/50 34,292,402 36,979,265 7.8% 40/50 30,510,507 37,317,158 22.3% 80/50 17,800,497 41,251,263 131.7% It can be seen that the TP version of the mutex performs well with respect to the standard mutex. For rwlock, with mixed locking threads doing equal number of read and write locks, the total locking ops (in 5s) with various number of locking threads and a fixed critical section load of 10 were as follows: threads pthread-rwlock pthread-rwlock pthread-tp-rwlock (prefer-R) (prefer-W) ------- ------------- -------------- ----------------- 10 25,365,336 994,356 59,150,680 20 8,055,288 1,012,284 59,891,268 40 17,750,188 1,269,368 56,844,898 80 12,267,932 1,667,678 57,855,304 With separate reader and writer locking threads, the total locking ops (in 5s) were as follows: threads pthread-rwlock pthread-rwlock pthread-tp-rwlock (prefer-R) (prefer-W) ------- ------------- -------------- ----------------- 5 readers 27,450,747 100,971 668,412 5 writers 168,557 12,456,028 32,128,358 10 readers 34,341,123 4,151 3,428,028 10 writers 15,880 15,030,632 30,132,666 20 readers 34,935,993 20 7,707,514 20 writers 26 18,557,431 8,983,738 40 readers 25,636,457 41 6,500,411 40 writers 40 22,634,557 5,584,316 Lock starvation happened for the standard glibc rwlock when the number of contending threads was 40 or more. The TP futex version of the rwlock, however, was more fair and hence suffers a bit performance-wise when the number of contending threads is large. Waiman Long (2): nptl: Enable pthread mutex to use the TP futex nptl: Enable pthread rwlock to use the TP futex ChangeLog | 26 +++ nptl/pthreadP.h | 18 ++ nptl/pthread_mutex_init.c | 27 +++ nptl/pthread_mutex_lock.c | 49 +++++- nptl/pthread_mutex_timedlock.c | 52 +++++- nptl/pthread_mutex_trylock.c | 20 ++- nptl/pthread_mutex_unlock.c | 20 ++- nptl/pthread_mutexattr_setprotocol.c | 1 + nptl/pthread_rwlock_rdlock.c | 5 +- nptl/pthread_rwlock_timedrdlock.c | 5 +- nptl/pthread_rwlock_timedwrlock.c | 5 +- nptl/pthread_rwlock_tp.c | 235 +++++++++++++++++++++++++++ nptl/pthread_rwlock_tryrdlock.c | 5 + nptl/pthread_rwlock_trywrlock.c | 5 + nptl/pthread_rwlock_unlock.c | 14 ++ nptl/pthread_rwlock_wrlock.c | 5 +- nptl/pthread_rwlockattr_setkind_np.c | 21 ++- sysdeps/nptl/pthread.h | 4 + sysdeps/unix/sysv/linux/hppa/pthread.h | 3 +- sysdeps/unix/sysv/linux/lowlevellock-futex.h | 9 + 20 files changed, 502 insertions(+), 27 deletions(-) create mode 100644 nptl/pthread_rwlock_tp.c -- 1.8.3.1