TSX lock elision for glibc v3

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

TSX lock elision for glibc v3

Andi Kleen-3
History:
v1: Initial post
v2: Remove IFUNC use.
v3: Nested trylock aborts by default now.
    Trylock enables elision on non-upgraded lock.
    Various bug fixes.
    New initializers, remove explicit new lock types in external interface.
    Add example of abort hook to manual.
    Fix bugs and clean up the configuration parser.
    Fix bug in lock busy handling.
    Fix tst-elision2 which was disabled by mistake earlier.
    I kept the hle.h compat header because removing it would have broken
    my own build system.

Lock elision using TSX is a technique to optimize lock scaling.
It allows to run existing locks in parallel using hardware memory
transactions. New instructions (RTM) are used to control
memory transactions.

The full series is available at
http://github.com/andikleen/glibc
git://github.com/andikleen/glibc rtm-devel4

An overview of lock elision is available at
http://halobates.de/adding-lock-elision-to-linux.pdf

See http://software.intel.com/file/41604 for the full
TSX specification. Running TSX requires either new hardware with TSX
support, or using the SDE emulator
http://software.intel.com/en-us/articles/intel-software-development-emulator/

This patchkit implements a simple adaptive lock elision algorithm based
on RTM. It enables elision for the pthread mutexes and rwlocks.
The algorithm keeps track whether a mutex successfully elides or not,
and stops eliding for some time when it is not.

When the CPU supports RTM the elision path is automatically tried,
otherwise any elision is disabled.

The adaptation algorithm and its tuning is currently preliminary.

I cannot post performance numbers at this point.

The user can also tune this by setting the mutex type and environment
variables.

The lock transactions have a abort hook mechanism to hook into the abort
path. This is quite useful for some debugging, so I kept this
functionality.

The mutexes can be configured at runtime with the PTHREAD_MUTEX
environment variable.  This will force a specific lock type for all
mutexes in the program that do not have another type set explicitly.
This can be done without modifying the program.

Currently elision is enabled by default on systems that support RTM,
unless explicitely disabled either in the program or by the user.
Given more experience we can decide if that is a good idea, or if it
should be opt-in.

Limitations that may be fixable (but it's unclear if it's worth it):
-------------------------------------------------------------------
- Adaptive enabled mutexes don't track the owner, so pthread_mutex_destroy
will not detect a busy mutex.
- Unlocking an unlocked mutex will result in a crash currently
(see above)
- No elision support for recursive, error check mutexes
Recursive may be possible, error check is unlikely
- Some obscure cases can also fallback to non elision
- Internal locks in glibc (like malloc or stdio) do not elide at this
  point.

Changing these semantics would be possible, but has some runtime cost. Currently
I decided to not do any expensive changes, but wait for more testing feedback.

To be fixed:
------------
- The default tuning parameters may be revised.

Reply | Threaded
Open this post in threaded view
|

[PATCH 1/9] Add the low level infrastructure for pthreads lock elision with TSX

Andi Kleen-3
From: Andi Kleen <[hidden email]>

Lock elision using TSX is a technique to optimize lock scaling
It allows to run locks in parallel using hardware support for
a transactional execution mode in upcoming Intel CPUs.
See http://software.intel.com/file/41604 for the full
specification.

This patch implements a simple adaptive lock elision algorithm based
on RTM. It enables elision for the pthread mutexes and rwlocks.
The algorithm keeps track whether a mutex successfully elides or not,
and stops eliding for some time when it is not.

When the CPU supports RTM the elision path is automatically tried,
otherwise any elision is disabled.

The adaptation algorithm and its tuning is currently preliminary.

The code adds some checks to the lock fast paths. Micro-benchmarks
show little to no difference without RTM.

Lock elision can be enabled/disabled using environment variables.
It can be also enabled or disabled using new lock types for
mutex and rwlocks. The adaptation parameters are also tunable.

This patch implements the low level "lll_" code for lock elision.
Followon patches hook this into the pthread implement.

Changes with the RTM mutexes:
-----------------------------
Lock elision in pthreads is generally compatible with existing programs.
There are some obscure exceptions, which are expected to be uncommon.
See the manual for more details.

- A broken program that unlocks a free lock will crash.
  There are ways around this with some tradeoffs (more code in hot paths)
  This will also happen on systems without RTM with the patchkit.
  I'm still undecided on what approach to take here; have to wait for testing reports.
- pthread_mutex_destroy of a lock mutex will not return EBUSY but 0.
- mutex appears free when elided.
  pthread_mutex_lock(mutex);
  if (pthread_mutex_trylock(mutex) != 0) do_something
  will not do something when the lock elided.
  However note that if the check is an assert it works as expected because the
  assert failure aborts and the region is re-executed non transactionally,
  with the old behaviour.
  The same change applies to write locks for rwlocks.
  [This is now only done for mutexes that have elision explicitely enabled,
   standard mutexes abort in this situation]
- There's also a similar situation with trylock outside the mutex,
  "knowing" that the mutex must be held due to some other condition.
  In this case an assert failure cannot be recovered. This situation is
  usually an existing bug in the program.
- Same applies to the rwlocks. Some of the return values changes
  (for example there is no EDEADLK for an elided lock, unless it aborts.
   However when elided it will also never deadlock of course)
- Timing changes, so broken programs that make assumptions about specific timing
  may expose already existing latent problems.  Note that these broken programs will
  break in other situations too (loaded system, new faster hardware, compiler
  optimizations etc.)

Currently elision is enabled by default on systems that support RTM,
unless explicitely disabled either in the program or by the user.

This patch implements the basic infrastructure for elision.

Open issues:
- XTEST or not XTEST in unlock, see above.
- Adaptation for rwlocks
- Condition variables don't use elision so far
- Adaptation tuning

2013-01-22  Andi Kleen  <[hidden email]>
            Hongjiu Lu  <[hidden email]>

        * nptl-init.c (__pthread_force_elision): Add.
        * pthreadP.h (__pthread_force_elision): Add.
        * sysdeps/unix/sysv/linux/i386/lowlevellock.h (__lll_timedwait_tid,
          lll_timedlock_elision, __lll_lock_elision, __lll_unlock_elision,
          __lll_trylock_elision, lll_lock_elision, lll_unlock_elision,
          lll_trylock_elision): Add.
        * sysdeps/unix/sysv/linux/x86/Makefile: Imply x86
        * sysdeps/unix/sysv/linux/x86/elision-conf.c: New file.
        * sysdeps/unix/sysv/linux/x86/elision-conf.h: New file.
        * sysdeps/unix/sysv/linux/x86/elision-lock.c: New file.
        * sysdeps/unix/sysv/linux/x86/elision-timed.c: New file.
        * sysdeps/unix/sysv/linux/x86/elision-trylock.c: New file.
        * sysdeps/unix/sysv/linux/x86/elision-unlock.c: New file
        * sysdeps/unix/sysv/linux/x86_64/lowlevellock.h (__lll_timedwait_tid,
          lll_timedlock_elision, __lll_lock_elision, __lll_unlock_elision,
          __lll_trylock_elision, lll_lock_elision, lll_unlock_elision,
          lll_trylock_elision): Add.
        * nptl/sysdeps/unix/sysv/linux/x86/hle.h: New file.
        * elision-conf.h: New file.
---
 nptl/elision-conf.h                                |    1 +
 nptl/nptl-init.c                                   |    1 +
 nptl/pthreadP.h                                    |    2 +
 nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h   |   22 ++
 nptl/sysdeps/unix/sysv/linux/x86/Makefile          |    3 +
 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c    |  227 ++++++++++++++++++++
 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h    |   54 +++++
 nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c    |   94 ++++++++
 nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c   |    8 +
 nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c |   73 +++++++
 nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c  |   32 +++
 nptl/sysdeps/unix/sysv/linux/x86/hle.h             |   71 ++++++
 nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h |   23 ++
 13 files changed, 611 insertions(+), 0 deletions(-)
 create mode 100644 nptl/elision-conf.h
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/Makefile
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/hle.h

diff --git a/nptl/elision-conf.h b/nptl/elision-conf.h
new file mode 100644
index 0000000..40a8c17
--- /dev/null
+++ b/nptl/elision-conf.h
@@ -0,0 +1 @@
+/* empty */
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 19e6616..cc80549 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -36,6 +36,7 @@
 #include <lowlevellock.h>
 #include <kernel-features.h>
 
+int __pthread_force_elision attribute_hidden;
 
 /* Size and alignment of static TLS block.  */
 size_t __static_tls_size;
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index 993a79e..17973b2 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -571,6 +571,8 @@ extern void __free_stacks (size_t limit) attribute_hidden;
 
 extern void __wait_lookup_done (void) attribute_hidden;
 
+extern int __pthread_force_elision attribute_hidden;
+
 #ifdef SHARED
 # define PTHREAD_STATIC_FN_REQUIRE(name)
 #else
diff --git a/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h b/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h
index f51f650..d2ef7de 100644
--- a/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h
+++ b/nptl/sysdeps/unix/sysv/linux/i386/lowlevellock.h
@@ -429,6 +429,12 @@ LLL_STUB_UNWIND_INFO_END
        : "memory");      \
      result; })
 
+extern int __lll_timedlock_elision (int *futex, short *try_lock,
+ const struct timespec *timeout,
+ int private) attribute_hidden;
+
+#define lll_timedlock_elision(futex, try_lock, timeout, private) \
+  __lll_timedlock_elision(&(futex), &(try_lock), timeout, private)
 
 #define lll_robust_timedlock(futex, timeout, id, private) \
   ({ int result, ignore1, ignore2, ignore3;      \
@@ -582,6 +588,22 @@ extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime)
       }      \
     __result; })
 
+extern int __lll_lock_elision (int *futex, short *try_lock, int private)
+  attribute_hidden;
+
+extern int __lll_unlock_elision(int *lock, int private)
+  attribute_hidden;
+
+extern int __lll_trylock_elision(int *lock, short *try_lock, int upgrade)
+  attribute_hidden;
+
+#define lll_lock_elision(futex, try_lock, private) \
+  __lll_lock_elision (&(futex), &(try_lock), private)
+#define lll_unlock_elision(futex, private) \
+  __lll_unlock_elision (&(futex), private)
+#define lll_trylock_elision(futex, try_lock, upgrade) \
+  __lll_trylock_elision(&(futex), &(try_lock), upgrade)
+
 #endif  /* !__ASSEMBLER__ */
 
 #endif /* lowlevellock.h */
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/Makefile b/nptl/sysdeps/unix/sysv/linux/x86/Makefile
new file mode 100644
index 0000000..61b7552
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/Makefile
@@ -0,0 +1,3 @@
+libpthread-sysdep_routines += init-arch
+libpthread-sysdep_routines += elision-lock elision-unlock elision-timed \
+      elision-trylock
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
new file mode 100644
index 0000000..f37c552
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
@@ -0,0 +1,227 @@
+/* elision-conf.c: Lock elision tunable parameters.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include <pthreadP.h>
+#include <sys/fcntl.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <init-arch.h>
+#include "elision-conf.h"
+
+struct elision_config __elision_aconf =
+  {
+    .retry_lock_busy = 3,
+    .retry_lock_internal_abort = 3,
+    .retry_try_xbegin = 3,
+    .retry_trylock_internal_abort = 3,
+  };
+
+struct tune
+{
+  const char *name;
+  unsigned offset;
+  int len;
+};
+
+#define FIELD(x) { #x, offsetof(struct elision_config, x), sizeof(#x)-1 }
+
+static const struct tune tunings[] =
+  {
+    FIELD(retry_lock_busy),
+    FIELD(retry_lock_internal_abort),
+    FIELD(retry_try_xbegin),
+    FIELD(retry_trylock_internal_abort),
+    {}
+  };
+
+#define PAIR(x) x, sizeof (x)-1
+
+static void
+complain (const char *msg, int len)
+{
+  INTERNAL_SYSCALL_DECL (err);
+  INTERNAL_SYSCALL (write, err, 3, 2, (char *)msg, len);
+}
+
+static void
+elision_aconf_setup(const char *s)
+{
+  int i;
+
+  while (*s)
+    {
+      for (i = 0; tunings[i].name; i++)
+ {
+  int nlen = tunings[i].len;
+
+  if (!strncmp (tunings[i].name, s, nlen))
+    {
+      char *end;
+      int val;
+
+      if (s[nlen] != '=')
+ {
+    complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: missing =\n"));
+  return;
+ }
+      s += nlen + 1;
+      val = strtoul (s, &end, 0);
+      if (end == s)
+ {
+    complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: missing number\n"));
+  return;
+ }
+      *(int *)(((char *)&__elision_aconf) + tunings[i].offset) = val;
+      s = end;
+      if (*s == ',' || *s == ':')
+ s++;
+      else if (*s)
+ {
+    complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: garbage after number\n"));
+  return;
+ }
+      break;
+    }
+ }
+      if (!tunings[i].name)
+       {
+    complain (PAIR("pthreads: invalid PTHREAD_MUTEX syntax: unknown tunable\n"));
+  return;
+ }
+    }
+}
+
+int __rwlock_rtm_enabled attribute_hidden;
+int __rwlock_rtm_read_retries attribute_hidden = 3;
+int __elision_available attribute_hidden;
+
+#define PAIR(x) x, sizeof (x)-1
+
+static char *
+next_env_entry (char first, char ***position)
+{
+  char **current = *position;
+  char *result = NULL;
+
+  while (*current != NULL)
+    {
+      if ((*current)[0] == first)
+ {
+  result = *current;
+  *position = ++current;
+  break;
+ }
+
+      ++current;
+    }
+
+  return result;
+}
+
+static inline void
+match (const char *line, const char *var, int len, const char **res)
+{
+  if (!strncmp (line, var, len))
+    *res = line + len;
+}
+
+static void
+elision_mutex_init (const char *s)
+{
+  if (!s)
+    return;
+  if (!strncmp (s, "adaptive", 8) && (s[8] == 0 || s[8] == ':'))
+    {
+      __pthread_force_elision = __elision_available;
+      if (s[8] == ':')
+ elision_aconf_setup (s + 9);
+    }
+  else if (!strncmp (s, "elision", 7) && (s[7] == 0 || s[7] == ':'))
+    {
+      __pthread_force_elision = __elision_available;
+      if (s[7] == ':')
+        elision_aconf_setup (s + 8);
+    }
+  else if (!strncmp (s, "none", 4) && s[4] == 0)
+    __pthread_force_elision = 0;
+  else
+    complain (PAIR("pthreads: Unknown setting for PTHREAD_MUTEX\n"));
+}
+
+static void
+elision_rwlock_init (const char *s)
+{
+  if (!s)
+    {
+      __rwlock_rtm_enabled = __elision_available;
+      return;
+    }
+  if (!strncmp (s, "elision", 7))
+    {
+      __rwlock_rtm_enabled = __elision_available;
+      if (s[7] == ':')
+        {
+          char *end;
+  int n;
+
+          n = strtoul (s + 8, &end, 0);
+  if (end == s + 8)
+    complain (PAIR("pthreads: Bad retry number for PTHREAD_RWLOCK\n"));
+          else
+    __rwlock_rtm_read_retries = n;
+ }
+    }
+  else if (!strncmp(s, "none", 4) && s[4] == 0)
+    __rwlock_rtm_enabled = 0;
+  else
+    complain (PAIR("pthreads: Unknown setting for PTHREAD_RWLOCK\n"));
+}
+
+static void
+elision_init (int argc __attribute__ ((unused)),
+      char **argv  __attribute__ ((unused)),
+      char **environ)
+{
+  char *envline;
+  const char *mutex = NULL, *rwlock = NULL;
+
+  __pthread_force_elision = 1;
+  __elision_available = 1;
+
+  while ((envline = next_env_entry ('P', &environ)) != NULL)
+    {
+      match (envline, PAIR("PTHREAD_MUTEX="), &mutex);
+      match (envline, PAIR("PTHREAD_RWLOCK="), &rwlock);
+    }
+
+  elision_mutex_init (mutex);
+  elision_rwlock_init (rwlock);
+}
+
+#ifdef SHARED
+# define INIT_SECTION ".init_array"
+#else
+# define INIT_SECTION ".preinit_array"
+#endif
+
+void (*const init_array []) (int, char **, char **)
+  __attribute__ ((section (INIT_SECTION), aligned (sizeof (void *)))) =
+{
+  &elision_init
+};
+
+__pthread_abort_hook_t __tsx_abort_hook attribute_hidden;
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
new file mode 100644
index 0000000..a0017b5
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
@@ -0,0 +1,54 @@
+/* elision-conf.h: Lock elision tunable parameters.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#ifndef _ELISION_CONF_H
+#define _ELISION_CONF_H 1
+
+#include <pthread.h>
+#include <cpuid.h>
+#include <time.h>
+
+/* Should make sure there is no false sharing on this */
+
+struct elision_config
+{
+  int retry_lock_busy;
+  int retry_lock_internal_abort;
+  int retry_try_xbegin;
+  int retry_trylock_internal_abort;
+};
+
+extern struct elision_config __elision_aconf attribute_hidden;
+
+extern __pthread_abort_hook_t __tsx_abort_hook;
+
+extern int __rwlock_rtm_enabled;
+extern int __elision_available;
+
+extern int __pthread_mutex_timedlock_nortm (pthread_mutex_t *mutex, const struct timespec *);
+extern int __pthread_mutex_timedlock_rtm (pthread_mutex_t *mutex, const struct timespec *);
+extern int __pthread_mutex_timedlock (pthread_mutex_t *mutex, const struct timespec *);
+extern int __pthread_mutex_lock_nortm (pthread_mutex_t *mutex);
+extern int __pthread_mutex_lock_rtm (pthread_mutex_t *mutex);
+extern int __pthread_mutex_lock (pthread_mutex_t *mutex);
+extern int __pthread_mutex_trylock_nortm (pthread_mutex_t *);
+extern int __pthread_mutex_trylock_rtm (pthread_mutex_t *);
+extern int __pthread_mutex_trylock (pthread_mutex_t *);
+
+#define SUPPORTS_ELISION 1
+
+#endif
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
new file mode 100644
index 0000000..5ec82c4
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
@@ -0,0 +1,94 @@
+/* elision-lock.c: Elided pthread mutex lock.
+   Copyright (C) 2011, 2012, 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include <pthread.h>
+#include "pthreadP.h"
+#include "lowlevellock.h"
+#include "hle.h"
+#include "elision-conf.h"
+
+#if !defined(LLL_LOCK) && !defined(EXTRAARG)
+/* Make sure the configuration code is always linked in for static
+   libraries. */
+#include "elision-conf.c"
+#endif
+
+#ifndef EXTRAARG
+#define EXTRAARG
+#endif
+#ifndef LLL_LOCK
+#define LLL_LOCK(a,b) lll_lock(a,b), 0
+#endif
+
+#define aconf __elision_aconf
+
+/* Adaptive lock using transactions.
+   By default the lock region is run as a transaction, and when it
+   aborts or the lock is busy the lock adapts itself. */
+
+int
+__lll_lock_elision (int *futex, short *try_lock, EXTRAARG int private)
+{
+  if (*try_lock <= 0)
+    {
+      unsigned status;
+      int try_xbegin;
+
+      for (try_xbegin = aconf.retry_try_xbegin;
+   try_xbegin > 0;
+   try_xbegin--)
+ {
+  if ((status = _xbegin()) == _XBEGIN_STARTED)
+    {
+      if (*futex == 0)
+ return 0;
+
+      /* Lock was busy. Fall back to normal locking.
+ Could also _xend here but xabort with 0xff code
+ is more visible in the profiler. */
+      _xabort (0xff);
+    }
+
+  if (__tsx_abort_hook)
+    __tsx_abort_hook(status);
+
+  if (!(status & _XABORT_RETRY))
+    {
+      if ((status & _XABORT_EXPLICIT) && _XABORT_CODE (status) == 0xff)
+        {
+  if (*try_lock != aconf.retry_lock_busy)
+    *try_lock = aconf.retry_lock_busy;
+ }
+      /* Internal abort. There is no chance for retry.
+ Use the normal locking and next time use lock.
+ Be careful to avoid writing to the lock. */
+      else if (*try_lock != aconf.retry_lock_internal_abort)
+ *try_lock = aconf.retry_lock_internal_abort;
+      break;
+    }
+ }
+    }
+  else
+    {
+      /* Use a normal lock until the threshold counter runs out.
+ Lost updates possible. */
+      (*try_lock)--;
+    }
+
+  /* Use a normal lock here */
+  return LLL_LOCK ((*futex), private);
+}
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c
new file mode 100644
index 0000000..1cad4779
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-timed.c
@@ -0,0 +1,8 @@
+#include <time.h>
+#include "elision-conf.h"
+#include "lowlevellock.h"
+#define __lll_lock_elision __lll_timedlock_elision
+#define EXTRAARG const struct timespec *t,
+#undef LLL_LOCK
+#define LLL_LOCK(a, b) lll_timedlock(a, t, b)
+#include "elision-lock.c"
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c
new file mode 100644
index 0000000..2d06677
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-trylock.c
@@ -0,0 +1,73 @@
+/* elision-trylock.c: Lock eliding trylock for pthreads.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include <pthread.h>
+#include <pthreadP.h>
+#include <lowlevellock.h>
+#include "hle.h"
+#include "elision-conf.h"
+
+#define aconf __elision_aconf
+
+/* Try to elide a futex trylock. FUTEX is the futex variable. TRY_LOCK is the
+   adaptation counter in the mutex. UPGRADED is != 0 when this is for an
+   automatically upgraded lock.  */
+
+int
+__lll_trylock_elision (int *futex, short *try_lock, int upgraded)
+{
+  /* Only try a transaction if it's worth it */
+  if (*try_lock <= 0)
+    {
+      unsigned status;
+
+      /* When this could be a nested trylock that is not explicitely
+ declared an elided lock abort. This makes us follow POSIX
+ paper semantics. */
+      if (upgraded)
+        _xabort (0xfd);
+
+      if ((status = _xbegin()) == _XBEGIN_STARTED)
+ {
+  if (*futex == 0)
+    return 0;
+
+  /* Lock was busy. Fall back to normal locking.
+     Could also _xend here but xabort with 0xff code
+     is more visible in the profiler. */
+  _xabort (0xff);
+ }
+
+      if (__tsx_abort_hook)
+ __tsx_abort_hook (status);
+
+      if (!(status & _XABORT_RETRY))
+        {
+          /* Internal abort. No chance for retry. For future
+             locks don't try speculation for some time. */
+          if (*try_lock != aconf.retry_trylock_internal_abort)
+            *try_lock = aconf.retry_trylock_internal_abort;
+        }
+    }
+  else
+    {
+      /* Lost updates are possible, but harmless. */
+      (*try_lock)--;
+    }
+
+  return lll_trylock (*futex);
+}
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c
new file mode 100644
index 0000000..0e74c8e
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-unlock.c
@@ -0,0 +1,32 @@
+/* elision-unlock.c: Commit an elided pthread lock.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+#include "pthreadP.h"
+#include "lowlevellock.h"
+#include "hle.h"
+
+int
+__lll_unlock_elision(int *lock, int private)
+{
+  /* When the lock was free we're in a transaction.
+     When you crash here you unlocked a free lock. */
+  if (*lock == 0)
+    _xend();
+  else
+    lll_unlock ((*lock), private);
+  return 0;
+}
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/hle.h b/nptl/sysdeps/unix/sysv/linux/x86/hle.h
new file mode 100644
index 0000000..866a6ee
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/hle.h
@@ -0,0 +1,71 @@
+/* Shared RTM header. Emulate TSX intrinsics for compilers and assemblers
+   that do not support the intrinsics and instructions yet. */
+#ifndef _HLE_H
+#define _HLE_H 1
+
+#ifdef __ASSEMBLER__
+
+.macro XBEGIN target
+ .byte 0xc7,0xf8
+ .long \target-1f
+1:
+.endm
+
+.macro XEND
+ .byte 0x0f,0x01,0xd5
+.endm
+
+.macro XABORT code
+ .byte 0xc6,0xf8,\code
+.endm
+
+.macro XTEST
+ .byte 0x0f,0x01,0xd6
+.endm
+
+#endif
+
+/* Official RTM intrinsics interface matching gcc/icc, but works
+   on older gcc compatible compilers and binutils.
+   We should somehow detect if the compiler supports it, because
+   it may be able to generate slightly better code. */
+
+#define _XBEGIN_STARTED (~0u)
+#define _XABORT_EXPLICIT (1 << 0)
+#define _XABORT_RETRY (1 << 1)
+#define _XABORT_CONFLICT (1 << 2)
+#define _XABORT_CAPACITY (1 << 3)
+#define _XABORT_DEBUG (1 << 4)
+#define _XABORT_NESTED (1 << 5)
+#define _XABORT_CODE(x) (((x) >> 24) & 0xff)
+
+#ifndef __ASSEMBLER__
+
+#define __force_inline __attribute__((__always_inline__)) inline
+
+static __force_inline int _xbegin(void)
+{
+  int ret = _XBEGIN_STARTED;
+  asm volatile (".byte 0xc7,0xf8 ; .long 0" : "+a" (ret) :: "memory");
+  return ret;
+}
+
+static __force_inline void _xend(void)
+{
+  asm volatile (".byte 0x0f,0x01,0xd5" ::: "memory");
+}
+
+static __force_inline void _xabort(const unsigned int status)
+{
+  asm volatile (".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
+}
+
+static __force_inline int _xtest(void)
+{
+  unsigned char out;
+  asm volatile (".byte 0x0f,0x01,0xd6 ; setnz %0" : "=r" (out) :: "memory");
+  return out;
+}
+
+#endif
+#endif
diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h b/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
index 6722294..98e7358 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
+++ b/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h
@@ -426,6 +426,13 @@ LLL_STUB_UNWIND_INFO_END
        : "memory", "cx", "cc", "r10", "r11");      \
      result; })
 
+extern int __lll_timedlock_elision (int *futex, short *try_lock,
+ const struct timespec *timeout,
+ int private) attribute_hidden;
+
+#define lll_timedlock_elision(futex, try_lock, timeout, private) \
+  __lll_timedlock_elision(&(futex), &(try_lock), timeout, private)
+
 #define lll_robust_timedlock(futex, timeout, id, private) \
   ({ int result, ignore1, ignore2, ignore3;      \
      __asm __volatile (LOCK_INSTR "cmpxchgl %1, %4\n\t"      \
@@ -596,6 +603,22 @@ extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime)
       }      \
     __result; })
 
+extern int __lll_lock_elision (int *futex, short *try_lock, int private)
+  attribute_hidden;
+
+extern int __lll_unlock_elision(int *lock, int private)
+  attribute_hidden;
+
+extern int __lll_trylock_elision(int *lock, short *try_lock, int upgraded)
+  attribute_hidden;
+
+#define lll_lock_elision(futex, try_lock, private) \
+  __lll_lock_elision (&(futex), &(try_lock), private)
+#define lll_unlock_elision(futex, private) \
+  __lll_unlock_elision (&(futex), private)
+#define lll_trylock_elision(futex, try_lock, upgraded) \
+  __lll_trylock_elision(&(futex), &(try_lock), upgraded)
+
 #endif  /* !__ASSEMBLER__ */
 
 #endif /* lowlevellock.h */
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 2/9] Add external interface changes: new lock types for pthread_mutex_t

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

Add a new PTHREAD_MUTEX_INIT_NP() macro that allows to set generic
mutex type/flags combinations. Expose PTHREAD_MUTEX_ELISION_NP
and PTHREAD_MUTEX_NO_ELISION_NP flags. In addition also expose
the existing PTHREAD_MUTEX_PSHARED_NP flag.

Similar flags are defined for the rwlocks.

This allows programs to set elision in a fine grained matter.
See the manual for more details.

recursive/pi/error checking mutexes do not elide.
Recursive could be implemented at some point, but are not currently.
The initializer allows to set illegal combinations currently.

2013-01-22  Andi Kleen  <[hidden email]>

        * pthreadP.h: Add elision types.
          (PTHREAD_MUTEX_TYPE_EL): Add.
        * sysdeps/pthread/pthread.h: Add elision initializers.
          (PTHREAD_MUTEX_ELISION_NP, PTHREAD_MUTEX_NO_ELISION_NP,
           PTHREAD_MUTEX_PSHARED_NP): Add new flags.
          (__PTHREAD_SPINS): Add.
          Update mutex initializers.
          (PTHREAD_RWLOCK_ELISION_NP, PTHREAD_RWLOCK_NO_ELISION_NP): Add.
---
 nptl/pthreadP.h                |   17 +++++++++++-
 nptl/sysdeps/pthread/pthread.h |   51 ++++++++++++++++++++++++++++++++--------
 2 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index 17973b2..2e3eb9b 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -60,7 +60,7 @@
 /* Internal mutex type value.  */
 enum
 {
-  PTHREAD_MUTEX_KIND_MASK_NP = 3,
+  PTHREAD_MUTEX_KIND_MASK_NP = 7,
   PTHREAD_MUTEX_ROBUST_NORMAL_NP = 16,
   PTHREAD_MUTEX_ROBUST_RECURSIVE_NP
   = PTHREAD_MUTEX_ROBUST_NORMAL_NP | PTHREAD_MUTEX_RECURSIVE_NP,
@@ -93,12 +93,25 @@ enum
   PTHREAD_MUTEX_PP_ERRORCHECK_NP
   = PTHREAD_MUTEX_PRIO_PROTECT_NP | PTHREAD_MUTEX_ERRORCHECK_NP,
   PTHREAD_MUTEX_PP_ADAPTIVE_NP
-  = PTHREAD_MUTEX_PRIO_PROTECT_NP | PTHREAD_MUTEX_ADAPTIVE_NP
+  = PTHREAD_MUTEX_PRIO_PROTECT_NP | PTHREAD_MUTEX_ADAPTIVE_NP,
+  PTHREAD_MUTEX_ELISION_FLAGS_NP
+  = PTHREAD_MUTEX_ELISION_NP | PTHREAD_MUTEX_NO_ELISION_NP,
+
+  PTHREAD_MUTEX_TIMED_ELISION_NP =
+  PTHREAD_MUTEX_TIMED_NP | PTHREAD_MUTEX_ELISION_NP,
+  PTHREAD_MUTEX_TIMED_NO_ELISION_NP =
+  PTHREAD_MUTEX_TIMED_NP | PTHREAD_MUTEX_NO_ELISION_NP,
+  PTHREAD_MUTEX_ADAPTIVE_ELISION_NP =
+  PTHREAD_MUTEX_ADAPTIVE_NP | PTHREAD_MUTEX_ELISION_NP,
+  PTHREAD_MUTEX_ADAPTIVE_NO_ELISION_NP =
+  PTHREAD_MUTEX_ADAPTIVE_NP | PTHREAD_MUTEX_NO_ELISION_NP
 };
 #define PTHREAD_MUTEX_PSHARED_BIT 128
 
 #define PTHREAD_MUTEX_TYPE(m) \
   ((m)->__data.__kind & 127)
+#define PTHREAD_MUTEX_TYPE_EL(m) \
+  ((m)->__data.__kind & (127|PTHREAD_MUTEX_ELISION_FLAGS_NP))
 
 #if LLL_PRIVATE == 0 && LLL_SHARED == 128
 # define PTHREAD_MUTEX_PSHARED(m) \
diff --git a/nptl/sysdeps/pthread/pthread.h b/nptl/sysdeps/pthread/pthread.h
index 10bcb80..5afadaa 100644
--- a/nptl/sysdeps/pthread/pthread.h
+++ b/nptl/sysdeps/pthread/pthread.h
@@ -44,7 +44,12 @@ enum
   PTHREAD_MUTEX_TIMED_NP,
   PTHREAD_MUTEX_RECURSIVE_NP,
   PTHREAD_MUTEX_ERRORCHECK_NP,
-  PTHREAD_MUTEX_ADAPTIVE_NP
+  PTHREAD_MUTEX_ADAPTIVE_NP,
+
+  PTHREAD_MUTEX_ELISION_NP    = 1024,
+  PTHREAD_MUTEX_NO_ELISION_NP = 2048,
+  PTHREAD_MUTEX_PSHARED_NP    = 128
+
 #if defined __USE_UNIX98 || defined __USE_XOPEN2K8
   ,
   PTHREAD_MUTEX_NORMAL = PTHREAD_MUTEX_TIMED_NP,
@@ -83,27 +88,50 @@ enum
 
 
 /* Mutex initializers.  */
+#if __PTHREAD_MUTEX_HAVE_ELISION == 1
+#define __PTHREAD_SPINS 0, 0
+#elif __PTHREAD_MUTEX_HAVE_ELISION == 2
+#define __PTHREAD_SPINS { 0, 0 }
+#else
+#define __PTHREAD_SPINS 0
+#endif
+
 #ifdef __PTHREAD_MUTEX_HAVE_PREV
 # define PTHREAD_MUTEX_INITIALIZER \
-  { { 0, 0, 0, 0, 0, 0, { 0, 0 } } }
+  { { 0, 0, 0, 0, 0, __PTHREAD_SPINS, { 0, 0 } } }
 # ifdef __USE_GNU
 #  define PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP \
-  { { 0, 0, 0, 0, PTHREAD_MUTEX_RECURSIVE_NP, 0, { 0, 0 } } }
+  { { 0, 0, 0, 0, PTHREAD_MUTEX_RECURSIVE_NP, __PTHREAD_SPINS, { 0, 0 } } }
 #  define PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP \
-  { { 0, 0, 0, 0, PTHREAD_MUTEX_ERRORCHECK_NP, 0, { 0, 0 } } }
+  { { 0, 0, 0, 0, PTHREAD_MUTEX_ERRORCHECK_NP, __PTHREAD_SPINS, { 0, 0 } } }
+#  define PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP \
+  { { 0, 0, 0, 0, PTHREAD_MUTEX_ADAPTIVE_NP, __PTHREAD_SPINS, { 0, 0 } } }
 #  define PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP \
-  { { 0, 0, 0, 0, PTHREAD_MUTEX_ADAPTIVE_NP, 0, { 0, 0 } } }
+  { { 0, 0, 0, 0, PTHREAD_MUTEX_ADAPTIVE_NP, __PTHREAD_SPINS, { 0, 0 } } }
+
+/* Generic initializer allowing to specify type and additional flags.
+   Valid types are
+   PTHREAD_MUTEX_TIMED_NP, PTHREAD_MUTEX_ADAPTIVE_NP, ...
+   Valid flags are
+   PTHREAD_MUTEX_PSHARED_NP, PTHREAD_MUTEX_ELISION_NP, PTHREAD_MUTEX_NO_ELISION_NP.
+   Both are or'ed together. Some combinations are not legal. */
+#  define PTHREAD_MUTEX_INIT_NP(type) \
+   { { 0, 0, 0, 0, (type), __PTHREAD_SPINS, { 0, 0 } } }
 # endif
 #else
 # define PTHREAD_MUTEX_INITIALIZER \
-  { { 0, 0, 0, 0, 0, { 0 } } }
+  { { 0, 0, 0, 0, 0, { __PTHREAD_SPINS } } }
 # ifdef __USE_GNU
 #  define PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP \
-  { { 0, 0, 0, PTHREAD_MUTEX_RECURSIVE_NP, 0, { 0 } } }
+  { { 0, 0, 0, PTHREAD_MUTEX_RECURSIVE_NP, 0, { __PTHREAD_SPINS } } }
 #  define PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP \
-  { { 0, 0, 0, PTHREAD_MUTEX_ERRORCHECK_NP, 0, { 0 } } }
+  { { 0, 0, 0, PTHREAD_MUTEX_ERRORCHECK_NP, 0, { __PTHREAD_SPINS } } }
 #  define PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP \
-  { { 0, 0, 0, PTHREAD_MUTEX_ADAPTIVE_NP, 0, { 0 } } }
+  { { 0, 0, 0, PTHREAD_MUTEX_ADAPTIVE_NP, 0, { __PTHREAD_SPINS } } }
+/* Generic initializer allowing to specify type and additional flags. */
+#  define PTHREAD_MUTEX_INIT_NP(type) \
+  { { 0, 0, 0, (type), 0, { __PTHREAD_SPINS } } }
+
 # endif
 #endif
 
@@ -115,7 +143,10 @@ enum
   PTHREAD_RWLOCK_PREFER_READER_NP,
   PTHREAD_RWLOCK_PREFER_WRITER_NP,
   PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP,
-  PTHREAD_RWLOCK_DEFAULT_NP = PTHREAD_RWLOCK_PREFER_READER_NP
+  PTHREAD_RWLOCK_DEFAULT_NP = PTHREAD_RWLOCK_PREFER_READER_NP,
+
+  PTHREAD_RWLOCK_ELISION_NP = (1U << 7),
+  PTHREAD_RWLOCK_NO_ELISION_NP = (1U << 8)
 };
 
 /* Define __PTHREAD_RWLOCK_INT_FLAGS_SHARED to 1 if pthread_rwlock_t
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 3/9] Add elision to pthread_mutex_{try,timed,un,}lock

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

Add elision paths to the basic mutex locks.

The normal path has a check for RTM and upgrades the lock
to RTM when available. Trylocks cannot automatically upgrade,
so they check for elision every time.

We use a 4 byte value in the mutex to store the lock
elision adaptation state. This is separate from the adaptive
spin state and uses a separate field.

Condition variables currently do not support elision.

Recursive mutexes and condition variables may be supported at some point,
but are not in the current implementation. Also "trylock" will
not automatically enable elision unless some other lock call
has been already called on the lock.

This version does not use IFUNC, so it means every lock has one
additional check for elision. Benchmarking showed the overhead
to be negligible.

2013-01-22  Andi Kleen  <[hidden email]>
            Hongjiu Lu  <[hidden email]>

        * pthread_mutex_lock.c (adaptive_lock): Add
        (__pthread_mutex_lock): Add lock elision support.
        * pthread_mutex_timedlock.c (pthread_mutex_timedlock): dito.
        * pthread_mutex_trylock.c (__pthread_mutex_trylock): dito.
        * pthread_mutex_unlock.c (__pthread_mutex_unlock_usercnt): dito.
        * sysdeps/unix/sysv/linux/pthread_mutex_cond_lock.c: dito.
        * sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h: dito.
        * sysdeps/unix/sysv/linux/x86/Makefile: New file.
        * sysdeps/unix/sysv/linux/x86/force-elision.h: New file
        * sysdeps/unix/sysv/linux/x86/pthread_mutex_cond_lock.c: dito.
        * sysdeps/unix/sysv/linux/x86/pthread_mutex_lock.c: dito.
        * sysdeps/unix/sysv/linux/x86/pthread_mutex_timedlock.c: dito.
        * sysdeps/unix/sysv/linux/x86/pthread_mutex_trylock.c: dito.
        * sysdeps/unix/sysv/linux/x86/pthread_mutex_unlock.c: dito.
        * pthreadP.h: (PTHREAD_MUTEX_UPGRADED_ELISION_NP): Add.
---
 nptl/pthread_mutex_lock.c                          |  121 +++++++++++++++-----
 nptl/pthread_mutex_timedlock.c                     |   43 +++++++-
 nptl/pthread_mutex_trylock.c                       |   46 +++++++-
 nptl/pthread_mutex_unlock.c                        |   26 ++++-
 nptl/sysdeps/pthread/pthread.h                     |    1 +
 .../unix/sysv/linux/pthread_mutex_cond_lock.c      |    5 +
 .../unix/sysv/linux/x86/bits/pthreadtypes.h        |   13 ++-
 nptl/sysdeps/unix/sysv/linux/x86/force-elision.h   |   34 ++++++
 .../unix/sysv/linux/x86/pthread_mutex_cond_lock.c  |    5 +
 .../unix/sysv/linux/x86/pthread_mutex_lock.c       |   23 ++++
 .../unix/sysv/linux/x86/pthread_mutex_timedlock.c  |   23 ++++
 .../unix/sysv/linux/x86/pthread_mutex_trylock.c    |   23 ++++
 .../unix/sysv/linux/x86/pthread_mutex_unlock.c     |    2 +
 13 files changed, 326 insertions(+), 39 deletions(-)
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/force-elision.h
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_cond_lock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_lock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_timedlock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_trylock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_unlock.c

diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
index fbedfd7..e3639f1 100644
--- a/nptl/pthread_mutex_lock.c
+++ b/nptl/pthread_mutex_lock.c
@@ -25,6 +25,14 @@
 #include <lowlevellock.h>
 #include <stap-probe.h>
 
+#ifndef lll_lock_elision
+#define lll_lock_elision(lock, try_lock, private) ({ \
+      lll_lock (lock, private); 0; })
+#endif
+
+#ifndef lll_trylock_elision
+#define lll_trylock_elision(a,t,u) lll_trylock(a)
+#endif
 
 #ifndef LLL_MUTEX_LOCK
 # define LLL_MUTEX_LOCK(mutex) \
@@ -34,12 +42,50 @@
 # define LLL_ROBUST_MUTEX_LOCK(mutex, id) \
   lll_robust_lock ((mutex)->__data.__lock, id, \
    PTHREAD_ROBUST_MUTEX_PSHARED (mutex))
+# define LLL_MUTEX_LOCK_ELISION(mutex) \
+  lll_lock_elision ((mutex)->__data.__lock, (mutex)->__data.__spins, \
+   PTHREAD_MUTEX_PSHARED (mutex))
 #endif
 
+#ifndef ENABLE_ELISION
+#define ENABLE_ELISION 0
+#endif
+
+#ifndef FORCE_ELISION
+#define FORCE_ELISION(m, s)
+#endif
 
 static int __pthread_mutex_lock_full (pthread_mutex_t *mutex)
      __attribute_noinline__;
 
+static inline __attribute__((always_inline)) void
+adaptive_lock (pthread_mutex_t *mutex)
+{
+  if (! __is_smp)
+    return;
+  if (LLL_MUTEX_TRYLOCK (mutex) != 0)
+    {
+      int cnt = 0;
+      int max_cnt = MIN (MAX_ADAPTIVE_COUNT, mutex->__data.__spins * 2 + 10);
+      do
+        {
+  if (cnt++ >= max_cnt)
+    {
+      LLL_MUTEX_LOCK (mutex);
+      break;
+    }
+
+#ifdef BUSY_WAIT_NOP
+  BUSY_WAIT_NOP;
+#endif
+ }
+      while (LLL_MUTEX_TRYLOCK (mutex) != 0);
+
+      mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
+    }
+  assert (mutex->__data.__owner == 0);
+}
+
 
 int
 __pthread_mutex_lock (mutex)
@@ -47,26 +93,40 @@ __pthread_mutex_lock (mutex)
 {
   assert (sizeof (mutex->__size) >= sizeof (mutex->__data));
 
-  unsigned int type = PTHREAD_MUTEX_TYPE (mutex);
+  unsigned int type = PTHREAD_MUTEX_TYPE_EL (mutex);
 
   LIBC_PROBE (mutex_entry, 1, mutex);
 
-  if (__builtin_expect (type & ~PTHREAD_MUTEX_KIND_MASK_NP, 0))
+  if (__builtin_expect (type & ~(PTHREAD_MUTEX_KIND_MASK_NP
+ | PTHREAD_MUTEX_ELISION_FLAGS_NP), 0))
     return __pthread_mutex_lock_full (mutex);
 
   pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
 
-  if (__builtin_expect (type, PTHREAD_MUTEX_TIMED_NP)
-      == PTHREAD_MUTEX_TIMED_NP)
+  /* A switch would be likely faster */
+
+  if (__builtin_expect (type == PTHREAD_MUTEX_TIMED_NP, 1))
     {
+      FORCE_ELISION (mutex, goto elision);
     simple:
       /* Normal mutex.  */
       LLL_MUTEX_LOCK (mutex);
       assert (mutex->__data.__owner == 0);
     }
+  else if (__builtin_expect (type == PTHREAD_MUTEX_TIMED_ELISION_NP, 1))
+    {
+  elision: __attribute__((unused))
+      if (!ENABLE_ELISION)
+        goto simple;
+      /* Don't record owner or users for elision case. */
+      int ret = LLL_MUTEX_LOCK_ELISION (mutex);
+      assert (mutex->__data.__owner == 0);
+      return ret;
+    }
   else if (__builtin_expect (type == PTHREAD_MUTEX_RECURSIVE_NP, 1))
     {
       /* Recursive mutex.  */
+    recursive:
 
       /* Check whether we already hold the mutex.  */
       if (mutex->__data.__owner == id)
@@ -89,35 +149,36 @@ __pthread_mutex_lock (mutex)
     }
   else if (__builtin_expect (type == PTHREAD_MUTEX_ADAPTIVE_NP, 1))
     {
-      if (! __is_smp)
- goto simple;
-
-      if (LLL_MUTEX_TRYLOCK (mutex) != 0)
- {
-  int cnt = 0;
-  int max_cnt = MIN (MAX_ADAPTIVE_COUNT,
-     mutex->__data.__spins * 2 + 10);
-  do
-    {
-      if (cnt++ >= max_cnt)
- {
-  LLL_MUTEX_LOCK (mutex);
-  break;
- }
-
-#ifdef BUSY_WAIT_NOP
-      BUSY_WAIT_NOP;
-#endif
-    }
-  while (LLL_MUTEX_TRYLOCK (mutex) != 0);
-
-  mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
- }
-      assert (mutex->__data.__owner == 0);
+      FORCE_ELISION (mutex, goto elision_adaptive);
+  adaptive:
+      adaptive_lock (mutex);
+    }
+  else if (type == PTHREAD_MUTEX_TIMED_ELISION_NP)
+    goto elision;
+  else if (type == PTHREAD_MUTEX_TIMED_NO_ELISION_NP)
+    goto simple;
+  else if (type == PTHREAD_MUTEX_ADAPTIVE_NO_ELISION_NP)
+    goto adaptive;
+  else if (type == PTHREAD_MUTEX_ADAPTIVE_ELISION_NP)
+    {
+  elision_adaptive: __attribute__((unused))
+      if (!ENABLE_ELISION)
+ goto adaptive;
+      if (!lll_trylock_elision (mutex->__data.__lock, mutex->__data.__elision, 0))
+        return 0;
+      adaptive_lock (mutex);
+      /* No owner for elision */
+      return 0;
+    }
+  else if (PTHREAD_MUTEX_TYPE (mutex) == PTHREAD_MUTEX_RECURSIVE_NP)
+    {
+      /* In case the user set the elision flags here.
+         Elision not supported so far. */
+      goto recursive;
     }
   else
     {
-      assert (type == PTHREAD_MUTEX_ERRORCHECK_NP);
+      assert (PTHREAD_MUTEX_TYPE (mutex) == PTHREAD_MUTEX_ERRORCHECK_NP);
       /* Check whether we already hold the mutex.  */
       if (__builtin_expect (mutex->__data.__owner == id, 0))
  return EDEADLK;
diff --git a/nptl/pthread_mutex_timedlock.c b/nptl/pthread_mutex_timedlock.c
index 3a36424..d0bc7a1 100644
--- a/nptl/pthread_mutex_timedlock.c
+++ b/nptl/pthread_mutex_timedlock.c
@@ -25,6 +25,21 @@
 
 #include <stap-probe.h>
 
+#ifndef lll_timedlock_elision
+#define lll_timedlock_elision(a,dummy,b,c) lll_timedlock(a, b, c)
+#endif
+
+#ifndef lll_trylock_elision
+#define lll_trylock_elision(a,t,u) lll_trylock(a)
+#endif
+
+#ifndef ENABLE_ELISION
+#define ENABLE_ELISION 0
+#endif
+
+#ifndef FORCE_ELISION
+#define FORCE_ELISION(m, s)
+#endif
 
 int
 pthread_mutex_timedlock (mutex, abstime)
@@ -40,10 +55,12 @@ pthread_mutex_timedlock (mutex, abstime)
   /* We must not check ABSTIME here.  If the thread does not block
      abstime must not be checked for a valid value.  */
 
-  switch (__builtin_expect (PTHREAD_MUTEX_TYPE (mutex),
+  switch (__builtin_expect (PTHREAD_MUTEX_TYPE_EL (mutex),
     PTHREAD_MUTEX_TIMED_NP))
     {
       /* Recursive mutex.  */
+    case PTHREAD_MUTEX_RECURSIVE_NP|PTHREAD_MUTEX_ELISION_NP:
+    case PTHREAD_MUTEX_RECURSIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP:
     case PTHREAD_MUTEX_RECURSIVE_NP:
       /* Check whether we already hold the mutex.  */
       if (mutex->__data.__owner == id)
@@ -78,13 +95,37 @@ pthread_mutex_timedlock (mutex, abstime)
       /* FALLTHROUGH */
 
     case PTHREAD_MUTEX_TIMED_NP:
+      FORCE_ELISION (mutex, goto elision);
+    case PTHREAD_MUTEX_TIMED_NO_ELISION_NP:
     simple:
       /* Normal mutex.  */
       result = lll_timedlock (mutex->__data.__lock, abstime,
       PTHREAD_MUTEX_PSHARED (mutex));
       break;
 
+    case PTHREAD_MUTEX_TIMED_ELISION_NP:
+    elision: __attribute__((unused))
+      /* Don't record ownership */
+      if (!ENABLE_ELISION)
+ goto simple;
+      return lll_timedlock_elision (mutex->__data.__lock,
+    mutex->__data.__spins,
+    abstime,
+    PTHREAD_MUTEX_PSHARED (mutex));
+
+
+    case PTHREAD_MUTEX_ADAPTIVE_ELISION_NP:
+    adaptive_elision: __attribute__((unused))
+      if (ENABLE_ELISION
+  && !lll_trylock_elision (mutex->__data.__lock, mutex->__data.__elision, 0))
+        return 0;
+      goto adaptive;
+
     case PTHREAD_MUTEX_ADAPTIVE_NP:
+      FORCE_ELISION (mutex, goto adaptive_elision);
+
+    case PTHREAD_MUTEX_ADAPTIVE_NO_ELISION_NP:
+    adaptive:
       if (! __is_smp)
  goto simple;
 
diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
index 8f5279d..4702409 100644
--- a/nptl/pthread_mutex_trylock.c
+++ b/nptl/pthread_mutex_trylock.c
@@ -22,6 +22,20 @@
 #include "pthreadP.h"
 #include <lowlevellock.h>
 
+#ifndef lll_trylock_elision
+#define lll_trylock_elision(a,t,u) lll_trylock(a)
+#endif
+
+#ifndef ENABLE_ELISION
+#define ENABLE_ELISION 0
+#endif
+
+#ifndef DO_ELISION
+#define DO_ELISION(m) 0
+#endif
+
+/* We don't force elision in trylock, because this can lead to inconsistent
+   lock state if the lock was actually busy. */
 
 int
 __pthread_mutex_trylock (mutex)
@@ -30,10 +44,12 @@ __pthread_mutex_trylock (mutex)
   int oldval;
   pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
 
-  switch (__builtin_expect (PTHREAD_MUTEX_TYPE (mutex),
+  switch (__builtin_expect (PTHREAD_MUTEX_TYPE_EL (mutex),
     PTHREAD_MUTEX_TIMED_NP))
     {
       /* Recursive mutex.  */
+    case PTHREAD_MUTEX_RECURSIVE_NP|PTHREAD_MUTEX_ELISION_NP:
+    case PTHREAD_MUTEX_RECURSIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP:
     case PTHREAD_MUTEX_RECURSIVE_NP:
       /* Check whether we already hold the mutex.  */
       if (mutex->__data.__owner == id)
@@ -57,10 +73,29 @@ __pthread_mutex_trylock (mutex)
  }
       break;
 
-    case PTHREAD_MUTEX_ERRORCHECK_NP:
+    case PTHREAD_MUTEX_TIMED_ELISION_NP:
+    case PTHREAD_MUTEX_ADAPTIVE_ELISION_NP:
+    elision:
+      if (!ENABLE_ELISION)
+ goto normal;
+      if (lll_trylock_elision (mutex->__data.__lock,
+       mutex->__data.__elision,
+       mutex->__data.__kind
+       & PTHREAD_MUTEX_UPGRADED_ELISION_NP) != 0)
+        break;
+      /* Don't record the ownership. */
+      return 0;
+
     case PTHREAD_MUTEX_TIMED_NP:
     case PTHREAD_MUTEX_ADAPTIVE_NP:
-      /* Normal mutex.  */
+      if (DO_ELISION (mutex))
+ goto elision;
+      /*FALL THROUGH*/
+    case PTHREAD_MUTEX_ADAPTIVE_NO_ELISION_NP:
+      /*FALL THROUGH*/
+    case PTHREAD_MUTEX_TIMED_NO_ELISION_NP:
+    case PTHREAD_MUTEX_ERRORCHECK_NP:
+    normal:
       if (lll_trylock (mutex->__data.__lock) != 0)
  break;
 
@@ -378,4 +413,9 @@ __pthread_mutex_trylock (mutex)
 
   return EBUSY;
 }
+
+#ifndef __pthread_mutex_trylock
+#ifndef pthread_mutex_trylock
 strong_alias (__pthread_mutex_trylock, pthread_mutex_trylock)
+#endif
+#endif
diff --git a/nptl/pthread_mutex_unlock.c b/nptl/pthread_mutex_unlock.c
index c0249f7..90b39df 100644
--- a/nptl/pthread_mutex_unlock.c
+++ b/nptl/pthread_mutex_unlock.c
@@ -23,6 +23,14 @@
 #include <lowlevellock.h>
 #include <stap-probe.h>
 
+#ifndef lll_unlock_elision
+#define lll_unlock_elision(a,b) ({ lll_unlock (a,b); 0; })
+#endif
+
+#ifndef ENABLE_ELISION
+#define ENABLE_ELISION 0
+#endif
+
 static int
 internal_function
 __pthread_mutex_unlock_full (pthread_mutex_t *mutex, int decr)
@@ -34,8 +42,9 @@ __pthread_mutex_unlock_usercnt (mutex, decr)
      pthread_mutex_t *mutex;
      int decr;
 {
-  int type = PTHREAD_MUTEX_TYPE (mutex);
-  if (__builtin_expect (type & ~PTHREAD_MUTEX_KIND_MASK_NP, 0))
+  int type = PTHREAD_MUTEX_TYPE_EL (mutex);
+  if (__builtin_expect (type &
+ ~(PTHREAD_MUTEX_KIND_MASK_NP|PTHREAD_MUTEX_ELISION_FLAGS_NP), 0))
     return __pthread_mutex_unlock_full (mutex, decr);
 
   if (__builtin_expect (type, PTHREAD_MUTEX_TIMED_NP)
@@ -55,6 +64,15 @@ __pthread_mutex_unlock_usercnt (mutex, decr)
 
       return 0;
     }
+  else if (__builtin_expect (type == PTHREAD_MUTEX_TIMED_ELISION_NP, 1)
+           || (type == PTHREAD_MUTEX_ADAPTIVE_ELISION_NP))
+    {
+      if (!ENABLE_ELISION)
+ goto normal;
+      /* Don't reset the owner/users fields for elision */
+      return lll_unlock_elision (mutex->__data.__lock,
+      PTHREAD_MUTEX_PSHARED (mutex));
+    }
   else if (__builtin_expect (type == PTHREAD_MUTEX_RECURSIVE_NP, 1))
     {
       /* Recursive mutex.  */
@@ -66,7 +84,9 @@ __pthread_mutex_unlock_usercnt (mutex, decr)
  return 0;
       goto normal;
     }
-  else if (__builtin_expect (type == PTHREAD_MUTEX_ADAPTIVE_NP, 1))
+  type &= ~PTHREAD_MUTEX_ELISION_FLAGS_NP;
+  if (__builtin_expect (type == PTHREAD_MUTEX_ADAPTIVE_NP, 1) ||
+      __builtin_expect (type == PTHREAD_MUTEX_TIMED_NP, 1))
     goto normal;
   else
     {
diff --git a/nptl/sysdeps/pthread/pthread.h b/nptl/sysdeps/pthread/pthread.h
index 5afadaa..7fe7866 100644
--- a/nptl/sysdeps/pthread/pthread.h
+++ b/nptl/sysdeps/pthread/pthread.h
@@ -48,6 +48,7 @@ enum
 
   PTHREAD_MUTEX_ELISION_NP    = 1024,
   PTHREAD_MUTEX_NO_ELISION_NP = 2048,
+  PTHREAD_MUTEX_UPGRADED_ELISION_NP = 4096,
   PTHREAD_MUTEX_PSHARED_NP    = 128
 
 #if defined __USE_UNIX98 || defined __USE_XOPEN2K8
diff --git a/nptl/sysdeps/unix/sysv/linux/pthread_mutex_cond_lock.c b/nptl/sysdeps/unix/sysv/linux/pthread_mutex_cond_lock.c
index b417da5..082196a 100644
--- a/nptl/sysdeps/unix/sysv/linux/pthread_mutex_cond_lock.c
+++ b/nptl/sysdeps/unix/sysv/linux/pthread_mutex_cond_lock.c
@@ -2,6 +2,11 @@
 
 #define LLL_MUTEX_LOCK(mutex) \
   lll_cond_lock ((mutex)->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex))
+
+/* Not actually elided so far. Needed? */
+#define LLL_MUTEX_LOCK_ELISION(mutex)  \
+  ({ lll_cond_lock ((mutex)->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex)); 0; })
+
 #define LLL_MUTEX_TRYLOCK(mutex) \
   lll_cond_trylock ((mutex)->__data.__lock)
 #define LLL_ROBUST_MUTEX_LOCK(mutex, id) \
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h b/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h
index ccd896c..1852e07 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h
+++ b/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h
@@ -101,14 +101,23 @@ typedef union
        binary compatibility.  */
     int __kind;
 #ifdef __x86_64__
-    int __spins;
+    short __spins;
+    short __elision;
     __pthread_list_t __list;
 # define __PTHREAD_MUTEX_HAVE_PREV 1
+# define __PTHREAD_MUTEX_HAVE_ELISION   1
 #else
     unsigned int __nusers;
     __extension__ union
     {
-      int __spins;
+      struct
+      {
+        short __espins;
+ short __elision;
+# define __spins d.__espins
+# define __elision d.__elision
+# define __PTHREAD_MUTEX_HAVE_ELISION   2
+      } d;
       __pthread_slist_t __list;
     };
 #endif
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/force-elision.h b/nptl/sysdeps/unix/sysv/linux/x86/force-elision.h
new file mode 100644
index 0000000..c08bded
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/force-elision.h
@@ -0,0 +1,34 @@
+/* force-elision.h: Automatic enabling of elision for mutexes
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+
+/* Check for elision on this lock without upgrading */
+#define DO_ELISION(m) \
+  (__pthread_force_elision \
+   && (m->__data.__kind & PTHREAD_MUTEX_NO_ELISION_NP) == 0 \
+   && __is_smp) \
+
+/* Automatically enable elision for existing user lock kinds */
+#define FORCE_ELISION(m, s) \
+  if (__pthread_force_elision \
+      && (m->__data.__kind & PTHREAD_MUTEX_ELISION_FLAGS_NP) == 0 \
+      && __is_smp) \
+    { \
+      mutex->__data.__kind |= PTHREAD_MUTEX_ELISION_NP \
+    | PTHREAD_MUTEX_UPGRADED_ELISION_NP; \
+      s; \
+    }
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_cond_lock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_cond_lock.c
new file mode 100644
index 0000000..7681f63
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_cond_lock.c
@@ -0,0 +1,5 @@
+/* The cond lock is not actually elided yet, but we still need to handle
+   already elided locks and have a working ENABLE_ELISION. */
+#include "elision-conf.h"
+#define ENABLE_ELISION (__elision_available != 0)
+#include "sysdeps/unix/sysv/linux/pthread_mutex_cond_lock.c"
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_lock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_lock.c
new file mode 100644
index 0000000..818804d
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_lock.c
@@ -0,0 +1,23 @@
+/* Elided version of pthread_mutex_lock.
+   Copyright (C) 2011, 2012, 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include "elision-conf.h"
+#include "force-elision.h"
+#include "init-arch.h"
+
+#define ENABLE_ELISION (__elision_available != 0)
+#include "nptl/pthread_mutex_lock.c"
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_timedlock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_timedlock.c
new file mode 100644
index 0000000..e939a99
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_timedlock.c
@@ -0,0 +1,23 @@
+/* Elided version of pthread_mutex_timedlock.
+   Copyright (C) 2011, 2012, 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include "elision-conf.h"
+#include "force-elision.h"
+#include "init-arch.h"
+
+#define ENABLE_ELISION (__elision_available != 0)
+#include "nptl/pthread_mutex_timedlock.c"
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_trylock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_trylock.c
new file mode 100644
index 0000000..604624f
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_trylock.c
@@ -0,0 +1,23 @@
+/* Elided version of pthread_mutex_trylock.
+   Copyright (C) 2011, 2012, 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include "elision-conf.h"
+#include "force-elision.h"
+#include "init-arch.h"
+
+#define ENABLE_ELISION (__elision_available != 0)
+#include "nptl/pthread_mutex_trylock.c"
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_unlock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_unlock.c
new file mode 100644
index 0000000..3b29b28
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_mutex_unlock.c
@@ -0,0 +1,2 @@
+#define ENABLE_ELISION 1
+#include "nptl/pthread_mutex_unlock.c"
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 4/9] Support setting elision in pthread_mutexattr_settype

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

2013-01-22 Andi Kleen  <[hidden email]>

        * pthread_mutexattr_settype.c (__pthread_mutexattr_settype):
        Support elision flags.
---
 nptl/pthread_mutexattr_settype.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/nptl/pthread_mutexattr_settype.c b/nptl/pthread_mutexattr_settype.c
index 7b476e9..cd4cec8 100644
--- a/nptl/pthread_mutexattr_settype.c
+++ b/nptl/pthread_mutexattr_settype.c
@@ -26,8 +26,11 @@ __pthread_mutexattr_settype (attr, kind)
      int kind;
 {
   struct pthread_mutexattr *iattr;
+  int mkind = kind & ~PTHREAD_MUTEX_ELISION_FLAGS_NP;
 
-  if (kind < PTHREAD_MUTEX_NORMAL || kind > PTHREAD_MUTEX_ADAPTIVE_NP)
+  if (mkind < PTHREAD_MUTEX_NORMAL || mkind > PTHREAD_MUTEX_ADAPTIVE_NP)
+    return EINVAL;
+  if ((kind & PTHREAD_MUTEX_ELISION_FLAGS_NP) == PTHREAD_MUTEX_ELISION_FLAGS_NP)
     return EINVAL;
 
   iattr = (struct pthread_mutexattr *) attr;
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 5/9] Add lock elision to rwlocks

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

This code is very similar to the main pthread mutex code, but we have
to implement part of it in assembler. IFUNC is used to avoid any
fast path overhead on non RTM systems for the lock functions (but not
unlock)

The patch kit is large because there are so many combinations
of reader and writer locks, but it is fairly repetive.

The basic elision algorithm is very similar to normal mutexes:
a wrapper checks if the lock is free (both reader and writer)
We have to check both reader and writer because separating them
would require putting state on different cache lines.

The algorithm is very basic currently and may be tuned
more in the future.

No adaptation at this point, the elision will not automatically
disable itself. The programmer or user can do it explicitely though.

New lock type flags are exposed to force enable/disable elision
for a specific rwlocks.

We use some padding in the internal rwlock as a elision flag.
This is 8 bytes on 64bits currently, but could be made smaller later
without compatibility problems (or to share some bytes with a
adaptation count). 32bit uses one byte.

2013-01-22  Andi Kleen <[hidden email]>
            Hongjiu Lu <[hidden email]>

        * pthread_rwlock_init.c (__pthread_rwlock_init): Support elision.
        * pthread_rwlockattr_setkind_np.c (pthread_rwlockattr_setkind_np): Support elision.
        * sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_rdlock.S: dito.
        * sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedrdlock.S: dito
        * sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedwrlock.S: dito.
        * sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_unlock.S: dito.
        * sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_wrlock.S: dito.
        * sysdeps/unix/sysv/linux/internaltypes.h (struct pthread_condattr):
        (struct pthread_rwlockattr): Support elision flags in internal rwlock.
        * sysdeps/unix/sysv/linux/lowlevelrwlock.sym: Add ELIDED.
        * sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h: Add elision flags
        * sysdeps/unix/sysv/linux/x86/pthread_rwlock_tryrdlock.c: New file
        * sysdeps/unix/sysv/linux/x86/pthread_rwlock_trywrlock.c: dito.
        * sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S: Handle elision.
        * sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedrdlock.S: dito.
        * sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedwrlock.S: dito.
        * sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_unlock.S: dito.
        * sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S: dito.
---
 nptl/pthread_rwlock_init.c                         |    4 +
 nptl/pthread_rwlock_tryrdlock.c                    |    2 +
 nptl/pthread_rwlock_trywrlock.c                    |    2 +
 nptl/pthread_rwlockattr_setkind_np.c               |   12 +++-
 .../sysv/linux/i386/i486/pthread_rwlock_rdlock.S   |   87 ++++++++++++++++++++
 .../linux/i386/i486/pthread_rwlock_timedrdlock.S   |   81 ++++++++++++++++++
 .../linux/i386/i486/pthread_rwlock_timedwrlock.S   |   63 ++++++++++++++-
 .../sysv/linux/i386/i486/pthread_rwlock_unlock.S   |   19 ++++
 .../sysv/linux/i386/i486/pthread_rwlock_wrlock.S   |   71 ++++++++++++++++
 nptl/sysdeps/unix/sysv/linux/internaltypes.h       |    4 +-
 nptl/sysdeps/unix/sysv/linux/lowlevelrwlock.sym    |    5 +
 .../unix/sysv/linux/x86/bits/pthreadtypes.h        |    6 +-
 .../unix/sysv/linux/x86/pthread_rwlock_tryrdlock.c |   54 ++++++++++++
 .../unix/sysv/linux/x86/pthread_rwlock_trywrlock.c |   57 +++++++++++++
 .../unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S |   72 ++++++++++++++++
 .../sysv/linux/x86_64/pthread_rwlock_timedrdlock.S |   75 +++++++++++++++++
 .../sysv/linux/x86_64/pthread_rwlock_timedwrlock.S |   52 ++++++++++++
 .../unix/sysv/linux/x86_64/pthread_rwlock_unlock.S |   17 ++++-
 .../unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S |   47 +++++++++++
 19 files changed, 724 insertions(+), 6 deletions(-)
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_tryrdlock.c
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_trywrlock.c

diff --git a/nptl/pthread_rwlock_init.c b/nptl/pthread_rwlock_init.c
index 16bfe2d..f3680fc 100644
--- a/nptl/pthread_rwlock_init.c
+++ b/nptl/pthread_rwlock_init.c
@@ -68,6 +68,10 @@ __pthread_rwlock_init (rwlock, attr)
       header.private_futex));
 #endif
 
+#ifdef __PTHREAD_RWLOCK_ELIDING
+  rwlock->__data.__eliding = iattr->eliding;
+#endif
+
   return 0;
 }
 strong_alias (__pthread_rwlock_init, pthread_rwlock_init)
diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c
index 935ac87..896809b 100644
--- a/nptl/pthread_rwlock_tryrdlock.c
+++ b/nptl/pthread_rwlock_tryrdlock.c
@@ -45,4 +45,6 @@ __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
 
   return result;
 }
+#ifndef __pthread_rwlock_tryrdlock
 strong_alias (__pthread_rwlock_tryrdlock, pthread_rwlock_tryrdlock)
+#endif
diff --git a/nptl/pthread_rwlock_trywrlock.c b/nptl/pthread_rwlock_trywrlock.c
index 01754ae..6d8a8c4 100644
--- a/nptl/pthread_rwlock_trywrlock.c
+++ b/nptl/pthread_rwlock_trywrlock.c
@@ -38,4 +38,6 @@ __pthread_rwlock_trywrlock (pthread_rwlock_t *rwlock)
 
   return result;
 }
+#ifndef __pthread_rwlock_trywrlock
 strong_alias (__pthread_rwlock_trywrlock, pthread_rwlock_trywrlock)
+#endif
diff --git a/nptl/pthread_rwlockattr_setkind_np.c b/nptl/pthread_rwlockattr_setkind_np.c
index 64bd341..833e1f0 100644
--- a/nptl/pthread_rwlockattr_setkind_np.c
+++ b/nptl/pthread_rwlockattr_setkind_np.c
@@ -27,12 +27,22 @@ pthread_rwlockattr_setkind_np (attr, pref)
 {
   struct pthread_rwlockattr *iattr;
 
+  iattr = (struct pthread_rwlockattr *) attr;
+
+  iattr->eliding = 0;
+  if (pref & (PTHREAD_RWLOCK_ELISION_NP|PTHREAD_RWLOCK_NO_ELISION_NP))
+    {
+      if (pref & PTHREAD_RWLOCK_ELISION_NP)
+        iattr->eliding = 1;
+      if (pref & PTHREAD_RWLOCK_NO_ELISION_NP)
+ iattr->eliding = -1;
+      pref &= ~(PTHREAD_RWLOCK_ELISION_NP|PTHREAD_RWLOCK_NO_ELISION_NP);
+    }
   if (pref != PTHREAD_RWLOCK_PREFER_READER_NP
       && pref != PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP
       && __builtin_expect  (pref != PTHREAD_RWLOCK_PREFER_WRITER_NP, 0))
     return EINVAL;
 
-  iattr = (struct pthread_rwlockattr *) attr;
 
   iattr->lockkind = pref;
 
diff --git a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_rdlock.S b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_rdlock.S
index 6c46ba6..7cc683a 100644
--- a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_rdlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_rdlock.S
@@ -21,9 +21,16 @@
 #include <lowlevelrwlock.h>
 #include <pthread-errnos.h>
 #include <kernel-features.h>
+#include <hle.h>
 
 #include <stap-probe.h>
 
+#ifdef PIC
+#define MO(x) x##@GOTOFF(%edx)
+#else
+#define MO(x) x
+#endif
+
  .text
 
  .globl __pthread_rwlock_rdlock
@@ -43,6 +50,84 @@ __pthread_rwlock_rdlock:
 
  LIBC_PROBE (rdlock_entry, 1, %ebx)
 
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+ cmpl  $0,MO(__elision_available)
+ jz    not_elided_rdlock
+
+ cmpb  $0,ELIDING(%ebx)
+ js    not_elided_rdlock
+ jnz   2f
+ /* zero: use default */
+
+ cmpl $0,MO(__rwlock_rtm_enabled)
+ jz    not_elided_rdlock
+
+2:
+ mov   MO(__rwlock_rtm_read_retries),%ecx
+
+try_trans_rdlock:
+ XBEGIN abort_rdlock
+
+ /* Lock writer/reader free? */
+ cmpl  $0,WRITER(%ebx)
+ jnz   1f
+ cmpl  $0,NR_READERS(%ebx)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+
+ pop   %ebx
+ cfi_adjust_cfa_offset(-4)
+ pop   %esi
+ cfi_adjust_cfa_offset(-4)
+ ret
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp not_elided_rdlock
+
+ /* Abort happened. */
+abort_rdlock:
+ cmpl  $0,MO(__tsx_abort_hook)
+ jz    1f
+ push  %ecx
+ cfi_adjust_cfa_offset(4)
+ push  %eax
+ cfi_adjust_cfa_offset(4)
+ call  *MO(__tsx_abort_hook)
+ pop   %eax
+ cfi_adjust_cfa_offset(-4)
+ pop   %ecx
+ xor   %esi, %esi /* needed? */
+ movl  12(%esp), %ebx
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+1:
+
+ testl $_XABORT_CONFLICT,%eax
+ jz    not_elided_rdlock
+
+ /* For a reader that aborts due a conflict retry speculation
+   a limited number of times. This way when some reader aborts
+   because the reader count is written the other readers will
+   still elide, at the cost of retrying the speculation. */
+
+ dec   %ecx
+ jnz   try_trans_rdlock
+
+ /* Otherwise we just fall back directly to the lock.
+   Here's the place to add more adaptation. */
+
+not_elided_rdlock:
+
  /* Get the lock.  */
  movl $1, %edx
  xorl %eax, %eax
@@ -188,5 +273,7 @@ __pthread_rwlock_rdlock:
  cfi_endproc
  .size __pthread_rwlock_rdlock,.-__pthread_rwlock_rdlock
 
+#ifndef __pthread_rwlock_rdlock
 strong_alias (__pthread_rwlock_rdlock, pthread_rwlock_rdlock)
 hidden_def (__pthread_rwlock_rdlock)
+#endif
diff --git a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedrdlock.S b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedrdlock.S
index 1908f6f..a88d191 100644
--- a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedrdlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedrdlock.S
@@ -21,7 +21,13 @@
 #include <lowlevelrwlock.h>
 #include <pthread-errnos.h>
 #include <kernel-features.h>
+#include <hle.h>
 
+#ifdef PIC
+#define MO(x) x##@GOTOFF(%edx)
+#else
+#define MO(x) x
+#endif
 
  .text
 
@@ -48,6 +54,80 @@ pthread_rwlock_timedrdlock:
  movl 28(%esp), %ebp
  movl 32(%esp), %edi
 
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+ cmpl  $0,MO(__elision_available)
+ jz    not_elided_trdlock
+
+ cmpb  $0,ELIDING(%ebp)
+ js    not_elided_trdlock
+ jnz   2f
+ /* zero: use default */
+
+ cmpl  $0,MO(__rwlock_rtm_enabled)
+ jz    not_elided_trdlock
+
+2:
+ mov   MO(__rwlock_rtm_read_retries),%ecx
+
+try_trans_trdlock:
+ XBEGIN abort_trdlock
+
+ /* Lock writer/reader free? */
+ cmpl  $0,WRITER(%ebp)
+ jnz   1f
+ cmpl  $0,NR_READERS(%ebp)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+ jmp   77f
+
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp not_elided_trdlock
+
+ /* Abort happened. */
+abort_trdlock:
+ cmpl  $0,MO(__tsx_abort_hook)
+ jz    1f
+ push  %ecx
+ cfi_adjust_cfa_offset(4)
+ push  %eax
+ cfi_adjust_cfa_offset(4)
+ call  *MO(__tsx_abort_hook)
+ pop   %eax
+ cfi_adjust_cfa_offset(-4)
+ pop   %ecx
+ movl  32(%esp), %edi
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+1:
+
+ testl $_XABORT_CONFLICT,%eax
+ jz    not_elided_trdlock
+
+ /* For a reader that aborts due a conflict retry speculation
+   a limited number of times. This way when some reader aborts
+   because the reader count is written the other readers will
+   still elide, at the cost of retrying the speculation. */
+
+ dec   %ecx
+ jnz   try_trans_trdlock
+
+ /* Otherwise we just fall back directly to the lock.
+   Here's the place to add more adaptation. */
+
+not_elided_trdlock:
+
+
  /* Get the lock.  */
  movl $1, %edx
  xorl %eax, %eax
@@ -158,6 +238,7 @@ pthread_rwlock_timedrdlock:
 
 7: movl %edx, %eax
 
+77:
  addl $8, %esp
  cfi_adjust_cfa_offset(-8)
  popl %ebp
diff --git a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedwrlock.S b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedwrlock.S
index e0fc809..4e8b307 100644
--- a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedwrlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_timedwrlock.S
@@ -21,7 +21,13 @@
 #include <lowlevelrwlock.h>
 #include <pthread-errnos.h>
 #include <kernel-features.h>
+#include <hle.h>
 
+#ifdef PIC
+#define MO(x) x##@GOTOFF(%edx)
+#else
+#define MO(x) x
+#endif
 
  .text
 
@@ -48,6 +54,61 @@ pthread_rwlock_timedwrlock:
  movl 28(%esp), %ebp
  movl 32(%esp), %edi
 
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+ cmpl  $0,MO(__elision_available)
+ jz    not_elided_twrlock
+
+ cmpb  $0,ELIDING(%ebp)
+ js    not_elided_twrlock
+ jnz   try_trans_wrlock
+ /* zero: use default */
+
+ cmpl $0,MO(__rwlock_rtm_enabled)
+ jz not_elided_twrlock
+
+try_trans_wrlock:
+ XBEGIN abort_twrlock
+
+ /* Lock writer free? */
+ cmpl  $0,WRITER(%ebp)
+ jnz   1f
+ cmpl  $0,NR_READERS(%ebp)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+
+ jmp   77f
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp   not_elided_twrlock
+
+ /* Abort happened. */
+abort_twrlock:
+ cmpl  $0,MO(__tsx_abort_hook)
+ jz    1f
+ push  %eax
+ cfi_adjust_cfa_offset(4)
+ call  *MO(__tsx_abort_hook)
+ pop   %eax
+ cfi_adjust_cfa_offset(-4)
+ mov   32(%esp), %edi
+#ifdef PIC
+ SETUP_PIC_REG(bx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+1:
+ /* Otherwise we just fall back directly to the lock.
+   Here's the place to add more adaptation. */
+
+not_elided_twrlock:
+
  /* Get the lock.  */
  movl $1, %edx
  xorl %eax, %eax
@@ -156,7 +217,7 @@ pthread_rwlock_timedwrlock:
 
 7: movl %edx, %eax
 
- addl $8, %esp
+77: addl $8, %esp
  cfi_adjust_cfa_offset(-8)
  popl %ebp
  cfi_adjust_cfa_offset(-4)
diff --git a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_unlock.S b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_unlock.S
index 708e31c..3169743 100644
--- a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_unlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_unlock.S
@@ -20,6 +20,7 @@
 #include <lowlevellock.h>
 #include <lowlevelrwlock.h>
 #include <kernel-features.h>
+#include <hle.h>
 
 
  .text
@@ -38,6 +39,24 @@ __pthread_rwlock_unlock:
 
  movl 12(%esp), %edi
 
+ /* Is lock free? */
+ cmpl $0,WRITER(%edi)
+ jnz  1f
+ cmpl $0,NR_READERS(%edi)
+ jnz  1f
+
+ /* Looks free. Assume transaction.
+   If you crash here you unlocked a free lock. */
+ XEND
+ xor  %eax,%eax
+
+ pop %edi
+ cfi_adjust_cfa_offset(-4)
+ pop %ebx
+ cfi_adjust_cfa_offset(-4)
+ ret
+
+1:
  /* Get the lock.  */
  movl $1, %edx
  xorl %eax, %eax
diff --git a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_wrlock.S b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_wrlock.S
index 6ea17f7..f59469f 100644
--- a/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_wrlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_rwlock_wrlock.S
@@ -21,9 +21,16 @@
 #include <lowlevelrwlock.h>
 #include <pthread-errnos.h>
 #include <kernel-features.h>
+#include <hle.h>
 
 #include <stap-probe.h>
 
+#ifdef PIC
+#define MO(x) x@GOTOFF(%edx)
+#else
+#define MO(x) x
+#endif
+
  .text
 
  .globl __pthread_rwlock_wrlock
@@ -43,6 +50,68 @@ __pthread_rwlock_wrlock:
 
  LIBC_PROBE (wrlock_entry, 1, %ebx)
 
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+ cmpl  $0,MO(__elision_available)
+ jz    not_elided_wrlock
+
+ cmpb  $0,ELIDING(%ebx)
+ js    not_elided_wrlock
+ jnz   try_trans_wrlock
+ /* zero: use default */
+
+ cmpl    $0,MO(__rwlock_rtm_enabled)
+ jz      not_elided_wrlock
+
+try_trans_wrlock:
+ XBEGIN abort_wrlock
+
+ /* Lock writer free? */
+ /* Ignore readers because we don't need them */
+ cmpl  $0,WRITER(%ebx)
+ jnz   1f
+ cmpl  $0,NR_READERS(%ebx)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+
+ pop   %ebx
+ cfi_adjust_cfa_offset(-4)
+ pop   %esi
+ cfi_adjust_cfa_offset(-4)
+ ret
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp   not_elided_wrlock
+
+ /* Abort happened. */
+abort_wrlock:
+ cmpl  $0,MO(__tsx_abort_hook)
+ jz    1f
+ push  %eax
+ cfi_adjust_cfa_offset(4)
+ call  *MO(__tsx_abort_hook)
+ pop   %eax
+ cfi_adjust_cfa_offset(-4)
+ xorl %esi, %esi /* needed? */
+ movl 12(%esp), %ebx
+#ifdef PIC
+ SETUP_PIC_REG(dx)
+ addl $_GLOBAL_OFFSET_TABLE_,%edx
+#endif
+
+1:
+
+ /* Otherwise we just fall back directly to the lock.
+   Here's the place to add more adaptation. */
+
+not_elided_wrlock:
+
  /* Get the lock.  */
  movl $1, %edx
  xorl %eax, %eax
@@ -179,5 +248,7 @@ __pthread_rwlock_wrlock:
  cfi_endproc
  .size __pthread_rwlock_wrlock,.-__pthread_rwlock_wrlock
 
+#ifndef __pthread_rwlock_wrlock
 strong_alias (__pthread_rwlock_wrlock, pthread_rwlock_wrlock)
 hidden_def (__pthread_rwlock_wrlock)
+#endif
diff --git a/nptl/sysdeps/unix/sysv/linux/internaltypes.h b/nptl/sysdeps/unix/sysv/linux/internaltypes.h
index 699a618..5d76107 100644
--- a/nptl/sysdeps/unix/sysv/linux/internaltypes.h
+++ b/nptl/sysdeps/unix/sysv/linux/internaltypes.h
@@ -86,7 +86,9 @@ struct pthread_condattr
 struct pthread_rwlockattr
 {
   int lockkind;
-  int pshared;
+  short pshared;
+  char  eliding;
+  char  pad;
 };
 
 
diff --git a/nptl/sysdeps/unix/sysv/linux/lowlevelrwlock.sym b/nptl/sysdeps/unix/sysv/linux/lowlevelrwlock.sym
index f50b25b..fcbcd5a 100644
--- a/nptl/sysdeps/unix/sysv/linux/lowlevelrwlock.sym
+++ b/nptl/sysdeps/unix/sysv/linux/lowlevelrwlock.sym
@@ -3,6 +3,10 @@
 #include <bits/pthreadtypes.h>
 #include <bits/wordsize.h>
 
+#ifndef __PTHREAD_RWLOCK_ELIDING
+#define __eliding __shared
+#endif
+
 --
 
 MUTEX offsetof (pthread_rwlock_t, __data.__lock)
@@ -14,3 +18,4 @@ WRITERS_QUEUED offsetof (pthread_rwlock_t, __data.__nr_writers_queued)
 FLAGS offsetof (pthread_rwlock_t, __data.__flags)
 WRITER offsetof (pthread_rwlock_t, __data.__writer)
 PSHARED offsetof (pthread_rwlock_t, __data.__shared)
+ELIDING offsetof (pthread_rwlock_t, __data.__eliding)
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h b/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h
index 1852e07..21b127d 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h
+++ b/nptl/sysdeps/unix/sysv/linux/x86/bits/pthreadtypes.h
@@ -183,12 +183,13 @@ typedef union
     unsigned int __nr_writers_queued;
     int __writer;
     int __shared;
-    unsigned long int __pad1;
+    long __eliding;
     unsigned long int __pad2;
     /* FLAGS must stay at this position in the structure to maintain
        binary compatibility.  */
     unsigned int __flags;
 # define __PTHREAD_RWLOCK_INT_FLAGS_SHARED 1
+# define __PTHREAD_RWLOCK_ELIDING 1
   } __data;
 # else
   struct
@@ -203,8 +204,9 @@ typedef union
        binary compatibility.  */
     unsigned char __flags;
     unsigned char __shared;
-    unsigned char __pad1;
+    char __eliding;
     unsigned char __pad2;
+# define __PTHREAD_RWLOCK_ELIDING 2
     int __writer;
   } __data;
 # endif
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_tryrdlock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_tryrdlock.c
new file mode 100644
index 0000000..3aed148
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_tryrdlock.c
@@ -0,0 +1,54 @@
+/* pthread_rwlock_tryrdlock: Lock eliding version of pthreads rwlock_tryrdlock.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include <pthread.h>
+#include <pthreadP.h>
+#include <hle.h>
+#include "elision-conf.h"
+#include "init-arch.h"
+
+#define __pthread_rwlock_tryrdlock __full_pthread_rwlock_tryrdlock
+#include <nptl/pthread_rwlock_tryrdlock.c>
+#undef __pthread_rwlock_tryrdlock
+
+int
+__pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock)
+{
+  unsigned status;
+
+  if ((rwlock->__data.__eliding > 0 && __elision_available)
+      || (rwlock->__data.__eliding == 0 && __rwlock_rtm_enabled))
+    {
+      if ((status = _xbegin()) == _XBEGIN_STARTED)
+ {
+  if (rwlock->__data.__writer == 0
+      && rwlock->__data.__nr_readers == 0)
+    return 0;
+  /* Lock was busy. Fall back to normal locking.
+     Could also _xend here but xabort with 0xff code
+     is more visible in the profiler. */
+  _xabort (0xff);
+ }
+      /* Aborts come here */
+      if (__tsx_abort_hook)
+ __tsx_abort_hook (status);
+    }
+
+  return __full_pthread_rwlock_tryrdlock (rwlock);
+}
+
+strong_alias(__pthread_rwlock_tryrdlock, pthread_rwlock_tryrdlock);
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_trywrlock.c b/nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_trywrlock.c
new file mode 100644
index 0000000..a813003
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/pthread_rwlock_trywrlock.c
@@ -0,0 +1,57 @@
+/* pthread_rwlock_trywrlock: Lock eliding version of pthreads rwlock trywrlock.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include <pthread.h>
+#include <pthreadP.h>
+#include <hle.h>
+#include "elision-conf.h"
+#include "init-arch.h"
+
+#define __pthread_rwlock_trywrlock __full_pthread_rwlock_trywrlock
+#include <nptl/pthread_rwlock_trywrlock.c>
+#undef __pthread_rwlock_trywrlock
+
+int
+__pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock)
+{
+  unsigned status;
+  int elision = 0;
+
+  if (rwlock->__data.__eliding == 0 && __rwlock_rtm_enabled)
+    {
+      _xabort (0xfd);
+      elision = 1;
+    }
+  if (elision || (rwlock->__data.__eliding > 0 && __elision_available))
+    {
+      if ((status = _xbegin()) == _XBEGIN_STARTED)
+ {
+  if (rwlock->__data.__writer == 0
+      && rwlock->__data.__nr_readers == 0)
+    return 0;
+  /* Lock was busy. Fall back to normal locking.
+     Could also _xend here but xabort with 0xff code
+     is more visible in the profiler. */
+  _xabort (0xff);
+ }
+      if (__tsx_abort_hook)
+ __tsx_abort_hook (status);
+    }
+
+  return __full_pthread_rwlock_trywrlock (rwlock);
+}
+strong_alias(__pthread_rwlock_trywrlock, pthread_rwlock_trywrlock);
diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S
index 7681818..de764c0 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S
@@ -22,6 +22,7 @@
 #include <pthread-errnos.h>
 #include <kernel-features.h>
 #include <stap-probe.h>
+#include <hle.h>
 
  .text
 
@@ -33,6 +34,75 @@ __pthread_rwlock_rdlock:
 
  LIBC_PROBE (rdlock_entry, 1, %rdi)
 
+ cmpl  $0,__elision_available(%rip)
+ jz    not_elided_rdlock
+
+ cmpq  $0,ELIDING(%rdi)
+ js    not_elided_rdlock
+ jnz   2f
+ /* zero: use default */
+
+ cmpl  $0,__rwlock_rtm_enabled(%rip)
+ jz    not_elided_rdlock
+
+2:
+ mov   __rwlock_rtm_read_retries(%rip),%esi
+
+try_trans_rdlock:
+ XBEGIN abort_rdlock
+
+ /* Lock reader/writer free? */
+ cmpl  $0,WRITER(%rdi)
+ jnz   1f
+ cmpl  $0,NR_READERS(%rdi)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+ ret
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp not_elided_rdlock
+
+ /* Abort happened. */
+abort_rdlock:
+ /* Call the abort hook */
+ cmpq  $0,__tsx_abort_hook(%rip)
+ jz    1f
+
+ push  %rdi
+ cfi_adjust_cfa_offset(8)
+ push  %rsi
+ cfi_adjust_cfa_offset(8)
+        push  %rax
+ cfi_adjust_cfa_offset(8)
+ mov   %eax,%edi
+ call  *__tsx_abort_hook(%rip)
+ pop   %rax
+ cfi_adjust_cfa_offset(-8)
+ pop   %rsi
+ cfi_adjust_cfa_offset(-8)
+ pop   %rdi
+ cfi_adjust_cfa_offset(-8)
+
+1:
+
+ testl $_XABORT_CONFLICT,%eax
+ jz    not_elided_rdlock
+
+ /* For a reader that aborts due a conflict retry speculation
+   a limited number of times. This way when some reader aborts
+   because the reader count is written the other readers will
+   still elide, at the cost of retrying the speculation. */
+
+ dec   %esi
+ jnz   try_trans_rdlock
+
+ /* Otherwise we just fall back directly to the lock.
+   Here's the place to add more adaptation. */
+
+not_elided_rdlock:
  xorq %r10, %r10
 
  /* Get the lock.  */
@@ -173,5 +243,7 @@ __pthread_rwlock_rdlock:
  cfi_endproc
  .size __pthread_rwlock_rdlock,.-__pthread_rwlock_rdlock
 
+#ifndef __pthread_rwlock_rdlock
 strong_alias (__pthread_rwlock_rdlock, pthread_rwlock_rdlock)
 hidden_def (__pthread_rwlock_rdlock)
+#endif
diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedrdlock.S b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedrdlock.S
index 57fe1e9..f45b296 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedrdlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedrdlock.S
@@ -21,6 +21,7 @@
 #include <lowlevelrwlock.h>
 #include <pthread-errnos.h>
 #include <kernel-features.h>
+#include <hle.h>
 
  .text
 
@@ -29,6 +30,80 @@
  .align 16
 pthread_rwlock_timedrdlock:
  cfi_startproc
+
+ cmpl  $0,__elision_available(%rip)
+ jz    not_elided_timedrdlock
+
+ cmpq  $0,ELIDING(%rdi)
+ js    not_elided_timedrdlock
+ jnz   2f
+ /* zero: use default */
+
+ cmpl  $0,__rwlock_rtm_enabled(%rip)
+ jz    not_elided_timedrdlock
+
+2:
+ mov   __rwlock_rtm_read_retries(%rip),%ecx
+
+try_trans_timedrdlock:
+ XBEGIN abort_timedrdlock
+
+ /* Lock reader/writer free? */
+ cmpl  $0,WRITER(%rdi)
+ jnz   1f
+ cmpl  $0,NR_READERS(%rdi)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+ ret
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp not_elided_timedrdlock
+
+ /* Abort happened. */
+abort_timedrdlock:
+ /* Call the abort hook */
+ cmpq  $0,__tsx_abort_hook(%rip)
+ jz    1f
+ push  %rcx
+ cfi_adjust_cfa_offset(8)
+ push  %rdi
+ cfi_adjust_cfa_offset(8)
+ push  %rsi
+ cfi_adjust_cfa_offset(8)
+ push  %rax
+ cfi_adjust_cfa_offset(8)
+ mov   %eax,%edi
+ call  *__tsx_abort_hook(%rip)
+ pop   %rax
+ cfi_adjust_cfa_offset(-8)
+ pop   %rsi
+ cfi_adjust_cfa_offset(-8)
+ pop   %rdi
+ cfi_adjust_cfa_offset(-8)
+ pop   %rcx
+ cfi_adjust_cfa_offset(-8)
+
+1:
+
+ testl $_XABORT_CONFLICT,%eax
+ jz    not_elided_timedrdlock
+
+ /* For a reader that aborts due a conflict retry speculation
+   a limited number of times. This way when some reader aborts
+   because the reader count is written the other readers will
+   still elide, at the cost of retrying the speculation. */
+
+ dec   %ecx
+ jnz   try_trans_timedrdlock
+
+ /* Otherwise we just fall back directly to the lock.
+   Here's the place to add more adaptation. */
+
+not_elided_timedrdlock:
+
  pushq %r12
  cfi_adjust_cfa_offset(8)
  cfi_rel_offset(%r12, 0)
diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedwrlock.S b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedwrlock.S
index 391be17..063ca64 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedwrlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_timedwrlock.S
@@ -21,6 +21,7 @@
 #include <lowlevelrwlock.h>
 #include <pthread-errnos.h>
 #include <kernel-features.h>
+#include <hle.h>
 
  .text
 
@@ -29,6 +30,57 @@
  .align 16
 pthread_rwlock_timedwrlock:
  cfi_startproc
+
+ cmpl  $0,__elision_available(%rip)
+ jz    not_elided_timedwrlock
+
+ cmpq  $0,ELIDING(%rdi)
+ js    not_elided_timedwrlock
+ jnz   2f
+ /* zero: use default */
+
+ cmpl  $0,__rwlock_rtm_enabled(%rip)
+ jz    not_elided_timedwrlock
+
+2:
+ XBEGIN abort
+
+ /* Lock free? */
+ cmpl  $0,NR_READERS(%rdi)
+ jnz   1f
+ cmpl  $0,WRITER(%rdi)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %rax, %rax
+ ret
+
+ /* Lock is not free. Run */
+1: XABORT 0xff
+ jmp   not_elided_timedwrlock
+
+ /* Abort happened. */
+abort:
+
+ /* Call abort hook */
+ cmpq  $0,__tsx_abort_hook(%rip)
+ jz    1f
+ push  %rdi
+ cfi_adjust_cfa_offset(8)
+ push  %rsi
+ cfi_adjust_cfa_offset(8)
+ mov   %eax,%edi
+ call  *__tsx_abort_hook(%rip)
+ pop   %rsi
+ cfi_adjust_cfa_offset(-8)
+ pop   %rdi
+ cfi_adjust_cfa_offset(-8)
+
+1:
+
+ /* No retries here */
+
+not_elided_timedwrlock:
  pushq %r12
  cfi_adjust_cfa_offset(8)
  cfi_rel_offset(%r12, 0)
diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_unlock.S b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_unlock.S
index 86dda05..a9876f2 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_unlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_unlock.S
@@ -20,7 +20,7 @@
 #include <lowlevellock.h>
 #include <lowlevelrwlock.h>
 #include <kernel-features.h>
-
+#include <hle.h>
 
  .text
 
@@ -29,6 +29,21 @@
  .align 16
 __pthread_rwlock_unlock:
  cfi_startproc
+
+ /* Is lock free? */
+ cmpl $0,WRITER(%rdi)
+ jnz  1f
+ cmpl $0,NR_READERS(%rdi)
+ jnz  1f
+
+ /* Looks free. Assume transaction.
+   If you crash here you unlocked a free lock. */
+ XEND
+ xor  %rax,%rax
+ ret
+
+1:
+
  /* Get the lock.  */
  movl $1, %esi
  xorl %eax, %eax
diff --git a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S
index 734bee3..90942d5 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S
+++ b/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S
@@ -22,6 +22,7 @@
 #include <pthread-errnos.h>
 #include <kernel-features.h>
 #include <stap-probe.h>
+#include <hle.h>
 
  .text
 
@@ -33,6 +34,50 @@ __pthread_rwlock_wrlock:
 
  LIBC_PROBE (wrlock_entry, 1, %rdi)
 
+ cmpl  $0,__elision_available(%rip)
+ jz    not_elided_wrlock
+
+ cmpq  $0,ELIDING(%rdi)
+ js    not_elided_wrlock
+ jnz   2f
+ /* zero: use default */
+
+ cmpl  $0,__rwlock_rtm_enabled(%rip)
+ jz    not_elided_wrlock
+
+2:
+ XBEGIN abort_wrlock
+
+ /* Lock free? */
+ cmpl  $0,WRITER(%rdi)
+ jnz   1f
+ cmpl  $0,NR_READERS(%rdi)
+ jnz   1f
+
+ /* Lock is free. Run with transaction */
+ xor   %eax,%eax
+ ret
+
+ /* Lock is not free. End transaction */
+1: XABORT 0xff
+ jmp   not_elided_wrlock
+
+ /* Abort happened. */
+abort_wrlock:
+
+ /* Call the abort hook */
+ cmpq  $0,__tsx_abort_hook(%rip)
+ jz    1f
+ push  %rdi
+ cfi_adjust_cfa_offset(8)
+ mov   %eax,%edi
+ call  *__tsx_abort_hook(%rip)
+ pop   %rdi
+ cfi_adjust_cfa_offset(-8)
+
+1:
+
+not_elided_wrlock:
  xorq %r10, %r10
 
  /* Get the lock.  */
@@ -161,5 +206,7 @@ __pthread_rwlock_wrlock:
  cfi_endproc
  .size __pthread_rwlock_wrlock,.-__pthread_rwlock_wrlock
 
+#ifndef __pthread_rwlock_wrlock
 strong_alias (__pthread_rwlock_wrlock, pthread_rwlock_wrlock)
 hidden_def (__pthread_rwlock_wrlock)
+#endif
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 6/9] Add __pthread_set_abort_hook export

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

For some debugging scenarios it is useful to catch aborts. Looking
at the abort code can be the only way to get information out of a
memory transaction.

This exports an __pthread_set_abort_hook function that can set
this abort hook.

Requires adding a GLIBC_2.18 namespace.

Separate patch because it changes the external ABI and tags
everything with GLIBC_2.18.  Backports that do not want to change
that can leave this patch out. tst-elision2 -- which tests this
hook -- will the automatically disable itself.

See the manual for an example.

2013-01-22  Andi Kleen  <[hidden email]>

        * Versions.def (GLIBC_2.18): Add for pthread.
        * Versions: Add __pthread_set_abort_hook.
        * sysdeps/pthread/pthread.h (__pthread_set_abort_hook,
         __pthread_abort_hook_t): Add.
        * sysdeps/unix/sysv/linux/i386/nptl/libpthread.abilist: Add 2.18.
        * sysdeps/unix/sysv/linux/x86_64/64/nptl/libpthread.abilist: dito.
        * sysdeps/unix/sysv/linux/x86/abort-hook.c: New file.
        * sysdeps/unix/sysv/linux/x86/Makefile: Add abort-hook.
        * sysdeps/unix/sysv/linux/x86/elision-conf.c (__tsx_abort_hook):
        Remove.
        * sysdeps/unix/sysv/linux/x86/elision-conf.h (SUPPORTS_ABORT_HOOK):
        Add.
---
 Versions.def                                       |    1 +
 nptl/Versions                                      |    4 ++
 nptl/sysdeps/pthread/pthread.h                     |   11 +++++++
 nptl/sysdeps/unix/sysv/linux/x86/Makefile          |    2 +-
 nptl/sysdeps/unix/sysv/linux/x86/abort-hook.c      |   32 ++++++++++++++++++++
 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c    |    1 -
 nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h    |    1 +
 .../unix/sysv/linux/i386/nptl/libpthread.abilist   |    3 ++
 .../sysv/linux/x86_64/64/nptl/libpthread.abilist   |    3 ++
 9 files changed, 56 insertions(+), 2 deletions(-)
 create mode 100644 nptl/sysdeps/unix/sysv/linux/x86/abort-hook.c

diff --git a/Versions.def b/Versions.def
index 3c9e0ae..899d6ee 100644
--- a/Versions.def
+++ b/Versions.def
@@ -99,6 +99,7 @@ libpthread {
   GLIBC_2.6
   GLIBC_2.11
   GLIBC_2.12
+  GLIBC_2.18
   GLIBC_PRIVATE
 }
 libresolv {
diff --git a/nptl/Versions b/nptl/Versions
index 6a10375..607face 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -252,6 +252,10 @@ libpthread {
     pthread_setname_np; pthread_getname_np;
   };
 
+  GLIBC_2.18 {
+    __pthread_set_elision_abort_hook;
+  };
+
   GLIBC_PRIVATE {
     __pthread_initialize_minimal;
     __pthread_clock_gettime; __pthread_clock_settime;
diff --git a/nptl/sysdeps/pthread/pthread.h b/nptl/sysdeps/pthread/pthread.h
index 7fe7866..d691b8f 100644
--- a/nptl/sysdeps/pthread/pthread.h
+++ b/nptl/sysdeps/pthread/pthread.h
@@ -1170,6 +1170,17 @@ extern int pthread_atfork (void (*__prepare) (void),
    void (*__child) (void)) __THROW;
 
 
+#ifdef __USE_GNU
+typedef void (*__pthread_abort_hook_t) (unsigned);
+
+/* Set an abort hook HOOK that is called when a lock transaction aborts.
+   The HOOK is called with the system specific transaction abort status.
+   Returns the old hook. */
+
+extern __pthread_abort_hook_t
+__pthread_set_elision_abort_hook (__pthread_abort_hook_t hook);
+#endif
+
 #ifdef __USE_EXTERN_INLINES
 /* Optimizations.  */
 __extern_inline int
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/Makefile b/nptl/sysdeps/unix/sysv/linux/x86/Makefile
index 61b7552..e7b9612 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86/Makefile
+++ b/nptl/sysdeps/unix/sysv/linux/x86/Makefile
@@ -1,3 +1,3 @@
 libpthread-sysdep_routines += init-arch
 libpthread-sysdep_routines += elision-lock elision-unlock elision-timed \
-      elision-trylock
+      elision-trylock abort-hook
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/abort-hook.c b/nptl/sysdeps/unix/sysv/linux/x86/abort-hook.c
new file mode 100644
index 0000000..0e2dec0
--- /dev/null
+++ b/nptl/sysdeps/unix/sysv/linux/x86/abort-hook.c
@@ -0,0 +1,32 @@
+/* abort-hook.c: Abort debugging support code.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>. */
+#include <pthreadP.h>
+#include "elision-conf.h"
+
+__pthread_abort_hook_t __tsx_abort_hook attribute_hidden;
+
+/* Allow user programs to hook into the abort handler.
+   This is useful for debugging situations where you need to get
+   information out of a transaction. */
+
+__pthread_abort_hook_t __pthread_set_elision_abort_hook(__pthread_abort_hook_t hook)
+{
+  __pthread_abort_hook_t old = __tsx_abort_hook;
+  __tsx_abort_hook = hook;
+  return old;
+}
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
index f37c552..a188cb6 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.c
@@ -224,4 +224,3 @@ void (*const init_array []) (int, char **, char **)
   &elision_init
 };
 
-__pthread_abort_hook_t __tsx_abort_hook attribute_hidden;
diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
index a0017b5..1d97bfc 100644
--- a/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
+++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-conf.h
@@ -50,5 +50,6 @@ extern int __pthread_mutex_trylock_rtm (pthread_mutex_t *);
 extern int __pthread_mutex_trylock (pthread_mutex_t *);
 
 #define SUPPORTS_ELISION 1
+#define SUPPORTS_ABORT_HOOK 1
 
 #endif
diff --git a/sysdeps/unix/sysv/linux/i386/nptl/libpthread.abilist b/sysdeps/unix/sysv/linux/i386/nptl/libpthread.abilist
index 827114f..b3f6df9 100644
--- a/sysdeps/unix/sysv/linux/i386/nptl/libpthread.abilist
+++ b/sysdeps/unix/sysv/linux/i386/nptl/libpthread.abilist
@@ -174,6 +174,9 @@ GLIBC_2.12
  pthread_mutexattr_getrobust F
  pthread_mutexattr_setrobust F
  pthread_setname_np F
+GLIBC_2.18
+ GLIBC_2.18 A
+ __pthread_set_elision_abort_hook F
 GLIBC_2.2
  GLIBC_2.2 A
  __open64 F
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/nptl/libpthread.abilist b/sysdeps/unix/sysv/linux/x86_64/64/nptl/libpthread.abilist
index 7c33f35..ef30b70 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/nptl/libpthread.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/nptl/libpthread.abilist
@@ -8,6 +8,9 @@ GLIBC_2.12
  pthread_mutexattr_getrobust F
  pthread_mutexattr_setrobust F
  pthread_setname_np F
+GLIBC_2.18
+ GLIBC_2.18 A
+ __pthread_set_elision_abort_hook F
 GLIBC_2.2.5
  GLIBC_2.2.5 A
  _IO_flockfile F
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 7/9] Extend the test suite for TSX and add some new tests to test elision

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

- Some of the existing mutex tests do not work with lock elision
semantics. We run them with PTHREAD_MUTEX=none PTHREAD_RWLOCK=none,
but do not otherwise modify it.

This is expected behaviour, related to RTM lock elision's inability
to track the owner of lock (see the original description for more
details). It's not expected that these changes will break any programs,
it's mostly related to obscure behaviour.
- I added some ifdefs to these tests and add new variants that
set the defines to test elision behaviour.
- Add new tests that explicitely check for elision.
Note strictly this violates the TSX spec which never guarantees any transaction
ever succeeding. However we assume simple transactions will succeed and will
fail if not.
- Add tests covering the new lock kinds.

2013-01-10  Andi Kleen  <[hidden email]>

        * Makefile: Add elision tests. Disable elision for some existing
        tests.
        * tst-abstime.c: Fix for elision.
        * tst-abstime1b.c: New file.
        * tst-abstime1c.c:dito.
        * tst-elision-common.c: New file.
        * tst-elision1.c: dito.
        * tst-elision1b.c: dito.
        * tst-elision2.c: dito.
        * tst-elision3.c: dito.
        * tst-initializers2-c89.c: dito.
        * tst-initializers2-c99.c: dito.
        * tst-initializers2-gnu89.c: dito.
        * tst-initializers2-gnu99.c: dito.
        * tst-initializers2.c: dito.
        * tst-mutex5.c: dito.
        * tst-mutex5b.c: dito.
        * tst-mutex5c.c: dito.
        * tst-mutex5d.c: dito
        * tst-mutex5e.c: dito.
        * tst-mutex7b.c: dito.
        * tst-mutex7c.c: dito.
        * tst-mutex7d.c: dito.
        * tst-mutex7e.c: dito.
        * tst-mutex8.c: dito.
        * tst-mutex8b.c: dito.
        * tst-mutex8c.c: dito.
        * tst-mutex8d.c: dito.
        * tst-mutex8e.c: dito.
        * tst-mutex8f.c: dito.
        * tst-mutex8g.c: dito.
---
 nptl/Makefile                  |   26 ++++-
 nptl/tst-abstime.c             |    8 +-
 nptl/tst-abstime1b.c           |    2 +
 nptl/tst-abstime1c.c           |    2 +
 nptl/tst-elision-common.c      |  242 ++++++++++++++++++++++++++++++++++++++++
 nptl/tst-elision1.c            |  121 ++++++++++++++++++++
 nptl/tst-elision1b.c           |    1 +
 nptl/tst-elision2.c            |   88 +++++++++++++++
 nptl/tst-initializers2-c89.c   |    1 +
 nptl/tst-initializers2-c99.c   |    1 +
 nptl/tst-initializers2-gnu89.c |    1 +
 nptl/tst-initializers2-gnu99.c |    1 +
 nptl/tst-initializers2.c       |   50 ++++++++
 nptl/tst-mutex5.c              |    2 +
 nptl/tst-mutex5b.c             |    6 +
 nptl/tst-mutex5c.c             |    2 +
 nptl/tst-mutex5d.c             |    2 +
 nptl/tst-mutex5e.c             |    6 +
 nptl/tst-mutex7b.c             |    2 +
 nptl/tst-mutex7c.c             |    2 +
 nptl/tst-mutex7d.c             |    2 +
 nptl/tst-mutex7e.c             |    2 +
 nptl/tst-mutex8.c              |   37 ++++++-
 nptl/tst-mutex8b.c             |    7 +
 nptl/tst-mutex8c.c             |    7 +
 nptl/tst-mutex8d.c             |    3 +
 nptl/tst-mutex8e.c             |    3 +
 nptl/tst-mutex8f.c             |    3 +
 nptl/tst-mutex8g.c             |    5 +
 29 files changed, 627 insertions(+), 8 deletions(-)
 create mode 100644 nptl/tst-abstime1b.c
 create mode 100644 nptl/tst-abstime1c.c
 create mode 100644 nptl/tst-elision-common.c
 create mode 100644 nptl/tst-elision1.c
 create mode 100644 nptl/tst-elision1b.c
 create mode 100644 nptl/tst-elision2.c
 create mode 100644 nptl/tst-initializers2-c89.c
 create mode 100644 nptl/tst-initializers2-c99.c
 create mode 100644 nptl/tst-initializers2-gnu89.c
 create mode 100644 nptl/tst-initializers2-gnu99.c
 create mode 100644 nptl/tst-initializers2.c
 create mode 100644 nptl/tst-mutex5b.c
 create mode 100644 nptl/tst-mutex5c.c
 create mode 100644 nptl/tst-mutex5d.c
 create mode 100644 nptl/tst-mutex5e.c
 create mode 100644 nptl/tst-mutex7b.c
 create mode 100644 nptl/tst-mutex7c.c
 create mode 100644 nptl/tst-mutex7d.c
 create mode 100644 nptl/tst-mutex7e.c
 create mode 100644 nptl/tst-mutex8b.c
 create mode 100644 nptl/tst-mutex8c.c
 create mode 100644 nptl/tst-mutex8d.c
 create mode 100644 nptl/tst-mutex8e.c
 create mode 100644 nptl/tst-mutex8f.c
 create mode 100644 nptl/tst-mutex8g.c

diff --git a/nptl/Makefile b/nptl/Makefile
index 6af4b37..b53077e 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -204,6 +204,9 @@ tests = tst-typesizes \
  tst-attr1 tst-attr2 tst-attr3 \
  tst-mutex1 tst-mutex2 tst-mutex3 tst-mutex4 tst-mutex5 tst-mutex6 \
  tst-mutex7 tst-mutex8 tst-mutex9 tst-mutex5a tst-mutex7a \
+ tst-mutex8b tst-mutex8c tst-mutex8d tst-mutex8e tst-mutex8f \
+ tst-mutex5b tst-mutex5c tst-mutex5d tst-mutex5e \
+ tst-mutex7b tst-mutex7c tst-mutex7d tst-mutex7e \
  tst-mutexpi1 tst-mutexpi2 tst-mutexpi3 tst-mutexpi4 tst-mutexpi5 \
  tst-mutexpi5a tst-mutexpi6 tst-mutexpi7 tst-mutexpi7a tst-mutexpi8 \
  tst-mutexpi9 \
@@ -262,10 +265,12 @@ tests = tst-typesizes \
  tst-context1 \
  tst-sched1 \
  tst-backtrace1 \
- tst-abstime \
+ tst-abstime tst-abstime1b tst-abstime1c \
  tst-vfork1 tst-vfork2 tst-vfork1x tst-vfork2x \
  tst-getpid1 tst-getpid2 tst-getpid3 \
- tst-initializers1 $(patsubst %,tst-initializers1-%,c89 gnu89 c99 gnu99)
+ tst-initializers1 $(patsubst %,tst-initializers1-%,c89 gnu89 c99 gnu99) \
+ tst-initializers2 $(patsubst %,tst-initializers2-%,c89 gnu89 c99 gnu99) \
+ tst-elision1 tst-elision1b tst-elision2 tst-elision3
 xtests = tst-setuid1 tst-setuid1-static tst-mutexpp1 tst-mutexpp6 tst-mutexpp10
 test-srcs = tst-oddstacklimit
 
@@ -436,11 +441,28 @@ CFLAGS-tst-initializers1-c89.c = $(CFLAGS-tst-initializers1-<)
 CFLAGS-tst-initializers1-c99.c = $(CFLAGS-tst-initializers1-<)
 CFLAGS-tst-initializers1-gnu89.c = $(CFLAGS-tst-initializers1-<)
 CFLAGS-tst-initializers1-gnu99.c = $(CFLAGS-tst-initializers1-<)
+CFLAGS-tst-initializers2.c = -W -Wall -Werror
+CFLAGS-tst-initializers2-< = $(CFLAGS-tst-initializers2.c) \
+     $(patsubst tst-initializers2-%.c,-std=%,$<)
+CFLAGS-tst-initializers2-c89.c = $(CFLAGS-tst-initializers2-<)
+CFLAGS-tst-initializers2-c99.c = $(CFLAGS-tst-initializers2-<)
+CFLAGS-tst-initializers2-gnu89.c = $(CFLAGS-tst-initializers2-<)
+CFLAGS-tst-initializers2-gnu99.c = $(CFLAGS-tst-initializers2-<)
 
 tst-cancel7-ARGS = --command "exec $(host-test-program-cmd)"
 tst-cancelx7-ARGS = $(tst-cancel7-ARGS)
 tst-umask1-ARGS = $(objpfx)tst-umask1.temp
 
+tst-abstime1-ENV = PTHREAD_MUTEX=none
+tst-mutex5-ENV = PTHREAD_MUTEX=none
+tst-mutex8-ENV = PTHREAD_MUTEX=none
+tst-mutex8-static-ENV = PTHREAD_MUTEX=none
+tst-mutex8f-ENV = PTHREAD_MUTEX=none
+tst-mutex8g-ENV = PTHREAD_MUTEX=elision
+tst-elision1b-ENV = PTHREAD_MUTEX=none PTHREAD_RWLOCK=none
+# disable adaptation for abort hook test
+tst-elision2-ENV = PTHREAD_MUTEX=elision:retry_lock_internal_abort=0:retry_trylock_internal_abort=0
+
 $(objpfx)tst-atfork2: $(libdl) $(shared-thread-library)
 LDFLAGS-tst-atfork2 = -rdynamic
 tst-atfork2-ENV = MALLOC_TRACE=$(objpfx)tst-atfork2.mtrace
diff --git a/nptl/tst-abstime.c b/nptl/tst-abstime.c
index 99fc7c1..3c91c66 100644
--- a/nptl/tst-abstime.c
+++ b/nptl/tst-abstime.c
@@ -21,9 +21,13 @@
 #include <semaphore.h>
 #include <stdio.h>
 
+#ifndef MUTEX_TYPE
+#define MUTEX_TYPE PTHREAD_MUTEX_INITIALIZER
+#endif
+
 static pthread_cond_t c = PTHREAD_COND_INITIALIZER;
-static pthread_mutex_t m1 = PTHREAD_MUTEX_INITIALIZER;
-static pthread_mutex_t m2 = PTHREAD_MUTEX_INITIALIZER;
+static pthread_mutex_t m1 = MUTEX_TYPE;
+static pthread_mutex_t m2 = MUTEX_TYPE;
 static pthread_rwlock_t rw1 = PTHREAD_RWLOCK_INITIALIZER;
 static pthread_rwlock_t rw2 = PTHREAD_RWLOCK_INITIALIZER;
 static sem_t sem;
diff --git a/nptl/tst-abstime1b.c b/nptl/tst-abstime1b.c
new file mode 100644
index 0000000..9a2953a
--- /dev/null
+++ b/nptl/tst-abstime1b.c
@@ -0,0 +1,2 @@
+#define MUTEX_TYPE PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP)
+#include "tst-abstime.c"
diff --git a/nptl/tst-abstime1c.c b/nptl/tst-abstime1c.c
new file mode 100644
index 0000000..22ae9a3
--- /dev/null
+++ b/nptl/tst-abstime1c.c
@@ -0,0 +1,2 @@
+#define MUTEX_TYPE PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP)
+#include "tst-abstime.c"
diff --git a/nptl/tst-elision-common.c b/nptl/tst-elision-common.c
new file mode 100644
index 0000000..8ddf391
--- /dev/null
+++ b/nptl/tst-elision-common.c
@@ -0,0 +1,242 @@
+/* tst-elision-common: Elision test harness.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+
+#define CPUID_FEATURE_RTM (1U << 11)
+
+static int
+cpu_has_rtm (void)
+{
+  if (__get_cpuid_max (0, NULL) >= 7)
+    {
+      unsigned a, b, c, d;
+
+      __cpuid_count (7, 0, a, b, c, d);
+      if (b & CPUID_FEATURE_RTM)
+ return 1;
+    }
+  return 0;
+}
+
+#define ITER 10
+#define MAXTRY 100
+
+pthread_mutex_t lock;
+
+#ifndef TRYLOCK_ONLY
+static int
+pthread_mutex_timedlock_wrapper(pthread_mutex_t *l)
+{
+  struct timespec wait = { 0, 0 };
+  return pthread_mutex_timedlock (l, &wait);
+}
+#endif
+
+/* Note this test program will fail when single stepped.
+   It also assumes that simple transactions always work. There is no
+   guarantee in the architecture that this is the case. We do some
+   retries to handle random abort cases like interrupts. But it's
+   not fully guaranteed. However when this fails it is somewhat worrying. */
+
+int
+run_mutex (int expected, const char *name, int force)
+{
+  int i;
+  int try = 0;
+  int txn __attribute__((unused));
+  int err;
+
+#ifndef TRYLOCK_ONLY
+  TESTLOCK(lock, pthread_mutex_lock, pthread_mutex_unlock, force);
+  TESTLOCK(lock, pthread_mutex_timedlock_wrapper, pthread_mutex_unlock, force);
+  TESTLOCK(lock, pthread_mutex_trylock, pthread_mutex_unlock, force);
+#else
+  TESTTRYLOCK(lock, pthread_mutex_lock, pthread_mutex_trylock, pthread_mutex_unlock, force);
+#endif
+
+  err = pthread_mutex_destroy (&lock);
+  if (err != 0)
+    {
+      printf ("destroy for %s failed: %d\n", name, err);
+      return 1;
+    }
+  return 0;
+}
+
+static int
+run_mutex_init (int iter, const char *name, int type, int has_type, int force)
+{
+  pthread_mutexattr_t attr;
+
+  pthread_mutexattr_init (&attr);
+  pthread_mutexattr_settype (&attr, type);
+  pthread_mutex_init (&lock, has_type ? &attr : NULL);
+  return run_mutex (iter, name, force);
+}
+
+/* This assumes elision is enabled by default. If that changes change
+   the first arguments of the default cases to 0. */
+
+int
+mutex_test (void)
+{
+  int ret = 0;
+
+  lock = (pthread_mutex_t) PTHREAD_MUTEX_INITIALIZER;
+  ret += run_mutex (ITER, "default initializer timed", 0);
+  lock = (pthread_mutex_t) PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|
+ PTHREAD_MUTEX_ELISION_NP),
+  ret += run_mutex (ITER, "timed initializer elision", 1);
+  lock = (pthread_mutex_t) PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|
+ PTHREAD_MUTEX_NO_ELISION_NP);
+  ret += run_mutex (0, "timed initializer no elision", 2);
+
+  lock = (pthread_mutex_t) PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP;
+  run_mutex (ITER, "adaptive initializer default", 0);
+  lock = (pthread_mutex_t) PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|
+ PTHREAD_MUTEX_ELISION_NP);
+  ret += run_mutex (ITER, "adaptive initializer elision", 1);
+  lock = (pthread_mutex_t) PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|
+ PTHREAD_MUTEX_NO_ELISION_NP);
+  ret += run_mutex (0, "adaptive initializer no elision", 2);
+
+  ret += run_mutex_init (ITER, "timed init default", 0, 0, 0);
+  ret += run_mutex_init (ITER, "timed init elision",
+                         PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP, 1, 1);
+  ret += run_mutex_init (0, "timed init no elision",
+ PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP, 1, 2);
+
+  ret += run_mutex_init (ITER, "adaptive init default", 0, 0, 0);
+  ret += run_mutex_init (ITER, "adaptive init elision",
+                         PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP, 1, 1);
+  ret += run_mutex_init (0, "adaptive init no elision",
+         PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP,
+  1, 2);
+
+  return ret;
+}
+
+pthread_rwlock_t rwlock;
+
+#ifndef TRYLOCK_ONLY
+static int
+pthread_rwlock_timedwrlock_wrapper(pthread_rwlock_t *l)
+{
+  struct timespec wait = { 0, 0 };
+  return pthread_rwlock_timedwrlock (l, &wait);
+}
+
+static int
+pthread_rwlock_timedrdlock_wrapper(pthread_rwlock_t *l)
+{
+  struct timespec wait = { 0, 0 };
+  return pthread_rwlock_timedrdlock (l, &wait);
+}
+#endif
+
+int
+run_rwlock (int expected, const char *name, int force)
+{
+  int i;
+  int try = 0;
+  int txn __attribute__((unused));
+  int err;
+
+#ifndef TRYLOCK_ONLY
+  TESTLOCK(rwlock, pthread_rwlock_rdlock, pthread_rwlock_unlock, force);
+  TESTLOCK(rwlock, pthread_rwlock_wrlock, pthread_rwlock_unlock, force);
+  TESTLOCK(rwlock, pthread_rwlock_rdlock, pthread_rwlock_unlock, force);
+  TESTLOCK(rwlock, pthread_rwlock_tryrdlock, pthread_rwlock_unlock, force);
+  TESTLOCK(rwlock, pthread_rwlock_trywrlock, pthread_rwlock_unlock, force);
+  TESTLOCK(rwlock, pthread_rwlock_timedrdlock_wrapper,
+   pthread_rwlock_unlock, force);
+  TESTLOCK(rwlock, pthread_rwlock_timedwrlock_wrapper,
+   pthread_rwlock_unlock, force);
+#else
+  TESTTRYLOCK(rwlock, pthread_rwlock_wrlock, pthread_rwlock_trywrlock,
+      pthread_rwlock_unlock, force);
+#endif
+
+  err = pthread_rwlock_destroy (&rwlock);
+  if (err != 0)
+    {
+      printf ("pthread_rwlock_destroy for %s failed: %d\n", name, err);
+      return 1;
+    }
+  return 0;
+}
+
+int
+run_rwlock_attr (int iter, const char *name, int type, int force)
+{
+  pthread_rwlockattr_t attr;
+  pthread_rwlockattr_init (&attr);
+  pthread_rwlockattr_setkind_np (&attr, type);
+  pthread_rwlock_init (&rwlock, &attr);
+  return run_rwlock (iter, name, force);
+}
+
+int
+run_rwlock_attr_set (int iter, const char *extra, int flag, int force)
+{
+  char str[100];
+  int ret = 0;
+
+  snprintf(str, sizeof str, "rwlock attr prefer reader %s", extra);
+  ret += run_rwlock_attr (ITER, str,
+                          PTHREAD_RWLOCK_PREFER_READER_NP | flag, force);
+  snprintf(str, sizeof str, "rwlock attr prefer writer %s", extra);
+  ret += run_rwlock_attr (ITER, str,
+                          PTHREAD_RWLOCK_PREFER_WRITER_NP | flag, force);
+  snprintf(str, sizeof str, "rwlock attr prefer writer non recursive %s", extra);
+  ret += run_rwlock_attr (ITER, str,
+          PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP | flag, force);
+  return ret;
+}
+
+
+int
+rwlock_test (void)
+{
+  int ret = 0;
+
+  pthread_rwlock_init (&rwlock, NULL);
+  ret += run_rwlock (ITER, "rwlock created", 0);
+
+  rwlock = (pthread_rwlock_t)PTHREAD_RWLOCK_INITIALIZER;
+  ret += run_rwlock (ITER, "rwlock initialized", 0);
+
+  rwlock = (pthread_rwlock_t)PTHREAD_RWLOCK_WRITER_NONRECURSIVE_INITIALIZER_NP;
+  ret += run_rwlock (ITER, "rwlock initialized writer non recursive", 0);
+
+  rwlock = (pthread_rwlock_t)PTHREAD_RWLOCK_WRITER_NONRECURSIVE_INITIALIZER_NP;
+  ret += run_rwlock (ITER, "rwlock initialized writer non recursive", 0);
+
+#ifdef PTHREAD_RWLOCK_WRITER_INITIALIZER_NP
+  // XXX includes are missing PTHREAD_RWLOCK_WRITER_INITIALIZER_NP
+  rwlock = (pthread_rwlock_t)PTHREAD_RWLOCK_WRITER_INITIALIZER_NP;
+  ret += run_rwlock (ITER, "rwlock initialized writer", 0);
+#endif
+
+  ret += run_rwlock_attr_set (ITER, "", 0, 0);
+  ret += run_rwlock_attr_set (ITER, "eliding", PTHREAD_RWLOCK_ELISION_NP, 1);
+  ret += run_rwlock_attr_set (ITER, "not eliding", PTHREAD_RWLOCK_NO_ELISION_NP, 2);
+
+  return ret;
+}
diff --git a/nptl/tst-elision1.c b/nptl/tst-elision1.c
new file mode 100644
index 0000000..aad06ec
--- /dev/null
+++ b/nptl/tst-elision1.c
@@ -0,0 +1,121 @@
+/* tst-elision1: Test basic elision success.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <elision-conf.h>
+#if (defined(__i386__) || defined(__x86_64__)) && defined(SUPPORTS_ELISION)
+#include <pthread.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include <hle.h>
+
+int disabled;
+int forced;
+
+int
+check (const char *name, const char *lock, int try, int txn, int max,
+       int override)
+{
+  int should_run = 1;
+
+  if (override == 0)
+    should_run = disabled == 0;
+  else if (override == 1)
+    should_run = 1;
+  else if (override == 2)
+    should_run = 0;
+
+  /* forced noop right now, so not tested. But test if the defaults change */
+  if (!should_run)
+    {
+      if (txn != 0)
+ {
+  printf ("%s %s transaction run unexpected\n", name, lock);
+  return 1;
+ }
+    }
+  else
+    {
+      if (try == max)
+ {
+  printf ("%s %s no transactions when expected\n", name, lock);
+  return 1;
+ }
+    }
+  return 0;
+}
+
+#define TESTLOCK(l, lock, unlock, force)\
+  do \
+    { \
+      txn = 0; \
+      for (i = 0; i < ITER; i++) \
+ { \
+  lock (&l); \
+  if (_xtest ()) \
+    txn++; \
+  unlock (&l); \
+ } \
+    } \
+  while (try++ < MAXTRY && txn != expected); \
+  if (check (name, #lock, try, txn, MAXTRY, force)) \
+    return 1;
+
+#include "tst-elision-common.c"
+
+int
+do_test (void)
+{
+  if (!cpu_has_rtm ())
+    {
+      printf ("elision test requires RTM capable CPU. not tested\n");
+      return 0;
+    }
+
+  char *s = getenv ("PTHREAD_MUTEX");
+  if (s)
+    {
+      char *o = getenv ("PTHREAD_RWLOCK");
+      if (!o || strcmp (o, s))
+        {
+          puts("PTHREAD_MUTEX and PTHREAD_RWLOCK must match for test!\n");
+          return 1;
+ }
+      if (!strcmp (s, "none"))
+        disabled = 1;
+      if (!strcmp (s, "elision"))
+ forced = 1;
+    }
+
+  if (mutex_test ())
+    return 1;
+
+  if (rwlock_test ())
+    return 1;
+
+  return 0;
+}
+#else
+int do_test (void)
+{
+  return 0;
+}
+#endif
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
diff --git a/nptl/tst-elision1b.c b/nptl/tst-elision1b.c
new file mode 100644
index 0000000..9f5ec3d
--- /dev/null
+++ b/nptl/tst-elision1b.c
@@ -0,0 +1 @@
+#include "tst-elision1.c"
diff --git a/nptl/tst-elision2.c b/nptl/tst-elision2.c
new file mode 100644
index 0000000..09717b6
--- /dev/null
+++ b/nptl/tst-elision2.c
@@ -0,0 +1,88 @@
+/* tst-elision2: Test TSX abort hook.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if defined(__i386__) || defined(__x86_64__)
+#include <pthread.h>
+#include <stdio.h>
+#include <hle.h>
+#include <cpuid.h>
+
+FILE *null;
+int abort_count;
+
+void do_abort(unsigned code)
+{
+  abort_count++;
+
+  /* Do something here that clobbers registers */
+  fprintf (null, "abort %x %d\n", code, abort_count);
+}
+
+#define TESTLOCK(l, lock, unlock, force) \
+   do \
+    { \
+      abort_count = 0; \
+      fprintf (null, "start %s %s\n", #lock, name); \
+      for (i = 0; i < ITER; i++) \
+ { \
+  lock (&l); \
+  _xabort(1); \
+  unlock (&l); \
+ } \
+    } \
+  while (try++ < MAXTRY && abort_count != ITER);\
+  if (abort_count != ITER && force != 2) \
+    { \
+      printf ("%s %s abort hook did not work %d\n", name, #lock, abort_count); \
+      return 1; \
+    }
+
+#include "tst-elision-common.c"
+
+int
+do_test (void)
+{
+  if (!cpu_has_rtm ())
+    {
+      printf ("elision test requires RTM capable CPU. not tested\n");
+      return 0;
+    }
+
+  null = fopen ("/dev/null", "w");
+  if (!null)
+    null = stdout;
+
+  __pthread_set_elision_abort_hook (do_abort);
+
+  if (mutex_test ())
+    return 1;
+
+  if (rwlock_test ())
+    return 1;
+
+  return 0;
+}
+#else
+int do_test (void)
+{
+  return 0;
+}
+#endif
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
diff --git a/nptl/tst-initializers2-c89.c b/nptl/tst-initializers2-c89.c
new file mode 100644
index 0000000..1fb8af6
--- /dev/null
+++ b/nptl/tst-initializers2-c89.c
@@ -0,0 +1 @@
+#include "tst-initializers2.c"
diff --git a/nptl/tst-initializers2-c99.c b/nptl/tst-initializers2-c99.c
new file mode 100644
index 0000000..1fb8af6
--- /dev/null
+++ b/nptl/tst-initializers2-c99.c
@@ -0,0 +1 @@
+#include "tst-initializers2.c"
diff --git a/nptl/tst-initializers2-gnu89.c b/nptl/tst-initializers2-gnu89.c
new file mode 100644
index 0000000..1fb8af6
--- /dev/null
+++ b/nptl/tst-initializers2-gnu89.c
@@ -0,0 +1 @@
+#include "tst-initializers2.c"
diff --git a/nptl/tst-initializers2-gnu99.c b/nptl/tst-initializers2-gnu99.c
new file mode 100644
index 0000000..1fb8af6
--- /dev/null
+++ b/nptl/tst-initializers2-gnu99.c
@@ -0,0 +1 @@
+#include "tst-initializers2.c"
diff --git a/nptl/tst-initializers2.c b/nptl/tst-initializers2.c
new file mode 100644
index 0000000..42a5033
--- /dev/null
+++ b/nptl/tst-initializers2.c
@@ -0,0 +1,50 @@
+/* Copyright (C) 2005, 2006, 2007, 2012 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* We test the code undef conditions outside of glibc.  */
+#undef _LIBC
+
+#include <pthread.h>
+
+/* Test initializers for elided locks */
+
+pthread_mutex_t mtx_timed_elision = PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|
+  PTHREAD_MUTEX_ELISION_NP);
+pthread_mutex_t mtx_timed_no_elision = PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|
+     PTHREAD_MUTEX_NO_ELISION_NP);
+pthread_mutex_t mtx_adaptive_elision = PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|
+     PTHREAD_MUTEX_ELISION_NP);
+pthread_mutex_t mtx_adaptive_no_elision = PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|
+ PTHREAD_MUTEX_NO_ELISION_NP);
+
+int
+main (void)
+{
+  if (mtx_timed_elision.__data.__kind !=
+      (PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP))
+    return 1;
+  if (mtx_timed_no_elision.__data.__kind !=
+      (PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP))
+    return 2;
+  if (mtx_adaptive_elision.__data.__kind !=
+      (PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP))
+    return 3;
+  if (mtx_adaptive_no_elision.__data.__kind !=
+      (PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP))
+    return 4;
+  return 0;
+}
diff --git a/nptl/tst-mutex5.c b/nptl/tst-mutex5.c
index f19cd8c..4410a47 100644
--- a/nptl/tst-mutex5.c
+++ b/nptl/tst-mutex5.c
@@ -85,6 +85,7 @@ do_test (void)
       return 1;
     }
 
+#ifndef ELIDED
   if (pthread_mutex_trylock (&m) == 0)
     {
       puts ("mutex_trylock succeeded");
@@ -186,6 +187,7 @@ do_test (void)
       puts ("final mutex_unlock failed");
       return 1;
     }
+#endif
 
   if (pthread_mutex_destroy (&m) != 0)
     {
diff --git a/nptl/tst-mutex5b.c b/nptl/tst-mutex5b.c
new file mode 100644
index 0000000..af9cca8
--- /dev/null
+++ b/nptl/tst-mutex5b.c
@@ -0,0 +1,6 @@
+#define TYPE PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP
+#include <elision-conf.h>
+#ifdef SUPPORTS_ELISION
+#define ELIDED 1
+#endif
+#include "tst-mutex5.c"
diff --git a/nptl/tst-mutex5c.c b/nptl/tst-mutex5c.c
new file mode 100644
index 0000000..ae92173
--- /dev/null
+++ b/nptl/tst-mutex5c.c
@@ -0,0 +1,2 @@
+#define TYPE PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP
+#include "tst-mutex5.c"
diff --git a/nptl/tst-mutex5d.c b/nptl/tst-mutex5d.c
new file mode 100644
index 0000000..328c4c7
--- /dev/null
+++ b/nptl/tst-mutex5d.c
@@ -0,0 +1,2 @@
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP
+#include "tst-mutex5.c"
diff --git a/nptl/tst-mutex5e.c b/nptl/tst-mutex5e.c
new file mode 100644
index 0000000..bf5a109
--- /dev/null
+++ b/nptl/tst-mutex5e.c
@@ -0,0 +1,6 @@
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP
+#include <elision-conf.h>
+#ifdef SUPPORTS_ELISION
+#define ELIDED 1
+#endif
+#include "tst-mutex5.c"
diff --git a/nptl/tst-mutex7b.c b/nptl/tst-mutex7b.c
new file mode 100644
index 0000000..be39fe2
--- /dev/null
+++ b/nptl/tst-mutex7b.c
@@ -0,0 +1,2 @@
+#define TYPE PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP
+#include "tst-mutex7.c"
diff --git a/nptl/tst-mutex7c.c b/nptl/tst-mutex7c.c
new file mode 100644
index 0000000..ab23c42
--- /dev/null
+++ b/nptl/tst-mutex7c.c
@@ -0,0 +1,2 @@
+#define TYPE PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP
+#include "tst-mutex7.c"
diff --git a/nptl/tst-mutex7d.c b/nptl/tst-mutex7d.c
new file mode 100644
index 0000000..1b67110
--- /dev/null
+++ b/nptl/tst-mutex7d.c
@@ -0,0 +1,2 @@
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP
+#include "tst-mutex7.c"
diff --git a/nptl/tst-mutex7e.c b/nptl/tst-mutex7e.c
new file mode 100644
index 0000000..eab165e
--- /dev/null
+++ b/nptl/tst-mutex7e.c
@@ -0,0 +1,2 @@
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP
+#include "tst-mutex7.c"
diff --git a/nptl/tst-mutex8.c b/nptl/tst-mutex8.c
index 72dc9d4..e7446a5 100644
--- a/nptl/tst-mutex8.c
+++ b/nptl/tst-mutex8.c
@@ -23,6 +23,9 @@
 #include <stdio.h>
 #include <stdlib.h>
 
+#ifndef NAME
+#define NAME "normal"
+#endif
 
 static pthread_mutex_t *m;
 static pthread_barrier_t b;
@@ -93,6 +96,8 @@ tf (void *arg)
 static int
 check_type (const char *mas, pthread_mutexattr_t *ma)
 {
+  int e __attribute__((unused));
+
   if (pthread_mutex_init (m, ma) != 0)
     {
       printf ("1st mutex_init failed for %s\n", mas);
@@ -117,7 +122,8 @@ check_type (const char *mas, pthread_mutexattr_t *ma)
       return 1;
     }
 
-  int e = pthread_mutex_destroy (m);
+#ifndef ELIDED
+  e = pthread_mutex_destroy (m);
   if (e == 0)
     {
       printf ("mutex_destroy of self-locked mutex succeeded for %s\n", mas);
@@ -129,6 +135,7 @@ check_type (const char *mas, pthread_mutexattr_t *ma)
       mas);
       return 1;
     }
+#endif
 
   if (pthread_mutex_unlock (m) != 0)
     {
@@ -142,6 +149,7 @@ check_type (const char *mas, pthread_mutexattr_t *ma)
       return 1;
     }
 
+#ifndef ELIDED
   e = pthread_mutex_destroy (m);
   if (e == 0)
     {
@@ -155,6 +163,7 @@ mutex_destroy of self-trylocked mutex did not return EBUSY %s\n",
       mas);
       return 1;
     }
+#endif
 
   if (pthread_mutex_unlock (m) != 0)
     {
@@ -189,6 +198,7 @@ mutex_destroy of self-trylocked mutex did not return EBUSY %s\n",
       return 1;
     }
 
+#ifndef ELIDED
   e = pthread_mutex_destroy (m);
   if (e == 0)
     {
@@ -201,6 +211,7 @@ mutex_destroy of self-trylocked mutex did not return EBUSY %s\n",
 mutex_destroy of condvar-used mutex did not return EBUSY for %s\n", mas);
       return 1;
     }
+#endif
 
   done = true;
   if (pthread_cond_signal (&c) != 0)
@@ -259,6 +270,7 @@ mutex_destroy of condvar-used mutex did not return EBUSY for %s\n", mas);
       return 1;
     }
 
+#ifndef ELIDED
   e = pthread_mutex_destroy (m);
   if (e == 0)
     {
@@ -273,6 +285,7 @@ mutex_destroy of condvar-used mutex did not return EBUSY for %s\n", mas);
       mas);
       return 1;
     }
+#endif
 
   if (pthread_cancel (th) != 0)
     {
@@ -304,6 +317,7 @@ mutex_destroy of condvar-used mutex did not return EBUSY for %s\n", mas);
 static int
 do_test (void)
 {
+  pthread_mutexattr_t ma;
   pthread_mutex_t mm;
   m = &mm;
 
@@ -319,10 +333,25 @@ do_test (void)
       return 1;
     }
 
-  puts ("check normal mutex");
-  int res = check_type ("normal", NULL);
+#ifdef TYPE
+  if (pthread_mutexattr_init (&ma) != 0)
+    {
+      puts ("0th mutexattr_init failed");
+      return 1;
+    }
+  if (pthread_mutexattr_settype (&ma, TYPE) != 0)
+    {
+      puts ("0th mutexattr_settype failed");
+      return 1;
+    }
+
+  puts ("check " NAME " mutex");
+  int res = check_type (NAME, &ma);
+#else
+  puts ("check " NAME " mutex");
+  int res = check_type (NAME, NULL);
+#endif
 
-  pthread_mutexattr_t ma;
   if (pthread_mutexattr_init (&ma) != 0)
     {
       puts ("1st mutexattr_init failed");
diff --git a/nptl/tst-mutex8b.c b/nptl/tst-mutex8b.c
new file mode 100644
index 0000000..ed8570d
--- /dev/null
+++ b/nptl/tst-mutex8b.c
@@ -0,0 +1,7 @@
+#define NAME "timed elided"
+#include <elision-conf.h>
+#ifdef SUPPORTS_ELISION
+#define ELIDED 1
+#endif
+#define TYPE PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP
+#include "tst-mutex8.c"
diff --git a/nptl/tst-mutex8c.c b/nptl/tst-mutex8c.c
new file mode 100644
index 0000000..12c41a9
--- /dev/null
+++ b/nptl/tst-mutex8c.c
@@ -0,0 +1,7 @@
+#define NAME "adaptive elided"
+#include <elision-conf.h>
+#ifdef SUPPORTS_ELISION
+#define ELIDED 1
+#endif
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP
+#include "tst-mutex8.c"
diff --git a/nptl/tst-mutex8d.c b/nptl/tst-mutex8d.c
new file mode 100644
index 0000000..e58d281
--- /dev/null
+++ b/nptl/tst-mutex8d.c
@@ -0,0 +1,3 @@
+#define NAME "timed not elided"
+#define TYPE PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP
+#include "tst-mutex8.c"
diff --git a/nptl/tst-mutex8e.c b/nptl/tst-mutex8e.c
new file mode 100644
index 0000000..ef0f82f
--- /dev/null
+++ b/nptl/tst-mutex8e.c
@@ -0,0 +1,3 @@
+#define NAME "adaptive not elided"
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP
+#include "tst-mutex8.c"
diff --git a/nptl/tst-mutex8f.c b/nptl/tst-mutex8f.c
new file mode 100644
index 0000000..9d203d2
--- /dev/null
+++ b/nptl/tst-mutex8f.c
@@ -0,0 +1,3 @@
+#define NAME "adaptive"
+#define TYPE PTHREAD_MUTEX_ADAPTIVE_NP
+#include "tst-mutex8.c"
diff --git a/nptl/tst-mutex8g.c b/nptl/tst-mutex8g.c
new file mode 100644
index 0000000..78f1395
--- /dev/null
+++ b/nptl/tst-mutex8g.c
@@ -0,0 +1,5 @@
+#include <elision-conf.h>
+#ifdef SUPPORTS_ELISION
+#define ELIDED 1
+#endif
+#include "tst-mutex8.c"
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 8/9] Fix tst-mutexpi8

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

tst-mutexpi8 didn't actually test PI because tst-mutex8 didn't have any
code to enable priority inheritance. Add the needed code in a ifdef.

This also fixes it for lock elision because priority inheritance turns
off elision, so we don't need to explicitely disable it.

2013-01-22  Andi Kleen  <[hidden email]>

        * tst-mutex8.c: Check for ENABLE_PI
        * tst-mutexpi8.c:  Set ENABLE_PI.
---
 nptl/tst-mutex8.c   |   22 +++++++++++++++++++++-
 nptl/tst-mutexpi8.c |    1 +
 2 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/nptl/tst-mutex8.c b/nptl/tst-mutex8.c
index e7446a5..765c048 100644
--- a/nptl/tst-mutex8.c
+++ b/nptl/tst-mutex8.c
@@ -344,7 +344,13 @@ do_test (void)
       puts ("0th mutexattr_settype failed");
       return 1;
     }
-
+#ifdef ENABLE_PI
+  if (pthread_mutexattr_setprotocol (&ma, PTHREAD_PRIO_INHERIT))
+    {
+      puts ("pthread_mutexattr_setprotocol2 failed");
+      return 1;
+    }
+#endif
   puts ("check " NAME " mutex");
   int res = check_type (NAME, &ma);
 #else
@@ -362,6 +368,13 @@ do_test (void)
       puts ("1st mutexattr_settype failed");
       return 1;
     }
+#ifdef ENABLE_PI
+  if (pthread_mutexattr_setprotocol (&ma, PTHREAD_PRIO_INHERIT))
+    {
+      puts ("pthread_mutexattr_setprotocol2 failed");
+      return 1;
+    }
+#endif
   puts ("check recursive mutex");
   res |= check_type ("recursive", &ma);
   if (pthread_mutexattr_destroy (&ma) != 0)
@@ -380,6 +393,13 @@ do_test (void)
       puts ("2nd mutexattr_settype failed");
       return 1;
     }
+#ifdef ENABLE_PI
+  if (pthread_mutexattr_setprotocol (&ma, PTHREAD_PRIO_INHERIT))
+    {
+      puts ("pthread_mutexattr_setprotocol3 failed");
+      return 1;
+    }
+#endif
   puts ("check error-checking mutex");
   res |= check_type ("error-checking", &ma);
   if (pthread_mutexattr_destroy (&ma) != 0)
diff --git a/nptl/tst-mutexpi8.c b/nptl/tst-mutexpi8.c
index cea6030..4aae694 100644
--- a/nptl/tst-mutexpi8.c
+++ b/nptl/tst-mutexpi8.c
@@ -1,2 +1,3 @@
+#define TYPE PTHREAD_MUTEX_TIMED_NP
 #define ENABLE_PI 1
 #include "tst-mutex8.c"
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

[PATCH 9/9] Add manual for lock elision

Andi Kleen-3
In reply to this post by Andi Kleen-3
From: Andi Kleen <[hidden email]>

pthreads are not described in the documentation, but I decided to document
lock elision there at least.

2013-01-22  Andi Kleen  <[hidden email]>

        * manual/Makefile: Add elision.texi.
        * manual/debug.texi: Link to elision.
        * manual/elision.texi: New file.
        * manual/intro.texi: Link to elision.
        * manual/lang.texi: dito.
---
 manual/Makefile     |    2 +-
 manual/debug.texi   |    2 +-
 manual/elision.texi |  337 +++++++++++++++++++++++++++++++++++++++++++++++++++
 manual/intro.texi   |    3 +
 manual/lang.texi    |    2 +-
 5 files changed, 343 insertions(+), 3 deletions(-)
 create mode 100644 manual/elision.texi

diff --git a/manual/Makefile b/manual/Makefile
index c1a304c..abcc744 100644
--- a/manual/Makefile
+++ b/manual/Makefile
@@ -42,7 +42,7 @@ chapters = $(addsuffix .texi, \
        message search pattern io stdio llio filesys \
        pipe socket terminal syslog math arith time \
        resource setjmp signal startup process job nss \
-       users sysinfo conf crypt debug)
+       users sysinfo conf crypt debug elision)
 add-chapters = $(wildcard $(foreach d, $(add-ons), ../$d/$d.texi))
 appendices = lang.texi header.texi install.texi maint.texi platform.texi \
      contrib.texi
diff --git a/manual/debug.texi b/manual/debug.texi
index b2bcb31..722e660 100644
--- a/manual/debug.texi
+++ b/manual/debug.texi
@@ -1,5 +1,5 @@
 @node Debugging Support
-@c @node Debugging Support, , Cryptographic Functions, Top
+@c @node Debugging Support, Lock elision, Cryptographic Functions, Top
 @c %MENU% Functions to help debugging applications
 @chapter Debugging support
 
diff --git a/manual/elision.texi b/manual/elision.texi
new file mode 100644
index 0000000..a1cf839
--- /dev/null
+++ b/manual/elision.texi
@@ -0,0 +1,337 @@
+@node Lock elision, Language Features, Debugging Support, Top
+@c %MENU% Lock elision
+@chapter Lock elision
+
+@c create the bizarre situation that lock elision is documented, but pthreads isn't
+
+This chapter describes the lock implementation implementation for pthread
+locks.
+
+@menu
+* Lock elision introduction:: What is lock elision?
+* Semantic differences of elided locks::
+* Tuning lock elision::
+* Setting elision for individual @code{pthread_mutex_t}::
+* Setting @code{pthread_mutex_t} elision using environment variables::
+* Setting elision for individual @code{pthread_rwlock_t}::
+* Setting @code{pthread_rwlock_t} elision using environment variables::
+* Abort hooks::
+@end menu
+
+@node Lock elision introduction
+@section Lock elision introduction
+
+Lock elision is a technique to improve lock scaling. It runs
+lock regions in parallel using hardware support for a transactional execution
+mode. The lock region is executed speculatively, and as long
+as there is no conflict or other reason for transaction abort the lock
+will executed in parallel. If an transaction abort occurs, any
+side effect of the speculative execution is undone, the lock is taken
+for real and the lock region re-executed. This improves scalability
+of the program because locks do not need to wait for each other.
+
+The standard @code{pthread_mutex_t} mutexes and @code{pthread_rwlock_t} rwlocks
+can be transparently elided by the C library.
+
+Lock elision may lower performance if transaction aborts occur too frequently.
+In this case it is recommended to use a PMU profiler to find the causes for
+the aborts first and try to eliminate them. If that is not possible
+elision can be disabled for a specific lock or for the whole program.
+Alternatively elision can be disabled completely, and only enabled for
+specific locks that are known to be elision friendly.
+
+The defaults locks are adaptive. The lock library decides whether elision
+is profitable based on the abort rates, and automatically disables
+elision for a lock when it aborts too often. After some time elision
+is retried, in case the workload changed.
+
+Lock elision is currently supported for default (timed) mutexes and for
+adaptive mutexes. Other lock types do not elide. Condition variables
+also do not elide. This may change in future versions.
+
+@node Semantic differences of elided locks
+@section Semantic differences of elided locks
+
+Elided locks have some semantic differences to classic locks. These differences
+are only visible when the lock is successfully elided. Since elision may always
+fail a program cannot rely on any of these semantics.
+
+@itemize
+@item
+Elided locks always behave like read-write locks.
+
+@item
+Mutexes and write rwlocks can be locked recursively inside the lock region.
+This behavior is visible through @code{pthread_mutex_trylock}. This
+behavior is not enabled by default for default timed locks, only
+for locks that have been explicitely marked for elision with
+@code{PTHREAD_MUTEX_ELISION_NP}. The default locks will abort
+elision for nested trylocks.
+
+@smallexample
+pthread_mutex_lock (&lock);
+if (pthread_mutex_trylock (&lock) == 0)
+      /* with elision we come here */
+else
+      /* with no elision we always come here */
+@end smallexample
+
+And also through @code{pthread_mutex_timedlock}. This behavior is unconditional
+for elided locks.
+
+@smallexample
+pthread_mutex_lock (&lock);
+if (pthread_mutex_timedlock (&lock, &timeout) == 0)
+     /* With elision we always come here */
+else
+     /* With no elision we always come here because timeout happens. */
+@end smallexample
+
+Similar semantic changes apply to @code{pthread_rwlock_trywrlock} and
+@code{pthread_rwlock_timedwrlock}.
+
+@item
+@code{pthread_mutex_destroy} does not return an error when the lock is locked
+and will clear the lock state.
+
+@item
+@code{pthread_mutex_t} and @code{pthread_rwlock_t} appear free from other threads.
+
+This can be visible through trylock or timedlock.
+In most cases checking this is a existing latent race in the program, but there may
+be rare cases when it is not.
+
+@item
+@code{EAGAIN} and @code{EDEADLK} in rwlocks will not happen under elision.
+
+@item
+@code{pthread_mutex_unlock} does not return an error when unlocking a free lock.
+
+@item
+Elision changes timing because locks now run in parallel.
+Timing differences may expose latent race bugs in the program. Programs using time based synchronization
+(as opposed to using data dependencies) may change behavior.
+
+@end itemize
+
+@node Tuning lock elision
+@section Tuning lock elision
+
+Critical regions may need some tuning to get the benefit of lock elision.
+This is based on the abort rates, which can be determined by a PMU profiler
+(e.g. perf on GNU/Linux systems). When the abort rate is too high lock
+scaling will not improve. Generally lock elision feedback should be done
+only based on profile feedback.
+
+Most of these optimizations will improve performance even without lock elision
+because they will minimize cache line bouncing between threads or make
+lock regions smaller.
+
+Common causes of transactional aborts:
+
+@itemize
+@item
+Not elidable operations like system calls, IO, CPU exceptions.
+
+Try to move out of the critical section when common. Note that these often happen at program startup only.
+@item
+Global statistic counts
+
+Global statistic variables tend to cause conflicts. Either disable, or make per thread or as a last resort sample
+(not update every operation)
+@item
+False sharing of variables or data structures causing conflicts with other threads
+
+Add padding as needed.
+@item
+Other conflicts on the same cache lines with other threads
+
+Minimize conflicts with other threads. This may require changes to the data structures.
+@item
+Capacity overflow
+
+The memory transaction used for lock elision has a limited capacity. Make the critical region smaller
+or move operations that do not need to be protected by the lock outside.
+
+@item
+Rewriting already set flags
+
+Setting flags or variables in shared objects that are already set may cause conflicts. Add a check
+to only write when the value changed.
+@end itemize
+
+@node Setting elision for individual @code{pthread_mutex_t}
+@section Setting elision for individual @code{pthread_mutex_t}
+
+Elision can be explicitly disabled or enabled for each @code{pthread_mutex_t} in the program.
+This overrides any other defaults set by environment variables for this lock.
+
+@code{pthrex_mutex_t} Initializers for using in variable initializations.
+
+@itemize
+@item
+PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP)
+Force lock elision for a (default) timed mutex.
+
+@item
+PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP)
+Force no lock elision for a (default) timed mutex.
+
+@item
+PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP)
+Force lock elision for an adaptive mutex.
+
+@item
+PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP)
+Force no lock elision for an adaptive mutex.
+@end itemize
+
+@smallexample
+/* Disable lock elision for mylock */
+pthread_mutex_t mylock = PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP);
+@end smallexample
+
+The lock type can also be set at runtime using @code{pthread_mutexattr_settype} and @code{pthread_mutex_init}.
+
+@smallexample
+/* Force lock elision for a dynamically allocated mutex */
+pthread_mutexattr_t attr;
+pthread_mutexattr_init (&attr);
+pthread_mutexattr_settype (&attr, PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP);
+pthread_mutex_init (&object->mylock, &attr);
+@end smallexample
+
+@code{pthread_mutex_gettype} will return additional flags too.
+
+@node Setting @code{pthread_mutex_t} elision using environment variables
+@section Setting @code{pthread_mutex_t} elision using environment variables
+The elision of @code{pthread_mutex_t} mutexes can be configured at runtime with the @code{PTHREAD_MUTEX}
+environment variable.  This will force a specific lock type for all
+mutexes in the program that do not have another type set explicitly.
+An explicitly set lock type will override the environment variable.
+
+@smallexample
+# run myprogram with no elision
+PTHREAD_MUTEX=none myprogram
+@end smallexample
+
+The default depends on the C library build configuration and whether the hardware
+supports lock elision.
+
+@itemize
+@item    
+@code{PTHREAD_MUTEX=elision}
+Use elided mutexes, unless explicitely disabled in the program.
+    
+@item
+@code{PTHREAD_MUTEX=none}
+Don't use elide mutexes, unless explicitly enable in the program.
+@end itemize
+
+In addition additional tunables can be configured through the environment variable,
+like this:
+@code{PTHREAD_MUTEX=adaptive:retry_lock_busy=10,retry_lock_internal_abort=20}
+Note these parameters do not consistitute an ABI and may change or disappear
+at any time as the lock elision algorithm evolves.
+
+Currently supported parameters are:
+    
+@itemize
+@item
+retry_lock_busy
+How often to not attempt a transaction when the lock is seen as busy.
+    
+@item
+retry_lock_internal_abort
+How often to not attempt a transaction after an internal abort is seen.
+
+@item    
+retry_try_xbegin
+How often to retry the transaction on external aborts.
+
+@item
+retry_trylock_internal_abort
+How often to retry the transaction on internal aborts during trylock.
+This setting is also used for adaptive locks.
+
+@end itemize
+
+@node Setting elision for individual @code{pthread_rwlock_t}
+@section Setting elision for individual @code{pthread_rwlock_t}
+
+Elision can be explicitly disabled or enabled for each @code{pthread_rwlock_t} in the program.
+This overrides any other defaults set by environment variables for this lock.
+
+Valid flags are @code{PTHREAD_RWLOCK_ELISION_NP} to force elision and @code{PTHREAD_RWLOCK_NO_ELISION_NP}
+to disable elision. These can be ored with other rwlock types.
+
+@smallexample
+/* Force no lock elision for a dynamically allocated rwlock */
+pthread_rwlockattr_t rwattr;
+pthread_rwlockattr_init (&rwattr);
+pthread_rwlockattr_settype (&rwattr, PTHREAD_RWLOCK_NO_ELISION_NP);
+pthread_rwlock_init (&object->myrwlock, &rwattr);
+@end smallexample
+
+@node Setting @code{pthread_rwlock_t} elision using environment variables
+@section Setting @code{pthread_rwlock_t} elision using environment variables
+The elision of @code{pthread_rwlock_t} rwlockes can be configured at
+runtime with the @code{PTHREAD_RWLOCK} environment variable.
+This will force a specific lock type for all
+rwlockes in the program that do not have another type set explicitly.
+An explicitly set lock type will override the environment variable.
+
+@smallexample
+# run myprogram with no elision
+PTHREAD_RWLOCK=none myprogram
+@end smallexample
+
+The default depends on the C library build configuration and whether the hardware
+supports lock elision.
+
+@itemize
+@item    
+@code{PTHREAD_RWLOCK=elision}
+Use elided rwlockes, unless explicitely disabled in the program.
+    
+@item
+@code{PTHREAD_RWLOCK=none}
+Don't use elided rwlocks, unless explicitely enabled in the program.
+@end itemize
+
+@node Abort hooks
+@section Abort hooks
+@cindex abort hooks for lock elision
+@comment pthread.h
+@comment GNU
+@deftypefun {__pthread_abort_hook_t} __pthread_set_elision_abort_hook (__pthread_abort_hook_t @var{hook})
+
+For some debugging situations it can be useful to call custom code on all transaction aborts.
+The C Library allows to set a hook that is called from all of its transaction abort handlers.
+
+The hook can be set with the @code{__pthread_set_elision_abort_hook} function. It consists of a callback
+@var{hook} that gets the CPU specific abort code as first argument. @code{__pthread_set_elision_abort_hook} returns
+the previous hook. Passing NULL for @var{hook} removes the hook.
+
+With TSX this hook can be used to pass up to one byte of information out of a transaction using
+the _xabort() intrinsic (there is no other way to do this).
+
+@smallexample
+enum { BAD_COND1 };
+
+void my_abort_hook (unsigned status);
+@{
+  if ((code & _XABORT_EXPLICIT) && _XABORT_CODE(status) == BAD_COND1)
+    printf("bad condition1 happend\n");
+@}
+
+__pthread_set_elision_abort_hook (my_abort_hook);
+pthread_mutex_lock(&lock);
+if (bad condition in lock)
+    _xabort(BAD_COND1);
+pthread_mutex_unlock(&lock);
+@end smallexample
+
+@code{__pthread_set_elision_abort_hook} is a GNU extension.
+@end deftypefun
+
diff --git a/manual/intro.texi b/manual/intro.texi
index deaf089..3af44c6 100644
--- a/manual/intro.texi
+++ b/manual/intro.texi
@@ -703,6 +703,9 @@ information about the hardware and software configuration your program
 is executing under.
 
 @item
+@ref{Lock elision} describes elided locks in pthreads.
+
+@item
 @ref{System Configuration}, tells you how you can get information about
 various operating system limits.  Most of these parameters are provided for
 compatibility with POSIX.
diff --git a/manual/lang.texi b/manual/lang.texi
index ee04e23..72e06b0 100644
--- a/manual/lang.texi
+++ b/manual/lang.texi
@@ -1,6 +1,6 @@
 @c This node must have no pointers.
 @node Language Features
-@c @node Language Features, Library Summary, , Top
+@c @node Language Features, Library Summary, Lock elision, Top
 @c %MENU% C language features provided by the library
 @appendix C Language Facilities in the Library
 
--
1.7.7.6

Reply | Threaded
Open this post in threaded view
|

Re: TSX lock elision for glibc v3

Andi Kleen-3
In reply to this post by Andi Kleen-3
Andi Kleen <[hidden email]> writes:
>
> The full series is available at
> http://github.com/andikleen/glibc
> git://github.com/andikleen/glibc rtm-devel4

Actually

git://github.com/andikleen/glibc rtm-devel5 now
rtm-devel4 was v2

-Andi


--
[hidden email] -- Speaking for myself only
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/9] Add the low level infrastructure for pthreads lock elision with TSX

Florian Weimer-5
In reply to this post by Andi Kleen-3
On 01/22/2013 10:39 PM, Andi Kleen wrote:

> +      /* When this could be a nested trylock that is not explicitely
> + declared an elided lock abort. This makes us follow POSIX
> + paper semantics. */
> +      if (upgraded)
> +        _xabort (0xfd);
> +
> +      if ((status = _xbegin()) == _XBEGIN_STARTED)
> + {
> +  if (*futex == 0)
> +    return 0;
> +
> +  /* Lock was busy. Fall back to normal locking.
> +     Could also _xend here but xabort with 0xff code
> +     is more visible in the profiler. */
> +  _xabort (0xff);
> + }

Shouldn't 0xfd, 0xff be #defines or enum constants?

--
Florian Weimer / Red Hat Product Security Team
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/9] Add the low level infrastructure for pthreads lock elision with TSX

Torvald Riegel-4
On Wed, 2013-01-23 at 09:16 +0100, Florian Weimer wrote:

> On 01/22/2013 10:39 PM, Andi Kleen wrote:
>
> > +      /* When this could be a nested trylock that is not explicitely
> > + declared an elided lock abort. This makes us follow POSIX
> > + paper semantics. */
> > +      if (upgraded)
> > +        _xabort (0xfd);
> > +
> > +      if ((status = _xbegin()) == _XBEGIN_STARTED)
> > + {
> > +  if (*futex == 0)
> > +    return 0;
> > +
> > +  /* Lock was busy. Fall back to normal locking.
> > +     Could also _xend here but xabort with 0xff code
> > +     is more visible in the profiler. */
> > +  _xabort (0xff);
> > + }
>
> Shouldn't 0xfd, 0xff be #defines or enum constants?

Agreed.

Andi, you've previously said that those constants would become part of
an ABI.  Is there any common header / specification / ... for this ABI
yet?  Where do you think will this get specified properly?  Or will this
remain a convention?

This is also somewhat related to the approach you took for txn-assert.
Those constants (eg, 0xff) are already used in GCC libitm too, so it
might make sense to find a common place for them at some point in time..


Torvald