RE: Robust mutex problem on MIPS.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

RE: Robust mutex problem on MIPS.

Kaz Kylheku-3
Earlier today, I wrote:
> In the deadlock situation,  the futex has somehow taken on the value
> 0x80000000 (the FUTEX_WAITERS value).

In fact, this appears to be an arch-indpendent bug in the low-level
robust lock routine.

(Haven't looked at the latest CVS, just 2.5: maybe I'm fixing something
that's already fixed!)

I'm trying a fix right now in
nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c

Basically, it's possible for the futex to be zero when the routine is
entered, since on another processor, the lock could just have been
released.

The caller will then flip the lock to ``oldval | FUTEX_WAITERS'', which
is just FUTEX_WAITERS.

But at the bottom of the loop, the compare swap operation tries to
acquire the lock from a 0 value. Both threads execute that loop and are
hosed.

Index: glibc/glibc-2.5/nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c
===================================================================
---
glibc.orig/glibc-2.5/nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c
2006-03-01 16:25:10.000000000 -0800
+++ glibc/glibc-1.5/nptl/sysdeps/unix/sysv/linux/lowlevelrobustlock.c
2007-05-26 15:45:45.504886376 -0700
@@ -36,11 +36,15 @@
  return oldval;
 
       int newval = oldval | FUTEX_WAITERS;
-      if (oldval != newval
-  && atomic_compare_and_exchange_bool_acq (futex, newval,
oldval))
- continue;
 
-      lll_futex_wait (futex, newval);
+      /* Do not switch an unowned lock to FUTEX_WAITERS! */
+      if (newval != FUTEX_WAITERS) {
+        if (oldval != newval
+            && atomic_compare_and_exchange_bool_acq (futex, newval,
oldval))
+          continue;
+
+        lll_futex_wait (futex, newval);
+      }
     }
   while ((oldval = atomic_compare_and_exchange_val_acq (futex,
  tid |
FUTEX_WAITERS,
Reply | Threaded
Open this post in threaded view
|

RE: Robust mutex problem on MIPS.

Kaz Kylheku-3
[hidden email] wrote:
> Earlier today, I wrote:
>> In the deadlock situation,  the futex has somehow taken on the value
>> 0x80000000 (the FUTEX_WAITERS value).
>
> But at the bottom of the loop, the compare swap operation tries to
> acquire the lock from a 0 value. Both threads execute that
> loop and are
> hosed.

Ah, I see in CVS that it was fixed on May 7, in a slightly different
way.
Very good.