[Bug nptl/2306] New: deferred cancellation fires during signal handler execution

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/2306] New: deferred cancellation fires during signal handler execution

986882896 at qq dot com
Distro: Debian unstable, glibc package 2.3.5-13, gcc Debian-4.0.2-8 (also
exhibited on FC3 on x86-64)

(first let me note that this behavior does not occur on i686-linux-gnu, using
the same package versions)

This problem was found while testing Asterisk, both the SVN trunk and all
current 1.2 releases.

Asterisk has a number of modules that when loaded create a 'monitor thread' that
performs various background tasks. When the user requests the module to be
unloaded, we follow these steps:

         pthread_cancel(monitor_thread);
         pthread_kill(monitor_thread, SIGURG);
         pthread_join(monitor_thread, NULL);

The idea is to place a pending cancellation on the thread (we do not change the
default cancellation state of the threads we create), then send it a signal to
break it out of any blocking calls it may be in, and then wait for it to die.
The signal handler for SIGURG does _nothing_ except re-enable itself using
'signal(SIGURG, urg_handler)'.

What we are seeing is a segfault, with a stack trace that looks like this:

gdb) bt
#0  0x00002aaab6e78842 in _Unwind_DeleteException () from /lib/libgcc_s.so.1
#1  0x00002aaab6e79824 in _Unwind_Backtrace () from /lib/libgcc_s.so.1
#2  0x00002aaab6e7990c in _Unwind_ForcedUnwind () from /lib/libgcc_s.so.1
#3  0x00002aaaaacd0f60 in __pthread_unwind () from /lib/libpthread.so.0
#4  0x00002aaaaaccb260 in sigcancel_handler () from /lib/libpthread.so.0
#5  <signal handler called>
#6  0x00002aaaab34b075 in sigaction () from /lib/libc.so.6
#7  0x00002aaaab34adb1 in ssignal () from /lib/libc.so.6
#8  0x000000000048eb5c in urg_handler (num=23) at asterisk.c:717
#9  0x00002aaaab34aed0 in killpg () from /lib/libc.so.6
#10 0x0000000000000000 in ?? ()
(gdb)

As best I can tell, it appears that during the execution of the SIGURG handler,
the deferred cancellation took effect. I cannot find any documentation stating
that signal() is a cancellation point, but if that is so, we'll have to find
some sort of workaround (although this does not happen using a 32-bit envirnonment).

--
           Summary: deferred cancellation fires during signal handler
                    execution
           Product: glibc
           Version: 2.3.5
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
        AssignedTo: drepper at redhat dot com
        ReportedBy: kpfleming at digium dot com
                CC: glibc-bugs at sources dot redhat dot com
  GCC host triplet: x86_64-linux-gnu


http://sourceware.org/bugzilla/show_bug.cgi?id=2306

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/2306] deferred cancellation fires during signal handler execution

986882896 at qq dot com

------- Additional Comments From decimal at us dot ibm dot com  2006-02-09 19:58 -------
Can you reproduce this with a small testcase?

Are you using LinuxThreads or NPTL?

--


http://sourceware.org/bugzilla/show_bug.cgi?id=2306

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/2306] deferred cancellation fires during signal handler execution

986882896 at qq dot com
In reply to this post by 986882896 at qq dot com

------- Additional Comments From kpfleming at digium dot com  2006-02-10 20:08 -------
We are using NPTL, and no I cannot create a simple testcase...

However, this issue can be closed, because it appears to be a bug in our code.
We are setting the thread to DETACHED, but then trying to pthread_join() on it
before unloading the module that created it. On a 32-bit distro, this seems to
work fine (although it is contrary to the documentation), but on the 64-bit
system is segfaults (presumably because the module has been freed from memory
while the thread is still executing). The backtrace, while consistent, appears
to be misleading at best :-)

--
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID


http://sourceware.org/bugzilla/show_bug.cgi?id=2306

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.