[Bug kprobes/2068] New: return probe on __switch_to triggers BUG_ON

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug kprobes/2068] New: return probe on __switch_to triggers BUG_ON

glaubitz at physik dot fu-berlin.de
Josh Stone reported a failure as follows:
-----
...
'kernel.function("__switch_to").return'.
This one is a problem with kretprobes only, as all of my other probes in
__switch_to behaved just fine, even in the middle of the function.
Running this gave "Kernel BUG at kprobes:449" (the full dump is included
below).  The line mentioned is in trampoline_probe_handler:

    BUG_ON(!orig_ret_address || (orig_ret_address ==
trampoline_address));
-----
The problem is probably that kretprobe_instance objects are hashed by the
current task pointer.  Upon entry to __switch_to(), the object is placed on the
list for the "prev" task, but upon return it's sought on the list for the "next"
task.

If this indeed the problem, then:
1. Return probes on __switch_to should be blacklisted in the SystemTap
translator and kprobes unless and until a fix is found.
2. A fix in kprobes would presumably require kprobes to notice (either at
registration time or at function entry) that we're probing __switch_to().  Upon
function entry, we'd have to invoke architecture-specific (and potentially
version-dependent) code to grab the next-task pointer out of the arg list and
hash on that.

By the way, context_switch() should NOT be a problem, because it's inline and
kprobes doesn't support return probes on inline functions.

--
           Summary: return probe on __switch_to triggers BUG_ON
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P1
         Component: kprobes
        AssignedTo: systemtap at sources dot redhat dot com
        ReportedBy: jkenisto at us dot ibm dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=2068

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Reply | Threaded
Open this post in threaded view
|

[Bug kprobes/2068] return probe on __switch_to triggers BUG_ON

glaubitz at physik dot fu-berlin.de

------- Additional Comments From jkenisto at us dot ibm dot com  2005-12-19 23:41 -------
My previous analysis is incorrect.  First of all, return probes on __switch_to()
seem to work fine in the kprobes and SystemTap tests that I've run.  

Second, as Roland pointed out even before I created this PR, the stack switch
happens before __switch_to() is called.  So the value of current is the same
(i.e., the same as next_p) on both entry and return, and the same hash list is
used both times.

Maybe Josh can provide a repeat-by script that demonstrates the problem.
Otherwise, I'll change this to WORKSFORME.

--
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |joshua dot i dot stone at
                   |                            |intel dot com
             Status|NEW                         |ASSIGNED
           Priority|P1                          |P2


http://sourceware.org/bugzilla/show_bug.cgi?id=2068

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Reply | Threaded
Open this post in threaded view
|

[Bug kprobes/2068] return probe on __switch_to triggers BUG_ON

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de

------- Additional Comments From joshua dot i dot stone at intel dot com  2005-12-20 19:38 -------
I can repeat this without fail with this simple command:

# stap -e 'probe kernel.function("__switch_to").return{}'

I am running the 2.6.9-24.ELsmp kernel, on x86_64.  I tried this also on the
i686 kernel, and did not trigger the BUG_ON, so it may be specific to x86_64.

--


http://sourceware.org/bugzilla/show_bug.cgi?id=2068

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Reply | Threaded
Open this post in threaded view
|

[Bug kprobes/2068] return probe on __switch_to triggers BUG_ON

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de

------- Additional Comments From jkenisto at us dot ibm dot com  2005-12-20 23:43 -------
Some ppc64 data, only partially analyzed...
I tracked __switch_to calls and returns on a ppc64 system.  Here __switch_to
does change the value of current (it matches prev on entry and new on return),
but the BUG_ON never gets triggered.

I also consistently see more calls than returns; and it's apparently not
entirely (or even mostly) due to running out of kretprobe_instance objects or
the entry probe getting installed before the return probe.  Here are results
from various runs:
run# maxactive nmissed ncalls nret  ncalls-(nret+nmissed)
1    5         112     209    92    5
2    50        0       293    267   26
3    50        0       406    364   42
4    50        0       759    717   42
5    50        0       1581   1535  46
6    50        173     7308   7055  80
Following the activity on a particular CPU, I occasionally see multiple
consecutive calls with no intervening returns.

--


http://sourceware.org/bugzilla/show_bug.cgi?id=2068

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Reply | Threaded
Open this post in threaded view
|

[Bug kprobes/2068] return probe on __switch_to triggers BUG_ON

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de

------- Additional Comments From jkenisto at us dot ibm dot com  2005-12-21 00:35 -------
Comment #1 ("works for me") refers only to i686.  Had my blinders on.

Like Josh, I can see this fail (insmod -> hung system) on x86_64.  I'm using a
hand-written C module; no need to involve SystemTap.

--


http://sourceware.org/bugzilla/show_bug.cgi?id=2068

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Reply | Threaded
Open this post in threaded view
|

[Bug kprobes/2068] return probe on __switch_to triggers BUG_ON

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de


--
           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|systemtap at sources dot    |jkenisto at us dot ibm dot
                   |redhat dot com              |com


http://sourceware.org/bugzilla/show_bug.cgi?id=2068

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.