Systemtap caching behavior

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Systemtap caching behavior

Mikhail Zhukovskiy
Hello,

I have a question about systemtap caching behavior.

According to the man page, the results of pass 3 and pass 4 (the translation to C and the compilation of the kernel module) should be cached between runs, however this is not what I am observing. Could you please point out what I’m doing wrong?

We are attempting to debug 1-2 minute long load spikes, with an stap script triggered when a load spike is detected. If the results of the compilation steps are not cached and compilation is necessary on every stap invocation, the event may be over before data capture begins.

I am attaching my invocation of stap and a demonstration that all compilation steps re-occur on subsequent runs.

Thank you very much for your help!

Mike

vagrant@dns4-01-us1:~/.systemtap$ sudo rm -rf ~/.systemtap

vagrant@dns4-01-us1:~$ cat flame-kernel.sh
#!/bin/bash

 /home/vagrant/systemtapbin/bin/stap -k -s 32 --all-modules -v -D MAXBACKTRACE=100 -D MAXSTRINGLEN=4096 -D MAXMAPENTRIES=10240 -D DMAXACTION_INTERRUPTIBLE=500 -D MAXACTION=100000 -D STP_OVERLOAD_THRESHOLD=5000000000LL flame-kernel.stp  > out.stap-stacks

vagrant@dns4-01-us1:~$ sudo ./flame-kernel.sh
Pass 1: parsed user script and 476 library scripts using 113060virt/85440res/4612shr/81280data kb, in 350usr/20sys/377real ms.
Pass 2: analyzed script: 3 probes, 7 functions, 4 embeds, 1 global using 250420virt/224400res/6408shr/218640data kb, in 1900usr/300sys/3033real ms.
Pass 3: translated to C into "/tmp/stapy6Mjkr/stap_11482_src.c" using 250420virt/224584res/6536shr/218640data kb, in 1230usr/70sys/1304real ms.
Pass 4: compiled C into "stap_11482.ko" in 16830usr/1230sys/18702real ms.
Pass 5: starting run.
WARNING: Missing unwind data for a module, rerun with 'stap -d kernel'
WARNING: Missing unwind data for a module, rerun with 'stap -d (unknown; retry with -DDEBUG_UNWIND)'
Pass 5: run completed in 10usr/200sys/30546real ms.
Keeping temporary directory "/tmp/stapy6Mjkr"

vagrant@dns4-01-us1:~$ find ~/.systemtap -name *.ko
/home/vagrant/.systemtap/cache/b4/typequery_b43a87dffeccfe937c15e5ca0b8cccf7_874.ko
/home/vagrant/.systemtap/cache/ea/typequery_ea54c2c3c9c4e050b149b086326979b2_877.ko


vagrant@dns4-01-us1:~$ sudo ./flame-kernel.sh
Pass 1: parsed user script and 476 library scripts using 113060virt/85560res/4732shr/81280data kb, in 330usr/40sys/371real ms.
Pass 2: analyzed script: 3 probes, 7 functions, 4 embeds, 1 global using 250408virt/224372res/6392shr/218628data kb, in 790usr/210sys/998real ms.
Pass 3: translated to C into "/tmp/stapEXDMjx/stap_13078_src.c" using 250408virt/224556res/6520shr/218628data kb, in 1220usr/70sys/1299real ms.
Pass 4: compiled C into "stap_13078.ko" in 8790usr/360sys/8817real ms.
Pass 5: starting run.
WARNING: Missing unwind data for a module, rerun with 'stap -d kernel'
Pass 5: run completed in 10usr/200sys/30549real ms.
Keeping temporary directory "/tmp/stapEXDMjx"
vagrant@dns4-01-us1:~$
Reply | Threaded
Open this post in threaded view
|

Re: Systemtap caching behavior

Mark Wielaard
On Sat, Jan 12, 2019 at 07:59:14PM -0500, Mikhail Zhukovskiy wrote:
> I have a question about systemtap caching behavior.
>
> According to the man page, the results of pass 3 and pass 4 (the translation to C and the compilation of the kernel module) should be cached between runs, however this is not what I am observing. Could you please point out what I’m doing wrong?
> [...]
>  /home/vagrant/systemtapbin/bin/stap -k -s 32 --all-modules -v -D MAXBACKTRACE=100 -D MAXSTRINGLEN=4096 -D MAXMAPENTRIES=10240 -D DMAXACTION_INTERRUPTIBLE=500 -D MAXACTION=100000 -D STP_OVERLOAD_THRESHOLD=5000000000LL flame-kernel.stp  > out.stap-stacks

You are using -k:

       -k     Keep  the temporary directory after all processing.  This may be
              useful in order to examine the generated C code, or to reuse the
              compiled kernel object.

> Keeping temporary directory "/tmp/stapy6Mjkr"

This bypasses the cache.

Cheers,

Mark
Reply | Threaded
Open this post in threaded view
|

Re: Systemtap caching behavior

Mikhail Zhukovskiy
Thank you Mark, that did the trick!

One more, perhaps naive question. What I am trying to do is generate fame graphs per http://brendangregg.com/FlameGraphs/cpuflamegraphs.html . At the moment, I have succeeded in making the flame graphs in which the call stacks begin with entry_SYSCALL_64_fastpath. If I wanted to include the user space calls that made these system calls, would my next step be to re-compile the application that we are profiling (the BIND DNS daemon) with one of the gcc debug flags (https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html)?

Is there anything else that I would need to do?

Thank you,
Mike

> On Jan 13, 2019, at 4:58 AM, Mark Wielaard <[hidden email]> wrote:
>
> On Sat, Jan 12, 2019 at 07:59:14PM -0500, Mikhail Zhukovskiy wrote:
>> I have a question about systemtap caching behavior.
>>
>> According to the man page, the results of pass 3 and pass 4 (the translation to C and the compilation of the kernel module) should be cached between runs, however this is not what I am observing. Could you please point out what I’m doing wrong?
>> [...]
>> /home/vagrant/systemtapbin/bin/stap -k -s 32 --all-modules -v -D MAXBACKTRACE=100 -D MAXSTRINGLEN=4096 -D MAXMAPENTRIES=10240 -D DMAXACTION_INTERRUPTIBLE=500 -D MAXACTION=100000 -D STP_OVERLOAD_THRESHOLD=5000000000LL flame-kernel.stp  > out.stap-stacks
>
> You are using -k:
>
>       -k     Keep  the temporary directory after all processing.  This may be
>              useful in order to examine the generated C code, or to reuse the
>              compiled kernel object.
>
>> Keeping temporary directory "/tmp/stapy6Mjkr"
>
> This bypasses the cache.
>
> Cheers,
>
> Mark

Reply | Threaded
Open this post in threaded view
|

Re: Systemtap caching behavior

Mark Wielaard
On Sun, Jan 13, 2019 at 09:58:39AM -0500, Mikhail Zhukovskiy wrote:
> One more, perhaps naive question. What I am trying to do is generate fame graphs per http://brendangregg.com/FlameGraphs/cpuflamegraphs.html . At the moment, I have succeeded in making the flame graphs in which the call stacks begin with entry_SYSCALL_64_fastpath. If I wanted to include the user space calls that made these system calls, would my next step be to re-compile the application that we are profiling (the BIND DNS daemon) with one of the gcc debug flags (https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html)?

For just backtraces in user space on amd64 you shouldn't need to rebuild
userspace, since amd64 includes unwind tables by default. So backtraces
done from user space probes should work as is. But if you are unwinding
from kernel space into user space you might need the kernel debuginfo
installed (since the kernel normally doesn't include unwind tables,
so the will need to come from the .debug_frame).

Cheers,

Mark
Reply | Threaded
Open this post in threaded view
|

Re: Systemtap caching behavior

Mikhail Zhukovskiy
Hi Mark,

Again, thank you for all your help!

To give you a bit of background on my problem, after patching the kernel from 3.13 to 4.4, we started seeing periodic 1-2 minute events where the BIND DNS daemon goes into “uninterruptible sleep” and becomes unresponsive. My effort here is about figuring what what system calls BIND is waiting on when that happens.

I have the kernel -dbgsym package installed (doing this on Ubuntu).

vagrant@dns4-01-us1:~$ dpkg -l | grep dbgsym
ii  linux-image-4.4.0-133-generic-dbgsym 4.4.0-133.159~14.04.1             amd64        Linux kernel debug image for version 4.4.0 on 64 bit x86 SMP

Here is the link to all my source code and a complete run.

https://gist.github.com/misha354/15f74ad50c4fe67fe5ae2b9e046fe2f6




I am attaching the resulting flame graph. As you can see, the base of every call stack is entry_SYSCALL_64_fastpath(). I am hoping to see the BIND functions that are calling it. I’m not sure what other steps I need to take make that happen.


Thank you,
Mike


> On Jan 13, 2019, at 11:22 AM, Mark Wielaard <[hidden email]> wrote:
>
> On Sun, Jan 13, 2019 at 09:58:39AM -0500, Mikhail Zhukovskiy wrote:
>> One more, perhaps naive question. What I am trying to do is generate fame graphs per https://urldefense.proofpoint.com/v2/url?u=http-3A__brendangregg.com_FlameGraphs_cpuflamegraphs.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1tDlnkfhOa3yKtPzIEgoBg8s2PlcIN5CQWpRGelmq-Y&m=ghzS2Dvh4MdfpRduK1G2gYXsy3n1CeV0-TRZLzr2Odo&s=P9ySoA2ptCOLPVUIQ7-82Zh_l0HeYpYklWFn7UETF34&e= . At the moment, I have succeeded in making the flame graphs in which the call stacks begin with entry_SYSCALL_64_fastpath. If I wanted to include the user space calls that made these system calls, would my next step be to re-compile the application that we are profiling (the BIND DNS daemon) with one of the gcc debug flags (https://urldefense.proofpoint.com/v2/url?u=https-3A__gcc.gnu.org_onlinedocs_gcc_Debugging-2DOptions.html-29-3F&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1tDlnkfhOa3yKtPzIEgoBg8s2PlcIN5CQWpRGelmq-Y&m=ghzS2Dvh4MdfpRduK1G2gYXsy3n1CeV0-TRZLzr2Odo&s=4NvgYM4LhMwJgO7Kgl8GAYMaWHCpMDmVHTg-m366who&e=
>
> For just backtraces in user space on amd64 you shouldn't need to rebuild
> userspace, since amd64 includes unwind tables by default. So backtraces
> done from user space probes should work as is. But if you are unwinding
> from kernel space into user space you might need the kernel debuginfo
> installed (since the kernel normally doesn't include unwind tables,
> so the will need to come from the .debug_frame).
>
> Cheers,
>
> Mark


stap-kernel.svg (708K) Download Attachment