Interoperability of LTTng and LTTV with SystemTAP

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Interoperability of LTTng and LTTV with SystemTAP

Mathieu Desnoyers-2
Hi,

I just had a discussion with Frank Ch. Eigler from the SystemTAP team on the
topic "how can the tools cooperate well together".

SystemTAP is especially made for running user defined scripts at specific points
in the kernel. I has the ability to add/remove scripts and instrumentation
dynamically from a running kernel.

LTTng is a kernel tracer designed to have the minimum impact on the system while
having a design solid enough that it can trace NMI handlers (no locking). It
supports native C types and alignment. It can register new instrumentation sites
from dynamically loaded modules on-the-fly.

LTTV is a modular text/graphical trace analysis tool. You can think of it as
framework offered for plugins to interact together to show information. Simple
plugins can show generically some information (detailed event list, for
instance), while more sophisticated plugins can show a particular representation
of the system (each process state (user mode, system call, interrupted,
in trap, ...) evolving through time).

While the SystemTAP project focus on pre-processing of the information, LTT
focus on raw data copy (limiting the impact at probe site) at the cost of more
data to be transferred and analysed afterward.

I think that both approaches are complementary. I am trying to see where
integrated analysis of pre-processed and post-processed information could be
interesting, but I'm sure there are areas where it is.

If you have application ideas, feel free to express yourself.


-- end of the "discussion initiation" part, now with the technical part :

Frank asked me, at the end of our IRC discussion, if I could send some numbers
about how much information is transferred. Here is the first result : a trace
taken on a system used for start/stop of web broser (mozilla) and doing a "find"
on the root of the system. I plan to do the same on heavily loaded
system (I am trying to get an heavy commercial application for this) and very
heavily loaded system (I will try ping -f and a few others) as soon as I have
the time.

I have also been asked about the impact of the copy of the information to disk
on the system behavior. The percentage of cpu time used is not relevant in this
first usage scenario because the system was most of the time idle. Still, about
1.79% (cpu 0) and 1.89% (cpu 1) of cpu time has been used by the disk writer
daemon, which is not much considering that the system usage was 6.78% for cpu 0
and 9.79% for cpu 1. A heavily loaded system will give us a better insight.

However, I would say that probe effect at the call site is much more important
than the effect of a much lower priority disk writer daemon. For the probe
effect, the microbenchmarks I made tells that logging an event takes about
220 ns.

Here are the results. Note that I have taken a short trace (15 seconds) just
because I was in a hurry before preparing a presentation at that moment.


<begin>

LTTng 0.4.3

Usage scenarios

Scenario 1 (low event rate)

Web browsing and using find (low event rate)
12MB trace, duration : 15.079714935 seconds
2MB per-cpu buffers used, written to disk 1MB at a time by the daemon.

The machine is a 3Ghz Pentium 4 with hyperthread. Each virtual CPU is seen as
a physical one by the kernel, except for the scheduler.

All the instrumentation set of LTTng is active. See
http://ltt.polymtl.ca/packages/LinuxTraceToolkitViewer-0.6.9-10102005.tar.gz
directory facilities for the description of the structures of events logged.


trace :

(units are in seconds)

real time :
start 16591.856649974
end 16606.936364909

trace duration : 15.079714935

(idle time is the time spent doing nothing (syscall mode) at process 0.
idle time on cpu 0 : 14.056580720
idle time on cpu 1 : 13.692558631

cpu 0 : total CPU time used : 1.023134215
        cpu 0 is used at 6.78%
cpu 1 : total CPU time used : 1.387156304
        cpu 1 is used at 9.19%

lttd (cpu time) (note : the daemon is currently non multithreaded)

cpu 0 :

usermode : 0.000098216
trap 14 (page fault) : 0.000192750 (count : 258)
irqs :
0 (timer) : 0.000157657
16 (audio) : 0.000079536
system calls :
none (unknown at the beginning of the trace) : 0.000799137
4 (write) : 0.016877408
54 (ioctl) : 0.000019114
168 (poll) : 0.000766053

Total cpu time accountable to the process :
usermode + traps + system calls
0.017953541


cpu 1 :

usermode : 0.000094188
trap 14 (page fault) : 0.000761918 (count : 771)
irqs : none
system calls :
4 (write) : 0.017939435
54 (ioctl) : 0.000021459
168 (poll) : 0.007352545

Total cpu time accountable to the process :
usermode + traps + system calls
0.026169545


ratio of time for write / cpu time :
cpu 0 : 1.75%
cpu 1 : 1.89%

ratio of time for write / real time :
cpu 0 : 0.12%
cpu 1 : 0.17%


overall event count (for the whole 15.079714935s trace):

events count :  651851

detail per event type (40 different event types) :

state_dump_facility_load :  10
wait :  326
fork :  24
exit :  24
free :  24
kernel_thread :  4
syscall_exit :  75564
trap_exit :  30149
syscall_entry :  75562
call :  84
recvmsg :  152
close :  1915
wakeup :  14614
schedchange :  21141
soft_irq_entry :  31313
softirq :  29477
soft_irq_exit :  31313
tasklet_entry :  1714
tasklet_exit :  1714
expired :  2563
trap_entry :  30149
irq_entry :  16169
irq_exit :  16169
packet_in :  6
packet_out :  7
sendmsg :  3237
select :  213105
write :  9974
page_alloc :  8661
read :  19378
set_itimer :  1232
ioctl :  7066
buf_wait_start :  20
buf_wait_end :  19
open :  1826
poll :  5762
page_free :  5
exec :  17
seek :  1345
shm_create :  2
create :  15

The per record data amount per record is statistically
12MB/651851 = 19.30 bytes per event


<end>



OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP

Tom Zanussi
Mathieu Desnoyers writes:
 > Hi,
 >
 > I just had a discussion with Frank Ch. Eigler from the SystemTAP team on the
 > topic "how can the tools cooperate well together".
 >
 > SystemTAP is especially made for running user defined scripts at specific points
 > in the kernel. I has the ability to add/remove scripts and instrumentation
 > dynamically from a running kernel.
 >
 > LTTng is a kernel tracer designed to have the minimum impact on the system while
 > having a design solid enough that it can trace NMI handlers (no locking). It
 > supports native C types and alignment. It can register new instrumentation sites
 > from dynamically loaded modules on-the-fly.
 >
 > LTTV is a modular text/graphical trace analysis tool. You can think of it as
 > framework offered for plugins to interact together to show information. Simple
 > plugins can show generically some information (detailed event list, for
 > instance), while more sophisticated plugins can show a particular representation
 > of the system (each process state (user mode, system call, interrupted,
 > in trap, ...) evolving through time).
 >
 > While the SystemTAP project focus on pre-processing of the information, LTT
 > focus on raw data copy (limiting the impact at probe site) at the cost of more
 > data to be transferred and analysed afterward.
 >
 > I think that both approaches are complementary. I am trying to see where
 > integrated analysis of pre-processed and post-processed information could be
 > interesting, but I'm sure there are areas where it is.
 >
 > If you have application ideas, feel free to express yourself.
 >

One area that seems tailor-made for this type of interaction would be
to use systemtap probes to detect 'interesting' conditions in the
trace data, at which point the buffered event stream to that point
could be dumped to disk for further analysis, leaving a detailed trail
of information as to what exactly led to a condition i.e. an
intelligent flight recorder.  'Interesting' conditions could comprise
a single event occurrence, but since probes can maintain state over
multiple events, more complex combinations of events ('compound'
events) could be detected as well.

Tom



Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP

Tom Zanussi
Tom Zanussi writes:
 > Mathieu Desnoyers writes:
 >  > Hi,
 >  >
 >  > I just had a discussion with Frank Ch. Eigler from the SystemTAP team on the
 >  > topic "how can the tools cooperate well together".
 >  >
 >  > SystemTAP is especially made for running user defined scripts at specific points
 >  > in the kernel. I has the ability to add/remove scripts and instrumentation
 >  > dynamically from a running kernel.
 >  >
 >  > LTTng is a kernel tracer designed to have the minimum impact on the system while
 >  > having a design solid enough that it can trace NMI handlers (no locking). It
 >  > supports native C types and alignment. It can register new instrumentation sites
 >  > from dynamically loaded modules on-the-fly.
 >  >
 >  > LTTV is a modular text/graphical trace analysis tool. You can think of it as
 >  > framework offered for plugins to interact together to show information. Simple
 >  > plugins can show generically some information (detailed event list, for
 >  > instance), while more sophisticated plugins can show a particular representation
 >  > of the system (each process state (user mode, system call, interrupted,
 >  > in trap, ...) evolving through time).
 >  >
 >  > While the SystemTAP project focus on pre-processing of the information, LTT
 >  > focus on raw data copy (limiting the impact at probe site) at the cost of more
 >  > data to be transferred and analysed afterward.
 >  >
 >  > I think that both approaches are complementary. I am trying to see where
 >  > integrated analysis of pre-processed and post-processed information could be
 >  > interesting, but I'm sure there are areas where it is.
 >  >
 >  > If you have application ideas, feel free to express yourself.
 >  >
 >
 > One area that seems tailor-made for this type of interaction would be
 > to use systemtap probes to detect 'interesting' conditions in the
 > trace data, at which point the buffered event stream to that point
 > could be dumped to disk for further analysis, leaving a detailed trail
 > of information as to what exactly led to a condition i.e. an
 > intelligent flight recorder.  'Interesting' conditions could comprise
 > a single event occurrence, but since probes can maintain state over
 > multiple events, more complex combinations of events ('compound'
 > events) could be detected as well.
 >

One other thought along these lines that I forgot to mention was that
it might also be useful for probes to be able to trigger tracing as
needed e.g. a probe notices some strange pattern in the data currently
being aggregated, and in response basically says, 'Hmm, this is
strange, let's turn on tracing and see if we can't get some more
detailed info about what's going on'.  The event mask could also be
dynamically changed based on the conditions detected by the probe,
etc...

Tom


Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP

Jose R. Santos
In reply to this post by Mathieu Desnoyers-2
Mathieu Desnoyers wrote:

>Hi,
>
>I just had a discussion with Frank Ch. Eigler from the SystemTAP team on the
>topic "how can the tools cooperate well together".
>
>SystemTAP is especially made for running user defined scripts at specific points
>in the kernel. I has the ability to add/remove scripts and instrumentation
>dynamically from a running kernel.
>
>LTTng is a kernel tracer designed to have the minimum impact on the system while
>having a design solid enough that it can trace NMI handlers (no locking). It
>supports native C types and alignment. It can register new instrumentation sites
>from dynamically loaded modules on-the-fly.
>
>LTTV is a modular text/graphical trace analysis tool. You can think of it as
>framework offered for plugins to interact together to show information. Simple
>plugins can show generically some information (detailed event list, for
>instance), while more sophisticated plugins can show a particular representation
>of the system (each process state (user mode, system call, interrupted,
>in trap, ...) evolving through time).
>
>While the SystemTAP project focus on pre-processing of the information, LTT
>focus on raw data copy (limiting the impact at probe site) at the cost of more
>data to be transferred and analysed afterward.
>
>I think that both approaches are complementary. I am trying to see where
>integrated analysis of pre-processed and post-processed information could be
>interesting, but I'm sure there are areas where it is.
>
>If you have application ideas, feel free to express yourself.
>  
>

We have a team that is working on implementing a kernel event trace tool
using Systemtap.  The project does not have a post-processor and I think
it would be really cool to create a program that converts the Systemtap
trace output into something the other visualizers can use.  The first
one that came to mind was LTTV.  While the overhead right now is quite
high compared to LTTing, the target of this trace utility is for
situations were parching a kernel is not an option.  We now have several
trace hooks implemented and expect to release the code to the mailing
list with in the next couple of days.

We plan on investing a significant portion of our time next year
addressing some of the performance issues but performance will probably
not be as fast as LTTing.  Designing a common set of trace hooks for
both the Systemtap and LTTing implementations will help people move
between one and the other with out a significant learning curve.  As
long as the visualizer can handle both trace formats, I believe that
both projects can complement each other.

-JRS
Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP (heavy usage)

Mathieu Desnoyers-2
In reply to this post by Mathieu Desnoyers-2
Hi,

Here is, as promised, the rest of the heavy usage scenario requested. It has
been kindly generated by Autodesk partners.

You will find these results much more interesting than the previous ones.

I would be interested to know if SystemTAP can handle an average load of 58K
events per second.

Heavy usage scenario : Trace from Autodesk, 64 bits x86_64, 4 CPUs

621 MB trace. Duration : 9m10s.


Units are in seconds

total :
wall time : 546.479691385 (9m10s)

total cpu time available :
4*546.479691385 = 2185.918765540

total idle time (sum for 4 cpus): 1135.045529154
(time spent in process 0 in "unknown" syscall)

Total cpu time used :
2185.918765540 - 1135.045529154 = 1050.873236386
System load : 1050.873236386/2185.918765540 = 48.07%


lttd :
(note : in the trace, lttd is : pid 11413)

overall cpu time in system call : 3.141004863
  note : in this trace, usermode time is accounted as system call time due
  to a bug in instrumentation of system calls. Autodesk has been informed of
  the problem. Instrumentation set provided by Autodesk is available there :
  http://ltt.polymtl.ca/ > Third party contributions.
overall cpu time in traps : 0.00223371


Time used by lttd :

Percentage of wall time :
3.143238573/2185.918765540 * 100% = 0.144%
Percentage of cpu time :
3.143238573/1050.873236386 *100% = 0.299%


overall event count (for the whole 9m10s on the 4 CPUs)

events count : 31946639

detail per event type

Statistic for  'event_types' :

state_dump_facility_load :  11
wait :  17759
free :  3017
exit :  3017
fork :  3014
soft_irq_entry :  2964643
softirq :  2188180
soft_irq_exit :  2964643
tasklet_entry :  523479
tasklet_exit :  523479
expired :  1061734
wakeup :  1711432
schedchange :  2620332
syscall_entry :  5198 (should be more than that)
write :  322312
read :  614706
syscall_exit :  5180 (should be more than that)
trap_entry :  3799122
trap_exit :  3799122
open :  139309
close :  166203
irq_exit :  886808
packet_in :  145148
packet_out :  122118
page_alloc :  387151
page_free :  3536
irq_entry :  886808
sendmsg :  138882
select :  4715579
recvmsg :  27091
ioctl :  272526
set_itimer :  13725
buf_wait_end :  1485
buf_wait_start :  1484
poll :  811642
seek :  84544
call :  2
shm_create :  1
create :  1567
signal :  7707
exec :  2725
page_wait_end :  109
page_wait_start :  109


Event rate :

over wall time :
31946639/546.479691385 = 58459 events/s

over combined available cpu time :
31946639/2185.918765540 = 14614 events/s


Mathieu Desnoyers


Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Reply | Threaded
Open this post in threaded view
|

Re: LTTng very heavy system usage scenario

Mathieu Desnoyers-2
Well, I finally managed to finish my brand new LTTng+LTTV implementation. This
third usage scenario has been made with it.

I tried something that always turns out to make tracers dead : a loopback ping
flood (ping -f 127.0.0.1). Here are the results :-)

Mathieu


Usage scenario 3

Very heavy system usage (ping flood on the loopback pseudo interface)

Tests made on a 3Ghz pentium 4, with hyperthreading enabled.

Trace taken with LTTng 0.5.0

No event has been lost while tracing.

(units are in seconds)

Trace duration : 12.785209485
Total cpu time available (approximately : it's hyperthreading...) : 25.57s
Trace size : 176M

Trace data rate : 13.76MB/s

idle cpu time : 7.297325362s (sum of the two HT cpus)
System load : (25.57-7.297325362)/25.57 = 71.5%


lttd cpu time (process 4773 in the trace)

usermode : 3.613249328
traps : none
irqs :
21 (3com 3c940 10/100/1000Base-T) : 0.094693456
0 (timer) : 0.015632051
19 (USB and IDE) : 0.000242789
system calls :
unknown (beginning of trace) : 0.000024347
175 (rt_sigprocmask) : 0.352830699
3 (read) : 0.697699884
142 (newselect) : 2.055746440
4 (write) : 1.522248853
54 (ioctl) : 0.000014157

Total cpu time accountable to process :
usermode + traps + system calls
8.241813708


ratio of time for write / cpu time used :
45.10%

ratio of time for write / cpu time available (doesn't mean much with HT) :
32.23%


Overall information on the trace

event count 9588836

Event rate : 749994 events per second

detail per event type (38 event types)

Statistic for  'event_types' :

state_dump_facility_load :  10
exit :  3
wait :  17
free :  3
fork :  3
schedchange :  42076
syscall_entry :  2297497
write :  427383
wakeup :  37268
soft_irq_entry :  309692
softirq :  25105
soft_irq_exit :  309692
read :  166746
syscall_exit :  2297497
expired :  258
irq_exit :  170318
packet_in :  259254
irq_entry :  170318
tasklet_entry :  62
tasklet_exit :  62
packet_out :  408715
trap_exit :  2448
call :  386428
recvmsg :  258803
close :  93
trap_entry :  2448
select :  1406394
buf_wait_start :  19
page_alloc :  478581
buf_wait_end :  19
create :  8
sendmsg :  127608
poll :  3449
ioctl :  443
open :  79
set_itimer :  11
page_free :  3
exec :  4
seek :  19



 
OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
Reply | Threaded
Open this post in threaded view
|

Re: LTTng very heavy system usage scenario : probes

Mathieu Desnoyers-2
Oh, and I almost forgot the most important :

Probe execution time at probe site : 220ns/event

220ns*9588836 = 2.11s

Ratio of probe time / cpu time used
2.11/(25.57-7.297325362) = 11.55%

Ratio of probe time / cpu time available
2.11/25.57 = 8.25%



* Mathieu Desnoyers ([hidden email]) wrote:

> Well, I finally managed to finish my brand new LTTng+LTTV implementation. This
> third usage scenario has been made with it.
>
> I tried something that always turns out to make tracers dead : a loopback ping
> flood (ping -f 127.0.0.1). Here are the results :-)
>
> Mathieu
>
>
> Usage scenario 3
>
> Very heavy system usage (ping flood on the loopback pseudo interface)
>
> Tests made on a 3Ghz pentium 4, with hyperthreading enabled.
>
> Trace taken with LTTng 0.5.0
>
> No event has been lost while tracing.
>
> (units are in seconds)
>
> Trace duration : 12.785209485
> Total cpu time available (approximately : it's hyperthreading...) : 25.57s
> Trace size : 176M
>
> Trace data rate : 13.76MB/s
>
> idle cpu time : 7.297325362s (sum of the two HT cpus)
> System load : (25.57-7.297325362)/25.57 = 71.5%
>
>
> lttd cpu time (process 4773 in the trace)
>
> usermode : 3.613249328
> traps : none
> irqs :
> 21 (3com 3c940 10/100/1000Base-T) : 0.094693456
> 0 (timer) : 0.015632051
> 19 (USB and IDE) : 0.000242789
> system calls :
> unknown (beginning of trace) : 0.000024347
> 175 (rt_sigprocmask) : 0.352830699
> 3 (read) : 0.697699884
> 142 (newselect) : 2.055746440
> 4 (write) : 1.522248853
> 54 (ioctl) : 0.000014157
>
> Total cpu time accountable to process :
> usermode + traps + system calls
> 8.241813708
>
>
> ratio of time for write / cpu time used :
> 45.10%
>
> ratio of time for write / cpu time available (doesn't mean much with HT) :
> 32.23%
>
>
> Overall information on the trace
>
> event count 9588836
>
> Event rate : 749994 events per second
>
> detail per event type (38 event types)
>
> Statistic for  'event_types' :
>
> state_dump_facility_load :  10
> exit :  3
> wait :  17
> free :  3
> fork :  3
> schedchange :  42076
> syscall_entry :  2297497
> write :  427383
> wakeup :  37268
> soft_irq_entry :  309692
> softirq :  25105
> soft_irq_exit :  309692
> read :  166746
> syscall_exit :  2297497
> expired :  258
> irq_exit :  170318
> packet_in :  259254
> irq_entry :  170318
> tasklet_entry :  62
> tasklet_exit :  62
> packet_out :  408715
> trap_exit :  2448
> call :  386428
> recvmsg :  258803
> close :  93
> trap_entry :  2448
> select :  1406394
> buf_wait_start :  19
> page_alloc :  478581
> buf_wait_end :  19
> create :  8
> sendmsg :  127608
> poll :  3449
> ioctl :  443
> open :  79
> set_itimer :  11
> page_free :  3
> exec :  4
> seek :  19
>
>
>
>  
> OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
> Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
>
OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
Reply | Threaded
Open this post in threaded view
|

Re: [ltt-dev] Re: Interoperability of LTTng and LTTV with SystemTAP (heavy usage)

Mathieu Desnoyers-2
In reply to this post by Mathieu Desnoyers-2
Time used for probe sites in heavy usage scenario (scenario 2) :

31946639 events * 220ns = 7.02s

Probe time / used cpu time
7.02/1050.873236386*100% = 0.67%

Probe time / available cpu time
7.02/2185.918765540*100% = 0.32%


Mathieu

* Mathieu Desnoyers ([hidden email]) wrote:

> Hi,
>
> Here is, as promised, the rest of the heavy usage scenario requested. It has
> been kindly generated by Autodesk partners.
>
> You will find these results much more interesting than the previous ones.
>
> I would be interested to know if SystemTAP can handle an average load of 58K
> events per second.
>
> Heavy usage scenario : Trace from Autodesk, 64 bits x86_64, 4 CPUs
>
> 621 MB trace. Duration : 9m10s.
>
>
> Units are in seconds
>
> total :
> wall time : 546.479691385 (9m10s)
>
> total cpu time available :
> 4*546.479691385 = 2185.918765540
>
> total idle time (sum for 4 cpus): 1135.045529154
> (time spent in process 0 in "unknown" syscall)
>
> Total cpu time used :
> 2185.918765540 - 1135.045529154 = 1050.873236386
> System load : 1050.873236386/2185.918765540 = 48.07%
>
>
> lttd :
> (note : in the trace, lttd is : pid 11413)
>
> overall cpu time in system call : 3.141004863
>   note : in this trace, usermode time is accounted as system call time due
>   to a bug in instrumentation of system calls. Autodesk has been informed of
>   the problem. Instrumentation set provided by Autodesk is available there :
>   http://ltt.polymtl.ca/ > Third party contributions.
> overall cpu time in traps : 0.00223371
>
>
> Time used by lttd :
>
> Percentage of wall time :
> 3.143238573/2185.918765540 * 100% = 0.144%
> Percentage of cpu time :
> 3.143238573/1050.873236386 *100% = 0.299%
>
>
> overall event count (for the whole 9m10s on the 4 CPUs)
>
> events count : 31946639
>
> detail per event type
>
> Statistic for  'event_types' :
>
> state_dump_facility_load :  11
> wait :  17759
> free :  3017
> exit :  3017
> fork :  3014
> soft_irq_entry :  2964643
> softirq :  2188180
> soft_irq_exit :  2964643
> tasklet_entry :  523479
> tasklet_exit :  523479
> expired :  1061734
> wakeup :  1711432
> schedchange :  2620332
> syscall_entry :  5198 (should be more than that)
> write :  322312
> read :  614706
> syscall_exit :  5180 (should be more than that)
> trap_entry :  3799122
> trap_exit :  3799122
> open :  139309
> close :  166203
> irq_exit :  886808
> packet_in :  145148
> packet_out :  122118
> page_alloc :  387151
> page_free :  3536
> irq_entry :  886808
> sendmsg :  138882
> select :  4715579
> recvmsg :  27091
> ioctl :  272526
> set_itimer :  13725
> buf_wait_end :  1485
> buf_wait_start :  1484
> poll :  811642
> seek :  84544
> call :  2
> shm_create :  1
> create :  1567
> signal :  7707
> exec :  2725
> page_wait_end :  109
> page_wait_start :  109
>
>
> Event rate :
>
> over wall time :
> 31946639/546.479691385 = 58459 events/s
>
> over combined available cpu time :
> 31946639/2185.918765540 = 14614 events/s
>
>
> Mathieu Desnoyers
>
>
> Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
> OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
> _______________________________________________
> ltt-dev mailing list
> [hidden email]
> http://www.listserv.shafik.org/listserv/listinfo/ltt-dev
>
OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP

Marcelo Tosatti
In reply to this post by Mathieu Desnoyers-2
Hi Mathieu,

On Tue, Dec 13, 2005 at 09:25:15PM -0500, Mathieu Desnoyers wrote:

<snip>

> Frank asked me, at the end of our IRC discussion, if I could send some numbers
> about how much information is transferred. Here is the first result : a trace
> taken on a system used for start/stop of web broser (mozilla) and doing a "find"
> on the root of the system. I plan to do the same on heavily loaded
> system (I am trying to get an heavy commercial application for this) and very
> heavily loaded system (I will try ping -f and a few others) as soon as I have
> the time.
>
> I have also been asked about the impact of the copy of the information to disk
> on the system behavior. The percentage of cpu time used is not relevant in this
> first usage scenario because the system was most of the time idle. Still, about
> 1.79% (cpu 0) and 1.89% (cpu 1) of cpu time has been used by the disk writer
> daemon, which is not much considering that the system usage was 6.78% for cpu 0
> and 9.79% for cpu 1. A heavily loaded system will give us a better insight.

It might be interesting to measure the impact of tracing on the
performance of synthetic workloads (preferably ones which are
meaningful/mimic behaviour of real loads).

Ie. the overhead of tracing under different loads.

No?

> However, I would say that probe effect at the call site is much more important
> than the effect of a much lower priority disk writer daemon.

It depends really. The log disk writes could interfere badly with
sequential disk read/write streams, and workloads dominated by such
kind of accesses would suffer more from that than from the CPU
effect.

> For the probe effect, the microbenchmarks I made tells that logging an event takes about
> 220 ns.

And blows a few cachelines?

> Here are the results. Note that I have taken a short trace (15 seconds) just
> because I was in a hurry before preparing a presentation at that moment.

Have you tried to use any sort of PMC hardware to measure the effects
more precisely (including cache effects)?

Best wishes

Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP

Mathieu Desnoyers-2
* Marcelo Tosatti ([hidden email]) wrote:
>
> It might be interesting to measure the impact of tracing on the
> performance of synthetic workloads (preferably ones which are
> meaningful/mimic behaviour of real loads).
>
> Ie. the overhead of tracing under different loads.
>
> No?
>

Absolutely. This is what you will find inside the subsequent emails of the
thread.

Usage scenario 2 : 4 CPU x86_64, with 48% workload. Used by Autodesk for
compiling and rendering.

Usage scenario 3 : 1 CPU HT Pentium 4, running a ping flood on the loopback
pseudo interface.


> > However, I would say that probe effect at the call site is much more important
> > than the effect of a much lower priority disk writer daemon.
>
> It depends really. The log disk writes could interfere badly with
> sequential disk read/write streams, and workloads dominated by such
> kind of accesses would suffer more from that than from the CPU
> effect.
>

Well, I hope that people will use the right tool for the right job. As the trace
from Autodesk demonstrates, it is only 621MB long for 9m10s running on 4
CPUs.

If the disk I/O is really a problem, one has the choice to use bigger memory
buffers. Someone in this situation could start the tracer with 512MB of memory
buffers dedicated to tracing and only launch the disk writer daemon once the
tracing is over.

LTTng also comes with a "flight recorder mode", which can keep the last bits of
information of a running system. It can be useful to investigate a crash or
simply to keep the information before a specific condition triggers a trace
stop.

We can think of other solutions too : installing disks specifically used for
tracing or sending the data remotely via network.

If someone really wants to investigate logging impact on disk I/O, feel free
to do it, but I think that more important performance issues exist with
tracing, like the probe effect.


> > For the probe effect, the microbenchmarks I made tells that logging an event takes about
> > 220 ns.
>
> And blows a few cachelines?
>

There will always be a need of a few cachelines for tracing control information,
writing events to memory and holding the supplementary instructions for
logging.

> > Here are the results. Note that I have taken a short trace (15 seconds) just
> > because I was in a hurry before preparing a presentation at that moment.
>
> Have you tried to use any sort of PMC hardware to measure the effects
> more precisely (including cache effects)?
>

The only testing that I have done is to use the cpu TSC to know of much time is
spent where. I use the cpu_khz value to transform that into seconds.

I see that PAPI and perfctr is an interesting investigation field, especially
for cache effects. I will try it as soon as I get the time. It makes me think
that periodically saving these hardware counters with a trace could lead to very
interesting analysis.

Thanks!


Mathieu



OpenPGP public key:              http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint:     8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
Reply | Threaded
Open this post in threaded view
|

Re: Interoperability of LTTng and LTTV with SystemTAP (heavy usage)

Martin Hunt
In reply to this post by Mathieu Desnoyers-2
On Thu, 2005-12-15 at 20:04 -0500, Mathieu Desnoyers wrote:
> Hi,
>
> Here is, as promised, the rest of the heavy usage scenario requested. It has
> been kindly generated by Autodesk partners.

> I would be interested to know if SystemTAP can handle an average load of 58K
> events per second.

A simple probe that logs a small amount of data typically takes <
5usecs, which is 200K events per second. I've seen 780K events on my
dual-processor Xeon. However, that uses almost all the cpu time.

> Heavy usage scenario : Trace from Autodesk, 64 bits x86_64, 4 CPUs
>
> 621 MB trace. Duration : 9m10s.


That's only 1.1MB/sec. With relayfs or procfs, I get 40MB/sec until the
buffers are filled, with sustained rates between 10-20 MB/sec depending
on disk type.


> Time used by lttd :
>
> Percentage of wall time :
> 3.143238573/2185.918765540 * 100% = 0.144%
> Percentage of cpu time :
> 3.143238573/1050.873236386 *100% = 0.299%
>
>
> overall event count (for the whole 9m10s on the 4 CPUs)
>
> events count : 31946639

For systemtap: Worst case for a small probe, 5usec/event, so that's
160 sec cpu time.

Percent of wall time: 7.3
Percent of cpu time: 15.2

That's worst case for a simple trace. I actually benchmark kprobes at
1.4usec and a systemtap script that does a printf of a small string at
2.5usecs. So a best case would be half the above numbers.

Martin