per-entity statistics

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

per-entity statistics

Martin Peschke
Hi,

I am interested in SCSI related statitics. I have been looking
through available tapsets. I have got some questions and I
would appriciate your thoughts.

First, certain data, like request latencies, is most useful
if being provided per device, or per LUN respectively, I guess.
My question is what kind of systemtap variable might suit this
requirement best.

Tapsets like http://sourceware.org/ml/systemtap/2005-q3/msg00507.html
use systemtap's assoziative arrays for similar purposes.
I could imagine an array in order to be able to store data for each
LUN separately.

Would this approach work if I want to maintain more than a single
counter per LUN? Looks like going this path might result in an
array of arrays, say, an array of latency histograms. Is this
feasible?

Would it be better to maintain separate global variables for
each LUN? Something like:

global latencies_lun1
global latencies_lun2
global latencies_lun3
etc.

But this would require the script, i.e. the number of global arrays,
to be adapted for each measurement. Has someone played with doing this
in an automated fashion? A number of entities as input, and out comes
an appropriate script.

This leads to my other questions, which are about selecting LUNs
one wants to do measurements for. The scripts I have come across
so far and which are capable of providing per-entity statistics,
e.g. per-process statistics, measure all entities, or all processes
respectively.

Assuming I want to measure 3 LUNs attached through adapter A, but
I don't want to strain my system measuring all the other 23 LUNs
attached through other adapters, how could I do that? I guess,
my probe would need to check everytime it has fired whether
it's being active for a LUN to be measured before continueing.

I assume the LUNs to be measured would be described using some array.
Then my probe would try to match LUNs using that array.
That wouldn't buy much efficiency, would it? Any other idea?

Besides, is there a good way to pass LUN selections to a
systemtap script? Or would it be best, as mentioned above,
to regenerate a script with the selection stuff being
built in?

Martin

Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Frank Ch. Eigler

Martin Peschke <[hidden email]> writes:

> [...]  First, certain data, like request latencies, is most useful
> if being provided per device, or per LUN respectively, I guess.
> [...] I could imagine an array in order to be able to store data for
> each LUN separately.  Would this approach work if I want to maintain
> more than a single counter per LUN?

> Looks like going this path might result in an array of arrays, say,
> an array of latency histograms. Is this feasible?

Systemtap arrays, like those in awk, are indexed by a tuple of key
values.  You can use device numbers, lun numbers, names, in any
consistent combination for each array.  Each array value can be either
a string, number, or statistics.

If you need to track statistics on more than one level of grouping at
a time, then use two or more arrays, with a different set of keys.

> Would it be better to maintain separate global variables for each
> LUN? Something like:
>
> global latencies_lun1
> global latencies_lun2
> global latencies_lun3

That's not how I'd do it.  Instead, something like:

##### scsi-latencies.stp

global lun_latencies # [lun:number]
global devlun_latencies # [dev:number, lun:number]

probe scsi.latencies = kernel.function("....") {
  if (! scsi_lun_selected ($lun)) next
  lat = EXPR
  lun_latencies[$lun] <<< lat
  devlun_latencies[$dev,$lun] <<< lat
}

probe end {
  # report?
}

##### scsi-latencies-selected.stp

function scsi_lun_selected(id) { return 1 }



> [...]  Assuming I want to measure 3 LUNs attached through adapter A
> [...]  I guess, my probe would need to check everytime it has fired
> whether it's being active for a LUN to be measured before
> continueing.

That's right.

> [...]  Besides, is there a good way to pass LUN selections to a
> systemtap script? [...]

We don't yet support the equivalent of command line arguments, so the
script would indeed have to be modified.  However, if one structures
the overall script into a few parts like above, one them could contain
only the default lun-selection function.  The end-user could override
that function only using an invocation like this:

stap -I scsi-tapset/  -e '
   function scsi_lun_selected(id) { return id==1 }
   probe scsi.latencies {}
'

- FChE

Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Jose R. Santos
In reply to this post by Martin Peschke
Martin Peschke wrote:

>Hi,
>
>I am interested in SCSI related statitics. I have been looking
>through available tapsets. I have got some questions and I
>would appriciate your thoughts.
>
>First, certain data, like request latencies, is most useful
>if being provided per device, or per LUN respectively, I guess.
>My question is what kind of systemtap variable might suit this
>requirement best.
>
>Tapsets like http://sourceware.org/ml/systemtap/2005-q3/msg00507.html
>use systemtap's assoziative arrays for similar purposes.
>I could imagine an array in order to be able to store data for each
>LUN separately.
>
>Would this approach work if I want to maintain more than a single
>counter per LUN? Looks like going this path might result in an
>array of arrays, say, an array of latency histograms. Is this
>feasible?
>
>Would it be better to maintain separate global variables for
>each LUN? Something like:
>
>global latencies_lun1
>global latencies_lun2
>global latencies_lun3
>etc.
>
>But this would require the script, i.e. the number of global arrays,
>to be adapted for each measurement. Has someone played with doing this
>in an automated fashion? A number of entities as input, and out comes
>an appropriate script.
>
>This leads to my other questions, which are about selecting LUNs
>one wants to do measurements for. The scripts I have come across
>so far and which are capable of providing per-entity statistics,
>e.g. per-process statistics, measure all entities, or all processes
>respectively.
>
>Assuming I want to measure 3 LUNs attached through adapter A, but
>I don't want to strain my system measuring all the other 23 LUNs
>attached through other adapters, how could I do that? I guess,
>my probe would need to check everytime it has fired whether
>it's being active for a LUN to be measured before continueing.
>
>I assume the LUNs to be measured would be described using some array.
>Then my probe would try to match LUNs using that array.
>That wouldn't buy much efficiency, would it? Any other idea?
>
>Besides, is there a good way to pass LUN selections to a
>systemtap script? Or would it be best, as mentioned above,
>to regenerate a script with the selection stuff being
>built in?
>
>Martin
>  
>
Hi Martin,

This is one of the scenarios we are designing our trace tool for.  While
it does not meet you requirement to just probe a selected number of
devices, something like that would be easy to hack  into the current
implementation since we already take SCSI host number, channel, lun and
id.  We also have hooks for the IO schedulers and system call and have
plans to put hooks into SCSI drivers.  For some of the workloads that we
are want use this tool, we need to be able to measure latencies from the
moment the IO was submitted by the application.

-JRS
Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Jose R. Santos
In reply to this post by Frank Ch. Eigler
Frank Ch. Eigler wrote:

>Martin Peschke <[hidden email]> writes:
>
>
>That's not how I'd do it.  Instead, something like:
>
>##### scsi-latencies.stp
>
>global lun_latencies # [lun:number]
>global devlun_latencies # [dev:number, lun:number]
>
>probe scsi.latencies = kernel.function("....") {
>  if (! scsi_lun_selected ($lun)) next
>  lat = EXPR
>  lun_latencies[$lun] <<< lat
>  devlun_latencies[$dev,$lun] <<< lat
>}
>
>probe end {
>  # report?
>}
>
>##### scsi-latencies-selected.stp
>
>function scsi_lun_selected(id) { return 1 }
>  
>
FYI

Unless you only have a single SCSI card with just one channel, just
getting the LUN is not enough. You need the SCSI host number, channel,
lun, and ID. Here is a bit of code from the trace tool that shows how to
get that information using SystemTap.

function log_scsi_iodone_extra(var:long)
%{
struct scsi_cmnd *cmd = (struct scsi_cmnd *)((long)THIS->var);
long long scsi_info;

scsi_info = ((cmd->device->host->host_no & 0xFF) << 24) |
((cmd->device->channel & 0xFF) << 16) |
((cmd->device->lun & 0xFF) << 8) |
(cmd->device->id & 0xFF);

/* scsi_info|data_direction|cmd_identifier| */
_stp_printf("%lld|%d|%d", scsi_info, cmd->sc_data_direction, cmd->pid);
%}


Hope it helps

-JRS
Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Martin Peschke
Jose R. Santos wrote:

> Unless you only have a single SCSI card with just one channel, just
> getting the LUN is not enough. You need the SCSI host number, channel,
> lun, and ID. Here is a bit of code from the trace tool that shows how to
> get that information using SystemTap.
>
> function log_scsi_iodone_extra(var:long)
> %{
> struct scsi_cmnd *cmd = (struct scsi_cmnd *)((long)THIS->var);
> long long scsi_info;
>
> scsi_info = ((cmd->device->host->host_no & 0xFF) << 24) |
> ((cmd->device->channel & 0xFF) << 16) |
> ((cmd->device->lun & 0xFF) << 8) |
> (cmd->device->id & 0xFF);

Well, it is obvious why you have masked, shifted and or'ed
thoses numbers. But one might get easily in trouble doing this.
Think about storage area networks and lots of devices being
attached to a single Linux. For now this code works in
most environments because the Linux SCSI stack does its
own compact LUN enumeration. But it's not airtight.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Martin Peschke
In reply to this post by Jose R. Santos
Jose R. Santos wrote:
> This is one of the scenarios we are designing our trace tool for.  While
> it does not meet you requirement to just probe a selected number of
> devices, something like that would be easy to hack  into the current
> implementation since we already take SCSI host number, channel, lun and
> id.  We also have hooks for the IO schedulers and system call and have
> plans to put hooks into SCSI drivers.  For some of the workloads that we
> are want use this tool, we need to be able to measure latencies from the
> moment the IO was submitted by the application.

The reason why I want probes to be able to select devices to be
scrutinized is to make sure that gathering statistics impacts
performance as little as possible.

Other ways to reduce probe overhead for simple latency measurements
include reducing the frequencing of stuff being gathered and reported
to userspace, and reducing the amount of data sampled for each event,
I guess. Systemtap's statistics seem to fit these requirements
quite well, as long as the instant aggregation doesn't become more
expensive than some arithmetic operations or comparisons.

It might be feasible to add some device selection code to the trace
tool. However, I feel the trace tool would still do more than needed
for simple latency measurements, even if it supported device selection.

The trace tool might be advantageous when placing several time related
probes in order to measure an entire stack of delay components.
I am not sure what the universal answer regarding the performance
analysis question is, and whether there can be one at all.
I think, I will just try to hack up some prototype doing device
selection and using systemtap-style statistics.

Thanks for your thoughts, anyway. I am curious to see how the trace
tool performs and what kind of hickups it will detect.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Jose R. Santos
In reply to this post by Martin Peschke
Martin Peschke wrote:

>Jose R. Santos wrote:
>> Unless you only have a single SCSI card with just one channel, just
>> getting the LUN is not enough. You need the SCSI host number, channel,
>> lun, and ID. Here is a bit of code from the trace tool that shows how to
>> get that information using SystemTap.
>>
>> function log_scsi_iodone_extra(var:long)
>> %{
>> struct scsi_cmnd *cmd = (struct scsi_cmnd *)((long)THIS->var);
>> long long scsi_info;
>>
>> scsi_info = ((cmd->device->host->host_no & 0xFF) << 24) |
>> ((cmd->device->channel & 0xFF) << 16) |
>> ((cmd->device->lun & 0xFF) << 8) |
>> (cmd->device->id & 0xFF);
>
>Well, it is obvious why you have masked, shifted and or'ed
>thoses numbers. But one might get easily in trouble doing this.
>Think about storage area networks and lots of devices being
>attached to a single Linux. For now this code works in
>most environments because the Linux SCSI stack does its
>own compact LUN enumeration. But it's not airtight.
>
>Martin
>
>  
>
I'll admit that I chose these based on my limited experience with
storage devices. While connecting a large amount of devices in a single
Linux system is no rare this is usually (in my experience) not done
through a single SCSI channel. It would seem odd to me for someone to
configure more than 256 SCSI device on a single SCSI channel, but it
could happen. Have you seen environments were this is not the case?

I don't have a problem changing this (especially now that we do not have
post-processing scripts that depend on the structure of the data).

-JRS
Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Li Guanglei
In reply to this post by Martin Peschke

> Besides, is there a good way to pass LUN selections to a
> systemtap script? Or would it be best, as mentioned above,
> to regenerate a script with the selection stuff being
> built in?
>
> Martin
>

We could use the -D option of stap to pass user defined value into
stap scripts. Here is an example:

---- var.stp ------
function logvar()
%{
         _stp_printf("LUN: %d\n", LUN);
%}

probe end
{
         logvar()
}
---- end of var.stp ----

You can run this script by:

stap -g -D LUN=110 var.stp

then it will print:
LUN: 110

The Linux Kernel Event Trace tool I released does a vary limited
filtering about each probe. The only filtering is based on pid, and
some of the probes even don't filter by pid. One reason of the limited
filtering is that we initially want to let this tool gather as more
info as possible, and then the post processing tool could work on a
larger set of data to dig more info. The idea is something like:
gather once, postprocess&analyze multiple times.

But anyway, how the tool should work need to be based on the practice
and experience of the users of this tool, just like you :-). And I
will be glad if you could give more feedbacks about it, such as
additions of filters, more event hooks.

Thanks.



Reply | Threaded
Open this post in threaded view
|

Re: per-entity statistics

Martin Peschke
In reply to this post by Jose R. Santos
Jose R. Santos wrote:
> While connecting a large amount of devices in a single
> Linux system is no rare this is usually (in my experience) not done
> through a single SCSI channel. It would seem odd to me for someone to
> configure more than 256 SCSI device on a single SCSI channel, but it
> could happen. Have you seen environments were this is not the case?

Good point.
It's probably fine to leave it the way it is until someone complains who
has done this insane setup :)

Martin