stp_exit change

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

stp_exit change

Frank Ch. Eigler
Hi -

Can you please justify your change to runtime/io.c and
tapset/logging.stp?  Moving that line of code to stp_exit may be
problematic because this function is invoked from more places than the
exit() script function.  In particular, this change interferes with
STAP_SESSION_ERROR state handling.

- FChE

Reply | Threaded
Open this post in threaded view
|

Re: stp_exit change

Martin Hunt
This change was part of some error handling cleanup I have been meaning
to check in for a while. However after checking in that one change I
realized things have changed some and I will probably go in a slightly
different direction.

Currently the translator emits lots of code like this:
        atomic_set (& session_state, STAP_SESSION_ERROR);
        _stp_exit ();
and
        atomic_set (&session_state, STAP_SESSION_ERROR);
        _stp_error ("foo broke");
and sometimes we want to exit for normal reasons (not an error)
        atomic_set (& session_state, STAP_SESSION_STOPPING);
        _stp_exit ();

Within the runtime, sometimes unexpected errors happen and there is
either no return value for the function of the translator doesn't check
it. Or maybe the runtime function just wants to print out an exact error
message about what happened. These functions call stp_error or stp_exit.
They should really set the session state variable too.

So cleaning this up looks easy enough,
void _stp_exit (int err)
{
        if (err)
                atomic_set (&session_state, STAP_SESSION_ERROR);
        else
                atomic_set (&session_state, STAP_SESSION_STOPPING);
        _stp_exit_flag = 1;
}

void _stp_error (const char *fmt, ...)
{
        va_list args;
        va_start(args, fmt);
        _stp_vlog (ERROR, NULL, 0, fmt, args);
        va_end(args);
        _stp_exit(1);
}

I recommend eliminating _stp_softerror() unless you can explain the
difference between it and _stp_warn().

Currently, setting session_state to either STAP_SESSION_ERROR or
STAP_SESSION_STOPPING has the same effect; it stops probes except the
end probe from executing. STAP_SESSION_ERROR seems to have no other use.
Do you have plans for it?  Should stap exit with an error code if it is
set?




Reply | Threaded
Open this post in threaded view
|

Re: stp_exit change

Frank Ch. Eigler
Hi -


hunt wrote:

> This change was part of some error handling cleanup I have been meaning
> to check in for a while. [...]

What did you have in mind?


> Currently the translator emits lots of code like this:
> atomic_set (& session_state, STAP_SESSION_ERROR);
> _stp_exit ();
> and
> atomic_set (&session_state, STAP_SESSION_ERROR);
> _stp_error ("foo broke");
> and sometimes we want to exit for normal reasons (not an error)
> atomic_set (& session_state, STAP_SESSION_STOPPING);
> _stp_exit ();

Yes, from within the context of those calls, I believe they generally
make sense.

> Within the runtime, sometimes unexpected errors happen and there is
> either no return value for the function of the translator doesn't
> check it.

These both sound like bugs.

> Or maybe the runtime function just wants to print out an exact error
> message about what happened. These functions call stp_error or
> stp_exit.

Examples would be helpful.  Keep in mind the difference between
initialization-time vs probe-time errors.

> They should really set the session state variable too.

On the contrary, I would rather the runtime assume as little as
possible about translator code generation patterns.


> [...]  I recommend eliminating _stp_softerror() unless you can
> explain the difference between it and _stp_warn().

Certainly: the "ERROR" vs "WARN" "stp_vlog" channel, plus the
"WARNING" vs "ERROR" message from stpd.  The recently introduced "soft
error" concept is different from a warning, in that a script may
tolerate some limiting number of the former (and unwinds the offending
probe), and does not treat the latter any differently from a printf.


> Currently, setting session_state to either STAP_SESSION_ERROR or
> STAP_SESSION_STOPPING has the same effect; it stops probes except
> the end probe from executing. [...]

ERROR state should block "end" probes too.  An inspection of the
source suggests that it in fact does, unless something like your
stp_exit change broke it.


> Should stap exit with an error code if it is set?

Yes, that would be appropriate.  If for example the runtime's
"module_exit" function expected a return value from the
systemtap-generated shutdown code, this information could get passed
back in an orderly way.


- FChE
Reply | Threaded
Open this post in threaded view
|

Re: stp_exit change

Martin Hunt
On Wed, 2006-01-11 at 14:34 -0500, Frank Ch. Eigler wrote:
>
> > Within the runtime, sometimes unexpected errors happen and there is
> > either no return value for the function of the translator doesn't
> > check it.
>
> These both sound like bugs.

In at least one case I know of, yes. Other times, it may be OK. For
example, something like:

x = get_value();
...
int get_value() {
   ...
   if (something really bad and unexpected happens) {
        stp_error("bad mojo")
        return 0;
   }
   return val;
}

> On the contrary, I would rather the runtime assume as little as
> possible about translator code generation patterns.

I think we consistently disagree on this point. While the majority of
the runtime code should be self-contained and not care at all about the
translator, there is a need for functions that aid in the interface
between the two. For some reason you used to remind me over and over
that the runtime was supposed to be solely for the use of the
translator. So I have been surprised you haven't been sticking all kinds
of little functions and code chunks into it.

> > [...]  I recommend eliminating _stp_softerror() unless you can
> > explain the difference between it and _stp_warn().
>
> Certainly: the "ERROR" vs "WARN" "stp_vlog" channel, plus the
The channel is the same.
> "WARNING" vs "ERROR" message from stpd.  The recently introduced "soft
> error" concept is different from a warning, in that a script may
> tolerate some limiting number of the former (and unwinds the offending
> probe), and does not treat the latter any differently from a printf.

I think an error that is ignored is a warning. Errors are always fatal.
Which leads us back into the discussion about 1379.


Reply | Threaded
Open this post in threaded view
|

Re: stp_exit change

Frank Ch. Eigler
Hi -

hunt wrote:

> [...]
> int get_value() {
>    ...
>    if (something really bad and unexpected happens) {
> stp_error("bad mojo")
> return 0;
>    }
>    return val;
> }
> [...]

What are some existing examples of this?  How is the caller supposed
to know that something "fatal" occurred if not by rc?


> > On the contrary, I would rather the runtime assume as little as
> > possible about translator code generation patterns.
>
> I think we consistently disagree on this point. [...]  there is a
> need for functions that aid in the interface between the two.

It's a question of magnitude, of details.

> For some reason you used to remind me over and over that the runtime
> was supposed to be solely for the use of the translator. [...]

This idea is not contradictory with the other one.  The runtime API
needs to serve the translator, but not be dependent on its vagaries,
as far as reasonable and possible.  That makes for a more proper
layered design.


> [...] I think an error that is ignored is a warning.  Errors are
> always fatal. [...]

Except, as I said, the soft errors are not ignored.  They are counted
toward a limit, and they cause an unwinding of the probe script.  If
you believe there should be no such thing as a soft error, how exactly
do you propose systemtap handle timed-out locks?

- FChE
Reply | Threaded
Open this post in threaded view
|

Re: stp_exit change

Martin Hunt
On Wed, 2006-01-11 at 15:33 -0500, Frank Ch. Eigler wrote:

> Hi -
>
> hunt wrote:
>
> > [...]
> > int get_value() {
> >    ...
> >    if (something really bad and unexpected happens) {
> > stp_error("bad mojo")
> > return 0;
> >    }
> >    return val;
> > }
> > [...]
>
> What are some existing examples of this?  How is the caller supposed
> to know that something "fatal" occurred if not by rc?

One example, is all the map sets.
  _stp_map_set_isi (global_x, l->__tmp6, l->__tmp7, l->__tmp9);

needs to be changed to something like
  rc = _stp_map_set_isi (global_x, l->__tmp6, l->__tmp7, l->__tmp9);
  if (rc) {
    atomic_set (&session_state, STAP_SESSION_ERROR);
    if (rc == -1)
        _stp_error("map overflow at %s\n", probe_point);
    else
        _stp_error("internal error: bad args to _stp_map_set\n");
    goto out;
  }

This will not help our compile times or performance and it will have no
advantage over letting the runtime directly call stp_error(). Yes,
things continue on through the probe like the map set happened
successfully, but the worst thing that could happen is that something
later tries to read that same map node and it gets 0.

An alternative would use stp_error() calls in the runtime, but still
have the translator check rc and exit the probe immediately.
  if (_stp_map_set_isi (global_x, l->__tmp6, l->__tmp7, l->__tmp9))
    goto out;



Reply | Threaded
Open this post in threaded view
|

Re: stp_exit change

Frank Ch. Eigler
Hi -

> > > [...]
> > > int get_value() {
> > >    ...
> > >    if (something really bad and unexpected happens) {
> > > stp_error("bad mojo")
> > > return 0;
> > >    }
> > >    return val;
> > > }
> > > [...]
> >
> > What are some existing examples of this?  How is the caller supposed
> > to know that something "fatal" occurred if not by rc?

> One example, is all the map sets.
>   _stp_map_set_isi (global_x, l->__tmp6, l->__tmp7, l->__tmp9);

My question was about code *within the runtime* that wanted to call
stp_error, so this is not an example of that.

This is code generated by the translator, and it fails to check an
existing return code.  I think this used to work, sigh.

>   rc = _stp_map_set_isi (global_x, l->__tmp6, l->__tmp7, l->__tmp9);
>   if (rc) {
>     atomic_set (&session_state, STAP_SESSION_ERROR);
>     if (rc == -1)
>         _stp_error("map overflow at %s\n", probe_point);
>     else
>         _stp_error("internal error: bad args to _stp_map_set\n");
>     goto out;
>   }

That is not how the translator models errors within ordinary
statements.  Instead, it sets the context's "last_error" pointer, and
does a goto to the appropriate unwinding label.  This allows
consistent error reporting, and association with the script site.


> This will not help our compile times or performance and it will have
> no advantage over letting the runtime directly call
> stp_error(). [...]

That would presume that the runtime has enough information about the
context and the intentions of the user - something that's not true
e.g. with the "soft error" concept.  It also mixes policy and
mechanism concepts.


- FChE
Reply | Threaded
Open this post in threaded view
|

RE: stp_exit change

bibo,mao-2
In reply to this post by Frank Ch. Eigler
I do not know whether we can delete atomic_set (&session_state, STAP_SESSION_STOPPING) sentence from runtime/io.c and add this sentence into file tapset/logging.stp like this:
function exit () %{
   atomic_set (&session_state, STAP_SESSION_STOPPING);
    _stp_exit ();
%}


Regards
Bibo,mao

>-----Original Message-----
>From: [hidden email] [mailto:[hidden email]]
>On Behalf Of Frank Ch. Eigler
>Sent: 2006年1月11日 6:19
>To: Martin Hunt
>Cc: [hidden email]
>Subject: stp_exit change
>
>Hi -
>
>Can you please justify your change to runtime/io.c and
>tapset/logging.stp?  Moving that line of code to stp_exit may be
>problematic because this function is invoked from more places than the
>exit() script function.  In particular, this change interferes with
>STAP_SESSION_ERROR state handling.
>
>- FChE
Reply | Threaded
Open this post in threaded view
|

RE: stp_exit change

Martin Hunt
On Mon, 2006-01-16 at 12:49 +0800, Mao, Bibo wrote:
> I do not know whether we can delete atomic_set (&session_state, STAP_SESSION_STOPPING) sentence from runtime/io.c and add this sentence into file tapset/logging.stp like this:
> function exit () %{
>    atomic_set (&session_state, STAP_SESSION_STOPPING);
>     _stp_exit ();
> %}
That is exactly what the code did before I changed it. I was working
under the assumption that tapsets certainly shouldn't be mucking around
in translator internal state. Frank argued that he prefers that to
having the runtime do it and reverted the change back to what you have
above.

Martin