Re: dejagnu version update? [CORRECTION: not a regression in DejaGnu; GDB testsuite bug]

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: dejagnu version update? [CORRECTION: not a regression in DejaGnu; GDB testsuite bug]

Sourceware - gdb list mailing list
[cc-ing Joel as the originator, and <[hidden email]>]

On Tue, 9 Jun 2020, Jacob Bachmeyer wrote:

> >> I ran a quick bisection and the culprit turned out to be:
> >>
> >> ba60272a5ac6f6a7012acca03f596a6ed003f044 is the first bad commit
> >> commit ba60272a5ac6f6a7012acca03f596a6ed003f044
> >> Author: Jacob Bachmeyer <[hidden email]>
> >> Date:   Mon May 25 08:40:46 2020 -0600
> >>
> >>      Establish a default C compiler by evaluating [find_gcc] if no
> >> other compiler is given.
> >>
> >>  So this regression has to be fixed before any new release is made.
> >
> > I will look into this.  So far, I have confirmed that the problem does
> > occur and that setting the "compiler" board_info key in
> > baseboards/unix.exp seems to avoid it.  It looks like the default is
> > not being selected in all cases where it should be used.  I still need
> > to find the exact problem to write a regression test to isolate this
> > bug and make it stay squashed.
>
> After further efforts, and a few hours wasted trying to figure out why
> additional tracing code added to default_target_compile was not
> producing output, I eventually decided to just make
> default_target_compile throw a Tcl error -- and I did not get a Tcl
> error when running the tests...
>
> That is "very interesting" and a quick grep -R reveals the culprit:  the
> GDB testsuite is overriding default_target_compile in
> gdb/testsuite/lib/future.exp.  Comments indicate that that file was
> intended to temporarily bundle certain features with the GDB testsuite
> that had not yet been merged into upstream DejaGnu -- in 2004.  Now
> DejaGnu core is continuing to develop, leaving the old code copied into
> the GDB testsuite broken, and making this NOTOURBUG.

 Well, as the name suggests (regrettably there's no adequate documentation
there) this procedure is there to be overridden.  The way it's done in GDB
might perhaps be non-standard, as the standard way AFAICS would be by
providing a `gdb_compile' handler, but I think this is tangential, and the
chosen solution is there possibly because DejaGnu had no `${tool}_compile'
knob back then (I haven't checked).  That does not really matter though,
as the net effect would be the same.

> In short, this is not a regression in DejaGnu; this is code duplicated
> into the GDB testsuite that was slated for removal from that location
> years ago and needs to be removed from the GDB testsuite, or at least
> made conditional on running under a version of DejaGnu old enough to
> require it, if such versions are still supported for running the GDB
> testsuite.  If that code has added features not present in upstream
> DejaGnu over the years, now is the time to get those patches in.

 Well, it clearly is a functional regression as the interface has changed,
even if not a formal one, and I would consider it a release stopper.

 As it happens there are about 2 users of DejaGnu there, which means any
fatal regression would have been easily caught in the course of change
verification (and running binutils/GDB and GCC test suites natively is
about as easy as it can be: `configure && make && make check'), and indeed
should have, and then sorted with the respective communities if indeed the
interface change is unavoidable, with a transition period so that the
users can prepare changes to their frameworks, including backports to
various release branches if applicable, before the old interface is
removed from DejaGnu.  My suggestion would be to revert the change, get
the details sorted with GDB people and only reapply it when all parties
are ready.

 Anyway, I have completed the verification I have volunteered to do and
therefore I'm done with my part.  Please sort it between the communities.  
I have other stuff to do I have committed to.

  Maciej
Reply | Threaded
Open this post in threaded view
|

GDB testsuite overrides default_target_compile and breaks

Sourceware - gdb list mailing list
[previous Subject:  Re: dejagnu version update? [CORRECTION: not a
regression in DejaGnu; GDB testsuite bug] ]
[adding Tom Tromey to CC list per advice from Rob Savoye]
[adding main DejaGnu mailing list to CC list]

In brief for those new to this, the GDB testsuite is currently broken
when run with the current DejaGnu Git master.  The problem was initially
reported off-list and traced to a copy of default_target_compile in the
GDB testsuite.  A recent change to DejaGnu internals included a change
to that procedure upstream.  Cleanup enabled by that change breaks the
old copy of the code in the GDB testsuite.

What is needed in brief:
(1) Merge the features of default_target_compile that the GDB testsuite
depends on upstream for 1.6.3.
(2) Find the actual extensibility that GDB needs here and add that
support to the default_target_compile rewrite slated for 1.6.4.

Maciej W. Rozycki wrote:

> [cc-ing Joel as the originator, and <[hidden email]>]
>
> On Tue, 9 Jun 2020, Jacob Bachmeyer wrote:
>  
>>>> I ran a quick bisection and the culprit turned out to be:
>>>>
>>>> ba60272a5ac6f6a7012acca03f596a6ed003f044 is the first bad commit
>>>> commit ba60272a5ac6f6a7012acca03f596a6ed003f044
>>>> Author: Jacob Bachmeyer <[hidden email]>
>>>> Date:   Mon May 25 08:40:46 2020 -0600
>>>>
>>>>      Establish a default C compiler by evaluating [find_gcc] if no
>>>> other compiler is given.
>>>>
>>>>  So this regression has to be fixed before any new release is made.
>>>>        
>>> I will look into this.  So far, I have confirmed that the problem does
>>> occur and that setting the "compiler" board_info key in
>>> baseboards/unix.exp seems to avoid it.  It looks like the default is
>>> not being selected in all cases where it should be used.  I still need
>>> to find the exact problem to write a regression test to isolate this
>>> bug and make it stay squashed.
>>>      
>> After further efforts, and a few hours wasted trying to figure out why
>> additional tracing code added to default_target_compile was not
>> producing output, I eventually decided to just make
>> default_target_compile throw a Tcl error -- and I did not get a Tcl
>> error when running the tests...
>>
>> That is "very interesting" and a quick grep -R reveals the culprit:  the
>> GDB testsuite is overriding default_target_compile in
>> gdb/testsuite/lib/future.exp.  Comments indicate that that file was
>> intended to temporarily bundle certain features with the GDB testsuite
>> that had not yet been merged into upstream DejaGnu -- in 2004.  Now
>> DejaGnu core is continuing to develop, leaving the old code copied into
>> the GDB testsuite broken, and making this NOTOURBUG.
>>    
>
>  Well, as the name suggests (regrettably there's no adequate documentation
> there) this procedure is there to be overridden.

No, *_hook indicates a procedure that is there to be overridden, but not
by testsuites.  Overriding procedures inside DejaGnu is generally for
testing lab customization, because a site can update their init files
when they update DejaGnu, but testsuites need to work with multiple
versions of DejaGnu.

>   The way it's done in GDB
> might perhaps be non-standard, as the standard way AFAICS would be by
> providing a `gdb_compile' handler, but I think this is tangential, and the
> chosen solution is there possibly because DejaGnu had no `${tool}_compile'
> knob back then (I haven't checked).  That does not really matter though,
> as the net effect would be the same.
>  

DejaGnu does not have a ${tool}_compile -- that would be a legitimate
procedure for a testsuite to define and use, and I believe that the GDB
testsuite does so.  There is a ${target}_compile customization point
that is intended for testing labs to override for specific target
boards.  (It is not quite correctly handled in target_compile at the
moment, but this is slated to be fixed for 1.6.4.)  The
default_target_compile procedure is intended to be the default handler
for target_compile for targets that do not have their own special
compilation procedures.

>> In short, this is not a regression in DejaGnu; this is code duplicated
>> into the GDB testsuite that was slated for removal from that location
>> years ago and needs to be removed from the GDB testsuite, or at least
>> made conditional on running under a version of DejaGnu old enough to
>> require it, if such versions are still supported for running the GDB
>> testsuite.  If that code has added features not present in upstream
>> DejaGnu over the years, now is the time to get those patches in.
>>    
>
>  Well, it clearly is a functional regression as the interface has changed,
> even if not a formal one, and I would consider it a release stopper.
>  

The changes were to an internal component and broke an out-of-tree copy
of same.  I have to draw a line somewhere, and "monkeypatching the core
is not supported and can break" is necessary for DejaGnu to develop at
all.  Add that comments in gdb/testsuite/lib/future.exp clearly state
(from 2004! 16 years ago!) that that entire file was supposed to only
temporarily exist until its features were merged upstream and I say that
now is the time to pick that dropped ball back up and get those features
merged upstream.

>  As it happens there are about 2 users of DejaGnu there, which means any
> fatal regression would have been easily caught in the course of change
> verification (and running binutils/GDB and GCC test suites natively is
> about as easy as it can be: `configure && make && make check'), and indeed
> should have, and then sorted with the respective communities if indeed the
> interface change is unavoidable, with a transition period so that the
> users can prepare changes to their frameworks, including backports to
> various release branches if applicable, before the old interface is
> removed from DejaGnu.

There will certainly be a lengthy deprecation process for the planned
future API cleanups; indeed, the difference between the last release in
1.x and 2.0 is planned to be the removal of compatibility shims, but
that is still very far off.

>   My suggestion would be to revert the change, get
> the details sorted with GDB people and only reapply it when all parties
> are ready.
>  

The actual change that broke the out-of-tree copy of
default_target_compile was cleaning up the board files to remove the
now-optional definition of the "compiler" board_info key.  (Some
testsuites do not need to compile code and running find_gcc emits
useless "verbose noise" in those cases.)  A patch to revert this clean
up (but not the addition of [find_gcc] as default in
default_target_compile) may be reasonable on the release branches for
the remaining 1.6.x releases, since it would not significantly affect
the core, but the change stays on the master branch.  Monkeypatching the
DejaGnu core is not supported and cannot be supported long-term.  (In
the future, default_target_compile may eventually disappear from the
global namespace as seen by testsuites.  If GDB continues to override
it, the rename will end up deleting the compatibility shim while the
internal calls still go to the internal copy of the procedure.  Then it
all blows up with a Tcl error in DejaGnu 2.0 when the compatibility shim
is removed.)

GDB seems to have also extended default_target_compile, so a discussion
on this is needed because there seems to be a need for some kind of
language extensibility feature to allow new language-default-selector
options to be defined without overriding the entire
default_target_compile procedure (which could also break in a test lab
that uses a custom ${target}_compile procedure for a certain board).  
Since default_target_compile is slated to be rewritten between 1.6.3 and
1.6.4, now is a very good time to get additional features that it needs
to have into the DejaGnu core.  The API is now documented as
"target_compile" in Git master.  Backwards-compatible extensions (like
adding new language default options) are possible, of course.

If I understand correctly, merging the features currently carried in
gdb/testsuite/lib/future.exp upstream will cause all existing GDB
releases to correctly use the upstream version of
default_target_compile, making the partial reversion patch unnecessary.

>  Anyway, I have completed the verification I have volunteered to do and
> therefore I'm done with my part.

Thank you for your help.  [To Maciej W. Rozycki:  The requests for
further efforts in this email are for other people who work more closely
with GDB.]



-- Jacob
Reply | Threaded
Open this post in threaded view
|

Re: GDB testsuite overrides default_target_compile and breaks

Tom Tromey-4
I trimmed the cc list down to just gdb and dejagnu.

Jacob> What is needed in brief:
Jacob> (1) Merge the features of default_target_compile that the GDB
Jacob> testsuite depends on upstream for 1.6.3.
Jacob> (2) Find the actual extensibility that GDB needs here and add that
Jacob> support to the default_target_compile rewrite slated for 1.6.4.

Sounds reasonable to me.

Jacob> The changes were to an internal component and broke an out-of-tree
Jacob> copy of same.  I have to draw a line somewhere, and "monkeypatching
Jacob> the core is not supported and can break" is necessary for DejaGnu to
Jacob> develop at all.

FWIW this makes sense to me.  I think it's never been a great idea to
have this override in gdb.  Dejagnu wasn't very actively maintained for
a long time, and had low engagement from gdb developers, so it was more
like a convenient way to move forward.

Jacob> Monkeypatching the DejaGnu core is not supported and cannot be
Jacob> supported long-term.

gdb does it other spots as well.  For example, check-test-names.exp
overrides log_summary and reset_vars that gdb can show duplicated test
names, and gdb.exp renames "cd" to avoid problems with relative log file
names.

Jacob> GDB seems to have also extended default_target_compile, so a
Jacob> discussion on this is needed because there seems to be a need for some
Jacob> kind of language extensibility feature to allow new
Jacob> language-default-selector options to be defined without overriding the
Jacob> entire default_target_compile procedure

Yes, that would be good to have.  I think this has been patched when a
new language is added to gdb; changing this function to work in some
extensible way would be a nice improvement... while there are no
concrete plans I know of to add new languages to gdb, one never knows
when someone may show up with patches.

Jacob> If I understand correctly, merging the features currently carried in
Jacob> gdb/testsuite/lib/future.exp upstream will cause all existing GDB
Jacob> releases to correctly use the upstream version of
Jacob> default_target_compile, making the partial reversion patch
Jacob> unnecessary.

Yes, assuming that gdb's detection code continues to work.  It does
this:

    if {[info procs find_gnatmake] == ""} {
        rename gdb_find_gnatmake find_gnatmake
        set use_gdb_compile 1
    }
    # ... sequence of such checks ...
    if {$use_gdb_compile} {
        catch {rename default_target_compile {}}
        rename gdb_default_target_compile default_target_compile
    }

Anyway, I compared gdb_default_target_compile with
default_target_compile.  I think there are two kinds of changes.

One is the addition of new languages, namely Rust and Go.  This seems
relatively straightforward.

The other is the addition of "early_flags":

        if {[regexp "^early_flags=" $i]} {
            regsub "^early_flags=" $i "" tmp
            append early_flags " $tmp"
        }
...
    switch $type {
        "object" {
            set opts "$early_flags $add_flags $sources"
        }
        ... etc

This seems to be needed to adapt to some distro-specific compiler
issues.  One example is:

                if { [test_compiler_info "gcc-*"] || [test_compiler_info "clang-*"] } {
                    # Undo debian's change in the default.
                    # Put it at the front to not override any user-provided
                    # value, and to make sure it appears in front of all the
                    # shlibs!
                    lappend new_options "early_flags=-Wl,--no-as-needed"
                }

I can write some patches to merge this stuff in.

Tom
Reply | Threaded
Open this post in threaded view
|

Re: GDB testsuite overrides default_target_compile and breaks

Rob Savoye
On 6/19/20 6:57 AM, Tom Tromey wrote:

> FWIW this makes sense to me.  I think it's never been a great idea to
> have this override in gdb.  Dejagnu wasn't very actively maintained for
> a long time, and had low engagement from gdb developers, so it was more
> like a convenient way to move forward.

  Post Cygnus, DejaGnu has never had any funding, and working on it
rarely got any support from management or clients. It's been relying on
volunteer labor for over 20 years... I totally understand GDB developers
extending things for GDB as the most efficient way to get their job
done. But while there is some activity, it'd be nice to migrate those
improvements where possible to DejaGnu for the next release. Thanks for
being open to getting this cleaned up sooner than later.

        - rob -
Reply | Threaded
Open this post in threaded view
|

Re: GDB testsuite overrides default_target_compile and breaks

Sourceware - gdb list mailing list
In reply to this post by Tom Tromey-4
Tom Tromey wrote:

> Jacob> What is needed in brief:
> Jacob> (1) Merge the features of default_target_compile that the GDB
> Jacob> testsuite depends on upstream for 1.6.3.
> Jacob> (2) Find the actual extensibility that GDB needs here and add that
> Jacob> support to the default_target_compile rewrite slated for 1.6.4.
>
> Sounds reasonable to me.
>
> Jacob> The changes were to an internal component and broke an out-of-tree
> Jacob> copy of same.  I have to draw a line somewhere, and "monkeypatching
> Jacob> the core is not supported and can break" is necessary for DejaGnu to
> Jacob> develop at all.
>
> FWIW this makes sense to me.  I think it's never been a great idea to
> have this override in gdb.  Dejagnu wasn't very actively maintained for
> a long time, and had low engagement from gdb developers, so it was more
> like a convenient way to move forward.
>  

 From how it was explained to me, that may have been the only way
forward at the time.

> Jacob> Monkeypatching the DejaGnu core is not supported and cannot be
> Jacob> supported long-term.
>
> gdb does it other spots as well.  For example, check-test-names.exp
> overrides log_summary and reset_vars that gdb can show duplicated test
> names, and gdb.exp renames "cd" to avoid problems with relative log file
> names.
>  

I will look into these and see what we can do to move generally useful
functionality upstream.  (There are many features in the GDB testsuite
that I eventually want in DejaGnu, like parallel testing, although the
core implementation of that might be somewhat different as I want it to
be as close to transparent as possible.)

Quick checks on those examples suggest that they will not cause problems
(until the overridden procedures disappear into an internal namespace)
because check-test-names.exp wraps log_summary and reset_vars instead of
replacing them with out-of-tree copies.  Wrapping "cd" avoids problems
for similar reasons (technically that one is monkeypatching Tcl itself
rather than DejaGnu) but DejaGnu's own initialization procedures should
probably ensure that the log file is active with an absolute file name.

I might have to add a "monkeypatch the monkeypatches" module, loaded
after the init files, to maintain backwards compatibility with the
recent GDB releases, but that is a last resort.

> Jacob> GDB seems to have also extended default_target_compile, so a
> Jacob> discussion on this is needed because there seems to be a need for some
> Jacob> kind of language extensibility feature to allow new
> Jacob> language-default-selector options to be defined without overriding the
> Jacob> entire default_target_compile procedure
>
> Yes, that would be good to have.  I think this has been patched when a
> new language is added to gdb; changing this function to work in some
> extensible way would be a nice improvement... while there are no
> concrete plans I know of to add new languages to gdb, one never knows
> when someone may show up with patches.
>  

The rewritten default_target_compile will be less of an "if forest" and
more table-driven.  Language defaults then become entries in an internal
table, so all we need is an API call for init files to add entries to
that table.

> Jacob> If I understand correctly, merging the features currently carried in
> Jacob> gdb/testsuite/lib/future.exp upstream will cause all existing GDB
> Jacob> releases to correctly use the upstream version of
> Jacob> default_target_compile, making the partial reversion patch
> Jacob> unnecessary.
>
> Yes, assuming that gdb's detection code continues to work.  It does
> this:
>
>     if {[info procs find_gnatmake] == ""} {
>         rename gdb_find_gnatmake find_gnatmake
>         set use_gdb_compile 1
>     }
>     # ... sequence of such checks ...
>     if {$use_gdb_compile} {
>         catch {rename default_target_compile {}}
>         rename gdb_default_target_compile default_target_compile
>     }
>  

That code is testing the existence of API procedures to decide if it
wants to override an internal procedure.  There are no plans to remove
find_* prior to 2.0; even if a better API ("testsuite tool"?) is
introduced, compatibility shims will remain.  (Now that I think about
it, even 2.0 may still carry a "compatibility mode" for the 1.x API.)

> Anyway, I compared gdb_default_target_compile with
> default_target_compile.  I think there are two kinds of changes.
>
> [...]
>
> I can write some patches to merge this stuff in.
>  

Many thanks for the patch set.  I will merge them and add documentation
and tests over the weekend or so.


-- Jacob