[PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Maciej W. Rozycki-3
Complement commit 902e9fc76a0e ("PR ld/20828: Move symbol version
processing ahead of GC symbol sweep"), commit b531344c34b0 ("PR
ld/20828: Reorder the symbol sweep stage of section GC") and commit
81ff47b3a546 ("PR ld/20828: Fix linker script symbols wrongly forced
local with section GC"), and prevent symbols forcibly entered in the
output file with the use of the `--undefined=' or `--require-defined='
linker command line options or the EXTERN linker script command from
being swept in section garbage collection and consequently recorded in
the dynamic symbol table as local entries.  This happens in certain
circumstances, where a symbol reference also exists in one of the static
input files, however only in a section which is garbage-collected and
does not make it to the output file, and the symbol is defined in a
dynamic object present in the link.

For example with the `i386-linux' target and the `pr21233.s' and
`pr21233-l.s' sources, and the `pr21233.ld' linker script included with
this change we get:

$ as -o pr21233-l.o pr21233-l.s
$ ld -shared -T pr21233.ld -o libpr21233.so pr21233-l.o
$ as -o pr21233.o pr21233.s
$ ld --gc-sections -e foo --require-defined=bar -T pr21233.ld -o pr21233 pr21233.o libpr21233.so
$ readelf --dyn-syms pr21233

Symbol table '.dynsym' contains 2 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 OBJECT  LOCAL  DEFAULT  UND bar
$

which makes the run-time `bar' dependency of the `pr21233' executable
different from its corresponding link-time dependency, i.e. the presence
of `libpr21233.so' and its `bar' symbol is required at the link time,
however at the run time a copy of `libpr21233.so' without `bar' will do.
Similarly with `--undefined=' and EXTERN which do not actually require
the reference to the symbol requested to be satisfied with a definition
at the link time, however once the definition has been pulled at the
link time, so it should at the dynamic load time.

Additionally with the `mips-linux' target we get:

$ ld --gc-sections -e foo --require-defined=bar -T pr21233.ld -o pr21233 pr21233.o libpr21233.so
ld: BFD (GNU Binutils) 2.28.51.20170324 assertion fail .../bfd/elfxx-mips.c:3861
$

as the target is not prepared to handle such a local dynamic symbol.

With this change in effect we get:

$ readelf --dyn-syms pr21233

Symbol table '.dynsym' contains 2 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 OBJECT  GLOBAL DEFAULT  UND bar
$

instead, for both targets.

        ld/
        PR ld/21233
        * ldlang.c (insert_undefined): Set `mark' for ELF symbols.
        * testsuite/ld-elf/pr21233.sd: New test.
        * testsuite/ld-elf/pr21233-l.sd: New test.
        * testsuite/ld-elf/pr21233.ld: New test linker script.
        * testsuite/ld-elf/pr21233-e.ld: New test linker script.
        * testsuite/ld-elf/pr21233.s: New test source.
        * testsuite/ld-elf/pr21233-l.s: New test source.
        * testsuite/ld-elf/shared.exp: Run the new tests.
---
 NB I haven't checked if a dynamic executable with an undefined unused
local symbol is going to succeed or fail execution with the dynamic
loader, but in any case I see our current semantics as inconsistent.

 And as I previously noted in the course of addressing PR ld/20828 in any
case there's no point in having a local symbol (other than perhaps for
special cases some psABIs may define) in the dynamic symbol table.  This
brings us with two potential solutions: either retaining the global scope
for the symbol or removing the symbol altogether.  I think the former
approach is more consistent with the definitions of the command-line
options and the linker script command concerned, which is why I chose it
over the latter.

 There have been no regressions across my usual targets with this change
applied.  The new test cases do however fail across a number of targets.
The cause is their handling of copy relocations.  As `bar' is a dynamic
data symbol, space in the `.dynbss' section is allocated and a copy
relocation produced.  As `.dynbss' is discarded from output the relocation
is lost, and the local copy of the `bar' symbol converted to an absolute
symbol, e.g. with `m68k-linux':

$ readelf --dyn-syms pr21233

Symbol table '.dynsym' contains 2 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     1 OBJECT  GLOBAL DEFAULT  ABS bar
$

(NB this variant of `bar' is similarly entered as a local symbol if the
executable is linked without the change discussed here).  The exact list
of targets affected I have identified is as follows:

aarch64-linux
am33_2.0-linux
arc-linux
arm-eabi
arm-linuxeabi
arm-netbsdelf
crisv32-linux
m68k-elf
m68k-linux
mn10300-elf
nios2-linux
vax-linux
vax-netbsdelf

Additionally `bfin-elf' and `bfin-uclinux' fail with:

ld: the bfin target does not currently support the generation of copy relocations
ld: failed to set dynamic section sizes: File format not recognized

 I have identified the cause of this phenomenon to be the reverse order
`elf_gc_sweep' and `_bfd_elf_link_check_relocs' are called in on these
targets, causing `->non_got_ref' to be set for the symbol even though the
reference is later discarded.  The possibility to change the order has
been introduced with commit d968975277ba ("Check ELF relocs after opening
all input files"), using CHECK_RELOCS_AFTER_OPEN_INPUT, after a discussion
started at <https://www.sourceware.org/ml/binutils/2016-04/msg00295.html>
which indicates the intent to gradually swap the order for all targets.
After the initial change for x86 this has only been since done for SH (cf
PR ld/17739), i.e. I gather we are still in the middle of the move.

 Which brings me a question to our general maintainers: which of the
following 3 options shall I pick for the purpose of this test case:

1. Leave the new failures as they are and let maintainers handle them as
   they find need or time; there may be more to be done for individual
   targets beyond the lone change to CHECK_RELOCS_AFTER_OPEN_INPUT.

2. File a PR referring to commit d968975277ba and its discussion and KFAIL
   the affected test cases for the problematic targets.

3. Tweak the tests to accept absolute symbols (and exclude `bfin-*-*' from
   testing as a hopeless case).  I'd rather avoid this option as I see
   such an outcome as invalid, defeating the purpose of the command-line
   options and the linker script command concerned.

 NB I have verified that using `-z nocopyreloc' does not change anything
for most of these problematic targets, so this is not a viable workaround.

  Maciej

binutils-ld-insert-undefined-mark.diff
Index: binutils/ld/ldlang.c
===================================================================
--- binutils.orig/ld/ldlang.c 2017-03-22 15:26:43.651680031 +0000
+++ binutils/ld/ldlang.c 2017-03-22 18:44:47.798267237 +0000
@@ -3429,6 +3429,8 @@ insert_undefined (const char *name)
     {
       h->type = bfd_link_hash_undefined;
       h->u.undef.abfd = NULL;
+      if (is_elf_hash_table (link_info.hash))
+ ((struct elf_link_hash_entry *) h)->mark = 1;
       bfd_link_add_undef (link_info.hash, h);
     }
 }
Index: binutils/ld/testsuite/ld-elf/pr21233-e.ld
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233-e.ld 2017-03-22 18:44:54.976834149 +0000
@@ -0,0 +1,2 @@
+EXTERN (bar)
+INCLUDE pr21233.ld
Index: binutils/ld/testsuite/ld-elf/pr21233-l.s
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233-l.s 2017-03-22 18:44:54.994117791 +0000
@@ -0,0 +1,6 @@
+ .data
+ .globl bar
+ .type bar, %object
+bar:
+ .byte 1
+ .size bar, . - bar
Index: binutils/ld/testsuite/ld-elf/pr21233-l.sd
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233-l.sd 2017-03-22 19:18:22.818303465 +0000
@@ -0,0 +1,6 @@
+# Make sure global `bar' is present in the dynamic symbol table, e.g.:
+#    Num:    Value  Size Type    Bind   Vis      Ndx Name
+#      1: 00000000     1 OBJECT  GLOBAL DEFAULT    5 bar
+#...
+ *[0-9]+: +[0-9a-f]+ +1 +OBJECT +GLOBAL +DEFAULT +[0-9]+ +bar
+#pass
Index: binutils/ld/testsuite/ld-elf/pr21233.ld
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233.ld 2017-03-22 18:44:55.060208515 +0000
@@ -0,0 +1,17 @@
+SECTIONS
+{
+  .hash : { *(.hash) }
+  .dynsym : { *(.dynsym) }
+  .dynstr : { *(.dynstr) }
+  .rel.dyn : { *(.rel.dyn) }
+  .text : { *(.text) }
+  .dynamic : { *(.dynamic) }
+  .data : { *(.data) }
+  .symtab : { *(.symtab) }
+  .strtab : { *(.strtab) }
+  .shstrtab : { *(.shstrtab) }
+  .plt : { *(.plt) }
+  .got.plt : { *(.got.plt) }
+  .got : { *(.got) }
+  /DISCARD/ : { *(*) }
+}
Index: binutils/ld/testsuite/ld-elf/pr21233.s
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233.s 2017-03-22 18:44:55.080529486 +0000
@@ -0,0 +1,8 @@
+ .text
+ .globl foo
+ .type foo, %function
+foo:
+ .size foo, . - foo
+
+ .data
+ .dc.a bar
Index: binutils/ld/testsuite/ld-elf/pr21233.sd
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ binutils/ld/testsuite/ld-elf/pr21233.sd 2017-03-22 18:44:55.092782869 +0000
@@ -0,0 +1,9 @@
+# Make sure the `bar' reference is global rather than local
+# in the dynamic symbol table, e.g.:
+#    Num:    Value  Size Type    Bind   Vis      Ndx Name
+#      1: 00000000     0 OBJECT  GLOBAL DEFAULT  UND bar
+# vs:
+#      1: 00000000     0 OBJECT  LOCAL  DEFAULT  UND bar
+#...
+ *[0-9]+: +[0-9a-f]+ +0 +OBJECT +GLOBAL +DEFAULT +UND +bar
+#pass
Index: binutils/ld/testsuite/ld-elf/shared.exp
===================================================================
--- binutils.orig/ld/testsuite/ld-elf/shared.exp 2017-03-22 15:26:44.259343252 +0000
+++ binutils/ld/testsuite/ld-elf/shared.exp 2017-03-22 18:54:52.902130284 +0000
@@ -115,6 +115,45 @@ if { [check_gc_sections_available] } {
     {{objdump -p pr20828-v.od}} \
     "pr20828-v-2"]]
 }
+# PR ld/21233 check for correct dynamic symbol table entries where:
+# - a symbol has been defined in a shared library used in the link,
+# - the symbol has been referenced from a section swept in garbage collection,
+# - the symbol has also been forced to be entered in the output file as an
+#   undefined symbol, either with a command-line option or a linker script
+#   command.
+# Verify that the undefined symbol is global rather than local.
+if { [check_gc_sections_available] } {
+    run_ld_link_tests [list \
+ [list \
+    "PR ld/21233 dynamic symbols with section GC\
+     (auxiliary shared library)" \
+    "$LFLAGS -shared -T pr21233.ld" "" "$AFLAGS_PIC" \
+    {pr21233-l.s} \
+    {{readelf --dyn-syms pr21233-l.sd}} \
+    "libpr21233.so"] \
+ [list \
+    "PR ld/21233 dynamic symbols with section GC (--undefined)" \
+    "$LFLAGS --gc-sections -e foo --undefined=bar -T pr21233.ld" \
+    "tmpdir/libpr21233.so" "" \
+    {pr21233.s} \
+    {{readelf --dyn-syms pr21233.sd}} \
+    "pr21233-1"] \
+ [list \
+    "PR ld/21233 dynamic symbols with section GC (--require-defined)" \
+    "$LFLAGS --gc-sections -e foo --require-defined=bar\
+     -T pr21233.ld" \
+    "tmpdir/libpr21233.so" "" \
+    {pr21233.s} \
+    {{readelf --dyn-syms pr21233.sd}} \
+    "pr21233-2"] \
+ [list \
+    "PR ld/21233 dynamic symbols with section GC (EXTERN)" \
+    "$LFLAGS --gc-sections -e foo -T pr21233-e.ld" \
+    "tmpdir/libpr21233.so" "" \
+    {pr21233.s} \
+    {{readelf --dyn-syms pr21233.sd}} \
+    "pr21233-3"]]
+}
 
 # Check to see if the C compiler works
 if { [which $CC] == 0 } {
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Alan Modra-3
On Mon, Mar 27, 2017 at 12:39:07PM +0100, Maciej W. Rozycki wrote:
> PR ld/21233
> * ldlang.c (insert_undefined): Set `mark' for ELF symbols.
> * testsuite/ld-elf/pr21233.sd: New test.
> * testsuite/ld-elf/pr21233-l.sd: New test.
> * testsuite/ld-elf/pr21233.ld: New test linker script.
> * testsuite/ld-elf/pr21233-e.ld: New test linker script.
> * testsuite/ld-elf/pr21233.s: New test source.
> * testsuite/ld-elf/pr21233-l.s: New test source.
> * testsuite/ld-elf/shared.exp: Run the new tests.

OK.

[snip]

>  I have identified the cause of this phenomenon to be the reverse order
> `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are called in on these
> targets, causing `->non_got_ref' to be set for the symbol even though the
> reference is later discarded.  The possibility to change the order has
> been introduced with commit d968975277ba ("Check ELF relocs after opening
> all input files"), using CHECK_RELOCS_AFTER_OPEN_INPUT, after a discussion
> started at <https://www.sourceware.org/ml/binutils/2016-04/msg00295.html>
> which indicates the intent to gradually swap the order for all targets.
> After the initial change for x86 this has only been since done for SH (cf
> PR ld/17739), i.e. I gather we are still in the middle of the move.

Well, powerpc hasn't changed where check_relocs is called, so this
isn't the whole story.  ie. The powerpc backend code shows that it is
possible to get this case right without delaying check_relocs.

>  Which brings me a question to our general maintainers: which of the
> following 3 options shall I pick for the purpose of this test case:
>
> 1. Leave the new failures as they are and let maintainers handle them as
>    they find need or time; there may be more to be done for individual
>    targets beyond the lone change to CHECK_RELOCS_AFTER_OPEN_INPUT.

Like this, I think.  Your testcase demonstrate a bug in the affected
backends.  Best make it visible.

--
Alan Modra
Australia Development Lab, IBM
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Maciej W. Rozycki-3
On Tue, 4 Apr 2017, Alan Modra wrote:

> > PR ld/21233
> > * ldlang.c (insert_undefined): Set `mark' for ELF symbols.
> > * testsuite/ld-elf/pr21233.sd: New test.
> > * testsuite/ld-elf/pr21233-l.sd: New test.
> > * testsuite/ld-elf/pr21233.ld: New test linker script.
> > * testsuite/ld-elf/pr21233-e.ld: New test linker script.
> > * testsuite/ld-elf/pr21233.s: New test source.
> > * testsuite/ld-elf/pr21233-l.s: New test source.
> > * testsuite/ld-elf/shared.exp: Run the new tests.
>
> OK.

 Committed, thanks for your review!

> >  I have identified the cause of this phenomenon to be the reverse order
> > `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are called in on these
> > targets, causing `->non_got_ref' to be set for the symbol even though the
> > reference is later discarded.  The possibility to change the order has
> > been introduced with commit d968975277ba ("Check ELF relocs after opening
> > all input files"), using CHECK_RELOCS_AFTER_OPEN_INPUT, after a discussion
> > started at <https://www.sourceware.org/ml/binutils/2016-04/msg00295.html>
> > which indicates the intent to gradually swap the order for all targets.
> > After the initial change for x86 this has only been since done for SH (cf
> > PR ld/17739), i.e. I gather we are still in the middle of the move.
>
> Well, powerpc hasn't changed where check_relocs is called, so this
> isn't the whole story.  ie. The powerpc backend code shows that it is
> possible to get this case right without delaying check_relocs.

 Right, I haven't investigated PowerPC, or indeed any target that already
passes these test cases, however I suspect that just like MIPS PowerPC
does not require copy relocations in the first place for simple static
symbol references (i.e. pointers) from data, which usually just end up
being deferred to the dynamic load time in the form of dynamic relocations
(e.g. R_MIPS_REL32 for the MIPS target, whether it uses the original SVR4
psABI or the PLT/copy-reloc extension).  These do not set `->non_got_ref'
at all.

 I suspect some targets might not have a psABI as simple as that however
and might require (or just want) a copy relocation even where the only
reference is a static pointer from data, and these might indeed rely on
the relative order `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are
called in.

 NB the avoidance of such odd cases is what I gather is implied by your
very observation made in the thread of discussion referred, specifically
here: <https://www.sourceware.org/ml/binutils/2016-04/msg00318.html>.

 Of course it might be a plain backend bug too.

> >  Which brings me a question to our general maintainers: which of the
> > following 3 options shall I pick for the purpose of this test case:
> >
> > 1. Leave the new failures as they are and let maintainers handle them as
> >    they find need or time; there may be more to be done for individual
> >    targets beyond the lone change to CHECK_RELOCS_AFTER_OPEN_INPUT.
>
> Like this, I think.  Your testcase demonstrate a bug in the affected
> backends.  Best make it visible.

 Committed unchanged then.  Any issue with backporting it to 2.28, as per
the situation with the problem report?

  Maciej
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Hans-Peter Nilsson
In reply to this post by Maciej W. Rozycki-3
> Date: Mon, 27 Mar 2017 12:39:07 +0100
> From: "Maciej W. Rozycki" <[hidden email]>

>  Which brings me a question to our general maintainers: which of the
> following 3 options shall I pick for the purpose of this test case:

> 2. File a PR referring to commit d968975277ba and its discussion and KFAIL
>    the affected test cases for the problematic targets.

(or rather xfail; I don't know if there's a way to kfail with
run_ld_link_tests and I don't think we use that in binutils.)

I'm not a general maintainer, but FWIW my preference would have
been this, to xfail the failing parts and also add affected
maintainers on the ticket.

Thus this commit for "my" domain.  The first part passed, so I
had to split up the list.  I can then deal with the actual bug
at my discretion without being nagged by any autotesters that
make use of a regression-free status to trig on any fail.

Incidentally, arm-unknown-eabi also fails.

To those interested: the run_ld_link_tests source shows how to
add xfails for a target or use this as an example.  (Not my
preferred test-driver function, I prefer to iterate on
run_dump_test *.d files; with a driver in place sometimes I only
have to add that one file with a descriptive comment inside.)

As a sanity check, I made sure mips-linux still passed all
tests.

diff --git a/ld/ChangeLog b/ld/ChangeLog
index d8b9a22..3cf9141 100644
--- a/ld/ChangeLog
+++ b/ld/ChangeLog
@@ -1,3 +1,9 @@
+2017-04-05  Hans-Peter Nilsson  <[hidden email]>
+
+ PR ld/21233
+ * testsuite/ld-elf/shared.exp: Xfail all PR21233 tests but the
+ first test for cris*-*-*.
+
 2017-04-04  Maciej W. Rozycki  <[hidden email]>
 
  PR ld/21233
diff --git a/ld/testsuite/ld-elf/shared.exp b/ld/testsuite/ld-elf/shared.exp
index be30ec0..291f9e1 100644
--- a/ld/testsuite/ld-elf/shared.exp
+++ b/ld/testsuite/ld-elf/shared.exp
@@ -130,7 +130,8 @@ if { [check_gc_sections_available] } {
     "$LFLAGS -shared -T pr21233.ld" "" "$AFLAGS_PIC" \
     {pr21233-l.s} \
     {{readelf --dyn-syms pr21233-l.sd}} \
-    "libpr21233.so"] \
+    "libpr21233.so"]]
+    run_ld_link_tests [list \
  [list \
     "PR ld/21233 dynamic symbols with section GC (--undefined)" \
     "$LFLAGS --gc-sections -e foo --undefined=bar -T pr21233.ld" \
@@ -152,7 +153,7 @@ if { [check_gc_sections_available] } {
     "tmpdir/libpr21233.so" "" \
     {pr21233.s} \
     {{readelf --dyn-syms pr21233.sd}} \
-    "pr21233-3"]]
+     "pr21233-3"]] "cris*-*-*"
 }
 
 # Check to see if the C compiler works

brgds, H-P
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Alan Modra-3
In reply to this post by Maciej W. Rozycki-3
On Tue, Apr 04, 2017 at 11:43:31PM +0100, Maciej W. Rozycki wrote:

> On Tue, 4 Apr 2017, Alan Modra wrote:
> > >  I have identified the cause of this phenomenon to be the reverse order
> > > `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are called in on these
> > > targets, causing `->non_got_ref' to be set for the symbol even though the
> > > reference is later discarded.  The possibility to change the order has
> > > been introduced with commit d968975277ba ("Check ELF relocs after opening
> > > all input files"), using CHECK_RELOCS_AFTER_OPEN_INPUT, after a discussion
> > > started at <https://www.sourceware.org/ml/binutils/2016-04/msg00295.html>
> > > which indicates the intent to gradually swap the order for all targets.
> > > After the initial change for x86 this has only been since done for SH (cf
> > > PR ld/17739), i.e. I gather we are still in the middle of the move.
> >
> > Well, powerpc hasn't changed where check_relocs is called, so this
> > isn't the whole story.  ie. The powerpc backend code shows that it is
> > possible to get this case right without delaying check_relocs.
>
>  Right, I haven't investigated PowerPC, or indeed any target that already
> passes these test cases, however I suspect that just like MIPS PowerPC
> does not require copy relocations in the first place for simple static
> symbol references (i.e. pointers) from data, which usually just end up
> being deferred to the dynamic load time in the form of dynamic relocations
> (e.g. R_MIPS_REL32 for the MIPS target, whether it uses the original SVR4
> psABI or the PLT/copy-reloc extension).  These do not set `->non_got_ref'
> at all.
>
>  I suspect some targets might not have a psABI as simple as that however
> and might require (or just want) a copy relocation even where the only
> reference is a static pointer from data, and these might indeed rely on
> the relative order `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are
> called in.

The test has a reference from .data to "bar" that sets non_got_ref and
pointer_equality_needed, increments plt.refcount in case "bar" turns
out to be a function, and records the need for a possible dynamic
reloc in dyn_relocs, in check_relocs on powerpc64le.  .data is then
garbage collected, which removes the dyn_reloc entry and decrements
plt.refcount too (*) in gc_sweep_hook but can't remove flags like
non_got_ref.  ppc64_elf_adjust_dynamic_symbol does clear non_got_ref
later, and doesn't make a nonsense entry in .dynbss with copy relocs.

*) plt.refcount isn't decremented in this case on ppc64le.  Bug!  So
   it was worth looking at the testcase.  :)

>  NB the avoidance of such odd cases is what I gather is implied by your
> very observation made in the thread of discussion referred, specifically
> here: <https://www.sourceware.org/ml/binutils/2016-04/msg00318.html>.

True.

>  Of course it might be a plain backend bug too.

Also likely true.

> > >  Which brings me a question to our general maintainers: which of the
> > > following 3 options shall I pick for the purpose of this test case:
> > >
> > > 1. Leave the new failures as they are and let maintainers handle them as
> > >    they find need or time; there may be more to be done for individual
> > >    targets beyond the lone change to CHECK_RELOCS_AFTER_OPEN_INPUT.
> >
> > Like this, I think.  Your testcase demonstrate a bug in the affected
> > backends.  Best make it visible.
>
>  Committed unchanged then.  Any issue with backporting it to 2.28, as per
> the situation with the problem report?

I think backporting should omit the testcase, to avoid unduly worrying
users.

--
Alan Modra
Australia Development Lab, IBM
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Maciej W. Rozycki-3
In reply to this post by Hans-Peter Nilsson
On Wed, 5 Apr 2017, Hans-Peter Nilsson wrote:

> > 2. File a PR referring to commit d968975277ba and its discussion and KFAIL
> >    the affected test cases for the problematic targets.
>
> (or rather xfail; I don't know if there's a way to kfail with
> run_ld_link_tests and I don't think we use that in binutils.)

 Of course there is, see commit 3807734dbe48 for a somewhat non-trivial
example.

 AFAIK XFAIL is unsuitable as it marks a problem with the test environment
rather than a bug in the component being tested.  KFAIL OTOH forces you to
create a PR, which prompts you to have the issue recorded in the bug
tracker (also giving a chance for someone else to pick it up) rather than
papered over and then forgotten.

 KFAIL's usage is indeed much more prominent in GDB than in binutils.

> I'm not a general maintainer, but FWIW my preference would have
> been this, to xfail the failing parts and also add affected
> maintainers on the ticket.

 On second thoughts there can be multiple bugs here, quite likely one or
more per BFD backend.  So it doesn't really scale well.  And any active
maintainer will notice the regression in their routine runs; XFAIL/KFAIL
may OTOH be missed (my test scripts certainly do not pay attention to
these on the basis of them being expected/known; I'd have to go through
full logs to catch them).

> To those interested: the run_ld_link_tests source shows how to
> add xfails for a target or use this as an example.  (Not my
> preferred test-driver function, I prefer to iterate on
> run_dump_test *.d files; with a driver in place sometimes I only
> have to add that one file with a descriptive comment inside.)

 I prefer `run_dump_test' too where feasible, but it is not suitable for
some test cases, not at least without bending backwards.

> As a sanity check, I made sure mips-linux still passed all
> tests.

 Thanks.

> diff --git a/ld/testsuite/ld-elf/shared.exp b/ld/testsuite/ld-elf/shared.exp
> index be30ec0..291f9e1 100644
> --- a/ld/testsuite/ld-elf/shared.exp
> +++ b/ld/testsuite/ld-elf/shared.exp
> @@ -152,7 +153,7 @@ if { [check_gc_sections_available] } {
>      "tmpdir/libpr21233.so" "" \
>      {pr21233.s} \
>      {{readelf --dyn-syms pr21233.sd}} \
> -    "pr21233-3"]]
> +     "pr21233-3"]] "cris*-*-*"

 Indentation issue here.

  Maciej
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Maciej W. Rozycki-3
In reply to this post by Alan Modra-3
On Wed, 5 Apr 2017, Alan Modra wrote:

> >  I suspect some targets might not have a psABI as simple as that however
> > and might require (or just want) a copy relocation even where the only
> > reference is a static pointer from data, and these might indeed rely on
> > the relative order `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are
> > called in.
>
> The test has a reference from .data to "bar" that sets non_got_ref and
> pointer_equality_needed, increments plt.refcount in case "bar" turns
> out to be a function, and records the need for a possible dynamic
> reloc in dyn_relocs, in check_relocs on powerpc64le.  .data is then
> garbage collected, which removes the dyn_reloc entry and decrements
> plt.refcount too (*) in gc_sweep_hook but can't remove flags like
> non_got_ref.  ppc64_elf_adjust_dynamic_symbol does clear non_got_ref
> later, and doesn't make a nonsense entry in .dynbss with copy relocs.
>
> *) plt.refcount isn't decremented in this case on ppc64le.  Bug!  So
>    it was worth looking at the testcase.  :)

 Fair enough.  I'm glad you've found my test case useful. :)

> >  Committed unchanged then.  Any issue with backporting it to 2.28, as per
> > the situation with the problem report?
>
> I think backporting should omit the testcase, to avoid unduly worrying
> users.

 I thought so as well as I was asking my question, but didn't want to
suggest the answer. ;)  I've backported the fix itself only then to 2.28,
and closed the PR now.

 Thanks for your input.

  Maciej
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Hans-Peter Nilsson
In reply to this post by Maciej W. Rozycki-3
> Date: Wed, 5 Apr 2017 17:03:22 +0100
> From: "Maciej W. Rozycki" <[hidden email]>

> On Wed, 5 Apr 2017, Hans-Peter Nilsson wrote:
>
> > > 2. File a PR referring to commit d968975277ba and its discussion and KFAIL
> > >    the affected test cases for the problematic targets.
> >
> > (or rather xfail; I don't know if there's a way to kfail with
> > run_ld_link_tests and I don't think we use that in binutils.)
>
>  Of course there is, see commit 3807734dbe48 for a somewhat non-trivial
> example.
>
>  AFAIK XFAIL is unsuitable as it marks a problem with the test environment
> rather than a bug in the component being tested.  KFAIL OTOH forces you to
> create a PR, which prompts you to have the issue recorded in the bug
> tracker (also giving a chance for someone else to pick it up) rather than
> papered over and then forgotten.

Sure.  Since it's your test and I felt compelled to fix my
indentation gotcha (for which I blame emacs) I changed to kfail.

JFTR, still argumenting for my preference for "2" above:

> > I'm not a general maintainer, but FWIW my preference would have
> > been this, to xfail the failing parts and also add affected
> > maintainers on the ticket.
>
>  On second thoughts there can be multiple bugs here, quite likely one or
> more per BFD backend.

But it's probably a horse rather than a zebra. :)

>  So it doesn't really scale well.  And any active
> maintainer will notice the regression in their routine runs; XFAIL/KFAIL
> may OTOH be missed (my test scripts certainly do not pay attention to
> these on the basis of them being expected/known; I'd have to go through
> full logs to catch them).

Keeping it simple, as well as the test being tied to a specific
PR, says to log it in that PR - at least until the issue
has been analyzed for that target.

> > To those interested: the run_ld_link_tests source shows how to
> > add xfails for a target or use this as an example.  (Not my
> > preferred test-driver function, I prefer to iterate on
> > run_dump_test *.d files; with a driver in place sometimes I only
> > have to add that one file with a descriptive comment inside.)
>
>  I prefer `run_dump_test' too where feasible, but it is not suitable for
> some test cases, not at least without bending backwards.

Sure; at a glance it didn't look like bending was required, but
then I haven't really looked close into these test-cases.

> > As a sanity check, I made sure mips-linux still passed all
> > tests.
>
>  Thanks.
>
> > diff --git a/ld/testsuite/ld-elf/shared.exp b/ld/testsuite/ld-elf/shared.exp
> > index be30ec0..291f9e1 100644
> > --- a/ld/testsuite/ld-elf/shared.exp
> > +++ b/ld/testsuite/ld-elf/shared.exp
> > @@ -152,7 +153,7 @@ if { [check_gc_sections_available] } {
> >      "tmpdir/libpr21233.so" "" \
> >      {pr21233.s} \
> >      {{readelf --dyn-syms pr21233.sd}} \
> > -    "pr21233-3"]]
> > +     "pr21233-3"]] "cris*-*-*"
>
>  Indentation issue here.

Committed as above.  Unfortunately the kfails required splitting
the test-list.  Sanity-checked mips-linux again.  It's not lost
on me that the invested effort, keeping this up, will soon be on
par with fixing the actual issue...

  PR ld/21233
        * testsuite/ld-elf/shared.exp: Change xfails to kfails and fix
        indentation issue introduced with last commit.

diff --git a/ld/testsuite/ld-elf/shared.exp b/ld/testsuite/ld-elf/shared.exp
index 291f9e1..4859170 100644
--- a/ld/testsuite/ld-elf/shared.exp
+++ b/ld/testsuite/ld-elf/shared.exp
@@ -131,6 +131,9 @@ if { [check_gc_sections_available] } {
     {pr21233-l.s} \
     {{readelf --dyn-syms pr21233-l.sd}} \
     "libpr21233.so"]]
+
+    setup_kfail "cris*-*-*" "ld/21233"
+
     run_ld_link_tests [list \
  [list \
     "PR ld/21233 dynamic symbols with section GC (--undefined)" \
@@ -138,7 +141,11 @@ if { [check_gc_sections_available] } {
     "tmpdir/libpr21233.so" "" \
     {pr21233.s} \
     {{readelf --dyn-syms pr21233.sd}} \
-    "pr21233-1"] \
+    "pr21233-1"]]
+
+    setup_kfail "cris*-*-*" "ld/21233"
+
+    run_ld_link_tests [list \
  [list \
     "PR ld/21233 dynamic symbols with section GC (--require-defined)" \
     "$LFLAGS --gc-sections -e foo --require-defined=bar\
@@ -146,14 +153,18 @@ if { [check_gc_sections_available] } {
     "tmpdir/libpr21233.so" "" \
     {pr21233.s} \
     {{readelf --dyn-syms pr21233.sd}} \
-    "pr21233-2"] \
+    "pr21233-2"]]
+
+    setup_kfail "cris*-*-*" "ld/21233"
+
+    run_ld_link_tests [list \
  [list \
     "PR ld/21233 dynamic symbols with section GC (EXTERN)" \
     "$LFLAGS --gc-sections -e foo -T pr21233-e.ld" \
     "tmpdir/libpr21233.so" "" \
     {pr21233.s} \
     {{readelf --dyn-syms pr21233.sd}} \
-     "pr21233-3"]] "cris*-*-*"
+    "pr21233-3"]]
 }
 
 # Check to see if the C compiler works

brgds, H-P
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Christophe Lyon-2
In reply to this post by Maciej W. Rozycki-3
On 5 April 2017 at 00:43, Maciej W. Rozycki <[hidden email]> wrote:

> On Tue, 4 Apr 2017, Alan Modra wrote:
>
>> >     PR ld/21233
>> >     * ldlang.c (insert_undefined): Set `mark' for ELF symbols.
>> >     * testsuite/ld-elf/pr21233.sd: New test.
>> >     * testsuite/ld-elf/pr21233-l.sd: New test.
>> >     * testsuite/ld-elf/pr21233.ld: New test linker script.
>> >     * testsuite/ld-elf/pr21233-e.ld: New test linker script.
>> >     * testsuite/ld-elf/pr21233.s: New test source.
>> >     * testsuite/ld-elf/pr21233-l.s: New test source.
>> >     * testsuite/ld-elf/shared.exp: Run the new tests.
>>
>> OK.
>
>  Committed, thanks for your review!
>

Hi,

I've noticed that these tests fail on aarch64 and arm targets:

./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC (--undefined)
./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC
(--require-defined)
./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC (EXTERN)

Is it a known problem?

Thanks,

Christophe


>> >  I have identified the cause of this phenomenon to be the reverse order
>> > `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are called in on these
>> > targets, causing `->non_got_ref' to be set for the symbol even though the
>> > reference is later discarded.  The possibility to change the order has
>> > been introduced with commit d968975277ba ("Check ELF relocs after opening
>> > all input files"), using CHECK_RELOCS_AFTER_OPEN_INPUT, after a discussion
>> > started at <https://www.sourceware.org/ml/binutils/2016-04/msg00295.html>
>> > which indicates the intent to gradually swap the order for all targets.
>> > After the initial change for x86 this has only been since done for SH (cf
>> > PR ld/17739), i.e. I gather we are still in the middle of the move.
>>
>> Well, powerpc hasn't changed where check_relocs is called, so this
>> isn't the whole story.  ie. The powerpc backend code shows that it is
>> possible to get this case right without delaying check_relocs.
>
>  Right, I haven't investigated PowerPC, or indeed any target that already
> passes these test cases, however I suspect that just like MIPS PowerPC
> does not require copy relocations in the first place for simple static
> symbol references (i.e. pointers) from data, which usually just end up
> being deferred to the dynamic load time in the form of dynamic relocations
> (e.g. R_MIPS_REL32 for the MIPS target, whether it uses the original SVR4
> psABI or the PLT/copy-reloc extension).  These do not set `->non_got_ref'
> at all.
>
>  I suspect some targets might not have a psABI as simple as that however
> and might require (or just want) a copy relocation even where the only
> reference is a static pointer from data, and these might indeed rely on
> the relative order `elf_gc_sweep' and `_bfd_elf_link_check_relocs' are
> called in.
>
>  NB the avoidance of such odd cases is what I gather is implied by your
> very observation made in the thread of discussion referred, specifically
> here: <https://www.sourceware.org/ml/binutils/2016-04/msg00318.html>.
>
>  Of course it might be a plain backend bug too.
>
>> >  Which brings me a question to our general maintainers: which of the
>> > following 3 options shall I pick for the purpose of this test case:
>> >
>> > 1. Leave the new failures as they are and let maintainers handle them as
>> >    they find need or time; there may be more to be done for individual
>> >    targets beyond the lone change to CHECK_RELOCS_AFTER_OPEN_INPUT.
>>
>> Like this, I think.  Your testcase demonstrate a bug in the affected
>> backends.  Best make it visible.
>
>  Committed unchanged then.  Any issue with backporting it to 2.28, as per
> the situation with the problem report?
>
>   Maciej
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Maciej W. Rozycki-3
On Wed, 19 Apr 2017, Christophe Lyon wrote:

> I've noticed that these tests fail on aarch64 and arm targets:
>
> ./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC (--undefined)
> ./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC
> (--require-defined)
> ./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC (EXTERN)
>
> Is it a known problem?

 I guess it depends to whom.  It was certainly discussed and symptoms of
the problem with the respective backends (though not its cause) identified
in this thread, and pieces of that consideration were even present in the
part you quoted.

  Maciej
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] PR ld/21233: Avoid sweeping forced-undefined symbols in section GC

Christophe Lyon-2
On 19 April 2017 at 14:52, Maciej W. Rozycki <[hidden email]> wrote:

> On Wed, 19 Apr 2017, Christophe Lyon wrote:
>
>> I've noticed that these tests fail on aarch64 and arm targets:
>>
>> ./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC (--undefined)
>> ./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC
>> (--require-defined)
>> ./ld/ld.sum:FAIL: PR ld/21233 dynamic symbols with section GC (EXTERN)
>>
>> Is it a known problem?
>
>  I guess it depends to whom.  It was certainly discussed and symptoms of
> the problem with the respective backends (though not its cause) identified
> in this thread, and pieces of that consideration were even present in the
> part you quoted.
>

:-) Indeed, it looks like I missed a few parts.
I'm not following binutils closely, but I'm annoyed with seeing my binutils
Jenkins job red :-) (so, I do not plan to fix the problem myself).

I should turn 'make check' into 'make check+compare with previous'
to make sure to catch regressions...

Thanks,

Christophe