Range lists, zero-length functions, linker gc

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Range lists, zero-length functions, linker gc

Sourceware - gdb list mailing list
It is being discussed on llvm-dev
(https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA)
what linkers should do regarding relocations referencing dropped functions (due
to section group rules, --gc-sections, /DISCARD/, etc) in .debug_*

As an example:

   __attribute__((section(".text.x"))) void f1() { }
   __attribute__((section(".text.x"))) void f2() { }
   int main() { }

Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION
symbols are collected):

   0x00000043:   DW_TAG_subprogram [2]
                   ###### relocated by .text.x + 10
                   DW_AT_low_pc [DW_FORM_addr]     (0x0000000000000010 ".text.x")
                   DW_AT_high_pc [DW_FORM_data4]   (0x00000006)
                   DW_AT_frame_base [DW_FORM_exprloc]      (DW_OP_reg6 RBP)
                   DW_AT_linkage_name [DW_FORM_strp]       ( .debug_str[0x0000002c] = "_Z2f2v")
                   DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000033] = "f2")


With ld --gc-sections:

* DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend
   This can cause overlapping address ranges with normal text sections. {{overlap}}
* [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend).
   See bfd/reloc.c (behavior introduced in
   https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 )

   [0, 0) cannot be used because it terminates the list entry.
   [-1, -1) cannot be used because -1 represents a base address selection entry which will affect
     subsequent address offset pairs.
* .debug_loc address offset pairs have similar problem to .debug_ranges
* In DWARF v5, the abnormal values can be in a separate section .debug_addr

---

To save your time, I have a summary of the discussions. I am eager to know what you think
of the ideas from binutils/gdb/elfutils's perspective.

* {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address.
   All (undef + addend) in .debug_* are resolved to -1.

   We have to ignore the addend. With __attribute__((section(".text.x"))),
   the address offset pair may be something like [.text.x + 16, .text.x + 24)
   I have to resolve the whole (.text.x + 16) to the special value.

   (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2
   (0 and -1 cannot be used due to the reasons above).

* Refined formula for a relocated value in a non-SHF_ALLOC section:

    if is_defined(sym)
       return addr(sym) + addend
    if relocated_section is .debug_ranges or .debug_loc
       return -2   # addend is intentionally ignored

    // Every DWARF v5 section falls here
    return -1  {{zero}}

* {{zero}} Can we resolve (undef + addend) to 0?

   https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html

   > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to
   > quirky/interesting use cases (admittedly - such platforms could equally want to make their
   > executable code way up in the address space near max or max - 1, etc?).

   Question: is address 0 meaningful for code in some binary formats?

* {{overlap}} The current situation (GNU ld, gold, LLD): (undef + addend) in .debug_* are resolved to addend.
   For an address offset pair like [.text + 0, .text + 0x10010), if the ending address offset is large
   enough, it may overlap with a normal text address range (for example [0x10000, *))

   This can cause problems in debuggers. How does gdb solve the problem?

* {{nonalloc}} Linkers resolve (undef + addend) in non-SHF_ALLOC sections to
   `addend`. For non-debug sections (open-ended), do we have needs resolving such
   values to `base` or `base+addend` where base is customizable?
   (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141956.html )
Reply | Threaded
Open this post in threaded view
|

Re: Range lists, zero-length functions, linker gc

Sourceware - gdb list mailing list
On 2020-05-31, Fangrui Song wrote:

>It is being discussed on llvm-dev
>(https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA)
>what linkers should do regarding relocations referencing dropped functions (due
>to section group rules, --gc-sections, /DISCARD/, etc) in .debug_*
>
>As an example:
>
>  __attribute__((section(".text.x"))) void f1() { }
>  __attribute__((section(".text.x"))) void f2() { }
>  int main() { }
>
>Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION
>symbols are collected):
>
>  0x00000043:   DW_TAG_subprogram [2]
>                  ###### relocated by .text.x + 10
>                  DW_AT_low_pc [DW_FORM_addr]     (0x0000000000000010 ".text.x")
>                  DW_AT_high_pc [DW_FORM_data4]   (0x00000006)
>                  DW_AT_frame_base [DW_FORM_exprloc]      (DW_OP_reg6 RBP)
>                  DW_AT_linkage_name [DW_FORM_strp]       ( .debug_str[0x0000002c] = "_Z2f2v")
>                  DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000033] = "f2")
>
>
>With ld --gc-sections:
>
>* DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend
>  This can cause overlapping address ranges with normal text sections. {{overlap}}
>* [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend).
>  See bfd/reloc.c (behavior introduced in
>  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 )
>
>  [0, 0) cannot be used because it terminates the list entry.
>  [-1, -1) cannot be used because -1 represents a base address selection entry which will affect
>    subsequent address offset pairs.
>* .debug_loc address offset pairs have similar problem to .debug_ranges
>* In DWARF v5, the abnormal values can be in a separate section .debug_addr
>
>---
>
>To save your time, I have a summary of the discussions. I am eager to know what you think
>of the ideas from binutils/gdb/elfutils's perspective.
>
>* {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address.
>  All (undef + addend) in .debug_* are resolved to -1.
>
>  We have to ignore the addend. With __attribute__((section(".text.x"))),
>  the address offset pair may be something like [.text.x + 16, .text.x + 24)
>  I have to resolve the whole (.text.x + 16) to the special value.
>
>  (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2
>  (0 and -1 cannot be used due to the reasons above).
>
>* Refined formula for a relocated value in a non-SHF_ALLOC section:
>
>   if is_defined(sym)
>      return addr(sym) + addend
>   if relocated_section is .debug_ranges or .debug_loc
>      return -2   # addend is intentionally ignored
>
>   // Every DWARF v5 section falls here
>   return -1  {{zero}}
>
>* {{zero}} Can we resolve (undef + addend) to 0?
>
>  https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html
>
>  > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to
>  > quirky/interesting use cases (admittedly - such platforms could equally want to make their
>  > executable code way up in the address space near max or max - 1, etc?).
>
>  Question: is address 0 meaningful for code in some binary formats?
>
>* {{overlap}} The current situation (GNU ld, gold, LLD): (undef + addend) in .debug_* are resolved to addend.
>  For an address offset pair like [.text + 0, .text + 0x10010), if the ending address offset is large
>  enough, it may overlap with a normal text address range (for example [0x10000, *))
>
>  This can cause problems in debuggers. How does gdb solve the problem?
>
>* {{nonalloc}} Linkers resolve (undef + addend) in non-SHF_ALLOC sections to
>  `addend`. For non-debug sections (open-ended), do we have needs resolving such
>  values to `base` or `base+addend` where base is customizable?
>  (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141956.html )

Forgot to mention

* {{compatibility}} Do we need an option if we change the computed value of (undef + addend) to
   -2 (.debug_loc,.debug_ranges)/-1 (other .debug_*)
   (or 0 (other .debug_*), but it might not be nice to some binary formats {{reserved_address}})

   https://lists.llvm.org/pipermail/llvm-dev/2020-May/141958.html

   > If we end up blessing it as part of the DWARF spec, we probably
   > wouldn't want it to be user-configurable for the .debug_ sections, so
   > I'd hesitate to add that configurability to the linker lest we have to
   > revoke it to conform to DWARF (breaking flag compatibility with
   > previous versions of the linker, etc). Admittedly we'll be breaking
   > output compatibility with this change regardless, so potentially
   > having the flag as an escape hatch could be useful.

   I hope we don't need to have a linker option. But if some not-so-old
   versions of gdb / binutils programs / elfutils programs can't cope
   with  -2/-1/0 {{reserved_address}}, we may have to invent a linker option.

   I hope GNU ld, gold and LLD can have a compatible option.
   (As an LLD contributor, I'd be happy to implement the opinion in LLD)
Reply | Threaded
Open this post in threaded view
|

Re: Range lists, zero-length functions, linker gc

Sourceware - gdb list mailing list
In reply to this post by Sourceware - gdb list mailing list
On 2020-05-31, Mark Wielaard wrote:

>Hi,
>
>On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote:
>> what linkers should do regarding relocations referencing dropped
>> functions (due to section group rules, --gc-sections, /DISCARD/,
>> etc) in .debug_*
>>
>> As an example:
>>
>>   __attribute__((section(".text.x"))) void f1() { }
>>   __attribute__((section(".text.x"))) void f2() { }
>>   int main() { }
>>
>> Some .debug_* sections are relocated by R_X86_64_64 referencing
>> undefined symbols (the STT_SECTION symbols are collected):
>>
>>   0x00000043:   DW_TAG_subprogram [2]
>>                   ###### relocated by .text.x + 10
>>                   DW_AT_low_pc [DW_FORM_addr]     (0x0000000000000010 ".text.x")
>>                   DW_AT_high_pc [DW_FORM_data4]   (0x00000006)
>>                   DW_AT_frame_base [DW_FORM_exprloc]      (DW_OP_reg6 RBP)
>>                   DW_AT_linkage_name [DW_FORM_strp]       ( .debug_str[0x0000002c] = "_Z2f2v")
>>                   DW_AT_name [DW_FORM_strp]       ( .debug_str[0x00000033] = "f2")
>>
>>
>> With ld --gc-sections:
>>
>> * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 +
>>   addend This can cause overlapping address ranges with normal text
>>   sections. {{overlap}} * [beginning address offset, ending address
>>   offset) in .debug_ranges are resolved to 1 (ignoring addend).  See
>>   bfd/reloc.c (behavior introduced in
>>   https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3
>>   )
>>
>>   [0, 0) cannot be used because it terminates the list entry.
>>   [-1, -1) cannot be used because -1 represents a base address
>>   selection entry which will affect subsequent address offset
>>   pairs.
>> * .debug_loc address offset pairs have similar problem to .debug_ranges
>> * In DWARF v5, the abnormal values can be in a separate section .debug_addr
>>
>> ---
>>
>> I am eager to know what you think
>> of the ideas from binutils/gdb/elfutils's perspective.
>
>I think this is a producer problem. If a (code) section can be totally
>dropped then the associated (.debug) sections should have been
>generated together with that (code) section in a COMDAT group. That
>way when the linker drops that section, all the associated sections in
>that COMDAT group will get dropped with it. If you don't do that, then
>the DWARF is malformed and there is not much a consumer can do about
>it.
>
>Said otherwise, I don't think it is correct for the linker (with
>--gc-sections) to drop any sections that have references to it
>(through relocation symbols) from other (.debug) sections.

I would love if we could solve the problem using ELF features, but
putting DW_TAG_subprogram in the same section group is not an
unqualified win
(https://lists.llvm.org/pipermail/llvm-dev/2020-May/141926.html)
(Cost: sizeof(Elf64_Shdr) = 64, Elf_Word for the entry in .group, plus
a string in .strtab unless you use the string ".debug_info"
(reusing the string requires https://sourceware.org/bugzilla/show_bug.cgi?id=25380))

According to Peter Smith in the thread
https://groups.google.com/forum/#!msg/generic-abi/A-1rbP8hFCA/EDA7Sf3KBwAJ ,
Arm Compiler 5 splits up DWARF v3 debugging information and puts these sections
into comdat groups:

"This approach did produce significantly more debug information than gcc
  did. For small microcontroller projects this wasn't a problem. For
  larger feature phone problems we had to put a lot of work into keeping
  the linker's memory usage down as many of our customers at the time were
  using 32-bit Windows machines with a default maximum virtual memory of 2Gb."

See Ben, Ali and others' comments in the thread. Fragmented .debug_* may
not be practical.
Reply | Threaded
Open this post in threaded view
|

Re: Range lists, zero-length functions, linker gc

Sourceware - gdb list mailing list
In reply to this post by Sourceware - gdb list mailing list
On 2020-06-03, Alan Modra wrote:

>On Tue, Jun 02, 2020 at 11:06:10AM -0700, David Blaikie via Binutils wrote:
>> On Tue, Jun 2, 2020 at 9:50 AM Mark Wielaard <[hidden email]> wrote:
>> > where I
>> > would argue the compiler simply needs to make sure that if it generates
>> > code in separate sections it also should create the DWARF separate
>> > section (groups).
>>
>> I don't think that's practical - the overhead, I believe, is too high.
>> Headers for each section contribution (ELF headers but DWARF headers
>> moreso - having a separate .debug_addr, .debug_line, etc section for
>> each function would be very expensive) would make for very large
>> object files.
>
>With a little linker magic I don't see the neccesity of duplicating
>the DWARF headers.  Taking .debug_line as an example, a compiler could
>emit the header, opcode, directory and file tables to a .debug_line
>section with line statements for function foo emitted to
>.debug_line.foo and for bar to .debug_line.bar, trusting that the
>linker will combine these sections in order to create an output
>.debug_line section.  If foo code is excluded then .debug_line.foo
>info will also be dropped if section groups are used.
>
>--
>Alan Modra
>Australia Development Lab, IBM

sizeof(Elf64_Shdr) = 64.

If we create a .debug_line fragment and a .debug_info fragment for a
function, we waste 128 bytes.

https://sourceware.org/pipermail/binutils/2020-May/111361.html

> .debug_line.bar

We should use the unique linkage feature https://sourceware.org/bugzilla/show_bug.cgi?id=25380
otherwise we also waste lots of bytes for the .debug_*.* section names.
Reply | Threaded
Open this post in threaded view
|

Re: Range lists, zero-length functions, linker gc

Sourceware - gdb list mailing list
In reply to this post by Sourceware - gdb list mailing list
On Fri, Jun 19, 2020 at 5:00 AM Mark Wielaard <[hidden email]> wrote:

>
> Hi,
>
> On Tue, 2020-06-02 at 11:06 -0700, David Blaikie via Elfutils-devel wrote:
> > > I do think combining Split DWARF and LTO might not be the best
> > > solution. When doing LTO you probably want something like GCC Early
> > > Debug, which is like Split DWARF, but different, because the Early
> > > Debug simply doesn't contain any address (ranges) yet (not even through
> > > indirection like .debug_addr).
> >
> > I don't think Early Debug fits here - it seems like it was
> > specifically for DWARF that doesn't refer to any code (eg: function
> > declarations and type definitions). I don't see how it could be used
> > for the actual address-referencing DWARF needed to describe function
> > definitions.
>
> I think that is kind of the point of Early Debug. Only use DWARF (at
> first) for address/range-less data like types and program scope
> entries, but don't emit anything (in DWARF format) for things that
> might need adjustments during link/LTO phase. The problem with using
> DWARF with address (ranges) during early object creation is that the
> linker isn't capable to rewrite the DWARF. You'll need a linker plugin
> that calls back into the compiler to do the actual LTO and emit the
> actual DWARF containing address/ranges (which can then link back to the
> already emitted DWARF types/program scope/etc during the Early Debug
> phase). I think the issue you are describing is actually that you do
> use DWARF to describe function definitions (not just the declarations)
> too early. If you aren't sure yet which addresses will be used DWARF
> isn't really the appropriate (temporary) debug format.

Sorry, I think we keep talking around each other. Not sure if we can
reach a good consensus or shared understanding on this topic.

DWARF in unlinked object files has been a fairly well used temporary
debug format for a long time - and the DWARF spec has done a lot to
ensure it is compatible with ELF in both object files and linkers
forever, basically? So I don't think it'd be suitable to say "DWARF
isn't an appropriate intermediate debug format to use between
compilers and linkers". In the sense that I don't think either the
DWARF committee members, producers, or consumers would agree with this
sentiment.

> > > > > > & again the overhead of all those separate contributions, headers,
> > > > > > etc, turns out to be not very desirable in any case.
> > > > >
> > > > > Yes, I agree with that. But as said earlier, maybe the compiler
> > > > > shouldn't have generated to code/data in the first place?
> > > >
> > > > In the (especially) C++ compilation model, I don't believe that's
> > > > possible - inline functions, templates, etc, require duplication -
> > > > unless you have a more complicated build process that can gather the
> > > > potential duplication, then fan back out again to compile, etc.
> > > > ThinLTO does some of this - at a cost of a more complicated build
> > > > system, etc.
> > >
> > > It might be useful for the original discussion to have a few more
> > > concrete examples to show when you might have unused code that the
> > > linker might want to discard, but where the compiler could only produce
> > > DWARF in one big blob. Apart of the -ffunction-sections case,
> >
> > Function sections, inline functions, function templates are core examples.
>
> I understand the function sections case, but can you give actual
> examples of an inline function or function template source code and how
> a DWARF producer generates DWARF for that? Maybe some simple source
> code we can put through gcc or clang to see how they (mis)handle it.
> Not being a compiler architect I am not sure I understand why those
> cannot be expressed correctly.

oh, sure! sorry.

a simple case of inline functions being deduplicated looks like this:

a.cpp:
inline void f1() { }
void f2() {
  f1();
}

b.cpp:
inline void f1() { }
void f2();
int main() {
  f1();
  f2();
}

This actually demonstrates a slightly different behavior of bfd and
gold: When the comdats are the same size (I'm told that's the
heuristic) and the local symbol names the DWARF uses to refer to the
functions (f1 in this case) - then both DWARF descriptions are
resolved to point to the same deduplicated copy of 'f1', eg:

BFD and Gold both produce this DWARF (uninteresting attributes have
been omitted):

DW_TAG_compile_unit [1] *
  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x00000065] = "a.cpp")
  DW_AT_low_pc [DW_FORM_addr]   (0x0000000000000000)
  DW_AT_ranges [DW_FORM_sec_offset]     (0x00000000
     [0x0000000000401110, 0x000000000040111b)
     [0x0000000000401120, 0x0000000000401126))
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401110)
    DW_AT_high_pc [DW_FORM_data4]       (0x0000000b)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x0000009d] = "f2")
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401120)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000006)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000a7] = "f1")
DW_TAG_compile_unit [1] *
  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x000000aa] = "b.cpp")
  DW_AT_low_pc [DW_FORM_addr]   (0x0000000000000000)
  DW_AT_ranges [DW_FORM_sec_offset]     (0x00000030
     [0x0000000000401130, 0x0000000000401142)
     [0x0000000000401120, 0x0000000000401126))
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401130)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000012)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000b0] = "main")
  DW_TAG_subprogram [3]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401120)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000006)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000a7] = "f1")

Now you have two CUs that have overlapping ranges, which is
interesting - if not strictly invalid (DWARF being permissive and
all). Though I think the size heuristic is risky - it's possible that
'f1' was optimized differently in the two compilations and just
happened to end up with the same size - but the DWARF descriptions may
be incorrect for the other version of the function (eg: one compiler
chose to put a constant in one register, the toher compiler used
another register - same instruction sequence length, but the DWARF
would be different and incorrect to mismatch like that)

If you end up with different function lengths (which is common enough
in larger programs - different other definitions may be available,
different inlining heuristics about overall object size, etc, may kick
in) then you get BFD and Gold's current tombstoning behavior:

DW_TAG_compile_unit [1] *
  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x00000065] = "a.cpp")
  DW_AT_low_pc [DW_FORM_addr]   (0x0000000000000000)
  DW_AT_ranges [DW_FORM_sec_offset]     (0x00000000
     [0x0000000000401110, 0x000000000040111b)
     [0x0000000000401120, 0x000000000040112b))
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401110)
    DW_AT_high_pc [DW_FORM_data4]       (0x0000000b)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x0000009d] = "f2")
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401120)
    DW_AT_high_pc [DW_FORM_data4]       (0x0000000b)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000a7] = "f1")
DW_TAG_compile_unit [1] *
  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x000000aa] = "b.cpp")
  DW_AT_low_pc [DW_FORM_addr]   (0x0000000000000000)
  DW_AT_ranges [DW_FORM_sec_offset]     (0x00000030
     [0x0000000000401130, 0x0000000000401142)
     [0x0000000000000001, 0x0000000000000001))
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000401130)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000012)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000b0] = "main")
  DW_TAG_subprogram [3]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000006)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000a7] = "f1")

In this case BFD uses the tombstone value 0 in most sections, but uses
1 in debug_ranges to ensure it doesn't produce the 0,0 that would end
the range list early (this workaround is incomplete and should also be
applied to debug_loc which is terminated by 0,0 too - but GCC (and
Clang) doesn't produce any inter-function location lists, so this
doesn't present a problem in practice/for now, except for dumping
tools which end up seeing "holes" in debug_loc that would otherwise be
dumpable)

Gold's behavior in this case is a little different, using the 0+addend approach:

DW_TAG_compile_unit [1] *
  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x00000065] = "a.cpp")
  DW_AT_low_pc [DW_FORM_addr]   (0x0000000000000000)
  DW_AT_ranges [DW_FORM_sec_offset]     (0x00000000
     [0x0000000000400540, 0x000000000040054b)
     [0x0000000000400550, 0x000000000040055b))
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000400540)
    DW_AT_high_pc [DW_FORM_data4]       (0x0000000b)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x0000009d] = "f2")
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000400550)
    DW_AT_high_pc [DW_FORM_data4]       (0x0000000b)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000a7] = "f1")
DW_TAG_compile_unit [1] *
  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x000000aa] = "b.cpp")
  DW_AT_low_pc [DW_FORM_addr]   (0x0000000000000000)
  DW_AT_ranges [DW_FORM_sec_offset]     (0x00000030
     [0x0000000000400560, 0x0000000000400572)
     [0x0000000000000000, 0x0000000000000006))
  DW_TAG_subprogram [2]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000400560)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000012)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000b0] = "main")
  DW_TAG_subprogram [3]
    DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
    DW_AT_high_pc [DW_FORM_data4]       (0x00000006)
    DW_AT_name [DW_FORM_strp]   ( .debug_str[0x000000a7] = "f1")

I introduced an ODR violation here (by modifying a.cpp's f1 to call f2
- thus making a.cpp's f1 a different length from b.cpp's f1) just as
an easy way to demonstrate the "different lengths" issue - but this
could arise from valid code that was differently optimized in the two
translation units.

& yeah - on an LLVM thread we did dabble with what it'd look like to
use comdats without whole separate units to put these together - and
it's possible, though that doesn't apply to Split DWARF (can't piece
together the debug_addr section either - since it'd throw of the
indexes used from the Split DWARF file) - and still adds extra section
overhead. Did prototype debug_ranges/debug_rnglist comdat assembling
(so the CU's range list wouldn't have entries for the
deduplicated/gc'd functions) (but again, more ELF sections - for
little gain in linked debug info size for the cost in intermediate
object size)


> > > where I
> > > would argue the compiler simply needs to make sure that if it generates
> > > code in separate sections it also should create the DWARF separate
> > > section (groups).
> >
> > I don't think that's practical - the overhead, I believe, is too high.
> > Headers for each section contribution (ELF headers but DWARF headers
> > moreso - having a separate .debug_addr, .debug_line, etc section for
> > each function would be very expensive) would make for very large
> > object files.
>
> I see your point, but maybe this shouldn't be handled by the linker
> then, but maybe have a linker plugin so the compiler can fixup the
> DWARF (or generate it later).

This sounds like it'd still be fairly intrusive (architecturally) and
expensive (both from a software complexity and linking time/memory
usage/etc). I'm not ruling it out as a possibility - and I'm
interested in dabbling with this kind of deduplication purely
academically (my users use Split DWARF, so there's no opportunity
there to fix this - so my interest in in-.o/linked executable DWARF is
limited to personal interest). I'm curious about just how expensive
the ELF sections would be, what sort of custom scheme might be used
instead (I could imagine a content-aware feature that might be more
terse than generic ELF sections, but not especially invasive (wouldn't
require parsing or rewriting DWARF DIEs, etc). That's being discussed
in the LLVM community - but I don't expect it'll be soon, nor
pervasively used even if it is built.

So I come back to Split DWARF making this fairly well impossible to
implement without a tombstone value, so far as I can imagine/think of.
And function sections at least making it very expensive to implement
(either in terms of object size and/or significant changes to the
nature of linking DWARF). And this being a pretty well established use
case/feature for decades now - that has some relatively small
drawbacks in certain narrow cases (zero length functions, zero or low
address values that are valid in some use cases) that adding an
explicit tombstone is necessary in some cases and beneficial if not
strictly necessary in others.

- Dave

>
> Cheers,
>
> Mark