Update LTO plugin interface

classic Classic list List threaded Threaded
78 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Update LTO plugin interface

H.J. Lu-30
Hi,

Here is a proposal to update LTO plugin interface.  Any comments?

Thanks.

--
H.J.
---
Goal:  We should preserve the same linker command line order as if
there are no IR.
Problem:
        a. LTO may generate extra symbol references which aren't in IR.
        b. It was worked around with -pass-through hack.  But it doesn't
preserve the link command line order.

Proposal:
        a. Remove -pass-through hack in GCC.
        b. Compiler plugin controls what linker uses to generate the final executable:
                i. The linker command line order should be the same, with or without LTO.
        c. Add a cmdline bit field to
        struct ld_plugin_input_file
        {
           const char *name;
           int fd;
           off_t offset;
           off_t filesize;
           void *handle;
           unsigned int cmdline : 1;
        };
        It is used by linker to tell plugin that the input file comes from
linker command line.
        d. 2 stage linker:
                i. Stage 1: Normal symbol resolution with plugin.
                ii. Stage 2:
                        1) Call the "all symbols read" handler to get the final linker inputs.
                        2) Discard all previous inputs.
                        3) Generate the final executable with inputs from plugin.
        e. Compiler plugin:
                i. For a file, which comes from the linker command line and isn't
claimed by plugin, save it in the linker pass-through list in the same
order as it comes in.
                ii. For the first file claimed by plugin,  remember the last
pass-through linker input.
                iii. The "all symbols read" handler adds input files to the linker
in the order:
                        1) All linker input files on the linker pass-through list up to the
first file claimed by plugin.
                        2) All linker input files generated by plugin.
                        3) The rest of linker input files on the linker pass-through list.
        f. Limitation:
                i. All files claimed by plugin are grouped together.  Any archives
between files claimed by plugin are placed after all linker input
files generated by plugin when passed to linker.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Basile Starynkevitch
On Wed, 1 Dec 2010 10:18:58 -0800
"H.J. Lu" <[hidden email]> wrote:

> Here is a proposal to update LTO plugin interface.  

How should we parse the above sentence?

Is it about an interface to plugin inside binutils to support LTO?

Is it about an interface for GCC plugins to help them be more LTO friendly?

Is it about an interface inside the GOLD linker to dlopen the LTO plugin provided with GCC sources?

Cheers.

--
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
On Wed, Dec 1, 2010 at 10:28 AM, Basile Starynkevitch
<[hidden email]> wrote:

> On Wed, 1 Dec 2010 10:18:58 -0800
> "H.J. Lu" <[hidden email]> wrote:
>
>> Here is a proposal to update LTO plugin interface.
>
> How should we parse the above sentence?
>
> Is it about an interface to plugin inside binutils to support LTO?
>
> Is it about an interface for GCC plugins to help them be more LTO friendly?
>
> Is it about an interface inside the GOLD linker to dlopen the LTO plugin provided with GCC sources?
>

It is about external linker plugin API as specified at

http://gcc.gnu.org/wiki/whopr/driver

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Ian Lance Taylor-3
In reply to this post by H.J. Lu-30
"H.J. Lu" <[hidden email]> writes:

> b. Compiler plugin controls what linker uses to generate the final executable:
> i. The linker command line order should be the same, with or without LTO.
> c. Add a cmdline bit field to
> struct ld_plugin_input_file
> {
>   const char *name;
>   int fd;
>   off_t offset;
>   off_t filesize;
>   void *handle;
>   unsigned int cmdline : 1;
> };

Just make it an int.  But I don't see why this is needed.  The plugin
already knows the files that it passed to add_input_file and
add_input_library.  Why does it need to linker to report back where the
file came from?  Why doesn't the plugin just keep track?

Ian
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
On Wed, Dec 1, 2010 at 10:54 AM, Ian Lance Taylor <[hidden email]> wrote:

> "H.J. Lu" <[hidden email]> writes:
>
>>       b. Compiler plugin controls what linker uses to generate the final executable:
>>               i. The linker command line order should be the same, with or without LTO.
>>       c. Add a cmdline bit field to
>>       struct ld_plugin_input_file
>>       {
>>          const char *name;
>>          int fd;
>>          off_t offset;
>>          off_t filesize;
>>          void *handle;
>>          unsigned int cmdline : 1;
>>       };
>
> Just make it an int.  But I don't see why this is needed.  The plugin
> already knows the files that it passed to add_input_file and
> add_input_library.  Why does it need to linker to report back where the
> file came from?  Why doesn't the plugin just keep track?
>

It is used to keep the same linker command line order. With LTO,
linker should use

crtX.o *trans*.o -lbar -lgcc -lc ... crtX.o

instead of

crtX.o -lbar -lgcc -lc ... crtX.o  *trans*.o

to generate final executable.  2 orders may generate different
executables.

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Jan Hubicka-2
> On Wed, Dec 1, 2010 at 10:54 AM, Ian Lance Taylor <[hidden email]> wrote:
> > "H.J. Lu" <[hidden email]> writes:
> >
> >>       b. Compiler plugin controls what linker uses to generate the final executable:
> >>               i. The linker command line order should be the same, with or without LTO.
> >>       c. Add a cmdline bit field to
> >>       struct ld_plugin_input_file
> >>       {
> >>          const char *name;
> >>          int fd;
> >>          off_t offset;
> >>          off_t filesize;
> >>          void *handle;
> >>          unsigned int cmdline : 1;
> >>       };
> >
> > Just make it an int.  But I don't see why this is needed.  The plugin
> > already knows the files that it passed to add_input_file and
> > add_input_library.  Why does it need to linker to report back where the
> > file came from?  Why doesn't the plugin just keep track?
> >
>
> It is used to keep the same linker command line order. With LTO,
> linker should use
>
> crtX.o *trans*.o -lbar -lgcc -lc ... crtX.o
>
> instead of
>
> crtX.o -lbar -lgcc -lc ... crtX.o  *trans*.o
>
> to generate final executable.  2 orders may generate different
> executables.

Hmm and when I have something like

ctrX.o non-lto1.o lto1.o non-lto2.o lto2.o .... crtX.o
and then linker plugin produce ltrans0.o combining both lto1.o and lto2.o, ho
we will deal with non-lto2.o?

If we get into extending linker plugin interface, it would be great if we would
do somehting about COMDAT.  We now have RESOLVED and RESOLVED_IRONLY, while the
problem is that all non-hidden COMDAT symbols get RESOLVED that pretty much
fixes them in the output library.

I would propose adding RESOLVED_IRDYNAMIC for cases where symbol was resolved
IRONLY except that it is externally visible to dynamic linker.  We can then allow
compiler to optimize this symbol out (same way as IRONLY) if it knows it may or
may not be exported - i.e. from COMDAT flag or via -fwhole-program.

Honza
>
> --
> H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
2010/12/1 Jan Hubicka <[hidden email]>:

>> On Wed, Dec 1, 2010 at 10:54 AM, Ian Lance Taylor <[hidden email]> wrote:
>> > "H.J. Lu" <[hidden email]> writes:
>> >
>> >>       b. Compiler plugin controls what linker uses to generate the final executable:
>> >>               i. The linker command line order should be the same, with or without LTO.
>> >>       c. Add a cmdline bit field to
>> >>       struct ld_plugin_input_file
>> >>       {
>> >>          const char *name;
>> >>          int fd;
>> >>          off_t offset;
>> >>          off_t filesize;
>> >>          void *handle;
>> >>          unsigned int cmdline : 1;
>> >>       };
>> >
>> > Just make it an int.  But I don't see why this is needed.  The plugin
>> > already knows the files that it passed to add_input_file and
>> > add_input_library.  Why does it need to linker to report back where the
>> > file came from?  Why doesn't the plugin just keep track?
>> >
>>
>> It is used to keep the same linker command line order. With LTO,
>> linker should use
>>
>> crtX.o *trans*.o -lbar -lgcc -lc ... crtX.o
>>
>> instead of
>>
>> crtX.o -lbar -lgcc -lc ... crtX.o  *trans*.o
>>
>> to generate final executable.  2 orders may generate different
>> executables.
>
> Hmm and when I have something like
>
> ctrX.o non-lto1.o lto1.o non-lto2.o lto2.o .... crtX.o
> and then linker plugin produce ltrans0.o combining both lto1.o and lto2.o, ho
> we will deal with non-lto2.o?
>

My current implementation groups all LTO files together and linker will see

ctrX.o non-lto1.o ltrans0.o non-lto2.o .... crtX.o

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Ian Lance Taylor-3
In reply to this post by H.J. Lu-30
"H.J. Lu" <[hidden email]> writes:

> On Wed, Dec 1, 2010 at 10:54 AM, Ian Lance Taylor <[hidden email]> wrote:
>> "H.J. Lu" <[hidden email]> writes:
>>
>>>       b. Compiler plugin controls what linker uses to generate the final executable:
>>>               i. The linker command line order should be the same, with or without LTO.
>>>       c. Add a cmdline bit field to
>>>       struct ld_plugin_input_file
>>>       {
>>>          const char *name;
>>>          int fd;
>>>          off_t offset;
>>>          off_t filesize;
>>>          void *handle;
>>>          unsigned int cmdline : 1;
>>>       };
>>
>> Just make it an int.  But I don't see why this is needed.  The plugin
>> already knows the files that it passed to add_input_file and
>> add_input_library.  Why does it need to linker to report back where the
>> file came from?  Why doesn't the plugin just keep track?
>>
>
> It is used to keep the same linker command line order. With LTO,
> linker should use
>
> crtX.o *trans*.o -lbar -lgcc -lc ... crtX.o
>
> instead of
>
> crtX.o -lbar -lgcc -lc ... crtX.o  *trans*.o
>
> to generate final executable.  2 orders may generate different
> executables.

I'm sorry, I'm missing something.  What does adding that bit have to do
with keeping the same linker command line order?

Is your concern that when the plugin adds a new input file to the link,
that new input file does not cause additional objects to be pulled out
of archives later in the link?  At least in gold, what matters for that
is when the plugin calls the add_input_file or add_input_library
callback.  In gold it would be fairly difficult to have that work any
other way.

Ian
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
On Wed, Dec 1, 2010 at 11:12 AM, Ian Lance Taylor <[hidden email]> wrote:

> "H.J. Lu" <[hidden email]> writes:
>
>> On Wed, Dec 1, 2010 at 10:54 AM, Ian Lance Taylor <[hidden email]> wrote:
>>> "H.J. Lu" <[hidden email]> writes:
>>>
>>>>       b. Compiler plugin controls what linker uses to generate the final executable:
>>>>               i. The linker command line order should be the same, with or without LTO.
>>>>       c. Add a cmdline bit field to
>>>>       struct ld_plugin_input_file
>>>>       {
>>>>          const char *name;
>>>>          int fd;
>>>>          off_t offset;
>>>>          off_t filesize;
>>>>          void *handle;
>>>>          unsigned int cmdline : 1;
>>>>       };
>>>
>>> Just make it an int.  But I don't see why this is needed.  The plugin
>>> already knows the files that it passed to add_input_file and
>>> add_input_library.  Why does it need to linker to report back where the
>>> file came from?  Why doesn't the plugin just keep track?
>>>
>>
>> It is used to keep the same linker command line order. With LTO,
>> linker should use
>>
>> crtX.o *trans*.o -lbar -lgcc -lc ... crtX.o
>>
>> instead of
>>
>> crtX.o -lbar -lgcc -lc ... crtX.o  *trans*.o
>>
>> to generate final executable.  2 orders may generate different
>> executables.
>
> I'm sorry, I'm missing something.  What does adding that bit have to do
> with keeping the same linker command line order?

We don't want to put all unclaimed files passed to plugin back to linker.
On Linux,

[hjl@gnu-6 gcc-lto]$ cat /usr/lib/libc.so
/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
OUTPUT_FORMAT(elf32-i386)
GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED (
/lib/ld-linux.so.2 ) )
[hjl@gnu-6 gcc-lto]$

Linker should use /usr/lib/libc.so, not /lib/libc.so.6,
/usr/lib/libc_nonshared.a,
/lib/ld-linux.so.2,  for final linker.  With the new cmdline field,
plugin can only pass
those unclaimed files from linker command line back to linker for the
final link.

> Is your concern that when the plugin adds a new input file to the link,
> that new input file does not cause additional objects to be pulled out
> of archives later in the link?  At least in gold, what matters for that
> is when the plugin calls the add_input_file or add_input_library
> callback.  In gold it would be fairly difficult to have that work any
> other way.
>

Please try the testcase in

http://sourceware.org/bugzilla/show_bug.cgi?id=12248#c5

with gold.

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Ian Lance Taylor-3
"H.J. Lu" <[hidden email]> writes:

> We don't want to put all unclaimed files passed to plugin back to linker.
> On Linux,
>
> [hjl@gnu-6 gcc-lto]$ cat /usr/lib/libc.so
> /* GNU ld script
>    Use the shared library, but some functions are only in
>    the static library, so try that secondarily.  */
> OUTPUT_FORMAT(elf32-i386)
> GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED (
> /lib/ld-linux.so.2 ) )
> [hjl@gnu-6 gcc-lto]$
>
> Linker should use /usr/lib/libc.so, not /lib/libc.so.6,
> /usr/lib/libc_nonshared.a,
> /lib/ld-linux.so.2,  for final linker.  With the new cmdline field,
> plugin can only pass
> those unclaimed files from linker command line back to linker for the
> final link.

Thanks, at least now I understand what the new field means: it is true
for a file explicitly named on the command line, false for a file named
in a linker script.

Are you planning to have the plugin claim all files, even linker
scripts, and then pass only the command line files back to the linker?

Ian
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:

> "H.J. Lu" <[hidden email]> writes:
>
>> We don't want to put all unclaimed files passed to plugin back to linker.
>> On Linux,
>>
>> [hjl@gnu-6 gcc-lto]$ cat /usr/lib/libc.so
>> /* GNU ld script
>>    Use the shared library, but some functions are only in
>>    the static library, so try that secondarily.  */
>> OUTPUT_FORMAT(elf32-i386)
>> GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED (
>> /lib/ld-linux.so.2 ) )
>> [hjl@gnu-6 gcc-lto]$
>>
>> Linker should use /usr/lib/libc.so, not /lib/libc.so.6,
>> /usr/lib/libc_nonshared.a,
>> /lib/ld-linux.so.2,  for final linker.  With the new cmdline field,
>> plugin can only pass
>> those unclaimed files from linker command line back to linker for the
>> final link.
>
> Thanks, at least now I understand what the new field means: it is true
> for a file explicitly named on the command line, false for a file named
> in a linker script.
>
> Are you planning to have the plugin claim all files, even linker
> scripts, and then pass only the command line files back to the linker?
>

Plugin will keep the same claim strategy.  For those aren't claimed by
plugin, plugin will save and pass them back to linker only if they are
specified at command line.


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Ian Lance Taylor-3
"H.J. Lu" <[hidden email]> writes:

> On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:
>
>> Are you planning to have the plugin claim all files, even linker
>> scripts, and then pass only the command line files back to the linker?
>>
>
> Plugin will keep the same claim strategy.  For those aren't claimed by
> plugin, plugin will save and pass them back to linker only if they are
> specified at command line.

Just to be clear, that does not make sense as written.  If the plugin
does not claim a file, it should not then pass it back to the linker.

In fact, if the plugin claims all files, then as far as I can see your
new ld_plugin_input_file field is not required.  And if the plugin does
not claim all files, I don't see how this can work.

Ian
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
On Wed, Dec 1, 2010 at 12:55 PM, Ian Lance Taylor <[hidden email]> wrote:

> "H.J. Lu" <[hidden email]> writes:
>
>> On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:
>>
>>> Are you planning to have the plugin claim all files, even linker
>>> scripts, and then pass only the command line files back to the linker?
>>>
>>
>> Plugin will keep the same claim strategy.  For those aren't claimed by
>> plugin, plugin will save and pass them back to linker only if they are
>> specified at command line.
>
> Just to be clear, that does not make sense as written.  If the plugin
> does not claim a file, it should not then pass it back to the linker.

API has

typedef
enum ld_plugin_status
(*ld_plugin_claim_file_handler) (
  const struct ld_plugin_input_file *file, int *claimed);

For linker script, archive, DSO and object file without IR,
*claimed will return 0 and plugin will save and pass it back to
linker later in  if it is specified at command line.

> In fact, if the plugin claims all files, then as far as I can see your
> new ld_plugin_input_file field is not required.  And if the plugin does
> not claim all files, I don't see how this can work.

Stage 2 linker should:

1. Discard all previous inputs.
2. Generate the final executable with inputs from plugin, which include
linker script, archive, DSO and object file without IR specified at
command line as well as trans files from LTO.

My implementation is available on hjl/lto branch at

http://git.kernel.org/?p=devel/binutils/hjl/x86.git;a=summary
http://git.kernel.org/?p=devel/gcc/hjl/x86.git;a=summary


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Ian Lance Taylor-3
"H.J. Lu" <[hidden email]> writes:

> On Wed, Dec 1, 2010 at 12:55 PM, Ian Lance Taylor <[hidden email]> wrote:
>> "H.J. Lu" <[hidden email]> writes:
>>
>>> On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:
>>>
>>>> Are you planning to have the plugin claim all files, even linker
>>>> scripts, and then pass only the command line files back to the linker?
>>>>
>>>
>>> Plugin will keep the same claim strategy.  For those aren't claimed by
>>> plugin, plugin will save and pass them back to linker only if they are
>>> specified at command line.
>>
>> Just to be clear, that does not make sense as written.  If the plugin
>> does not claim a file, it should not then pass it back to the linker.
>
> API has
>
> typedef
> enum ld_plugin_status
> (*ld_plugin_claim_file_handler) (
>   const struct ld_plugin_input_file *file, int *claimed);
>
> For linker script, archive, DSO and object file without IR,
> *claimed will return 0 and plugin will save and pass it back to
> linker later in  if it is specified at command line.

I don't understand what you wrote, so I am going to write what I think
happens.

The claim_file handler is an interface provided by the plugin itself.
The plugin will register it via LDPT_REGISTER_CLAIM_FILE_HOOK.  The
linker proper will call it for each input file.

In the case of the LTO plugin, this is the static function
claim_file_handler in lto-plugin.c.

If the plugin registers a claim_file handler, and, when the linker calls
it, it returns with *claimed == 0, then the linker will process the file
as it normally does.  Since the file will already have been processed,
it does not make sense for the plugin to then pass it back to the
linker.  The effect would be similar to listing the file twice on the
command line.


>> In fact, if the plugin claims all files, then as far as I can see your
>> new ld_plugin_input_file field is not required.  And if the plugin does
>> not claim all files, I don't see how this can work.
>
> Stage 2 linker should:
>
> 1. Discard all previous inputs.

How is this step done?


> My implementation is available on hjl/lto branch at

Thanks, but I don't see any changes to gold there, so I don't see what
you have done to change the plugin interface.

Ian
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Richard Biener
On Wed, Dec 1, 2010 at 10:28 PM, Ian Lance Taylor <[hidden email]> wrote:

> "H.J. Lu" <[hidden email]> writes:
>
>> On Wed, Dec 1, 2010 at 12:55 PM, Ian Lance Taylor <[hidden email]> wrote:
>>> "H.J. Lu" <[hidden email]> writes:
>>>
>>>> On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:
>>>>
>>>>> Are you planning to have the plugin claim all files, even linker
>>>>> scripts, and then pass only the command line files back to the linker?
>>>>>
>>>>
>>>> Plugin will keep the same claim strategy.  For those aren't claimed by
>>>> plugin, plugin will save and pass them back to linker only if they are
>>>> specified at command line.
>>>
>>> Just to be clear, that does not make sense as written.  If the plugin
>>> does not claim a file, it should not then pass it back to the linker.
>>
>> API has
>>
>> typedef
>> enum ld_plugin_status
>> (*ld_plugin_claim_file_handler) (
>>   const struct ld_plugin_input_file *file, int *claimed);
>>
>> For linker script, archive, DSO and object file without IR,
>> *claimed will return 0 and plugin will save and pass it back to
>> linker later in  if it is specified at command line.
>
> I don't understand what you wrote, so I am going to write what I think
> happens.
>
> The claim_file handler is an interface provided by the plugin itself.
> The plugin will register it via LDPT_REGISTER_CLAIM_FILE_HOOK.  The
> linker proper will call it for each input file.
>
> In the case of the LTO plugin, this is the static function
> claim_file_handler in lto-plugin.c.
>
> If the plugin registers a claim_file handler, and, when the linker calls
> it, it returns with *claimed == 0, then the linker will process the file
> as it normally does.  Since the file will already have been processed,
> it does not make sense for the plugin to then pass it back to the
> linker.  The effect would be similar to listing the file twice on the
> command line.

The basic problem is that if lto-plugin claims a file and provides a symtab
to the linker the link-time optimization might change that, including
adding new undefined symbols (think of libcalls).  The linker needs
to re-process even not-claimed static archives (such as libgcc) to
resolve those new undefs.  We hack around this by adding another
-lgcc at the end of the command-line, but that does change linker
resolution as the link order does matter.

Basically we need to trigger a complete re-link with the claimed
object files substituted for the link-time optimized ones.

Richard.

>
>>> In fact, if the plugin claims all files, then as far as I can see your
>>> new ld_plugin_input_file field is not required.  And if the plugin does
>>> not claim all files, I don't see how this can work.
>>
>> Stage 2 linker should:
>>
>> 1. Discard all previous inputs.
>
> How is this step done?
>
>
>> My implementation is available on hjl/lto branch at
>
> Thanks, but I don't see any changes to gold there, so I don't see what
> you have done to change the plugin interface.
>
> Ian
>
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
In reply to this post by Ian Lance Taylor-3
On Wed, Dec 1, 2010 at 1:28 PM, Ian Lance Taylor <[hidden email]> wrote:

> "H.J. Lu" <[hidden email]> writes:
>
>> On Wed, Dec 1, 2010 at 12:55 PM, Ian Lance Taylor <[hidden email]> wrote:
>>> "H.J. Lu" <[hidden email]> writes:
>>>
>>>> On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:
>>>>
>>>>> Are you planning to have the plugin claim all files, even linker
>>>>> scripts, and then pass only the command line files back to the linker?
>>>>>
>>>>
>>>> Plugin will keep the same claim strategy.  For those aren't claimed by
>>>> plugin, plugin will save and pass them back to linker only if they are
>>>> specified at command line.
>>>
>>> Just to be clear, that does not make sense as written.  If the plugin
>>> does not claim a file, it should not then pass it back to the linker.
>>
>> API has
>>
>> typedef
>> enum ld_plugin_status
>> (*ld_plugin_claim_file_handler) (
>>   const struct ld_plugin_input_file *file, int *claimed);
>>
>> For linker script, archive, DSO and object file without IR,
>> *claimed will return 0 and plugin will save and pass it back to
>> linker later in  if it is specified at command line.
>
> I don't understand what you wrote, so I am going to write what I think
> happens.
>
> The claim_file handler is an interface provided by the plugin itself.
> The plugin will register it via LDPT_REGISTER_CLAIM_FILE_HOOK.  The
> linker proper will call it for each input file.
>
> In the case of the LTO plugin, this is the static function
> claim_file_handler in lto-plugin.c.
>
> If the plugin registers a claim_file handler, and, when the linker calls
> it, it returns with *claimed == 0, then the linker will process the file
> as it normally does.  Since the file will already have been processed,
> it does not make sense for the plugin to then pass it back to the
> linker.  The effect would be similar to listing the file twice on the
> command line.

That is what "Discard all previous inputs" in stage 2 linking is for.

>
>>> In fact, if the plugin claims all files, then as far as I can see your
>>> new ld_plugin_input_file field is not required.  And if the plugin does
>>> not claim all files, I don't see how this can work.
>>
>> Stage 2 linker should:
>>
>> 1. Discard all previous inputs.
>
> How is this step done?

For GNU linker, I mark all sections in a bfd file, which
will be sent back from plugin, with SEC_EXCLUDE. I also
free and recreate the output hash table.

>
>> My implementation is available on hjl/lto branch at
>
> Thanks, but I don't see any changes to gold there, so I don't see what
> you have done to change the plugin interface.
>

My changes should be visible now.


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

H.J. Lu-30
In reply to this post by Richard Biener
On Wed, Dec 1, 2010 at 1:33 PM, Richard Guenther
<[hidden email]> wrote:

> On Wed, Dec 1, 2010 at 10:28 PM, Ian Lance Taylor <[hidden email]> wrote:
>> "H.J. Lu" <[hidden email]> writes:
>>
>>> On Wed, Dec 1, 2010 at 12:55 PM, Ian Lance Taylor <[hidden email]> wrote:
>>>> "H.J. Lu" <[hidden email]> writes:
>>>>
>>>>> On Wed, Dec 1, 2010 at 12:37 PM, Ian Lance Taylor <[hidden email]> wrote:
>>>>>
>>>>>> Are you planning to have the plugin claim all files, even linker
>>>>>> scripts, and then pass only the command line files back to the linker?
>>>>>>
>>>>>
>>>>> Plugin will keep the same claim strategy.  For those aren't claimed by
>>>>> plugin, plugin will save and pass them back to linker only if they are
>>>>> specified at command line.
>>>>
>>>> Just to be clear, that does not make sense as written.  If the plugin
>>>> does not claim a file, it should not then pass it back to the linker.
>>>
>>> API has
>>>
>>> typedef
>>> enum ld_plugin_status
>>> (*ld_plugin_claim_file_handler) (
>>>   const struct ld_plugin_input_file *file, int *claimed);
>>>
>>> For linker script, archive, DSO and object file without IR,
>>> *claimed will return 0 and plugin will save and pass it back to
>>> linker later in  if it is specified at command line.
>>
>> I don't understand what you wrote, so I am going to write what I think
>> happens.
>>
>> The claim_file handler is an interface provided by the plugin itself.
>> The plugin will register it via LDPT_REGISTER_CLAIM_FILE_HOOK.  The
>> linker proper will call it for each input file.
>>
>> In the case of the LTO plugin, this is the static function
>> claim_file_handler in lto-plugin.c.
>>
>> If the plugin registers a claim_file handler, and, when the linker calls
>> it, it returns with *claimed == 0, then the linker will process the file
>> as it normally does.  Since the file will already have been processed,
>> it does not make sense for the plugin to then pass it back to the
>> linker.  The effect would be similar to listing the file twice on the
>> command line.
>
> The basic problem is that if lto-plugin claims a file and provides a symtab
> to the linker the link-time optimization might change that, including
> adding new undefined symbols (think of libcalls).  The linker needs
> to re-process even not-claimed static archives (such as libgcc) to
> resolve those new undefs.  We hack around this by adding another
> -lgcc at the end of the command-line, but that does change linker
> resolution as the link order does matter.
>
> Basically we need to trigger a complete re-link with the claimed
> object files substituted for the link-time optimized ones.
>

That is what my implementation does.


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Ian Lance Taylor-3
In reply to this post by H.J. Lu-30
"H.J. Lu" <[hidden email]> writes:

> On Wed, Dec 1, 2010 at 1:28 PM, Ian Lance Taylor <[hidden email]> wrote:
>> "H.J. Lu" <[hidden email]> writes:
>>> For linker script, archive, DSO and object file without IR,
>>> *claimed will return 0 and plugin will save and pass it back to
>>> linker later in  if it is specified at command line.
>>
>> I don't understand what you wrote, so I am going to write what I think
>> happens.
>>
>> The claim_file handler is an interface provided by the plugin itself.
>> The plugin will register it via LDPT_REGISTER_CLAIM_FILE_HOOK.  The
>> linker proper will call it for each input file.
>>
>> In the case of the LTO plugin, this is the static function
>> claim_file_handler in lto-plugin.c.
>>
>> If the plugin registers a claim_file handler, and, when the linker calls
>> it, it returns with *claimed == 0, then the linker will process the file
>> as it normally does.  Since the file will already have been processed,
>> it does not make sense for the plugin to then pass it back to the
>> linker.  The effect would be similar to listing the file twice on the
>> command line.
>
> That is what "Discard all previous inputs" in stage 2 linking is for.

But what does that mean?  Are you saying that the linker interface to
the plugin should change to work that way?  If we do that, then we
should change other aspects of the plugin interface as well.  It could
probably become quite a bit simpler.


The only reason we would ever need to do a complete relink is if the LTO
plugin can introduce arbitrary new symbol references.  Is that ever
possible?  If it is, we need to rethink the whole approach.  If the LTO
plugin can introduce arbitrary new symbol references, that means that
LTO plugin can cause arbitrary objects to be pulled in from archives.
And that means that if we only run the plugin once, we are losing
possible optimizations, because the plugin will never those new objects.


My suspicion is that the LTO plugin can only introduce a small bounded
set of new symbol references, namely those which we assume can be
satisified from -lc or -lgcc.  Is that true?

Ian
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Cary Coutant-2
In reply to this post by Jan Hubicka-2
> If we get into extending linker plugin interface, it would be great if we would
> do somehting about COMDAT.  We now have RESOLVED and RESOLVED_IRONLY, while the
> problem is that all non-hidden COMDAT symbols get RESOLVED that pretty much
> fixes them in the output library.
>
> I would propose adding RESOLVED_IRDYNAMIC for cases where symbol was resolved
> IRONLY except that it is externally visible to dynamic linker.  We can then allow
> compiler to optimize this symbol out (same way as IRONLY) if it knows it may or
> may not be exported - i.e. from COMDAT flag or via -fwhole-program.

(This is off the main topic...)

Actually, we have PREVAILING_DEF and PREVAILING_DEF_IRONLY, plus
RESOLVED_IR, RESOLVED_EXEC, and RESOLVED_DYN. If the symbol was
resolved elsewhere, we don't have any way to say whether it was IRONLY
or not, and that's a problem for common symbols, because there really
is no prevailing def -- the linker just allocates the space itself.
Currently, gold picks one of the common symbols and calls it the
prevailing def, but the one it picks might not actually be the largest
one. I'd prefer to add something like COMMON and COMMON_IRONLY as
possible resolutions.

I'm not sure if you're talking about that, or about real COMDAT
groups. As far as gold is concerned, it picks one COMDAT group and
throws the rest of them away, but for the one it picks, you'll get
either PREVAILING_DEF or PREVAILING_DEF_IRONLY. That should tell the
compiler what it needs to know.

I'm also not sure what you mean by "resolved IRONLY except that it is
externally visible to the dynamic linker." If we're building a shared
library, and the symbol is exported, it's not going to be IRONLY, and
I don't see how it would be valid to optimize it out. If we're
building an executable with --export-dynamic, same thing.

-cary
Reply | Threaded
Open this post in threaded view
|

Re: Update LTO plugin interface

Cary Coutant-2
In reply to this post by Ian Lance Taylor-3
>> That is what "Discard all previous inputs" in stage 2 linking is for.
>
> But what does that mean?  Are you saying that the linker interface to
> the plugin should change to work that way?  If we do that, then we
> should change other aspects of the plugin interface as well.  It could
> probably become quite a bit simpler.
>
> The only reason we would ever need to do a complete relink is if the LTO
> plugin can introduce arbitrary new symbol references.  Is that ever
> possible?  If it is, we need to rethink the whole approach.  If the LTO
> plugin can introduce arbitrary new symbol references, that means that
> LTO plugin can cause arbitrary objects to be pulled in from archives.
> And that means that if we only run the plugin once, we are losing
> possible optimizations, because the plugin will never those new objects.
>
> My suspicion is that the LTO plugin can only introduce a small bounded
> set of new symbol references, namely those which we assume can be
> satisified from -lc or -lgcc.  Is that true?

Exactly. The plugin API was designed for this model -- if you want to
start the link all over again, you may as well stick with the collect2
approach and enhance it to deal with archives of IR files.

The plugin API, as implemented in gold (not sure about gnu ld), does
maintain the original order of input files as far as symbol binding is
concerned. When IR files are claimed, the plugin provides the list of
symbols defined and referenced, and the linker builds the symbol table
as if those files were linked in at that particular spot in the
command line. When the compiler provides real definitions of those
symbols later, the real definitions simply replace the "placeholders"
that were left in the linker's symbol table. The only aspect of link
order that isn't maintained is the physical order of the sections in
memory.

As Ian noted, if the compiler introduces new references that weren't
there before, the new references must be from a limited set of
libcalls that the backend can introduce, and those should all be
resolved with an extra pass through -lc or -lgcc. That's not exactly
pretty, but I don't see how it destroys the notion of link order --
the only way those new symbols could have been resolved differently is
if a user library interposed definitions for the libcall, and those
certainly can't be what the compiler intended to bind to. In PR 12248,
I think it's questionable to claim that the compiler-introduced call
to __udivdi3 should not resolve to the version in libgcc. Sure, I
understand it's useful for library developers while debugging and
testing, but an ordinary user certainly can't count on his own
definition of that routine to get called -- the compiler might
generate the division inline, or call a different specialized version.
All of these routines are outside the user's namespace, and we should
be able to optimize without regard for what the user's libraries might
contain.

An improvement could be for the claim file handler to determine what
libcalls might be introduced and add them to the list of referenced
symbols so that the linker can bring in the definitions in the
original pass through the input files -- any that end up not being
referenced can be garbage collected. Alternatively, we could do a
whole-archive link of the library that contains the libcalls, again
discarding unreferenced routines via garbage collection. Neither of
these require a change to the API.

-cary
1234