2.25 freeze status

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

2.25 freeze status

Siddhesh Poyarekar-8
Hi,

The release date of 1 Feb is upon us and there are 3 release blockers
that haven't been resolved yet:

- global-dynamic TLS broken on aarch64 and others
- Fix for bug 20019 causes linker errors for shared libraries using
  longjmp
- tunables: insecure environment variables passed to subprocesses with
  AT_SECURE

I've got the tunables one and I hope to post a patch by tonight, latest
by tomorrow morning.  What is the status on the other two?

Also, assuming that these fixes go in next week, would arch maintainers
like another week to run a final set of tests before I cut the release
branch?  If not then I'll simply cut the release 2 days after the
release blockers are resolved.  If yes then I'll cut the release 7 days
after the current release blockers are resolved.

Does that sound good?

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Florian Weimer-5
On 01/27/2017 05:06 AM, Siddhesh Poyarekar wrote:
> Hi,
>
> The release date of 1 Feb is upon us and there are 3 release blockers
> that haven't been resolved yet:
>
> - global-dynamic TLS broken on aarch64 and others

Fix is known (revert part of a faulty commit), it just needs review.

> - Fix for bug 20019 causes linker errors for shared libraries using
>   longjmp

This is bug 21041 and the situation doesn't look so good.

The fix for bug 20019 goes against the published guidelines for IFUNC
resolvers: If IFUNC resolvers must not depend on relocations, then the
fix breaks valid usage scenarios (if there are no relocations involved,
it does not matter if the IFUNC resolver object has been relocated or not).

On the other hand, most IFUNC resolvers in glibc need relocations, so
the guidelines are wrong as far as current glibc usage goes.

We don't have consensus yet as a project which way to go: Fix the glibc
IFUNC resolvers (perhaps by changing the IFUNC resolver signature), or
relax the requirement on IFUNC resolvers so that the glibc usage becomes
valid.  There aren't patches yet which could be committed, either.

We have not received reports from our testing in Fedora rawhide that bug
21041 is a regular occurrence in practice.  So perhaps we should not
treat this as a blocker.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Szabolcs Nagy-2
On 27/01/17 07:47, Florian Weimer wrote:
> On 01/27/2017 05:06 AM, Siddhesh Poyarekar wrote:
>> Hi,
>>
>> The release date of 1 Feb is upon us and there are 3 release blockers
>> that haven't been resolved yet:
>>
>> - global-dynamic TLS broken on aarch64 and others
>
> Fix is known (revert part of a faulty commit), it just needs review.

i think the hunk mentioned in
https://sourceware.org/bugzilla/show_bug.cgi?id=20915
should be just reverted without further review.

writing to the dtv of other threads is neither
necessary nor correct.

concurrency notes and comments can be discussed later
(without blocking the release), it's unreasonable to
leave such a known issue in the code.

Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Florian Weimer-5
On 01/27/2017 05:27 PM, Szabolcs Nagy wrote:

> On 27/01/17 07:47, Florian Weimer wrote:
>> On 01/27/2017 05:06 AM, Siddhesh Poyarekar wrote:
>>> Hi,
>>>
>>> The release date of 1 Feb is upon us and there are 3 release blockers
>>> that haven't been resolved yet:
>>>
>>> - global-dynamic TLS broken on aarch64 and others
>>
>> Fix is known (revert part of a faulty commit), it just needs review.
>
> i think the hunk mentioned in
> https://sourceware.org/bugzilla/show_bug.cgi?id=20915
> should be just reverted without further review.
>
> writing to the dtv of other threads is neither
> necessary nor correct.
Let's do it then.  Is this patch okay?

Thanks,
Florian


bug20915.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Szabolcs Nagy-2
On 27/01/17 16:46, Florian Weimer wrote:
> On 01/27/2017 05:27 PM, Szabolcs Nagy wrote:
>> i think the hunk mentioned in
>> https://sourceware.org/bugzilla/show_bug.cgi?id=20915
>> should be just reverted without further review.
>
> Let's do it then.  Is this patch okay?
...
> nptl: Do not overwrite the DTV of other threads [BZ #20915]
>
> This reverts part of commit 17af5da98cd2c9ec958421ae2108f877e0945451.
>
> 2017-01-27  Florian Weimer  <[hidden email]>
>
> [BZ #20915]
> * nptl/allocatestack.c (init_one_static_tls): Do not write to the
> DTV of other threads.

looks ok to me.

Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

H.J. Lu-30
In reply to this post by Siddhesh Poyarekar-8
On Thu, Jan 26, 2017 at 8:06 PM, Siddhesh Poyarekar <[hidden email]> wrote:

> Hi,
>
> The release date of 1 Feb is upon us and there are 3 release blockers
> that haven't been resolved yet:
>
> - global-dynamic TLS broken on aarch64 and others
> - Fix for bug 20019 causes linker errors for shared libraries using
>   longjmp
> - tunables: insecure environment variables passed to subprocesses with
>   AT_SECURE
>
> I've got the tunables one and I hope to post a patch by tonight, latest
> by tomorrow morning.  What is the status on the other two?
>
> Also, assuming that these fixes go in next week, would arch maintainers
> like another week to run a final set of tests before I cut the release
> branch?  If not then I'll simply cut the release 2 days after the
> release blockers are resolved.  If yes then I'll cut the release 7 days
> after the current release blockers are resolved.
>
> Does that sound good?
>
> Siddhesh
I am testing this patch for

https://sourceware.org/bugzilla/show_bug.cgi?id=21081

I'd like to check it in before code freeze.

--
H.J.

0001-Add-VZEROUPPER-to-memset-vec-unaligned-erms.S-BZ-210.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Carlos O'Donell-6
In reply to this post by Siddhesh Poyarekar-8
On 01/26/2017 11:06 PM, Siddhesh Poyarekar wrote:
> Hi,
>
> The release date of 1 Feb is upon us and there are 3 release blockers
> that haven't been resolved yet:
>
> - global-dynamic TLS broken on aarch64 and others

As Florian and Szabolcs note there is a relatively straight forward fix
for this, and we should just commit it while I continue review.

> - Fix for bug 20019 causes linker errors for shared libraries using
>   longjmp

I think we should remove bug 21041 from the blocker list.

Whatever we might think about the seriousness of the issue, it has existed
for 4 or 5 releases and few people have run into it, which means we have
time to fix it i.e. not a blocker.

> - tunables: insecure environment variables passed to subprocesses with
>   AT_SECURE
>
> I've got the tunables one and I hope to post a patch by tonight, latest
> by tomorrow morning.  What is the status on the other two?

OK.

So I think we're almost done on the blocker front.

> Also, assuming that these fixes go in next week, would arch maintainers
> like another week to run a final set of tests before I cut the release
> branch?  If not then I'll simply cut the release 2 days after the
> release blockers are resolved.  If yes then I'll cut the release 7 days
> after the current release blockers are resolved.

No, I don't think we need the extra time. We should just stick to our
schedule. I'll run multiple arch testing on Monday.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Siddhesh Poyarekar-8
In reply to this post by H.J. Lu-30
On Friday 27 January 2017 11:27 PM, H.J. Lu wrote:

> I am testing this patch for
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=21081
>
> I'd like to check it in before code freeze.
>
>
> 0001-Add-VZEROUPPER-to-memset-vec-unaligned-erms.S-BZ-210.patch
>
>
> From 9097edb85e04c137f226f3d371afff34a4ab17b7 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" <[hidden email]>
> Date: Tue, 24 Jan 2017 15:58:49 -0800
> Subject: [PATCH] Add VZEROUPPER to memset-vec-unaligned-erms.S [BZ #21081]
>
> Since memset-vec-unaligned-erms.S has VDUP_TO_VEC0_AND_SET_RETURN at
> function entry, memset optimized for AVX2 and AVX512 will always use
> ymm/zmm register. VZEROUPPER should be placed before ret in
>
> L(stosb):
>         movq    %rdx, %rcx
>         movzbl  %sil, %eax
>         movq    %rdi, %rdx
>         rep stosb
>         movq    %rdx, %rax
>         ret
>
> since it can be reached from
>
> L(stosb_more_2x_vec):
>         cmpq    $REP_STOSB_THRESHOLD, %rdx
>         ja      L(stosb)
>
> [BZ #21081]
> * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> (L(stosb)): Add VZEROUPPER before ret.
> ---
>  sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> index ff214f0..704eed9 100644
> --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> @@ -110,6 +110,8 @@ ENTRY (__memset_erms)
>  ENTRY (MEMSET_SYMBOL (__memset, erms))
>  # endif
>  L(stosb):
> + /* Issue vzeroupper before rep stosb.  */
> + VZEROUPPER
>   movq %rdx, %rcx
>   movzbl %sil, %eax
>   movq %rdi, %rdx
>

Looks good to me.

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Phil Blundell-3
In reply to this post by Carlos O'Donell-6
On Sat, 2017-01-28 at 19:00 -0500, Carlos O'Donell wrote:
>
> Whatever we might think about the seriousness of the issue, it has
> existed
> for 4 or 5 releases and few people have run into it, which means we
> have
> time to fix it i.e. not a blocker.

That's not an entirely accurate characterisation.  Bug 20019 has indeed
existed for several releases without anybody noticing, but bug 21041 is
new.

The effect of the older bug 20019 is that affected programs would be at
risk of crashing if they called longjmp().  But the effect of the newer
bug 21041 is that such programs cannot start up at all, even if they
never in fact call longjmp() at runtime.  (It's not clear to me whether
there is a group of binaries which do in fact call longjmp and were
previously working due to some quirk of ordering but are now also being
prevented from loading.)

It's slightly debatable which is worse.  The code as it stands today
certainly gives a more meaningful error message (rather than just
branching to address zero) and the corrective action that it suggests,
although not perfect, probably would indeed work around the bug in at
least most cases.  But on the other hand, there do appear to be at
least some programs which were not in fact crashing before and are now
unusable, so this is a functional regression.

One common library that contains relocations against longjmp() but does
not link with libpthread is libpng.  As far as I can tell any
multithreaded library which links itself with libpng is liable to run
into this problem with the code as it stands.

p.

Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Florian Weimer-5
On 01/30/2017 11:25 AM, Phil Blundell wrote:

> On Sat, 2017-01-28 at 19:00 -0500, Carlos O'Donell wrote:
>>
>> Whatever we might think about the seriousness of the issue, it has
>> existed
>> for 4 or 5 releases and few people have run into it, which means we
>> have
>> time to fix it i.e. not a blocker.
>
> That's not an entirely accurate characterisation.  Bug 20019 has indeed
> existed for several releases without anybody noticing, but bug 21041 is
> new.

Phil, thanks for spelling this out explicitly.

> The effect of the older bug 20019 is that affected programs would be at
> risk of crashing if they called longjmp().  But the effect of the newer
> bug 21041 is that such programs cannot start up at all, even if they
> never in fact call longjmp() at runtime.

I share this concern.

Florian

Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Carlos O'Donell-6
On 01/30/2017 05:48 AM, Florian Weimer wrote:

> On 01/30/2017 11:25 AM, Phil Blundell wrote:
>> On Sat, 2017-01-28 at 19:00 -0500, Carlos O'Donell wrote:
>>>
>>> Whatever we might think about the seriousness of the issue, it has
>>> existed
>>> for 4 or 5 releases and few people have run into it, which means we
>>> have
>>> time to fix it i.e. not a blocker.
>>
>> That's not an entirely accurate characterisation.  Bug 20019 has indeed
>> existed for several releases without anybody noticing, but bug 21041 is
>> new.
>
> Phil, thanks for spelling this out explicitly.
>
>> The effect of the older bug 20019 is that affected programs would be at
>> risk of crashing if they called longjmp().  But the effect of the newer
>> bug 21041 is that such programs cannot start up at all, even if they
>> never in fact call longjmp() at runtime.
>
> I share this concern.

This is enough concern that I now propose we back out the fix for 20019
until we can find an acceptable fix in glibc 2.26.

H.J.,

Could you please back out the fix for bug 20019?

We will continue to try and fix this in 2.26 with a solution that moves
IFUNC design towards a better documented set of semantics.

commit 0e6d3adc60d8073397af6a320e594d98d7fbedde
Author: H.J. Lu <[hidden email]>
Date:   Fri Oct 28 09:11:55 2016 -0700

    Check IFUNC definition in unrelocated shared library [BZ #20019]
   
    Calling an IFUNC function defined in unrelocated shared library may
    lead to segfault.  This patch issues an error message to request
    relinking the shared library if it references IFUNC function defined
    in the unrelocated shared library.
   
            [BZ #20019]
            * sysdeps/i386/dl-machine.h (elf_machine_rel): Check IFUNC
            definition in unrelocated shared library.
            * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

H.J. Lu-30
On Mon, Jan 30, 2017 at 10:55 AM, Carlos O'Donell <[hidden email]> wrote:

> On 01/30/2017 05:48 AM, Florian Weimer wrote:
>> On 01/30/2017 11:25 AM, Phil Blundell wrote:
>>> On Sat, 2017-01-28 at 19:00 -0500, Carlos O'Donell wrote:
>>>>
>>>> Whatever we might think about the seriousness of the issue, it has
>>>> existed
>>>> for 4 or 5 releases and few people have run into it, which means we
>>>> have
>>>> time to fix it i.e. not a blocker.
>>>
>>> That's not an entirely accurate characterisation.  Bug 20019 has indeed
>>> existed for several releases without anybody noticing, but bug 21041 is
>>> new.
>>
>> Phil, thanks for spelling this out explicitly.
>>
>>> The effect of the older bug 20019 is that affected programs would be at
>>> risk of crashing if they called longjmp().  But the effect of the newer
>>> bug 21041 is that such programs cannot start up at all, even if they
>>> never in fact call longjmp() at runtime.
>>
>> I share this concern.
>
> This is enough concern that I now propose we back out the fix for 20019
> until we can find an acceptable fix in glibc 2.26.
>
> H.J.,
>
> Could you please back out the fix for bug 20019?
>
> We will continue to try and fix this in 2.26 with a solution that moves
> IFUNC design towards a better documented set of semantics.
>

Since calling longjmp will segfault in this case, shouldn't it be
fixed first by reverting IFUNC implementation in libpthread?


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Carlos O'Donell-6
On 01/30/2017 02:04 PM, H.J. Lu wrote:

>> H.J.,
>>
>> Could you please back out the fix for bug 20019?
>>
>> We will continue to try and fix this in 2.26 with a solution that moves
>> IFUNC design towards a better documented set of semantics.
>>
>
> Since calling longjmp will segfault in this case, shouldn't it be
> fixed first by reverting IFUNC implementation in libpthread?
 
I believe there is insufficient time to test that such a change and verify
it does not have other unintended consequences for changing a symbol from
IFUNC to non-IFUNC.

The minimal fix is to revert the changes for bug 20019, and allow programs
to startup, and run as expected in the cases they do not call longjmp.

I would like to see the minimum amount of reversion required to get us
back to a state where applications run again.

We have only a few days until the release deadline and I do not wish
to extend that date.

I understand your desire to fix this correctly, and we will continue this
discussion once master reopens, possibly with a reversion of the IFUNC
change to libpthread.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

H.J. Lu-30
On Mon, Jan 30, 2017 at 11:17 AM, Carlos O'Donell <[hidden email]> wrote:

> On 01/30/2017 02:04 PM, H.J. Lu wrote:
>>> H.J.,
>>>
>>> Could you please back out the fix for bug 20019?
>>>
>>> We will continue to try and fix this in 2.26 with a solution that moves
>>> IFUNC design towards a better documented set of semantics.
>>>
>>
>> Since calling longjmp will segfault in this case, shouldn't it be
>> fixed first by reverting IFUNC implementation in libpthread?
>
> I believe there is insufficient time to test that such a change and verify
> it does not have other unintended consequences for changing a symbol from
> IFUNC to non-IFUNC.
>
> The minimal fix is to revert the changes for bug 20019, and allow programs
> to startup, and run as expected in the cases they do not call longjmp.
>
> I would like to see the minimum amount of reversion required to get us
> back to a state where applications run again.
>
> We have only a few days until the release deadline and I do not wish
> to extend that date.
>
> I understand your desire to fix this correctly, and we will continue this
> discussion once master reopens, possibly with a reversion of the IFUNC
> change to libpthread.

I don't think knowingly allow a program to segfault at random without any
warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Florian Weimer-5
On 01/30/2017 08:39 PM, H.J. Lu wrote:
> I don't think knowingly allow a program to segfault at random without any
> warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?

We had extensive discussions previously, and the consensus at the time
was that we do not want error detection for corrupt binaries in the
dynamic linker.

I think this is the thread I remember:

   <https://sourceware.org/ml/libc-alpha/2015-07/msg00480.html>

There likely have been others.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Carlos O'Donell-6
In reply to this post by H.J. Lu-30
On 01/30/2017 02:39 PM, H.J. Lu wrote:

> On Mon, Jan 30, 2017 at 11:17 AM, Carlos O'Donell <[hidden email]> wrote:
>> On 01/30/2017 02:04 PM, H.J. Lu wrote:
>>>> H.J.,
>>>>
>>>> Could you please back out the fix for bug 20019?
>>>>
>>>> We will continue to try and fix this in 2.26 with a solution that moves
>>>> IFUNC design towards a better documented set of semantics.
>>>>
>>>
>>> Since calling longjmp will segfault in this case, shouldn't it be
>>> fixed first by reverting IFUNC implementation in libpthread?
>>
>> I believe there is insufficient time to test that such a change and verify
>> it does not have other unintended consequences for changing a symbol from
>> IFUNC to non-IFUNC.
>>
>> The minimal fix is to revert the changes for bug 20019, and allow programs
>> to startup, and run as expected in the cases they do not call longjmp.
>>
>> I would like to see the minimum amount of reversion required to get us
>> back to a state where applications run again.
>>
>> We have only a few days until the release deadline and I do not wish
>> to extend that date.
>>
>> I understand your desire to fix this correctly, and we will continue this
>> discussion once master reopens, possibly with a reversion of the IFUNC
>> change to libpthread.
>
> I don't think knowingly allow a program to segfault at random without any
> warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?
 
What is or is not appropriate right now must be in the context of the upcoming
release.

The reversal of the patch is the simplist and most conservative move which
restores the behaviour that allows programs to start.


--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

H.J. Lu-30
On Mon, Jan 30, 2017 at 12:22 PM, Carlos O'Donell <[hidden email]> wrote:

> On 01/30/2017 02:39 PM, H.J. Lu wrote:
>> On Mon, Jan 30, 2017 at 11:17 AM, Carlos O'Donell <[hidden email]> wrote:
>>> On 01/30/2017 02:04 PM, H.J. Lu wrote:
>>>>> H.J.,
>>>>>
>>>>> Could you please back out the fix for bug 20019?
>>>>>
>>>>> We will continue to try and fix this in 2.26 with a solution that moves
>>>>> IFUNC design towards a better documented set of semantics.
>>>>>
>>>>
>>>> Since calling longjmp will segfault in this case, shouldn't it be
>>>> fixed first by reverting IFUNC implementation in libpthread?
>>>
>>> I believe there is insufficient time to test that such a change and verify
>>> it does not have other unintended consequences for changing a symbol from
>>> IFUNC to non-IFUNC.
>>>
>>> The minimal fix is to revert the changes for bug 20019, and allow programs
>>> to startup, and run as expected in the cases they do not call longjmp.
>>>
>>> I would like to see the minimum amount of reversion required to get us
>>> back to a state where applications run again.
>>>
>>> We have only a few days until the release deadline and I do not wish
>>> to extend that date.
>>>
>>> I understand your desire to fix this correctly, and we will continue this
>>> discussion once master reopens, possibly with a reversion of the IFUNC
>>> change to libpthread.
>>
>> I don't think knowingly allow a program to segfault at random without any
>> warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?
>
> What is or is not appropriate right now must be in the context of the upcoming
> release.
>
> The reversal of the patch is the simplist and most conservative move which
> restores the behaviour that allows programs to start.
>

I am not against allowing the bad programs to start.  But silently allow the
bad programs to crash at random isn't a conservative fix to me.

--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Carlos O'Donell-6
On 01/30/2017 03:38 PM, H.J. Lu wrote:

> On Mon, Jan 30, 2017 at 12:22 PM, Carlos O'Donell <[hidden email]> wrote:
>> On 01/30/2017 02:39 PM, H.J. Lu wrote:
>>> On Mon, Jan 30, 2017 at 11:17 AM, Carlos O'Donell <[hidden email]> wrote:
>>>> On 01/30/2017 02:04 PM, H.J. Lu wrote:
>>>>>> H.J.,
>>>>>>
>>>>>> Could you please back out the fix for bug 20019?
>>>>>>
>>>>>> We will continue to try and fix this in 2.26 with a solution that moves
>>>>>> IFUNC design towards a better documented set of semantics.
>>>>>>
>>>>>
>>>>> Since calling longjmp will segfault in this case, shouldn't it be
>>>>> fixed first by reverting IFUNC implementation in libpthread?
>>>>
>>>> I believe there is insufficient time to test that such a change and verify
>>>> it does not have other unintended consequences for changing a symbol from
>>>> IFUNC to non-IFUNC.
>>>>
>>>> The minimal fix is to revert the changes for bug 20019, and allow programs
>>>> to startup, and run as expected in the cases they do not call longjmp.
>>>>
>>>> I would like to see the minimum amount of reversion required to get us
>>>> back to a state where applications run again.
>>>>
>>>> We have only a few days until the release deadline and I do not wish
>>>> to extend that date.
>>>>
>>>> I understand your desire to fix this correctly, and we will continue this
>>>> discussion once master reopens, possibly with a reversion of the IFUNC
>>>> change to libpthread.
>>>
>>> I don't think knowingly allow a program to segfault at random without any
>>> warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?
>>
>> What is or is not appropriate right now must be in the context of the upcoming
>> release.
>>
>> The reversal of the patch is the simplist and most conservative move which
>> restores the behaviour that allows programs to start.
>>
>
> I am not against allowing the bad programs to start.  But silently allow the
> bad programs to crash at random isn't a conservative fix to me.
 
It isn't a fix at all. We have run out of time to address the issue, and for
the upcoming 2.25 release it would be better that the applications continue
with their existing behaviour, rather than new behaviour that we know we
have to change again.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

Carlos O'Donell-6
In reply to this post by Florian Weimer-5
On 01/30/2017 02:53 PM, Florian Weimer wrote:
> On 01/30/2017 08:39 PM, H.J. Lu wrote:
>> I don't think knowingly allow a program to segfault at random without any
>> warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?
>
> We had extensive discussions previously, and the consensus at the
> time was that we do not want error detection for corrupt binaries in
> the dynamic linker.

I would argue it's more a question of error handling:
https://sourceware.org/glibc/wiki/Style_and_Conventions#Error_Handling

I've updated IFUNC to include "Design Goals" where I outline what I think
should be happening:
https://sourceware.org/glibc/wiki/GNU_IFUNC

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: 2.25 freeze status

H.J. Lu-30
In reply to this post by Carlos O'Donell-6
On Mon, Jan 30, 2017 at 12:41 PM, Carlos O'Donell <[hidden email]> wrote:

> On 01/30/2017 03:38 PM, H.J. Lu wrote:
>> On Mon, Jan 30, 2017 at 12:22 PM, Carlos O'Donell <[hidden email]> wrote:
>>> On 01/30/2017 02:39 PM, H.J. Lu wrote:
>>>> On Mon, Jan 30, 2017 at 11:17 AM, Carlos O'Donell <[hidden email]> wrote:
>>>>> On 01/30/2017 02:04 PM, H.J. Lu wrote:
>>>>>>> H.J.,
>>>>>>>
>>>>>>> Could you please back out the fix for bug 20019?
>>>>>>>
>>>>>>> We will continue to try and fix this in 2.26 with a solution that moves
>>>>>>> IFUNC design towards a better documented set of semantics.
>>>>>>>
>>>>>>
>>>>>> Since calling longjmp will segfault in this case, shouldn't it be
>>>>>> fixed first by reverting IFUNC implementation in libpthread?
>>>>>
>>>>> I believe there is insufficient time to test that such a change and verify
>>>>> it does not have other unintended consequences for changing a symbol from
>>>>> IFUNC to non-IFUNC.
>>>>>
>>>>> The minimal fix is to revert the changes for bug 20019, and allow programs
>>>>> to startup, and run as expected in the cases they do not call longjmp.
>>>>>
>>>>> I would like to see the minimum amount of reversion required to get us
>>>>> back to a state where applications run again.
>>>>>
>>>>> We have only a few days until the release deadline and I do not wish
>>>>> to extend that date.
>>>>>
>>>>> I understand your desire to fix this correctly, and we will continue this
>>>>> discussion once master reopens, possibly with a reversion of the IFUNC
>>>>> change to libpthread.
>>>>
>>>> I don't think knowingly allow a program to segfault at random without any
>>>> warning is appropriate.  Can't we turn the fatal error into a non-fatal warning?
>>>
>>> What is or is not appropriate right now must be in the context of the upcoming
>>> release.
>>>
>>> The reversal of the patch is the simplist and most conservative move which
>>> restores the behaviour that allows programs to start.
>>>
>>
>> I am not against allowing the bad programs to start.  But silently allow the
>> bad programs to crash at random isn't a conservative fix to me.
>
> It isn't a fix at all. We have run out of time to address the issue, and for
> the upcoming 2.25 release it would be better that the applications continue
> with their existing behaviour, rather than new behaviour that we know we
> have to change again.

I have a couple questions:

1. Will this change be ever in 2.26?
2. Will this change be backported to 2.24?


--
H.J.
12