glibc 2.32 rseq support incompatible with Firefox sandbox

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
I tried to run Firefox on current the current glibc development branch
which will become glibc 2.32 in early August.  It fails with SIGSYS
during rseq registration:

Core was generated by `/home/test/firefox/firefox-bin -contentproc -childID 6 -isForBrowser -prefsLen'.
Program terminated with signal SIGSYS, Bad system call.
#0 rseq_register_current_thread ()
at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
38 if (INTERNAL_SYSCALL_ERROR_P (ret))
[Current thread is 1 (Thread 0x7f545a45e640 (LWP 5932))]
(gdb) l
33 if (__rseq_abi.cpu_id != RSEQ_CPU_ID_UNINITIALIZED)
34 __libc_fatal ("glibc fatal error: "
35 "rseq already initialized for this thread\n");
36 ret = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
37 0, RSEQ_SIG);
38 if (INTERNAL_SYSCALL_ERROR_P (ret))
39 {
40 const char *msg = NULL;
41
42 switch (INTERNAL_SYSCALL_ERRNO (ret))

(gdb) bt
#0 rseq_register_current_thread ()
at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
#1 start_thread (arg=<optimized out>) at pthread_create.c:390
#2 0x00007f546b86d283 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

This looks like the earlier dlopen problem because during the creation
of new threads, all signals are blocked. This is required for
correctness of the rseq implementation, so that applications cannot
observe a thread state during which rseq is not registered.

I filed a Firefox bug here:

  <https://bugzilla.mozilla.org/show_bug.cgi?id=1651701>

How can we work together to fix this?

Thanks,
Florian

Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
On Thu, Jul 9, 2020 at 8:04 AM Florian Weimer via Libc-alpha
<[hidden email]> wrote:

>
> I tried to run Firefox on current the current glibc development branch
> which will become glibc 2.32 in early August.  It fails with SIGSYS
> during rseq registration:
>
> Core was generated by `/home/test/firefox/firefox-bin -contentproc -childID 6 -isForBrowser -prefsLen'.
> Program terminated with signal SIGSYS, Bad system call.
> #0 rseq_register_current_thread ()
> at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
> 38 if (INTERNAL_SYSCALL_ERROR_P (ret))
> [Current thread is 1 (Thread 0x7f545a45e640 (LWP 5932))]
> (gdb) l
> 33 if (__rseq_abi.cpu_id != RSEQ_CPU_ID_UNINITIALIZED)
> 34 __libc_fatal ("glibc fatal error: "
> 35 "rseq already initialized for this thread\n");
> 36 ret = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
> 37 0, RSEQ_SIG);
> 38 if (INTERNAL_SYSCALL_ERROR_P (ret))
> 39 {
> 40 const char *msg = NULL;
> 41
> 42 switch (INTERNAL_SYSCALL_ERRNO (ret))
>
> (gdb) bt
> #0 rseq_register_current_thread ()
> at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
> #1 start_thread (arg=<optimized out>) at pthread_create.c:390
> #2 0x00007f546b86d283 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> This looks like the earlier dlopen problem because during the creation
> of new threads, all signals are blocked. This is required for
> correctness of the rseq implementation, so that applications cannot
> observe a thread state during which rseq is not registered.
>
> I filed a Firefox bug here:
>
>   <https://bugzilla.mozilla.org/show_bug.cgi?id=1651701>
>
> How can we work together to fix this?

Initialization may not always work:

https://sourceware.org/bugzilla/show_bug.cgi?id=26203

Can we use a dedicated dummy function pointer:

https://sourceware.org/pipermail/libc-alpha/2020-July/115788.html

to call initialization functions if IFUNC is available?


--
H.J.
Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
* H. J. Lu:

> Initialization may not always work:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=26203

Sorry, this is a completely different issue.  For rseq, we use
__libc_early_init, which is always invoked before executing user code.
IFUNC resolvers cannot use rseq reliably because it introduces a
relocation dependency.

Thanks,
Florian

Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
In reply to this post by Sourceware - libc-alpha mailing list
----- On Jul 9, 2020, at 11:03 AM, Florian Weimer [hidden email] wrote:

> I tried to run Firefox on current the current glibc development branch
> which will become glibc 2.32 in early August.  It fails with SIGSYS
> during rseq registration:
>
> Core was generated by `/home/test/firefox/firefox-bin -contentproc -childID 6
> -isForBrowser -prefsLen'.
> Program terminated with signal SIGSYS, Bad system call.
> #0 rseq_register_current_thread ()
> at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
> 38 if (INTERNAL_SYSCALL_ERROR_P (ret))
> [Current thread is 1 (Thread 0x7f545a45e640 (LWP 5932))]
> (gdb) l
> 33 if (__rseq_abi.cpu_id != RSEQ_CPU_ID_UNINITIALIZED)
> 34 __libc_fatal ("glibc fatal error: "
> 35 "rseq already initialized for this thread\n");
> 36 ret = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
> 37 0, RSEQ_SIG);
> 38 if (INTERNAL_SYSCALL_ERROR_P (ret))
> 39 {
> 40 const char *msg = NULL;
> 41
> 42 switch (INTERNAL_SYSCALL_ERRNO (ret))
>
> (gdb) bt
> #0 rseq_register_current_thread ()
> at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
> #1 start_thread (arg=<optimized out>) at pthread_create.c:390
> #2 0x00007f546b86d283 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> This looks like the earlier dlopen problem because during the creation
> of new threads, all signals are blocked. This is required for
> correctness of the rseq implementation, so that applications cannot
> observe a thread state during which rseq is not registered.
>
> I filed a Firefox bug here:
>
>  <https://bugzilla.mozilla.org/show_bug.cgi?id=1651701>
>
> How can we work together to fix this?

I suspect it's just the sandbox doing its job, and not being aware
of the existence of rseq.

In the gecko-dev git tree, grepping for other system calls used around
pthread creation, e.g. get/set_robust_list, I see:

testing/web-platform/tests/tools/docker/seccomp.json
147:        "get_robust_list",
305:        "set_robust_list",

security/sandbox/linux/SandboxFilter.cpp:
637:#ifdef __NR_set_robust_list
638:      case __NR_set_robust_list:

So I think we should just teach those sandboxes about rseq in the same
way they allow set_robust_list.

Thanks,

Mathieu




>
> Thanks,
> Florian

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Szabolcs Nagy-2
In reply to this post by Sourceware - libc-alpha mailing list
The 07/09/2020 17:03, Florian Weimer via Libc-alpha wrote:

> I tried to run Firefox on current the current glibc development branch
> which will become glibc 2.32 in early August.  It fails with SIGSYS
> during rseq registration:
>
> Core was generated by `/home/test/firefox/firefox-bin -contentproc -childID 6 -isForBrowser -prefsLen'.
> Program terminated with signal SIGSYS, Bad system call.
> #0 rseq_register_current_thread ()
> at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
> 38 if (INTERNAL_SYSCALL_ERROR_P (ret))
> [Current thread is 1 (Thread 0x7f545a45e640 (LWP 5932))]
> (gdb) l
> 33 if (__rseq_abi.cpu_id != RSEQ_CPU_ID_UNINITIALIZED)
> 34 __libc_fatal ("glibc fatal error: "
> 35 "rseq already initialized for this thread\n");
> 36 ret = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
> 37 0, RSEQ_SIG);
> 38 if (INTERNAL_SYSCALL_ERROR_P (ret))
> 39 {
> 40 const char *msg = NULL;
> 41
> 42 switch (INTERNAL_SYSCALL_ERRNO (ret))
>
> (gdb) bt
> #0 rseq_register_current_thread ()
> at ../sysdeps/unix/sysv/linux/rseq-internal.h:38
> #1 start_thread (arg=<optimized out>) at pthread_create.c:390
> #2 0x00007f546b86d283 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> This looks like the earlier dlopen problem because during the creation
> of new threads, all signals are blocked. This is required for
> correctness of the rseq implementation, so that applications cannot
> observe a thread state during which rseq is not registered.
>
> I filed a Firefox bug here:
>
>   <https://bugzilla.mozilla.org/show_bug.cgi?id=1651701>
>
> How can we work together to fix this?

so SIGSYS is raised because rseq is not an allowed
syscall but treated with SECCOMP_RET_TRAP in the
sandbox.

the sandbox expects to handle the SIGSYS but if the
signal is blocked then the process is terminated.

this sandbox design is broken: blocking signals is
valid and new syscalls are added all the time (e.g.
all time related syscalls just got replaced on 32bit
systems).

the libc has a backward abi compatibility guarantee:
a binary that works with an old libc must work with
a new libc too. such sandboxes break this guarantee
creating problems where ppl keep their libc or linux
system downgraded so their binaries work.

new syscalls may be necessary to fix security bugs
so this potentially prevents deploying security
updates too.

if firefox wants to filter syscalls it must coordinate
with the libc and not poke at the linux abi behind
its back.
Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
On Thu, Jul 9, 2020, 18:34 Szabolcs Nagy <[hidden email]> wrote:

>
> this sandbox design is broken: blocking signals is
> valid


That the design is broken is somewhere between an irrelevant and pointless
observation: it was introduced back in 2012 by Chromium. I'm going to guess
they did it this way because Seccomp-Trap was the only available option to
get a usable browser sandbox until very recent 5.x kernels (which means it
will potentially stay the only way for modern glibc on older kernels?).
Firefox hits it as well because we share large parts of the sandbox
implementation with Chromium.

We discussed the signal blocking issue in a previous thread - glibc wanting
to be more strict with signal blocking during thread setup - and have been
discussing with Chromium people on moving towards the more modern NOTIFY. I
believe they have a prototype implementation in review. We have some dreams
of getting to work on one before the end of the year. In any case we'll be
stuck with this for a while more.

and new syscalls are added all the time (e.g.
> all time related syscalls just got replaced on 32bit
> systems).
>

We try to proactively patch these in when we become aware of the need. It's
also possible to add them via prefs so you don't need a browser rebuild,
see: https://wiki.mozilla.org/Security/Sandbox#Customization_Settings

If all we need to do here is allow rseq, then it's not really a problem. If
it's a more fundamental issue with the signal blocking, we'll need to
figure out a workaround until sandboxed browsers can add support for and
add the entirely new seccomp implementation.

--
GCP

>
Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
* Gian-Carlo Pascutto:

> If all we need to do here is allow rseq, then it's not really a
> problem. If it's a more fundamental issue with the signal blocking,
> we'll need to figure out a workaround until sandboxed browsers can add
> support for and add the entirely new seccomp implementation.

rseq and rt_sigprocmask are the only new system call after clone in
glibc 2.32.  rt_sigprocmask should be fine, so only rseq needs to be
permitted.  It would be be best not to deny rseq on specific threads if
it has already succeeded on the main thread.

Thanks,
Florian

Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
In reply to this post by Sourceware - libc-alpha mailing list
On 7/9/20 2:10 PM, Gian-Carlo Pascutto via Libc-alpha wrote:
> We try to proactively patch these in when we become aware of the need. It's
> also possible to add them via prefs so you don't need a browser rebuild,
> see: https://wiki.mozilla.org/Security/Sandbox#Customization_Settings

Gian-Carlo,

Thanks for this information.

I can confirm that setting 334 (rseq) in security.sandbox.content.syscall_whitelist
is enough to fix the tabs crashing in Firefox in Fedora Rawhide.

However, the OpenH264 video playback in the sandbox is still broken in the most
recent update of glibc that I'm testing in Fedora Rawhide.

Florian,

To test this I did temp Rawhide sync to master (-18) and so somewhere between
commit c6aac3bf3663709cdefde5f5d5e9e875d607be5e and
commit c363f834cfcbf5efa5449ef13f62233a6d5b9422 we break the OpenH264 decoding
(I used a local Fedora Rawhide VM for testing on x86_64).

I see this repeating with MOZ_SANDBOX_LOGGING=1:
Sandbox: policy for /dev/shm/org.mozilla.ipc.4406.: 1 -> 47
... seccomp-bpf program ...
###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv
Sandbox: EOF from pid 4232

Which I don't see in Rawhide's -17 before I do the update (which includes the
most recent 68 commits).

So something else is still causing problems for the sandbox even with 334 in the
syscall_whitelist.

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: glibc 2.32 rseq support incompatible with Firefox sandbox

Sourceware - libc-alpha mailing list
* Carlos O'Donell:

> Florian,
>
> To test this I did temp Rawhide sync to master (-18) and so somewhere between
> commit c6aac3bf3663709cdefde5f5d5e9e875d607be5e and
> commit c363f834cfcbf5efa5449ef13f62233a6d5b9422 we break the OpenH264 decoding
> (I used a local Fedora Rawhide VM for testing on x86_64).
>
> I see this repeating with MOZ_SANDBOX_LOGGING=1:
> Sandbox: policy for /dev/shm/org.mozilla.ipc.4406.: 1 -> 47
> ... seccomp-bpf program ...
> ###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv
> Sandbox: EOF from pid 4232
>
> Which I don't see in Rawhide's -17 before I do the update (which includes the
> most recent 68 commits).

The OpenH264 plugin has its own, separate sandbox.  Apparently it does
not use the additional system call list for the content sandbox.

What's your test procedure for the OpenH264 plugin?

Thanks,
Florian