[Bug nptl/24776] New: pthread_key_create, pthread_setspecific are incompatible with dlmopen

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] New: pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

            Bug ID: 24776
           Summary: pthread_key_create, pthread_setspecific are
                    incompatible with dlmopen
           Product: glibc
           Version: 2.30
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: fweimer at redhat dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---
             Flags: security-

pthread_key_create and pthread_setspecific use the malloc of the respective
namespace.  The memory is freed in __nptl_deallocate_tsd, again using the
malloc of the namespace.

These can be different if the base namespace creates a thread, another
namespace calls pthread_setspecific, and then the thread exits (by returning
from the start function, so via the base namespace).

After some heroic patching, this is where the example in
<https://sourceware.org/ml/libc-help/2019-06/msg00026.html> fails, and that
part will be difficult to address.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

--- Comment #1 from Florian Weimer <fweimer at redhat dot com> ---
Created attachment 11888
  --> https://sourceware.org/bugzilla/attachment.cgi?id=11888&action=edit
Baojun Wang's reproducer from the mailing list thread

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #2 from Carlos O'Donell <carlos at redhat dot com> ---
Is it fair to say that registered destructors must run in the same namespace
they were registered under?

We need to record the namespace at the point of registration, and use it again
when the destructor is run.

Running the destructor in the namespace that was present at registration
ensures that the correct free (and any other calls) is used.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

--- Comment #3 from Florian Weimer <fweimer at redhat dot com> ---
For code outside of ld.so, the relevant namespace is always implied by the code
address, so I don't think that's the problem here.

We probably cannot use malloc for allocating the internal data structures.  If
we keep using it, we need to call through a wrapper function in GLRO (or ld.so
in general), to reach into the right namespace.

The user-supplied destructor function is not necessarily in the same namespace
as the internal allocation, either.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

--- Comment #4 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Florian Weimer from comment #3)
> We probably cannot use malloc for allocating the internal data structures.
> If we keep using it, we need to call through a wrapper function in GLRO (or
> ld.so in general), to reach into the right namespace.

My opinion is that we may be able to take a hybrid approach to the problem.

I believe that for certain parts of rtld we can and should use an internal
allocator that is isolated from all the allocators in the namespaces. For the
purpose of consistency and accountability this allocator cannot be interposed.
These allocations are almost always related to book-keeping tasks.

For other parts of rtld where rtld acts on *behalf* of another API request, it
seems logical to try and use the allocator from the namespace, tag the
allocation with namespace information, and then release it back using the same
namespace.

Back to the question at hand. If the user supplied destructor calls malloc/free
then as long as we run it in the original namespace the objects will be
correctly  allocated and released in the right place. If the user supplied
destructor calls a shared-rtld function (directly or indirectly) which returns
a pointer that the destructor must free, then the underlying shared-rtld
functions must be of the kind I describe above as "acts on *behalf* of another
API" and so must also come from the enclosing namespace.

One way to solve this is:
* Consider the space used by pthread_setspecific to be an "implementation
detail" and use a new rtld_calloc() function to allocate it from rtld.
* Subsequent frees in __nptl_deallocate_tsd use rtld_free().

So we invert the problem and move the allocations to the shared rtld namespace,
rather than recording where they came from to free them correctly (possibly
costly).

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

--- Comment #5 from Florian Weimer <fweimer at redhat dot com> ---
(In reply to Carlos O'Donell from comment #4)
> Back to the question at hand. If the user supplied destructor calls
> malloc/free then as long as we run it in the original namespace the objects

Note that there is no concept of “running code in a namespace”.  The code *is*
the namespace (at least until we gain support for a FDPIC architecture, where
things could be different).

> will be correctly  allocated and released in the right place. If the user
> supplied destructor calls a shared-rtld function (directly or indirectly)
> which returns a pointer that the destructor must free, then the underlying
> shared-rtld functions must be of the kind I describe above as "acts on
> *behalf* of another API" and so must also come from the enclosing namespace.

True, but I think we should just not add such a function that uses the generic
free.  There can only be one rtld, so performing tasks in rtld in the non-base
namespace will always be problematic.  See what we have to do with dlerror to
make things work properly.

> One way to solve this is:
> * Consider the space used by pthread_setspecific to be an "implementation
> detail" and use a new rtld_calloc() function to allocate it from rtld.
> * Subsequent frees in __nptl_deallocate_tsd use rtld_free().

Another approach would introduce per-link map thread destructors.  Which one is
better is hard to tell.  There are likely interactions between dlmopen and
threads that are much harder to solve anyway.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

wangbj at gmail dot com <wangbj at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wangbj at gmail dot com

--- Comment #6 from wangbj at gmail dot com <wangbj at gmail dot com> ---
Sorry for hijacking the thread. I guess the issue might related to
ld-linux.so's malloc. I know ld-linux.so itself can call *malloc* families
(including free, realloc..), even as a freestanding object. but it also makes
*malloc* (families) a weak symbol:


$ readelf -s /lib/x86_64-linux-gnu/ld-2.27.so | grep tls_get_addr
    27: 00000000000199f0    61 FUNC    GLOBAL DEFAULT   12
__tls_get_addr@@GLIBC_2.3

$ readelf -s /lib/x86_64-linux-gnu/ld-2.27.so  | grep WEAK
     9: 000000000001b620    52 FUNC    WEAK   DEFAULT   12 free@@GLIBC_2.2.5
    10: 000000000001b7e0   135 FUNC    WEAK   DEFAULT   12 realloc@@GLIBC_2.2.5
    17: 000000000001b5e0    56 FUNC    WEAK   DEFAULT   12 calloc@@GLIBC_2.2.5
    29: 000000000001b4b0   292 FUNC    WEAK   DEFAULT   12 malloc@@GLIBC_2.2.5


Now that even `dlmopen` is called, there's only a single copy of `ld-linux.so`,
so when the new linker namespace calls `__tls_get_addr`, which in turn could
call `malloc`, ld-linux.so resolves the `malloc` into the default namespace's
malloc. because `malloc@@GLIBC_2.2.5` could have been resolved by ld-linux.so
already previously.

So it looks to me the isolation is not strong enough: it is very easy to jail
break the isolation when TLS is involved. wouldn't ld-linux.so provide malloc
families as WEAK symbol, the isolation could be stronger. Hence IMHO,
ld-linux.so should not try to re-resolve the malloc family symbols.

Back to my issue attached by Florian, my program got stack overflowed by
500+MB:

    frame #40114: 0x00007ffff7deea28 ld-linux-x86-64.so.2`__tls_get_addr at
tls_get_addr.S:55
    frame #40115: 0x00007ffff7ef0fa2   // <---- DSO in new linker ns calls
__tls_get_addr, via std::io::_eprintln (rust code)
    frame #40116: 0x00007ffff764e90a libc.so.6`arena_get2(size=576,
avoid_arena=0x00007ffff6338770) at arena.c:888 // <----- calls get_nprocs
    frame #40117: 0x00007ffff765354d libc.so.6`tcache_init at arena.c:879
    frame #40118: 0x00007ffff7653530 libc.so.6`tcache_init at malloc.c:2986
    frame #40119: 0x00007ffff76541cb libc.so.6`__GI___libc_malloc at
malloc.c:2983
    frame #40120: 0x00007ffff76541b0 libc.so.6`__GI___libc_malloc(bytes=160) at
malloc.c:3042. // <---- resolved to default namespace instead of the new one.
    frame #40121: 0x00007ffff7de7b90 ld-linux-x86-64.so.2`tls_get_addr_tail at
dl-tls.c:594
    frame #40122: 0x00007ffff7de7b6c ld-linux-x86-64.so.2`tls_get_addr_tail at
dl-tls.c:607
    frame #40123: 0x00007ffff7de7b5e
ld-linux-x86-64.so.2`tls_get_addr_tail(ti=0x00007ffff7f227e0,
dtv=0x000000000060a090, the_map=0x0000000000602360) at dl-tls.c:787
    frame #40124: 0x00007ffff7deea28 ld-linux-x86-64.so.2`__tls_get_addr at
tls_get_addr.S:55
    frame #40125: 0x00007ffff7ef0fa2


From the call stack you can see __tls_get_addr -> __tls_get_addr_tail -> malloc
resolved into the default namespace's malloc (sorry might not obvious to you
guys because the full /proc/<pid>/maps is rather long), the malloc calls to
arena_get2 -> get_nprocs, get_nrpocs might call SYS_open, trying to get the
information from process. Now that's an issue for me, because my DSO (in the
new linker ns) actually intercept `SYS_open`, and doing something like `printf`
(or std::io::_eprint in rust):

000000000001af50 <std::io::stdio::_eprint>:
   1af50:       53                      push   %rbx
   1af51:       48 81 ec c0 00 00 00    sub    $0xc0,%rsp
   1af58:       0f 10 07                movups (%rdi),%xmm0
   1af5b:       0f 10 4f 10             movups 0x10(%rdi),%xmm1
   1af5f:       0f 10 57 20             movups 0x20(%rdi),%xmm2
   1af63:       0f 29 94 24 a0 00 00    movaps %xmm2,0xa0(%rsp)
   1af6a:       00
   1af6b:       0f 29 8c 24 90 00 00    movaps %xmm1,0x90(%rsp)
   1af72:       00
   1af73:       0f 29 84 24 80 00 00    movaps %xmm0,0x80(%rsp)
   1af7a:       00
   1af7b:       48 8d 05 d5 fa 01 00    lea    0x1fad5(%rip),%rax        #
3aa57 <str.2+0x5b7>
   1af82:       48 89 84 24 b0 00 00    mov    %rax,0xb0(%rsp)
   1af89:       00
   1af8a:       48 c7 84 24 b8 00 00    movq   $0x6,0xb8(%rsp)
   1af91:       00 06 00 00 00
   1af96:       48 8d 3d 43 18 03 00    lea    0x31843(%rip),%rdi        #
4c7e0 <_GLOBAL_OFFSET_TABLE_+0x80>
   1af9d:       e8 de d0 fe ff          callq  8080 <__tls_get_addr@plt>
   1afa2:       48 83 b8 40 00 00 00    cmpq   $0x1,0x40(%rax).  // <---
0x00007ffff7ef0fa2 from the stack frame.

Which caused the huge stack overflow. That's the reason why patching
`get_nprocs` solved my issue, I'd have thought. As mentioned earlier, if
ld-linux.so resolved `malloc` to the new linker namespace version, it wouldn't
have any issue (because my DSO only intercepts syscalls from *default*
namespace).

It may sounds like this particular issue only applies to my use case, but I
really think the WEAK malloc families symbols really makes the isolation much
weaker, do you guys think it would be a reasonable request to make the malloc
families symbols from ld-linux.so PRIVATE?

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

--- Comment #7 from wangbj at gmail dot com <wangbj at gmail dot com> ---
It could be even more nuanced than that: __tls_get_addr could cause mallocing
memory from the default namespace, while free from the new namespace would end
up freeing memory allocated from the default namespace, resulting memory
corruption, I've seen this happen sporadically, though haven't found a solid
instance it indeed happened, as neither gdb nor lldb is friendly to core dumps
with multiple linker namespace.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/24776] pthread_key_create, pthread_setspecific are incompatible with dlmopen

fweimer at redhat dot com
In reply to this post by fweimer at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=24776

--- Comment #8 from Florian Weimer <fweimer at redhat dot com> ---
(In reply to [hidden email] from comment #7)
> It could be even more nuanced than that: __tls_get_addr could cause
> mallocing memory from the default namespace, while free from the new
> namespace would end up freeing memory allocated from the default namespace,
> resulting memory corruption, I've seen this happen sporadically, though
> haven't found a solid instance it indeed happened, as neither gdb nor lldb
> is friendly to core dumps with multiple linker namespace.

This is exactly what happens if the other (easier) issues are fixed, and the
scope of this bug report.  It's currently not easy to see this because of the
other issues (bug 24772, bug 24773, if I recall correctly).

--
You are receiving this mail because:
You are on the CC list for the bug.