[Bug nptl/22853] New: Heap address of pthread_create thread is aligned.

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] New: Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

            Bug ID: 22853
           Summary: Heap address of pthread_create thread is aligned.
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: blackzert at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

When pthread_create'ed thread firstly calls malloc, glibc create a new heap
region for it. Size of this heap depends on compile-time configuration and in
my case ( x84-64 arch ) HEAP_MAX_SIZE is 64MB

In function malloc/arena.c:new_heap there is a condigion when heap will be
aligned by HEAP_MAX_SIZE. This means that address of created heap will be
aligned to HEAP_MAX_SIZE what is 64MB in my case.

Now we can compute the probability to guess the heap address by attacker.

2^48 - 4096 is size of user-mode task. 64MB is 2^26 that means there is
48-26=22, 2^22 possible heaps in the application.
For many Linux distributives mmap_rnd_bits is 28 (tunable with
/proc/sys/vm/mmap_rnd_bits ). It means for such systems 8 hi-bits would be set
- 48 bits - (mmap_rnd_bits + PAGE_SHIFT) = 8, this means we know 8 hi bits of
mmap_base address - it will be started from 0x7f.
What does it means? it means that if application create thread and call malloc
from new thread, it is 2^14 possible values for such heap.

Proof of Concept:

first get program that outputs malloc addr:

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <sys/mman.h>

void * first(void *x)
{
       int a = (int)x;
       int *p_a = &a;

        void *ptr;
       ptr = malloc(8);
       if (ptr == 0)
       {
               printf("Failed to alloc %d\n", errno);
               return -1;
       }
        printf("%p\n", ptr);
       return 0;
}

int main()
{
       int res;
       pthread_t one;

       res = pthread_create(&one, NULL, &first, 0);
       if (res)
       {
               printf("Failed create thread %d\n", errno);
               return -1;
       }
       void *val;
       pthread_join(one,&val);
       return 0;
}

now execute it many times and get histogram with python:

import subprocess

d = {}
i = 0
while i < 1000000:
   output = subprocess.check_output('./thread_stack_heap_hysto')
   key = int(output, 16)
   if key in d:
       d[key] += 1
   else:
       d[key] = 1
   i += 1
print 'total: ', len(d)
for key in d:
   print hex(key), d[key]


and the result:
$ python get_hysto.py
total:  16385


16385 is 0x4001, and 1 here is because max address of kernel for 1 page less
then 2^48.

Summary, heap should not be aligned on 64 megabytes, this behaviour allows
attacker to brute force heap of pthread created thread.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Ilya Smith <blackzert at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |blackzert at gmail dot com

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com
              Flags|                            |security-

--- Comment #1 from Florian Weimer <fweimer at redhat dot com> ---
Flagging as security- because this is just hardening.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

--- Comment #2 from Ilya Smith <blackzert at gmail dot com> ---
Hello,

Can you please explain me what exactly this hardening is?
If this hardening of security, this should be a security bug,
But if you think something different, please explain me.

From my point of view this bug only about security because lead to ASLR bypass.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Andreas Schwab <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Alias|                            |CVE-2019-1010025

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Yann Droneaud <ydroneaud at opteya dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ydroneaud at opteya dot com

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

--- Comment #3 from Florian Weimer <fweimer at redhat dot com> ---
(In reply to Ilya Smith from comment #2)
> Hello,
>
> Can you please explain me what exactly this hardening is?
> If this hardening of security, this should be a security bug,
> But if you think something different, please explain me.
>
> From my point of view this bug only about security because lead to ASLR
> bypass.

ASLR bypass itself is not a vulnerability.  It may simplify exploitation of
another, unrelated vulnerability.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #4 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
My understanding of reporter intention is to point out the malloc heap
alignment restriction lowers the total entropy available to kernel, which is an
implementation detail from glibc. But I agree with Florian reasoning that this
is hardening feature.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Ismael Ripoll <iripoll at disca dot upv.es> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |iripoll at disca dot upv.es

--- Comment #5 from Ismael Ripoll <iripoll at disca dot upv.es> ---

This patch solves the weakness discovered by Ilya Smith. The problem
is important in x86_64, but we think that is SEVERE on other systems
(4 ASLR bits is like no ASLR).

Ptmalloc aligns to a very large value the arenas. This large alignment
greatly reduces the ASLR entropy of all the dynamically allocated data
from thread code. Allocations carried out from main() are not
affected, unless brk can not expand the heap.

The impact of the alignment greatly depends on the architecture.  The
expected entropy shall be at least that of the mmap() on each system,
but the actual entropy is shown in the following table:

  +----------+--------+-----------------+
  | System   |  Mmap  |  Thread malloc  |
  +----------+--------+-----------------+
  | x86_64   |   28   |        14       |
  | i386     |    8   |         0       |  
  | x32      |    8   |         0       |
  | ARM      |    8   |         0       |
  | AARCH    |   18   |         4       |
  | PPC64    |   14   |         4       |
  | s390(64) |   11   |         4       |
  | s390     |   11   |         3       |
  +----------+--------+-----------------+

As it can be seen, all systems but x86_64 are severely affected by
this weakness.

This patch removes the need to align arenas to HEAP_MAX_SIZE by
changing the macro heap_for_ptr(ptr). Arenas are randomized with the
same entropy than the rest of mmaped objects. The entropy to randomize
the thread's arenas is obtained from the ASLR value of the libraries,
by using the address of the __arena_rnd:

__arena_rnd = ((unsigned long)&__arena_rnd) & (HEAP_MAX_SIZE-1) &
                   ~(pagesize-1);

This way, if the user disables ASLR (via randomize_va_space, setarch
-R, or when using gdb), the entropy of the arena is also automatically
disabled.

Summary of the patch features:

*- Does not change the allocation policy.
*- Does not add new data structures (only a long variable).
*- The temporal overhead is almost undetectable (just 2 more cpu
   instructions per free: one "add" and one "or"). malloc is not
   affected.
*- It is fully backward compatible.
*- It restores completely, the ASLR entropy to thread's heaps.

With the patch the entropy on x86_64 is 28bits (15 more than the
current one), and i386 is 8bits (8 more), which is the same
entropy that the mallocs made from the main thread.

Checked on x86_64-linux-gnu and i386-linux-gnu

Authors:
 Ismael Ripoll-Ripoll <[hidden email]>
 Hector Marco-Gisbert <[hidden email]>

---
 malloc/arena.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/malloc/arena.c b/malloc/arena.c
index cecdb7f4c4..8dd7b9f028 100644
--- a/malloc/arena.c
+++ b/malloc/arena.c
@@ -48,7 +48,7 @@

 /* A heap is a single contiguous memory region holding (coalesceable)
    malloc_chunks.  It is allocated with mmap() and always starts at an
-   address aligned to HEAP_MAX_SIZE.  */
+   address aligned to arena_rnd.  */

 typedef struct _heap_info
 {
@@ -122,10 +122,15 @@ int __malloc_initialized = -1;
         ptr = arena_get2 ((size), NULL);                                     \
   } while (0)

-/* find the heap and corresponding arena for a given ptr */
-
+/* find the heap and corresponding arena for a given ptr.  Note that
+   heap_info is not HEAP_MAX_SIZE aligned any more. But a random
+   offset from the expected alignment, known by the process. This way
+   it is fast to get the head of the area whereas it is ASLR compatible.
+*/
+static unsigned long __arena_rnd;
 #define heap_for_ptr(ptr) \
-  ((heap_info *) ((unsigned long) (ptr) & ~(HEAP_MAX_SIZE - 1)))
+  ((heap_info *) ((((unsigned long) ptr-__arena_rnd) & ~(HEAP_MAX_SIZE - 1)) \
+                  |  __arena_rnd))
 #define arena_for_chunk(ptr) \
   (chunk_main_arena (ptr) ? &main_arena : heap_for_ptr (ptr)->ar_ptr)

@@ -293,6 +298,11 @@ ptmalloc_init (void)

   __malloc_initialized = 0;

+  size_t pagesize = GLRO (dl_pagesize);
+  /* Get the entropy from the already existing ASLR. */
+  __arena_rnd = ((unsigned long)&__arena_rnd) & (HEAP_MAX_SIZE-1) &
+                ~(pagesize-1);
+
 #ifdef SHARED
   /* In case this libc copy is in a non-default namespace, never use brk.
      Likewise if dlopened from statically linked program.  */
@@ -439,7 +449,7 @@ dump_heap (heap_info *heap)
 /* If consecutive mmap (0, HEAP_MAX_SIZE << 1, ...) calls return decreasing
    addresses as opposed to increasing, new_heap would badly fragment the
    address space.  In that case remember the second HEAP_MAX_SIZE part
-   aligned to HEAP_MAX_SIZE from last mmap (0, HEAP_MAX_SIZE << 1, ...)
+   aligned to arena_rnd from last mmap (0, HEAP_MAX_SIZE << 1, ...)
    call (if it is already aligned) and try to reuse it next time.  We need
    no locking for it, as kernel ensures the atomicity for us - worst case
    we'll call mmap (addr, HEAP_MAX_SIZE, ...) for some value of addr in
@@ -490,6 +500,11 @@ new_heap (size_t size, size_t top_pad)
         {
           p2 = (char *) (((unsigned long) p1 + (HEAP_MAX_SIZE - 1))
                          & ~(HEAP_MAX_SIZE - 1));
+          /* The heap_info is at a random offset from the alignment to
+             HEAP_MAX_SIZE. */
+          p2 = (char *) ((unsigned long) p2 | __arena_rnd);
+          if (p1 + HEAP_MAX_SIZE <= p2)
+            p2 -= HEAP_MAX_SIZE;
           ul = p2 - p1;
           if (ul)
             __munmap (p1, ul);
@@ -500,12 +515,12 @@ new_heap (size_t size, size_t top_pad)
       else
         {
           /* Try to take the chance that an allocation of only HEAP_MAX_SIZE
-             is already aligned. */
+             is already aligned to __arena_rnd. */
           p2 = (char *) MMAP (0, HEAP_MAX_SIZE, PROT_NONE, MAP_NORESERVE);
           if (p2 == MAP_FAILED)
             return 0;

-          if ((unsigned long) p2 & (HEAP_MAX_SIZE - 1))
+          if (((unsigned long) p2 & (HEAP_MAX_SIZE - 1)) != __arena_rnd)
             {
               __munmap (p2, HEAP_MAX_SIZE);
               return 0;
--
2.20.1

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

Salvatore Bonaccorso <carnil at debian dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carnil at debian dot org

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug nptl/22853] Heap address of pthread_create thread is aligned.

glaubitz at physik dot fu-berlin.de
In reply to this post by glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22853

--- Comment #6 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Ismael Ripoll from comment #5)

> This patch solves the weakness discovered by Ilya Smith. The problem
> is important in x86_64, but we think that is SEVERE on other systems
> (4 ASLR bits is like no ASLR).
>
> Ptmalloc aligns to a very large value the arenas. This large alignment
> greatly reduces the ASLR entropy of all the dynamically allocated data
> from thread code. Allocations carried out from main() are not
> affected, unless brk can not expand the heap.
>
> The impact of the alignment greatly depends on the architecture.  The
> expected entropy shall be at least that of the mmap() on each system,
> but the actual entropy is shown in the following table:
>
>   +----------+--------+-----------------+
>   | System   |  Mmap  |  Thread malloc  |
>   +----------+--------+-----------------+
>   | x86_64   |   28   |        14       |
>   | i386     |    8   |         0       |  
>   | x32      |    8   |         0       |
>   | ARM      |    8   |         0       |
>   | AARCH    |   18   |         4       |
>   | PPC64    |   14   |         4       |
>   | s390(64) |   11   |         4       |
>   | s390     |   11   |         3       |
>   +----------+--------+-----------------+
>
> As it can be seen, all systems but x86_64 are severely affected by
> this weakness.
>
> This patch removes the need to align arenas to HEAP_MAX_SIZE by
> changing the macro heap_for_ptr(ptr). Arenas are randomized with the
> same entropy than the rest of mmaped objects. The entropy to randomize
> the thread's arenas is obtained from the ASLR value of the libraries,
> by using the address of the __arena_rnd:
>
> __arena_rnd = ((unsigned long)&__arena_rnd) & (HEAP_MAX_SIZE-1) &
>       ~(pagesize-1);
>
> This way, if the user disables ASLR (via randomize_va_space, setarch
> -R, or when using gdb), the entropy of the arena is also automatically
> disabled.
>
> Summary of the patch features:
>
> *- Does not change the allocation policy.
> *- Does not add new data structures (only a long variable).
> *- The temporal overhead is almost undetectable (just 2 more cpu
>    instructions per free: one "add" and one "or"). malloc is not
>    affected.
> *- It is fully backward compatible.
> *- It restores completely, the ASLR entropy to thread's heaps.
>
> With the patch the entropy on x86_64 is 28bits (15 more than the
> current one), and i386 is 8bits (8 more), which is the same
> entropy that the mallocs made from the main thread.
>
> Checked on x86_64-linux-gnu and i386-linux-gnu
>
> Authors:
>  Ismael Ripoll-Ripoll <[hidden email]>
>  Hector Marco-Gisbert <[hidden email]>
>
> ---
>  malloc/arena.c | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/malloc/arena.c b/malloc/arena.c
> index cecdb7f4c4..8dd7b9f028 100644
> --- a/malloc/arena.c
> +++ b/malloc/arena.c
> @@ -48,7 +48,7 @@
>  
>  /* A heap is a single contiguous memory region holding (coalesceable)
>     malloc_chunks.  It is allocated with mmap() and always starts at an
> -   address aligned to HEAP_MAX_SIZE.  */
> +   address aligned to arena_rnd.  */
>  
>  typedef struct _heap_info
>  {
> @@ -122,10 +122,15 @@ int __malloc_initialized = -1;
>          ptr = arena_get2 ((size), NULL);      \
>    } while (0)
>  
> -/* find the heap and corresponding arena for a given ptr */
> -
> +/* find the heap and corresponding arena for a given ptr.  Note that
> +   heap_info is not HEAP_MAX_SIZE aligned any more. But a random
> +   offset from the expected alignment, known by the process. This way
> +   it is fast to get the head of the area whereas it is ASLR compatible.
> +*/
> +static unsigned long __arena_rnd;
>  #define heap_for_ptr(ptr) \
> -  ((heap_info *) ((unsigned long) (ptr) & ~(HEAP_MAX_SIZE - 1)))
> +  ((heap_info *) ((((unsigned long) ptr-__arena_rnd) & ~(HEAP_MAX_SIZE -
> 1)) \
> +                  |  __arena_rnd))
>  #define arena_for_chunk(ptr) \
>    (chunk_main_arena (ptr) ? &main_arena : heap_for_ptr (ptr)->ar_ptr)
>  
> @@ -293,6 +298,11 @@ ptmalloc_init (void)
>  
>    __malloc_initialized = 0;
>  
> +  size_t pagesize = GLRO (dl_pagesize);
> +  /* Get the entropy from the already existing ASLR. */
> +  __arena_rnd = ((unsigned long)&__arena_rnd) & (HEAP_MAX_SIZE-1) &
> +                ~(pagesize-1);
> +
>  #ifdef SHARED
>    /* In case this libc copy is in a non-default namespace, never use brk.
>       Likewise if dlopened from statically linked program.  */
> @@ -439,7 +449,7 @@ dump_heap (heap_info *heap)
>  /* If consecutive mmap (0, HEAP_MAX_SIZE << 1, ...) calls return decreasing
>     addresses as opposed to increasing, new_heap would badly fragment the
>     address space.  In that case remember the second HEAP_MAX_SIZE part
> -   aligned to HEAP_MAX_SIZE from last mmap (0, HEAP_MAX_SIZE << 1, ...)
> +   aligned to arena_rnd from last mmap (0, HEAP_MAX_SIZE << 1, ...)
>     call (if it is already aligned) and try to reuse it next time.  We need
>     no locking for it, as kernel ensures the atomicity for us - worst case
>     we'll call mmap (addr, HEAP_MAX_SIZE, ...) for some value of addr in
> @@ -490,6 +500,11 @@ new_heap (size_t size, size_t top_pad)
>          {
>            p2 = (char *) (((unsigned long) p1 + (HEAP_MAX_SIZE - 1))
>                           & ~(HEAP_MAX_SIZE - 1));
> +          /* The heap_info is at a random offset from the alignment to
> +             HEAP_MAX_SIZE. */
> +          p2 = (char *) ((unsigned long) p2 | __arena_rnd);
> +          if (p1 + HEAP_MAX_SIZE <= p2)
> +            p2 -= HEAP_MAX_SIZE;
>            ul = p2 - p1;
>            if (ul)
>              __munmap (p1, ul);
> @@ -500,12 +515,12 @@ new_heap (size_t size, size_t top_pad)
>        else
>          {
>            /* Try to take the chance that an allocation of only HEAP_MAX_SIZE
> -             is already aligned. */
> +             is already aligned to __arena_rnd. */
>            p2 = (char *) MMAP (0, HEAP_MAX_SIZE, PROT_NONE, MAP_NORESERVE);
>            if (p2 == MAP_FAILED)
>              return 0;
>  
> -          if ((unsigned long) p2 & (HEAP_MAX_SIZE - 1))
> +          if (((unsigned long) p2 & (HEAP_MAX_SIZE - 1)) != __arena_rnd)
>              {
>                __munmap (p2, HEAP_MAX_SIZE);
>                return 0;
> --
> 2.20.1

Thanks for the work. However, patch discussion is done through the email list
[1].
Could you re-send it on the libc-alpha?

It seems that alignment restriction was added originally to simplify the
heap_for_ptr macro and I think it should be added on patch description (stating
there is no other restriction about arena alignment).

Also, would be possible to add a testcase? I am not sure which approach would
be better (a maillist discussion might give some ideas). Maybe by parametrize
the mmap entropy information you compiled on arch-specific files (similar to
the stackinfo.h), create N thread that run M allocation, record the output,
and compare the histogram with the expected entropy with a certain margin.

Finally, please also add the outcome of the bench-malloc-{simple,thread}
with and without the patch.

[1] https://sourceware.org/mailman/listinfo/libc-alpha

--
You are receiving this mail because:
You are on the CC list for the bug.