[PATCH] Update tcache double-free check

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
Hello,

As we discussed, I've attached here the patch for updating the
double-free check in the tcache. The patch passes all of malloc's
existing tests (including the double free tests), and it was tested to
work as expected both with and without entropy.

Once again, I apologize for sending the patch as an attachment instead
of inlined in the body of the mail itself (same Gmail issues as
before).

I am aware that there might be whitespace issues with the patch,
please feel free to fix them on your end if possible.
Thanks,
Eyal Itkin.

0001-Update-tcache-double-free-check.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
On 7/24/20 9:37 AM, Eyal Itkin via Libc-alpha wrote:

> Hello,
>
> As we discussed, I've attached here the patch for updating the
> double-free check in the tcache. The patch passes all of malloc's
> existing tests (including the double free tests), and it was tested to
> work as expected both with and without entropy.
>
> Once again, I apologize for sending the patch as an attachment instead
> of inlined in the body of the mail itself (same Gmail issues as
> before).
>
> I am aware that there might be whitespace issues with the patch,
> please feel free to fix them on your end if possible.

Do we have any concerns having all threads call getrandom()?

If even *one* thread succeeded at calling getrandom() could we
use that to make ~tcache more random?

There is certainly a tradeoff here between the number of calls
to getrandom() and the values we use in the key. I haven't
measured the numbers to decide on this yet.

Do your changes show up in benchtests/bench-malloc-{simple,thread}.c?

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
I've gone over the compiled binaries before/after the patch, and the
change on my x64 binary is as follows: no change to tcache_get() or
tcache_put(), and an addition of 2 asm instructions to _int_free(). If
we would have only used ~tcache this would have been reduced to an
addition of a single asm instruction in _int_free().

The patch's impact on benchtests/bench-malloc-(simple,thread) is
negligible, and is less than the natural noise I get on my GCP
instance between two consecutive benchmark executions.

Since the call to getrandom() is only on tcache_init(), hence once per
thread, I'm fine with it being called for all threads, but it is
really for you to decide.

There is a possible fix to make sure that the entropy we get from
getrandom() will be accumulated, thus helping all future threads that
might need it. We won't be able to update the keys of previous threads
(if they lacked entropy), but we can enrich future threads with the
entropy even if they failed to use getrandom(). This solution will
also enable us to use all of the received entropy from getrandom()
even when we get less than the wanted amount of bytes.

The solution for such an entropy accumulation is quite simple:

static __thread size_t tcache_key = 0;
static size_t tcache_entropy = 0;

void tcache_init()
{
      ...
      /* Attempt to get a random key for the double-free checks.  */
      size_t local_entropy = 0;
      getrandom (&local_entropy, sizeof(local_entropy), GRND_NONBLOCK);
      /* always accumulate it with the past seen entropy */
      tcache_entropy ^= local_entropy;
      /* always use the basic alternative of ~tcache in case we didn't
have any entropy.  */
      tcache_key = tcache_entropy ^ ~((size_t) tcache);
}

As can be seen, in case that all calls to getrandom() failed, and we
received nothing from it, we would still use ~tcache, just like
before. In addition, this use of "~tcache" also adds the entropy from
ASLR to the tcache_key, so we don't lose anything from using it.

Please advise me of the best way to proceed.

On Sat, Jul 25, 2020 at 12:05 AM Carlos O'Donell <[hidden email]> wrote:

>
> On 7/24/20 9:37 AM, Eyal Itkin via Libc-alpha wrote:
> > Hello,
> >
> > As we discussed, I've attached here the patch for updating the
> > double-free check in the tcache. The patch passes all of malloc's
> > existing tests (including the double free tests), and it was tested to
> > work as expected both with and without entropy.
> >
> > Once again, I apologize for sending the patch as an attachment instead
> > of inlined in the body of the mail itself (same Gmail issues as
> > before).
> >
> > I am aware that there might be whitespace issues with the patch,
> > please feel free to fix them on your end if possible.
>
> Do we have any concerns having all threads call getrandom()?
>
> If even *one* thread succeeded at calling getrandom() could we
> use that to make ~tcache more random?
>
> There is certainly a tradeoff here between the number of calls
> to getrandom() and the values we use in the key. I haven't
> measured the numbers to decide on this yet.
>
> Do your changes show up in benchtests/bench-malloc-{simple,thread}.c?
>
> --
> Cheers,
> Carlos.
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
On 7/25/20 6:39 AM, Eyal Itkin wrote:

> I've gone over the compiled binaries before/after the patch, and the
> change on my x64 binary is as follows: no change to tcache_get() or
> tcache_put(), and an addition of 2 asm instructions to _int_free(). If
> we would have only used ~tcache this would have been reduced to an
> addition of a single asm instruction in _int_free().
>
> The patch's impact on benchtests/bench-malloc-(simple,thread) is
> negligible, and is less than the natural noise I get on my GCP
> instance between two consecutive benchmark executions.
>
> Since the call to getrandom() is only on tcache_init(), hence once per
> thread, I'm fine with it being called for all threads, but it is
> really for you to decide.
>
> There is a possible fix to make sure that the entropy we get from
> getrandom() will be accumulated, thus helping all future threads that
> might need it. We won't be able to update the keys of previous threads
> (if they lacked entropy), but we can enrich future threads with the
> entropy even if they failed to use getrandom(). This solution will
> also enable us to use all of the received entropy from getrandom()
> even when we get less than the wanted amount of bytes.
>
> The solution for such an entropy accumulation is quite simple:
>
> static __thread size_t tcache_key = 0;
> static size_t tcache_entropy = 0;
>
> void tcache_init()
> {
>       ...
>       /* Attempt to get a random key for the double-free checks.  */
>       size_t local_entropy = 0;
>       getrandom (&local_entropy, sizeof(local_entropy), GRND_NONBLOCK);
>       /* always accumulate it with the past seen entropy */
>       tcache_entropy ^= local_entropy;
>       /* always use the basic alternative of ~tcache in case we didn't
> have any entropy.  */
>       tcache_key = tcache_entropy ^ ~((size_t) tcache);
> }
>
> As can be seen, in case that all calls to getrandom() failed, and we
> received nothing from it, we would still use ~tcache, just like
> before. In addition, this use of "~tcache" also adds the entropy from
> ASLR to the tcache_key, so we don't lose anything from using it.
>
> Please advise me of the best way to proceed.

I like the accumulation of entropy. I'm a little worried about thread
startup and shutdown costs, but I haven't measured that yet. I'm also
busy getting the release out the door (I'm the release manager).
If you have thread startup/shutdown costs that would be interesting
for me to see as a reviewer.

I'd like to see Florian's opinion on this.

I fully expect that if you flesh this out you'll have to handle the
getrandom failure case, and the atomics required to avoid the data
race with multiple threads and global state. However, hold off on
that until we get a big more consensus if we really want to go to
this level of complexity. It would be good to keep the entropy we
have in the event the system is under some kind of entropy starvation
at a later point (I've seen at least one of these for TCP sequence
numbers, read and drop from /dev/random etc.)

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
Updated the patch to perform an atomic update operation on the global
entropy state, so to avoid races when multiple threads are initialized
simultaneously. The patch now accumulates entropy between threads,
while still using the backup case of ~tcache to take care of cases in
which no entropy was yet to be available.

As Carlos mentioned earlier, I guess you will want to discuss this
patch before integrating it. Also, feel free to update the patch if
needed in case I missed some whitespace / long line coding style.

Thanks again,
Eyal


On Sun, Jul 26, 2020 at 12:07 AM Carlos O'Donell <[hidden email]> wrote:

>
> On 7/25/20 6:39 AM, Eyal Itkin wrote:
> > I've gone over the compiled binaries before/after the patch, and the
> > change on my x64 binary is as follows: no change to tcache_get() or
> > tcache_put(), and an addition of 2 asm instructions to _int_free(). If
> > we would have only used ~tcache this would have been reduced to an
> > addition of a single asm instruction in _int_free().
> >
> > The patch's impact on benchtests/bench-malloc-(simple,thread) is
> > negligible, and is less than the natural noise I get on my GCP
> > instance between two consecutive benchmark executions.
> >
> > Since the call to getrandom() is only on tcache_init(), hence once per
> > thread, I'm fine with it being called for all threads, but it is
> > really for you to decide.
> >
> > There is a possible fix to make sure that the entropy we get from
> > getrandom() will be accumulated, thus helping all future threads that
> > might need it. We won't be able to update the keys of previous threads
> > (if they lacked entropy), but we can enrich future threads with the
> > entropy even if they failed to use getrandom(). This solution will
> > also enable us to use all of the received entropy from getrandom()
> > even when we get less than the wanted amount of bytes.
> >
> > The solution for such an entropy accumulation is quite simple:
> >
> > static __thread size_t tcache_key = 0;
> > static size_t tcache_entropy = 0;
> >
> > void tcache_init()
> > {
> >       ...
> >       /* Attempt to get a random key for the double-free checks.  */
> >       size_t local_entropy = 0;
> >       getrandom (&local_entropy, sizeof(local_entropy), GRND_NONBLOCK);
> >       /* always accumulate it with the past seen entropy */
> >       tcache_entropy ^= local_entropy;
> >       /* always use the basic alternative of ~tcache in case we didn't
> > have any entropy.  */
> >       tcache_key = tcache_entropy ^ ~((size_t) tcache);
> > }
> >
> > As can be seen, in case that all calls to getrandom() failed, and we
> > received nothing from it, we would still use ~tcache, just like
> > before. In addition, this use of "~tcache" also adds the entropy from
> > ASLR to the tcache_key, so we don't lose anything from using it.
> >
> > Please advise me of the best way to proceed.
>
> I like the accumulation of entropy. I'm a little worried about thread
> startup and shutdown costs, but I haven't measured that yet. I'm also
> busy getting the release out the door (I'm the release manager).
> If you have thread startup/shutdown costs that would be interesting
> for me to see as a reviewer.
>
> I'd like to see Florian's opinion on this.
>
> I fully expect that if you flesh this out you'll have to handle the
> getrandom failure case, and the atomics required to avoid the data
> race with multiple threads and global state. However, hold off on
> that until we get a big more consensus if we really want to go to
> this level of complexity. It would be good to keep the entropy we
> have in the event the system is under some kind of entropy starvation
> at a later point (I've seen at least one of these for TCP sequence
> numbers, read and drop from /dev/random etc.)
>
> --
> Cheers,
> Carlos.
>

0001-Update-tcache-double-free-check.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
On 8/10/20 9:07 AM, Eyal Itkin wrote:
> Updated the patch to perform an atomic update operation on the global
> entropy state, so to avoid races when multiple threads are initialized
> simultaneously. The patch now accumulates entropy between threads,
> while still using the backup case of ~tcache to take care of cases in
> which no entropy was yet to be available.
>
> As Carlos mentioned earlier, I guess you will want to discuss this
> patch before integrating it. Also, feel free to update the patch if
> needed in case I missed some whitespace / long line coding style.

Thank you for putting this together. I need to spend some time thinking
more deeply on this and considering where the right balance might lie
between a per-process value and a per-thread value. Particularly with
respect to the tradeoff between maintaining the code and security.

Do you have any strong opinions on the use of a per-thread vs. per-process
value?

This patch is third on my queue.

My queue is currently:
- NSS configuration reloading (DJ Delorie)
- DSO sorting DFS (Chung-Lin Tang)
- Tcache double-free check (Eyal Itkin)

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
The design of this feature started by stating that "any arbitrary
value" will suffice. Which will indeed be enough when protecting
against a double free vulnerability assuming the attacker has no
control over the content of the memory that is going to be free()ed
twice.

However, if we look at a proper threat model, we have the following:
1. Attacker that can't control the content of the memory to be free()ed
2. Attacker that has full/partial control over the content of the
memory to be free()ed

When examining the double free check, we can see that a double free
operation will be caught if the EXACT key will match. Which basically
means that if the attacker can modify even a single bit in the stored
key, the check will be bypassed. Even, without the attacker knowing
what was the original key.

This basically means that the only advantage of using a random value
is to avoid using a constant arbitrary value that might collide
(repeatedly) with specific values of the program, and will demand
slower checks in the tcache so as to rule out the possibility of a
double free. In that sense, double-free across threads will never be
caught, as the thread will still check it's own tcache struct and will
never find the other thread's buffer inside their own struct.

Usually, random values make a defense stronger because the attacker
has to guess each and every one of the bits. However, after performing
the analysis, it seems like we have the opposite case here. If the
attacker can modify even a single bit, a double-free operation will be
possible, as the check will be bypassed.

Due to that, I recommend using a random value per process according to
the original scheme of getrandom() and a backup of "~tcache". Keeping
in mind that the ASLR bits will supply some random, and that ~tcache
will never be a value we expect to see in memory and won't ever be a
mapped memory address that could be used by a use-after-free
vulnerability as the original key ("tcache") was exploited. I don't
see any value in a per-thread random key, as again, the random value
will only help us avoid collisions with values from the program's
memory.

The overall scheme for accumulating random between threads might be
useful in the future, for a cookie-style security check, however I
convinced myself that it won't be needed in this case.

Once a decision will be made, please notify me, and I'll update the
patch accordingly.


On Mon, Aug 10, 2020 at 4:12 PM Carlos O'Donell <[hidden email]> wrote:

>
> On 8/10/20 9:07 AM, Eyal Itkin wrote:
> > Updated the patch to perform an atomic update operation on the global
> > entropy state, so to avoid races when multiple threads are initialized
> > simultaneously. The patch now accumulates entropy between threads,
> > while still using the backup case of ~tcache to take care of cases in
> > which no entropy was yet to be available.
> >
> > As Carlos mentioned earlier, I guess you will want to discuss this
> > patch before integrating it. Also, feel free to update the patch if
> > needed in case I missed some whitespace / long line coding style.
>
> Thank you for putting this together. I need to spend some time thinking
> more deeply on this and considering where the right balance might lie
> between a per-process value and a per-thread value. Particularly with
> respect to the tradeoff between maintaining the code and security.
>
> Do you have any strong opinions on the use of a per-thread vs. per-process
> value?
>
> This patch is third on my queue.
>
> My queue is currently:
> - NSS configuration reloading (DJ Delorie)
> - DSO sorting DFS (Chung-Lin Tang)
> - Tcache double-free check (Eyal Itkin)
>
> --
> Cheers,
> Carlos.
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
On 8/10/20 9:35 AM, Eyal Itkin wrote:
> The overall scheme for accumulating random between threads might be
> useful in the future, for a cookie-style security check, however I
> convinced myself that it won't be needed in this case.

Your analysis seems sensible to me. In the case of a cookie-style check
I think we can and *should* do this with some of the chunk metadata.
This is something Florian proposed a couple of years ago, but the
difficulty is that it's straight on the hot path, so you have to try
reorder the operations to get as-good performance as before. Given that
you're already touching the cacheline with the metadata there is a lot
that you can hide e.g. xor of the size with a cookie.

You're right though the code you've written I think will be useful to
others.

I still need to review the various versions we have and get consensus.

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
In reply to this post by Sourceware - libc-alpha mailing list
On 8/10/20 9:12 AM, Carlos O'Donell wrote:

> On 8/10/20 9:07 AM, Eyal Itkin wrote:
>> Updated the patch to perform an atomic update operation on the global
>> entropy state, so to avoid races when multiple threads are initialized
>> simultaneously. The patch now accumulates entropy between threads,
>> while still using the backup case of ~tcache to take care of cases in
>> which no entropy was yet to be available.
>>
>> As Carlos mentioned earlier, I guess you will want to discuss this
>> patch before integrating it. Also, feel free to update the patch if
>> needed in case I missed some whitespace / long line coding style.
>
> Thank you for putting this together. I need to spend some time thinking
> more deeply on this and considering where the right balance might lie
> between a per-process value and a per-thread value. Particularly with
> respect to the tradeoff between maintaining the code and security.
>
> Do you have any strong opinions on the use of a per-thread vs. per-process
> value?
>
> This patch is third on my queue.
>
> My queue is currently:
> - NSS configuration reloading (DJ Delorie)
> - DSO sorting DFS (Chung-Lin Tang)
> - Tcache double-free check (Eyal Itkin)
>

I got through the NSS configuration patches.

I'm working again on DSO sorting.

LPC 2020 is interrupting this week.

Thanks for your patience.

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
Ping. Is there any update on this subject?


On Wed, Aug 26, 2020 at 11:40 PM Carlos O'Donell <[hidden email]> wrote:

>
> On 8/10/20 9:12 AM, Carlos O'Donell wrote:
> > On 8/10/20 9:07 AM, Eyal Itkin wrote:
> >> Updated the patch to perform an atomic update operation on the global
> >> entropy state, so to avoid races when multiple threads are initialized
> >> simultaneously. The patch now accumulates entropy between threads,
> >> while still using the backup case of ~tcache to take care of cases in
> >> which no entropy was yet to be available.
> >>
> >> As Carlos mentioned earlier, I guess you will want to discuss this
> >> patch before integrating it. Also, feel free to update the patch if
> >> needed in case I missed some whitespace / long line coding style.
> >
> > Thank you for putting this together. I need to spend some time thinking
> > more deeply on this and considering where the right balance might lie
> > between a per-process value and a per-thread value. Particularly with
> > respect to the tradeoff between maintaining the code and security.
> >
> > Do you have any strong opinions on the use of a per-thread vs. per-process
> > value?
> >
> > This patch is third on my queue.
> >
> > My queue is currently:
> > - NSS configuration reloading (DJ Delorie)
> > - DSO sorting DFS (Chung-Lin Tang)
> > - Tcache double-free check (Eyal Itkin)
> >
>
> I got through the NSS configuration patches.
>
> I'm working again on DSO sorting.
>
> LPC 2020 is interrupting this week.
>
> Thanks for your patience.
>
> --
> Cheers,
> Carlos.
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
On 10/3/20 5:04 AM, Eyal Itkin wrote:
> Ping. Is there any update on this subject?

Not yet.

My queue is:
- DSO sorting patches.
- Tcache double-free
- nsswitch (again)

I've been blocked by other deliverables.

I've reviewed some other patches, but they are
easier to reason about and review ;-)

--
Cheers,
Carlos.

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Update tcache double-free check

Sourceware - libc-alpha mailing list
Hi, the amount of emails I receive each day from this mailing list is
really making it hard for me to keep track of my personal / other work
related emails (not even mentioning a "Zero inbox" policy). Having it
for more than 2 months now while the patch is being examined is a bit
too much.

I'm unsubscribing myself from this list, and when there will be any
update about the proposed patch, please let me know via the ongoing
thread (that contains my email), or by creating a new one that I'll be
added to it explicitly.

Thanks,
Eyal.


On Sun, Oct 4, 2020 at 10:41 PM Carlos O'Donell <[hidden email]> wrote:

>
> On 10/3/20 5:04 AM, Eyal Itkin wrote:
> > Ping. Is there any update on this subject?
>
> Not yet.
>
> My queue is:
> - DSO sorting patches.
> - Tcache double-free
> - nsswitch (again)
>
> I've been blocked by other deliverables.
>
> I've reviewed some other patches, but they are
> easier to reason about and review ;-)
>
> --
> Cheers,
> Carlos.
>