[PATCH] explicit_bzero final

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH] explicit_bzero final

Zack Weinberg-2
I've addressed Joseph's comments, moved the bits/string2.h inline to
string.h proper to get it out of the way of Wilco's work removing
bits/string2.h (this also means that it can be a proper inline rather
than a macro, although the necessary #if conditional is pretty ugly),
and squashed all three changesets together.

This is the last call for feedback.  I intend to check this in on
Wednesday the 14th (forty-eight hours from now) if there are no
hard objections.

zw

---
New string function explicit_bzero (from OpenBSD).

explicit_bzero(s, n) is the same as memset(s, 0, n), except that the
compiler is not allowed to delete a call to explicit_bzero even if the
memory pointed to by 's' is dead after the call.  We achieve this effect
by defining it to call memset() and then a second function,

    extern void __glibc_read_memory (const void *, size_t)
       __attribute_noinline__;

which does nothing -- but the compiler is prevented from knowing that
it does nothing, and so the pointer "escapes" and the memory is not
treated as dead.  (Concretely, __glibc_read_memory is forced
out-of-line, defined in a file containing nothing else, and comments
in both string/read_memory.c and string/Makefile document that it must
not be subject to link-time optimization.)

By exposing __glibc_read_memory to external callers, we can write
a fortify wrapper for explicit_bzero in terms of __memset_chk and
__glibc_read_memory.  (There is no out-of-line __bzero_chk defined in
libc.so, so I see no need to add an out-of-line __explicit_bzero_chk
either.)  We can also add a bits/string2.h-style inline that optimizes
explicit_bzero to memset + __glibc_read_memory, thus allowing the compiler
to use intrinsic code generation for memset.  I think this is worth doing
because it partially addresses the problem pointed out in the manual,
where explicit_bzero might _cause_ sensitive data to be copied out of
registers onto the stack.  With memset visible to the compiler, it will
just clear a scratch area on the stack and then call __glibc_read_memory;
it won't copy the sensitive data onto the stack first.  (But it still
doesn't erase the sensitive data _in_ the registers, so the warning in
the manual is still appropriate.)

There are two new tests: test-explicit_bzero.c verifies the
visible semantics in the same way as the existing test-bzero.c,
and tst-xbzero-opt.c verifies the not-being-optimized-out property.
The latter is conceptually based on a test written by Matthew Dempsky
for the OpenBSD regression suite.

The crypt() implementation has an immediate use for this new feature.
As this is in libcrypt, there is a GLIBC_PRIVATE, impl-namespace alias
for explicit_bzero (even though it doesn't get used, thanks to the
aforementioned optimization).  I did not find any other places that needed
explicit_bzero, but all the glue is in place should we want to use it in
libc proper in the future.  The legacy DES implementation wasn't bothering
to clear its buffers, so I added that, mostly for consistency's sake.

        * string/explicit_bzero.c, string/read_memory.c: New routines.
        * string/test-explicit_bzero.c, string/tst-xbzero-opt.c: New tests.
        * string/Makefile (routines, strop-tests, tests): Add them.
        * string/test-memset.c: Add ifdeffage for testing explicit_bzero.

        * string/string.h [__USE_MISC]: Declare explicit_bzero and
        __glibc_read_memory.  Define explicit_zero as an inline
        when possible.
        * string/bits/string3.h: Fortify explicit_bzero.
        * include/string.h: Declare __internal_glibc_read_memory,
        __explicit_bzero, and __internal_explicit_bzero.
        * debug/tst-chk1.c: Test fortification of explicit_bzero.

        * manual/string.texi: Document explicit_bzero.
        * NEWS: Mention addition of explicit_bzero.

        * crypt/crypt-entry.c (__crypt_r): Clear key-dependent intermediate
        data before returning, using explicit_bzero.
        * crypt/md5-crypt.c (__md5_crypt_r): Likewise.
        * crypt/sha256-crypt.c (__sha256_crypt_r): Likewise.
        * crypt/sha512-crypt.c (__sha512_crypt_r): Likewise.

        * string/Versions [GLIBC_2.25]:
        Add explicit_bzero and __glibc_read_memory.
        [GLIBC_PRIVATE]: Add __explicit_bzero.
        * sysdeps/arm/nacl/libc.abilist
        * sysdeps/unix/sysv/linux/aarch64/libc.abilist
        * sysdeps/unix/sysv/linux/alpha/libc.abilist
        * sysdeps/unix/sysv/linux/arm/libc.abilist
        * sysdeps/unix/sysv/linux/hppa/libc.abilist
        * sysdeps/unix/sysv/linux/i386/libc.abilist
        * sysdeps/unix/sysv/linux/ia64/libc.abilist
        * sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
        * sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
        * sysdeps/unix/sysv/linux/microblaze/libc.abilist
        * sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
        * sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
        * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
        * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
        * sysdeps/unix/sysv/linux/nios2/libc.abilist
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
        * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
        * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
        * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
        * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
        * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
        * sysdeps/unix/sysv/linux/sh/libc.abilist
        * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
        * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
        * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
        * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
        * sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
        * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
        * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist:
        Add entries for explicit_bzero and __glibc_read_memory.
---
 NEWS                                               |   6 +
 crypt/crypt-entry.c                                |  11 +
 crypt/md5-crypt.c                                  |   8 +-
 crypt/sha256-crypt.c                               |  14 +-
 crypt/sha512-crypt.c                               |  14 +-
 debug/tst-chk1.c                                   |  28 ++
 include/string.h                                   |  16 ++
 manual/string.texi                                 | 101 +++++++
 string/Makefile                                    |  12 +-
 string/Versions                                    |  11 +
 string/bits/string3.h                              |   7 +
 string/explicit_bzero.c                            |  34 +++
 string/read_memory.c                               |  47 ++++
 string/string.h                                    |  22 ++
 string/test-explicit_bzero.c                       |  20 ++
 string/test-memset.c                               |  10 +-
 string/tst-xbzero-opt.c                            | 291 +++++++++++++++++++++
 sysdeps/arm/nacl/libc.abilist                      |   2 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist       |   2 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist         |   2 +
 sysdeps/unix/sysv/linux/arm/libc.abilist           |   2 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist          |   2 +
 sysdeps/unix/sysv/linux/i386/libc.abilist          |   2 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist          |   2 +
 sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist |   2 +
 sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/microblaze/libc.abilist    |   2 +
 .../unix/sysv/linux/mips/mips32/fpu/libc.abilist   |   2 +
 .../unix/sysv/linux/mips/mips32/nofpu/libc.abilist |   2 +
 .../unix/sysv/linux/mips/mips64/n32/libc.abilist   |   2 +
 .../unix/sysv/linux/mips/mips64/n64/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist         |   2 +
 .../sysv/linux/powerpc/powerpc32/fpu/libc.abilist  |   2 +
 .../linux/powerpc/powerpc32/nofpu/libc.abilist     |   2 +
 .../sysv/linux/powerpc/powerpc64/libc-le.abilist   |   2 +
 .../unix/sysv/linux/powerpc/powerpc64/libc.abilist |   2 +
 sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist  |   2 +
 sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist  |   2 +
 sysdeps/unix/sysv/linux/sh/libc.abilist            |   2 +
 sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist |   2 +
 sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist |   2 +
 .../sysv/linux/tile/tilegx/tilegx32/libc.abilist   |   2 +
 .../sysv/linux/tile/tilegx/tilegx64/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist  |   2 +
 sysdeps/unix/sysv/linux/x86_64/64/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist    |   2 +
 46 files changed, 688 insertions(+), 22 deletions(-)
 create mode 100644 string/explicit_bzero.c
 create mode 100644 string/read_memory.c
 create mode 100644 string/test-explicit_bzero.c
 create mode 100644 string/tst-xbzero-opt.c

diff --git a/NEWS b/NEWS
index 64ed6593cb..43a5c4e538 100644
--- a/NEWS
+++ b/NEWS
@@ -84,6 +84,12 @@ Version 2.25
 * The functions strfromd, strfromf, and strfroml, from ISO/IEC TS 18661-1:2014,
   are added to libc.  They convert a floating-point number into string.
 
+* The function explicit_bzero, from OpenBSD, has been added to libc.  It is
+  intended to be used instead of memset() to erase sensitive data after use;
+  the compiler will not optimize out calls to explicit_bzero even if they
+  are "unnecessary" (in the sense that no _correct_ program can observe the
+  effects of the memory clear).
+
 * On ColdFire, MicroBlaze, Nios II and SH3, the float_t type is now defined
   to float instead of double.  This does not affect the ABI of any libraries
   that are part of the GNU C Library, but may affect the ABI of other
diff --git a/crypt/crypt-entry.c b/crypt/crypt-entry.c
index a7dfccaa36..2d72691ceb 100644
--- a/crypt/crypt-entry.c
+++ b/crypt/crypt-entry.c
@@ -141,6 +141,17 @@ __crypt_r (const char *key, const char *salt,
    * And convert back to 6 bit ASCII
    */
   _ufc_output_conversion_r (res[0], res[1], salt, data);
+
+#ifdef _LIBC
+  /*
+   * Erase key-dependent intermediate data.  Data dependent only on
+   * the salt is not considered sensitive.
+   */
+  __explicit_bzero (ktab, sizeof (ktab));
+  __explicit_bzero (data->keysched, sizeof (data->keysched));
+  __explicit_bzero (res, sizeof (res));
+#endif
+
   return data->crypt_3_buf;
 }
 weak_alias (__crypt_r, crypt_r)
diff --git a/crypt/md5-crypt.c b/crypt/md5-crypt.c
index 2243bc7aed..617ccd3ac1 100644
--- a/crypt/md5-crypt.c
+++ b/crypt/md5-crypt.c
@@ -288,13 +288,13 @@ __md5_crypt_r (const char *key, const char *salt, char *buffer, int buflen)
 #ifndef USE_NSS
   __md5_init_ctx (&ctx);
   __md5_finish_ctx (&ctx, alt_result);
-  memset (&ctx, '\0', sizeof (ctx));
-  memset (&alt_ctx, '\0', sizeof (alt_ctx));
+  __explicit_bzero (&ctx, sizeof (ctx));
+  __explicit_bzero (&alt_ctx, sizeof (alt_ctx));
 #endif
   if (copied_key != NULL)
-    memset (copied_key, '\0', key_len);
+    __explicit_bzero (copied_key, key_len);
   if (copied_salt != NULL)
-    memset (copied_salt, '\0', salt_len);
+    __explicit_bzero (copied_salt, salt_len);
 
   free (free_key);
   return buffer;
diff --git a/crypt/sha256-crypt.c b/crypt/sha256-crypt.c
index d768234879..53f00c041e 100644
--- a/crypt/sha256-crypt.c
+++ b/crypt/sha256-crypt.c
@@ -371,16 +371,16 @@ __sha256_crypt_r (const char *key, const char *salt, char *buffer, int buflen)
 #ifndef USE_NSS
   __sha256_init_ctx (&ctx);
   __sha256_finish_ctx (&ctx, alt_result);
-  memset (&ctx, '\0', sizeof (ctx));
-  memset (&alt_ctx, '\0', sizeof (alt_ctx));
+  __explicit_bzero (&ctx, sizeof (ctx));
+  __explicit_bzero (&alt_ctx, sizeof (alt_ctx));
 #endif
-  memset (temp_result, '\0', sizeof (temp_result));
-  memset (p_bytes, '\0', key_len);
-  memset (s_bytes, '\0', salt_len);
+  __explicit_bzero (temp_result, sizeof (temp_result));
+  __explicit_bzero (p_bytes, key_len);
+  __explicit_bzero (s_bytes, salt_len);
   if (copied_key != NULL)
-    memset (copied_key, '\0', key_len);
+    __explicit_bzero (copied_key, key_len);
   if (copied_salt != NULL)
-    memset (copied_salt, '\0', salt_len);
+    __explicit_bzero (copied_salt, salt_len);
 
   free (free_key);
   free (free_pbytes);
diff --git a/crypt/sha512-crypt.c b/crypt/sha512-crypt.c
index f404c88b20..86b63542ca 100644
--- a/crypt/sha512-crypt.c
+++ b/crypt/sha512-crypt.c
@@ -393,16 +393,16 @@ __sha512_crypt_r (const char *key, const char *salt, char *buffer, int buflen)
 #ifndef USE_NSS
   __sha512_init_ctx (&ctx);
   __sha512_finish_ctx (&ctx, alt_result);
-  memset (&ctx, '\0', sizeof (ctx));
-  memset (&alt_ctx, '\0', sizeof (alt_ctx));
+  __explicit_bzero (&ctx, sizeof (ctx));
+  __explicit_bzero (&alt_ctx, sizeof (alt_ctx));
 #endif
-  memset (temp_result, '\0', sizeof (temp_result));
-  memset (p_bytes, '\0', key_len);
-  memset (s_bytes, '\0', salt_len);
+  __explicit_bzero (temp_result, sizeof (temp_result));
+  __explicit_bzero (p_bytes, key_len);
+  __explicit_bzero (s_bytes, salt_len);
   if (copied_key != NULL)
-    memset (copied_key, '\0', key_len);
+    __explicit_bzero (copied_key, key_len);
   if (copied_salt != NULL)
-    memset (copied_salt, '\0', salt_len);
+    __explicit_bzero (copied_salt, salt_len);
 
   free (free_key);
   free (free_pbytes);
diff --git a/debug/tst-chk1.c b/debug/tst-chk1.c
index 478c2fbb81..e87a279fb2 100644
--- a/debug/tst-chk1.c
+++ b/debug/tst-chk1.c
@@ -160,6 +160,10 @@ do_test (void)
   if (memcmp (buf, "aabcdabc\0\0", 10))
     FAIL ();
 
+  explicit_bzero (buf + 6, 4);
+  if (memcmp (buf, "aabcda\0\0\0\0", 10))
+    FAIL ();
+
   strcpy (buf + 4, "EDCBA");
   if (memcmp (buf, "aabcEDCBA", 10))
     FAIL ();
@@ -201,6 +205,10 @@ do_test (void)
   if (memcmp (buf, "aabcdabc\0\0", 10))
     FAIL ();
 
+  explicit_bzero (buf + 6, l0 + 4);
+  if (memcmp (buf, "aabcda\0\0\0\0", 10))
+    FAIL ();
+
   strcpy (buf + 4, str1 + 5);
   if (memcmp (buf, "aabcEDCBA", 10))
     FAIL ();
@@ -256,6 +264,10 @@ do_test (void)
   if (memcmp (a.buf1, "aabcdabc\0\0", 10))
     FAIL ();
 
+  explicit_bzero (a.buf1 + 6, l0 + 4);
+  if (memcmp (a.buf1, "aabcda\0\0\0\0", 10))
+    FAIL ();
+
 #if __USE_FORTIFY_LEVEL < 2
   /* The following tests are supposed to crash with -D_FORTIFY_SOURCE=2
      and sufficient GCC support, as the string operations overflow
@@ -345,6 +357,14 @@ do_test (void)
   bzero (buf + 9, l0 + 2);
   CHK_FAIL_END
 
+  CHK_FAIL_START
+  explicit_bzero (buf + 9, 2);
+  CHK_FAIL_END
+
+  CHK_FAIL_START
+  explicit_bzero (buf + 9, l0 + 2);
+  CHK_FAIL_END
+
   CHK_FAIL_START
   strcpy (buf + 5, str1 + 5);
   CHK_FAIL_END
@@ -454,6 +474,14 @@ do_test (void)
   bzero (a.buf1 + 9, l0 + 2);
   CHK_FAIL_END
 
+  CHK_FAIL_START
+  explicit_bzero (a.buf1 + 9, 2);
+  CHK_FAIL_END
+
+  CHK_FAIL_START
+  explicit_bzero (a.buf1 + 9, l0 + 2);
+  CHK_FAIL_END
+
 # if __USE_FORTIFY_LEVEL >= 2
 #  define O 0
 # else
diff --git a/include/string.h b/include/string.h
index e145bfdb4c..56d02deb83 100644
--- a/include/string.h
+++ b/include/string.h
@@ -100,6 +100,22 @@ extern __typeof (memmem) __memmem;
 libc_hidden_proto (__memmem)
 libc_hidden_proto (__ffs)
 
+/* explicit_bzero is used in libcrypt.  */
+extern __typeof (explicit_bzero) __explicit_bzero;
+extern __typeof (explicit_bzero) __internal_explicit_bzero;
+libc_hidden_proto (__internal_explicit_bzero)
+extern __typeof (__glibc_read_memory) __internal_glibc_read_memory;
+libc_hidden_proto (__internal_glibc_read_memory)
+/* Honor string.h inlines when present.  */
+#if __GNUC_PREREQ (3,4) \
+  && ((defined __extern_always_inline \
+       && defined __OPTIMIZE__ && !defined __OPTIMIZE_SIZE__ \
+       && !defined __NO_INLINE__ && !defined __NO_STRING_INLINES) \
+      || (__USE_FORTIFY_LEVEL > 0 && defined __fortify_function))
+# define __explicit_bzero(s,n) explicit_bzero (s,n)
+# define __internal_explicit_bzero(s,n) explicit_bzero (s,n)
+#endif
+
 libc_hidden_builtin_proto (memchr)
 libc_hidden_builtin_proto (memcpy)
 libc_hidden_builtin_proto (mempcpy)
diff --git a/manual/string.texi b/manual/string.texi
index 1986357ee8..faa75ba501 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -34,6 +34,8 @@ too.
 * Search Functions::            Searching for a specific element or substring.
 * Finding Tokens in a String::  Splitting a string into tokens by looking
  for delimiters.
+* Erasing Sensitive Data::      Clearing memory which contains sensitive
+                                 data, after it's no longer needed.
 * strfry::                      Function for flash-cooking a string.
 * Trivial Encryption::          Obscuring data.
 * Encode Binary Data::          Encoding and Decoding of Binary Data.
@@ -2404,6 +2406,105 @@ contains no '/' bytes, then "." is returned.  The prototype for this
 function can be found in @file{libgen.h}.
 @end deftypefun
 
+@node Erasing Sensitive Data
+@section Erasing Sensitive Data
+
+Sensitive data, such as cryptographic keys, should be erased from
+memory after use, to reduce the risk that a bug will expose it to the
+outside world.  However, compiler optimizations may determine that an
+erasure operation is ``unnecessary,'' and remove it from the generated
+code, because no @emph{correct} program could access the variable or
+heap object containing the sensitive data after it's deallocated.
+Since erasure is a precaution against bugs, this optimization is
+inappropriate.
+
+The function @code{explicit_bzero} erases a block of memory, and
+guarantees that the compiler will not remove the erasure as
+``unnecessary.''
+
+@smallexample
+@group
+#include <string.h>
+
+extern void encrypt (const char *key, const char *in,
+                     char *out, size_t n);
+extern void genkey (const char *phrase, char *key);
+
+void encrypt_with_phrase (const char *phrase, const char *in,
+                          char *out, size_t n)
+@{
+  char key[16];
+  genkey (phrase, key);
+  encrypt (key, in, out, n);
+  explicit_bzero (key, 16);
+@}
+@end group
+@end smallexample
+
+@noindent
+In this example, if @code{memset}, @code{bzero}, or a hand-written
+loop had been used, the compiler might remove them as ``unnecessary.''
+
+@strong{Warning:} @code{explicit_bzero} does not guarantee that
+sensitive data is @emph{completely} erased from the computer's memory.
+There may be copies in temporary storage areas, such as registers and
+``scratch'' stack space; since these are invisible to the source code,
+a library function cannot erase them.
+
+Also, @code{explicit_bzero} only operates on RAM.  If a sensitive data
+object never needs to have its address taken other than to call
+@code{explicit_bzero}, it might be stored entirely in CPU registers
+@emph{until} the call to @code{explicit_bzero}.  Then it will be
+copied into RAM, the copy will be erased, and the original will remain
+intact.  Data in RAM is more likely to be exposed by a bug than data
+in registers, so this creates a brief window where the data is at
+greater risk of exposure than it would have been if the program didn't
+try to erase it at all.  @Theglibc{}'s implementation of
+@code{explicit_bzero} contains a hack that can prevent the data from
+being copied to RAM in this situation, but it may not always work.
+
+Declaring sensitive variables as @code{volatile} will make both the
+above problems @emph{worse}; a @code{volatile} variable will be stored
+in memory for its entire lifetime, and the compiler will make
+@emph{more} copies of it than it would otherwise have.  Attempting to
+erase a normal variable ``by hand'' through a
+@code{volatile}-qualified pointer doesn't work at all---because the
+variable itself is not @code{volatile}, some compilers will ignore the
+qualification on the pointer and remove the erasure anyway.
+
+Having said all that, in most situations, using @code{explicit_bzero}
+is better than not using it.  At present, the only way to do a more
+thorough job is to write the entire sensitive operation in assembly
+language.  We anticipate that future compilers will recognize calls to
+@code{explicit_bzero} and take appropriate steps to erase all the
+copies of the affected data, whereever they may be.
+
+@comment string.h
+@comment BSD
+@deftypefun void explicit_bzero (void *@var{block}, size_t @var{len})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+
+@code{explicit_bzero} writes zero into @var{len} bytes of memory
+beginning at @var{block}, just as @code{bzero} would.  The zeroes are
+always written, even if the compiler could determine that this is
+``unnecessary'' because no correct program could read them back.
+
+@strong{Note:} The @emph{only} optimization that @code{explicit_bzero}
+disables is removal of ``unnecessary'' writes to memory.  The compiler
+can perform all the other optimizations that it could for a call to
+@code{memset}.  For instance, it may replace the function call with
+inline memory writes, and it may assume that @var{block} cannot be a
+null pointer.
+
+@strong{Portability Note:} This function first appeared in OpenBSD 5.5
+and has not been standardized.  Other systems may provide the same
+functionality under a different name, such as @code{explicit_memset},
+@code{memset_s}, or @code{SecureZeroMemory}.
+
+@Theglibc{} declares this function in @file{string.h}, but on other
+systems it may be in @file{strings.h} instead.
+@end deftypefun
+
 @node strfry
 @section strfry
 
diff --git a/string/Makefile b/string/Makefile
index 69d3f802fb..0b6486e509 100644
--- a/string/Makefile
+++ b/string/Makefile
@@ -41,20 +41,26 @@ routines := strcat strchr strcmp strcoll strcpy strcspn \
      addsep replace) \
    envz basename \
    strcoll_l strxfrm_l string-inlines memrchr \
-   xpg-strerror strerror_l
+   xpg-strerror strerror_l explicit_bzero
+
+# Attention future hackers trying to enable link-time optimization for
+# glibc: this file *must not* be subject to LTO.  It is added separately
+# to 'routines' to document this.  See comments in this file for details.
+routines += read_memory
 
 strop-tests := memchr memcmp memcpy memmove mempcpy memset memccpy \
    stpcpy stpncpy strcat strchr strcmp strcpy strcspn \
    strlen strncmp strncpy strpbrk strrchr strspn memmem \
    strstr strcasestr strnlen strcasecmp strncasecmp \
-   strncat rawmemchr strchrnul bcopy bzero memrchr
+   strncat rawmemchr strchrnul bcopy bzero memrchr \
+   explicit_bzero
 tests := tester inl-tester noinl-tester testcopy test-ffs \
    tst-strlen stratcliff tst-svc tst-inlcall \
    bug-strncat1 bug-strspn1 bug-strpbrk1 tst-bswap \
    tst-strtok tst-strxfrm bug-strcoll1 tst-strfry \
    bug-strtok1 $(addprefix test-,$(strop-tests)) \
    bug-envz1 tst-strxfrm2 tst-endian tst-svc2 \
-   tst-strtok_r bug-strcoll2 tst-cmp
+   tst-strtok_r bug-strcoll2 tst-cmp tst-xbzero-opt
 
 xtests = tst-strcoll-overflow
 
diff --git a/string/Versions b/string/Versions
index 475c1fdb64..73bb1cbf72 100644
--- a/string/Versions
+++ b/string/Versions
@@ -82,4 +82,15 @@ libc {
   }
   GLIBC_2.24 {
   }
+  GLIBC_2.25 {
+    # used by inlines in string.h and bits/string3.h
+    __glibc_read_memory;
+
+    # e*
+    explicit_bzero;
+  }
+  GLIBC_PRIVATE {
+    # used by libcrypt
+    __explicit_bzero;
+  }
 }
diff --git a/string/bits/string3.h b/string/bits/string3.h
index 8f13b65fa6..c1fcbd9e10 100644
--- a/string/bits/string3.h
+++ b/string/bits/string3.h
@@ -102,6 +102,13 @@ __NTH (bzero (void *__dest, size_t __len))
 {
   (void) __builtin___memset_chk (__dest, '\0', __len, __bos0 (__dest));
 }
+
+__fortify_function void
+__NTH (explicit_bzero (void *__dest, size_t __len))
+{
+  (void) __builtin___memset_chk (__dest, '\0', __len, __bos0 (__dest));
+  __glibc_read_memory (__dest, __len);
+}
 #endif
 
 __fortify_function char *
diff --git a/string/explicit_bzero.c b/string/explicit_bzero.c
new file mode 100644
index 0000000000..571595d1b5
--- /dev/null
+++ b/string/explicit_bzero.c
@@ -0,0 +1,34 @@
+/* Erasure of sensitive data.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <features.h>
+#undef __USE_STRING_INLINES
+#define __NO_STRING_INLINES
+#include <string.h>
+
+/* Set LEN bytes of S to 0.  The compiler will not delete a call to
+   this function, even if S is dead after the call.  */
+void
+__internal_explicit_bzero (void *s, size_t len)
+{
+  memset (s, '\0', len);
+  __internal_glibc_read_memory (s, len);
+}
+libc_hidden_def (__internal_explicit_bzero)
+strong_alias (__internal_explicit_bzero, __explicit_bzero)
+weak_alias (__internal_explicit_bzero, explicit_bzero)
diff --git a/string/read_memory.c b/string/read_memory.c
new file mode 100644
index 0000000000..8e08a46dcd
--- /dev/null
+++ b/string/read_memory.c
@@ -0,0 +1,47 @@
+/* "Read" a range of memory, from the caller's perspective.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+
+/* This function is an optimization fence.  It doesn't do anything,
+   but the compiler doesn't know that, and must assume that it reads
+   the block of memory pointed to by S and extending for LEN bytes.
+   This prevents any earlier write to that memory from being optimized
+   away.  For instance, explicit_bzero uses this function to ensure
+   that its erasure of a piece of sensitive data is preserved.
+
+   In order to achieve the desired effect, this function must never,
+   under any circumstances, be inlined or subjected to inter-
+   procedural optimization.  string.h declares this function with
+   attributes that, in conjunction with the no-op asm insert, are
+   sufficient to prevent problems in the current (2016) generation of
+   compilers, but *only if* this file is *not* compiled with -flto.
+   At present, this is not an issue since glibc is never compiled with
+   -flto, but should that ever change, this file must be excepted.
+
+   The 'volatile' below is technically not necessary but is included
+   for explicitness.  */
+
+void
+internal_function
+__internal_glibc_read_memory(const void *s, size_t len)
+{
+  asm volatile ("");
+}
+libc_hidden_def (__internal_glibc_read_memory)
+strong_alias (__internal_glibc_read_memory, __glibc_read_memory)
diff --git a/string/string.h b/string/string.h
index b103e64912..1ef33101f8 100644
--- a/string/string.h
+++ b/string/string.h
@@ -453,6 +453,28 @@ extern void bcopy (const void *__src, void *__dest, size_t __n)
 /* Set N bytes of S to 0.  */
 extern void bzero (void *__s, size_t __n) __THROW __nonnull ((1));
 
+/* As bzero, but the compiler will not delete a call to this
+   function, even if S is dead after the call.  */
+extern void explicit_bzero (void *__s, size_t __n) __THROW __nonnull ((1));
+
+/* Optimization fence, used by the inline versions of explicit_bzero
+   below and in bits/string3.h.  */
+extern void __glibc_read_memory (const void *__s, size_t __n)
+     __THROW __nonnull ((1)) __attribute_noinline__;
+
+# if __GNUC_PREREQ (3,4) && defined __extern_always_inline \
+  && defined __OPTIMIZE__ && !defined __OPTIMIZE_SIZE__ \
+  && !defined __NO_INLINE__ && !defined __NO_STRING_INLINES \
+  && (__USE_FORTIFY_LEVEL == 0 || !defined __fortify_function)
+
+__extern_always_inline void
+__NTH (explicit_bzero (void *__s, size_t __n))
+{
+  memset (__s, '\0', __n);
+  __glibc_read_memory (__s, __n);
+}
+# endif
+
 /* Compare N bytes of S1 and S2 (same as memcmp).  */
 extern int bcmp (const void *__s1, const void *__s2, size_t __n)
      __THROW __attribute_pure__ __nonnull ((1, 2));
diff --git a/string/test-explicit_bzero.c b/string/test-explicit_bzero.c
new file mode 100644
index 0000000000..5a4543b41a
--- /dev/null
+++ b/string/test-explicit_bzero.c
@@ -0,0 +1,20 @@
+/* Test and measure explicit_bzero.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+#define TEST_EXPLICIT_BZERO
+#define TEST_BZERO
+#include "test-memset.c"
diff --git a/string/test-memset.c b/string/test-memset.c
index fee3bdf37f..7ca4f2076c 100644
--- a/string/test-memset.c
+++ b/string/test-memset.c
@@ -19,7 +19,11 @@
 
 #define TEST_MAIN
 #ifdef TEST_BZERO
-# define TEST_NAME "bzero"
+# ifdef TEST_EXPLICIT_BZERO
+#  define TEST_NAME "explicit_bzero"
+# else
+#  define TEST_NAME "bzero"
+# endif
 #else
 # ifndef WIDE
 #  define TEST_NAME "memset"
@@ -56,7 +60,11 @@ void builtin_bzero (char *, size_t);
 
 IMPL (simple_bzero, 0)
 IMPL (builtin_bzero, 0)
+#ifdef TEST_EXPLICIT_BZERO
+IMPL (explicit_bzero, 1)
+#else
 IMPL (bzero, 1)
+#endif
 
 void
 simple_bzero (char *s, size_t n)
diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c
new file mode 100644
index 0000000000..a777432e1c
--- /dev/null
+++ b/string/tst-xbzero-opt.c
@@ -0,0 +1,291 @@
+/* Test that explicit_bzero block clears are not optimized out.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This test is conceptually based on a test designed by Matthew
+   Dempsky for the OpenBSD regression suite:
+   <openbsd>/src/regress/lib/libc/explicit_bzero/explicit_bzero.c.
+   The basic idea is, we have a function that contains a
+   block-clearing operation (not necessarily explicit_bzero), after
+   which the block is dead, in the compiler-jargon sense.  Execute
+   that function while running on a user-allocated alternative
+   stack. Then we have another pointer to the memory region affected
+   by the block clear -- namely, the original allocation for the
+   alternative stack -- and can find out whether it actually happened.
+
+   The OpenBSD test uses sigaltstack and SIGUSR1 to get onto an
+   alternative stack.  This causes a number of awkward problems; some
+   operating systems (e.g. Solaris and OSX) wipe the signal stack upon
+   returning to the normal stack, there's no way to be sure that other
+   processes running on the same system will not interfere, and the
+   signal stack is very small so it's not safe to call printf there.
+   This implementation instead uses the <ucontext.h> coroutine
+   interface.  The coroutine stack is still too small to safely use
+   printf, but we know the OS won't erase it, so we can do all the
+   checks and printing from the normal stack.
+
+   If this test begins to fail with a new compiler, investigate
+   whether __glibc_read_memory is still doing its job.  */
+
+#define _GNU_SOURCE 1
+
+#include <errno.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <ucontext.h>
+#include <unistd.h>
+
+/* A byte pattern that is unlikely to occur by chance: the first 16
+   prime numbers (OEIS A000040).  */
+static const unsigned char test_pattern[16] =
+{
+  2, 3, 5, 7,  11, 13, 17, 19,  23, 29, 31, 37,  41, 43, 47, 53
+};
+
+/* Immediately after each subtest returns, we call swapcontext to get
+   back onto the main stack.  That call might itself overwrite the
+   test pattern, so we fill a modest-sized buffer with copies of it
+   and check whether any of them survived.  */
+
+#define PATTERN_SIZE (sizeof test_pattern)
+#define PATTERN_REPS 32
+#define TEST_BUFFER_SIZE (PATTERN_SIZE * PATTERN_REPS)
+
+/* There are three subtests, two of which are sanity checks.
+   Each test follows this sequence:
+
+     main                      coroutine
+     ----                      --------
+     advance cur_subtest
+     swap
+                               call setup function
+                                 prepare test buffer
+                                 swap
+     verify that buffer
+     was filled in
+     swap
+                                 possibly clear buffer
+                                 return
+                               swap
+     check buffer again,
+     according to test
+     expectation
+
+   In the "no_clear" case, we don't do anything to the test buffer
+   between preparing it and letting it go out of scope, and we expect
+   to find it.  This confirms that the test buffer does get filled in
+   and we can find it from the stack buffer.  In the "ordinary_clear"
+   case, we clear it using memset, and we expect to find it.  This
+   confirms that the compiler can optimize out block clears in this
+   context; if it can't, the real test might be succeeding for the
+   wrong reason.  Finally, the "explicit_clear" case uses
+   explicit_bzero and expects _not_ to find the test buffer, which is
+   the real test.  */
+
+static ucontext_t uc_main, uc_co;
+
+/* Always check the test buffer immediately after filling it; this
+   makes externally visible side effects depend on the buffer existing
+   and having been filled in.  */
+static void
+prepare_test_buffer (unsigned char *buf)
+{
+  for (unsigned int i = 0; i < PATTERN_REPS; i++)
+    memcpy (buf + i*PATTERN_SIZE, test_pattern, PATTERN_SIZE);
+
+  if (swapcontext (&uc_co, &uc_main))
+    abort ();
+}
+
+static void
+setup_no_clear (void)
+{
+  unsigned char buf[TEST_BUFFER_SIZE];
+  prepare_test_buffer (buf);
+}
+
+static void
+setup_ordinary_clear (void)
+{
+  unsigned char buf[TEST_BUFFER_SIZE];
+  prepare_test_buffer (buf);
+  memset (buf, 0, TEST_BUFFER_SIZE);
+}
+
+static void
+setup_explicit_clear (void)
+{
+  unsigned char buf[TEST_BUFFER_SIZE];
+  prepare_test_buffer (buf);
+  explicit_bzero (buf, TEST_BUFFER_SIZE);
+}
+
+enum test_expectation { EXPECT_NONE, EXPECT_SOME, EXPECT_ALL };
+struct subtest
+{
+  void (*setup_subtest) (void);
+  const char *label;
+  enum test_expectation expected;
+};
+static const struct subtest *cur_subtest;
+
+static const struct subtest subtests[] =
+{
+  { setup_no_clear,       "no clear",       EXPECT_SOME },
+  { setup_ordinary_clear, "ordinary clear", EXPECT_SOME },
+  { setup_explicit_clear, "explicit clear", EXPECT_NONE },
+  { 0,                    0,                -1          }
+};
+
+static void
+test_coroutine (void)
+{
+  while (cur_subtest->setup_subtest)
+    {
+      cur_subtest->setup_subtest ();
+      if (swapcontext (&uc_co, &uc_main))
+ abort ();
+    }
+}
+
+/* All the code above this point runs on the coroutine stack.
+   All the code below this point runs on the main stack.  */
+
+static int test_status;
+static unsigned char *co_stack_buffer;
+static size_t co_stack_size;
+
+static unsigned int
+count_test_patterns (unsigned char *buf, size_t bufsiz)
+{
+  unsigned char *first = memmem (buf, bufsiz, test_pattern, PATTERN_SIZE);
+  if (!first)
+    return 0;
+  unsigned int cnt = 0;
+  for (unsigned int i = 0; i < PATTERN_REPS; i++)
+    {
+      unsigned char *p = first + i*PATTERN_SIZE;
+      if (p + PATTERN_SIZE - buf > bufsiz)
+ break;
+      if (memcmp (p, test_pattern, PATTERN_SIZE) == 0)
+ cnt++;
+    }
+  return cnt;
+}
+
+static void
+check_test_buffer (enum test_expectation expected,
+   const char *label, const char *stage)
+{
+  unsigned int cnt = count_test_patterns (co_stack_buffer, co_stack_size);
+  switch (expected)
+    {
+    case EXPECT_NONE:
+      if (cnt == 0)
+ printf ("PASS: %s/%s: expected 0 got %d\n", label, stage, cnt);
+      else
+ {
+  printf ("FAIL: %s/%s: expected 0 got %d\n", label, stage, cnt);
+  test_status = 1;
+ }
+      break;
+
+    case EXPECT_SOME:
+      if (cnt > 0)
+ printf ("PASS: %s/%s: expected some got %d\n", label, stage, cnt);
+      else
+ {
+  printf ("FAIL: %s/%s: expected some got 0\n", label, stage);
+  test_status = 1;
+ }
+      break;
+
+     case EXPECT_ALL:
+      if (cnt == PATTERN_REPS)
+ printf ("PASS: %s/%s: expected %d got %d\n", label, stage,
+ PATTERN_REPS, cnt);
+      else
+ {
+  printf ("FAIL: %s/%s: expected %d got %d\n", label, stage,
+  PATTERN_REPS, cnt);
+  test_status = 1;
+ }
+      break;
+
+    default:
+      printf ("ERROR: %s/%s: invalid value for 'expected' = %d\n",
+      label, stage, (int)expected);
+      test_status = 1;
+    }
+}
+
+static void
+test_loop (void)
+{
+  cur_subtest = subtests;
+  while (cur_subtest->setup_subtest)
+    {
+      if (swapcontext (&uc_main, &uc_co))
+ abort ();
+      check_test_buffer (EXPECT_ALL, cur_subtest->label, "prepare");
+      if (swapcontext (&uc_main, &uc_co))
+ abort ();
+      check_test_buffer (cur_subtest->expected, cur_subtest->label, "test");
+      cur_subtest++;
+    }
+  /* Terminate the coroutine.  */
+  if (swapcontext (&uc_main, &uc_co))
+    abort ();
+}
+
+int
+do_test (void)
+{
+  size_t page_alignment = sysconf (_SC_PAGESIZE);
+  if (page_alignment < sizeof (void *))
+    page_alignment = sizeof (void *);
+
+  co_stack_size = SIGSTKSZ + TEST_BUFFER_SIZE;
+  if (co_stack_size < page_alignment * 4)
+    co_stack_size = page_alignment * 4;
+
+  void *p;
+  int err = posix_memalign (&p, page_alignment, co_stack_size);
+  if (err || !p)
+    {
+      printf ("ERROR: allocating alt stack: %s\n", strerror (err));
+      return 2;
+    }
+  co_stack_buffer = p;
+
+  if (getcontext (&uc_co))
+    {
+      printf ("ERROR: allocating coroutine context: %s\n", strerror (err));
+      return 2;
+    }
+  uc_co.uc_stack.ss_sp   = co_stack_buffer;
+  uc_co.uc_stack.ss_size = co_stack_size;
+  uc_co.uc_link          = &uc_main;
+  makecontext (&uc_co, test_coroutine, 0);
+
+  test_loop ();
+  return test_status;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/arm/nacl/libc.abilist b/sysdeps/arm/nacl/libc.abilist
index 0a39f4d13e..8583317ad3 100644
--- a/sysdeps/arm/nacl/libc.abilist
+++ b/sysdeps/arm/nacl/libc.abilist
@@ -1843,6 +1843,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 gnu_dev_major F
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index 8d3435f210..e5bdd2138c 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2090,6 +2090,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index 2381616b71..693a97217f 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2001,6 +2001,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/arm/libc.abilist b/sysdeps/unix/sysv/linux/arm/libc.abilist
index e2dbf6fee7..51b21726d6 100644
--- a/sysdeps/unix/sysv/linux/arm/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/libc.abilist
@@ -91,6 +91,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index 231f039e69..da8d858217 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -1855,6 +1855,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index 5d0159348b..ea34f48b0a 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2013,6 +2013,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index 00ef4d77ab..df0e999d9a 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -1877,6 +1877,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 7de0b0e9bd..f5fda4820d 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -92,6 +92,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index 93e1124592..1e5ce204ae 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -1969,6 +1969,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/microblaze/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/libc.abilist
index da70083bdd..4a2b238a60 100644
--- a/sysdeps/unix/sysv/linux/microblaze/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/libc.abilist
@@ -2090,6 +2090,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 570bbf6791..a48d869967 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -1944,6 +1944,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index 5022e5a186..de591fc28d 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -1942,6 +1942,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index ca9f91924f..673393b340 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -1940,6 +1940,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 2a090db5c6..cfc03d9018 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -1935,6 +1935,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index d10bb32513..58d148dffa 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2131,6 +2131,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index a7e688bed9..4d6efdd6b6 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -1973,6 +1973,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index 8c32c33076..a0f8608119 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -1978,6 +1978,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
index 2a13e49a9e..a097b1c317 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
@@ -2178,6 +2178,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
index 4f29a6db6e..0e7fac196f 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
@@ -92,6 +92,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index 1e9d4f4d91..ad93e221e0 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -1973,6 +1973,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index 6b3d9320e1..7b53df33fb 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -1874,6 +1874,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/sh/libc.abilist b/sysdeps/unix/sysv/linux/sh/libc.abilist
index 93b79d3758..425ae6cc28 100644
--- a/sysdeps/unix/sysv/linux/sh/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/libc.abilist
@@ -1859,6 +1859,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index a29b0c149b..cad6a4706d 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -1965,6 +1965,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 8ac06e9b01..55c0ea6f7d 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -1903,6 +1903,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
index 208e7425ff..b95c3d4023 100644
--- a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
@@ -2097,6 +2097,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
index 3d7836d546..6fde0c43b0 100644
--- a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
@@ -2097,6 +2097,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist b/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
index 208e7425ff..b95c3d4023 100644
--- a/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
@@ -2097,6 +2097,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index 32cd1e07b4..1f77aadc9a 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -1854,6 +1854,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index 4ff7acf300..f528b85665 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2097,6 +2097,8 @@ GLIBC_2.23 fts64_set F
 GLIBC_2.24 GLIBC_2.24 A
 GLIBC_2.24 quick_exit F
 GLIBC_2.25 GLIBC_2.25 A
+GLIBC_2.25 __glibc_read_memory F
+GLIBC_2.25 explicit_bzero F
 GLIBC_2.25 getentropy F
 GLIBC_2.25 getrandom F
 GLIBC_2.25 strfromd F
--
2.11.0

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Paul Eggert
A couple of very minor nits in the documentation:

> +Having said all that, in most situations, using @code{explicit_bzero}

Please remove second comma. Otherwise "in most situations" can be misread as
applying to "Having said all that".

> by defining it to call memset() and then a second function,
> ...
> +  intended to be used instead of memset() to erase sensitive data after use;

In the commit message and NEWS file, please use just 'memset' and not
'memset()', as the comments talk about the function, not about argument-free
calls to the function.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Florian Weimer-5
In reply to this post by Zack Weinberg-2
On 12/13/2016 12:06 AM, Zack Weinberg wrote:
> By exposing __glibc_read_memory to external callers, we can write
> a fortify wrapper for explicit_bzero in terms of __memset_chk and
> __glibc_read_memory.

I think this is quite wrong.

We already know that explicit_bzero will need special treatment from the
compiler, but the above may hide the fortified version.

I need to talk to some GCC people to see if the above is just plain
ugly, or something that can actively interfere with predictable
properties of a fortified call to explicit_bzero.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
On Tue, Dec 13, 2016 at 2:02 AM, Florian Weimer <[hidden email]> wrote:
> On 12/13/2016 12:06 AM, Zack Weinberg wrote:
>> By exposing __glibc_read_memory to external callers, we can write
>> a fortify wrapper for explicit_bzero in terms of __memset_chk and
>> __glibc_read_memory.
>
> I need to talk to some GCC people to see if the above is just plain ugly, or
> something that can actively interfere with predictable properties of a
> fortified call to explicit_bzero.

It's probably good to loop in some GCC people, but off the top of my
head, I think it should be sufficient to have it treat a call to
'__glibc_read_memory' as triggering the same special behavior as a
call to 'explicit_bzero'.  Of course that would mean
__glibc_read_memory can't be used for other cases where we need a
"those writes are not dead" optimization fence, so maybe I should
rename it __explicit_bzero_fence or something like that.

zw
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Jeff Law
On 12/13/2016 08:53 AM, Zack Weinberg wrote:

> On Tue, Dec 13, 2016 at 2:02 AM, Florian Weimer <[hidden email]> wrote:
>> On 12/13/2016 12:06 AM, Zack Weinberg wrote:
>>> By exposing __glibc_read_memory to external callers, we can write
>>> a fortify wrapper for explicit_bzero in terms of __memset_chk and
>>> __glibc_read_memory.
>>
>> I need to talk to some GCC people to see if the above is just plain ugly, or
>> something that can actively interfere with predictable properties of a
>> fortified call to explicit_bzero.
>
> It's probably good to loop in some GCC people, but off the top of my
> head, I think it should be sufficient to have it treat a call to
> '__glibc_read_memory' as triggering the same special behavior as a
> call to 'explicit_bzero'.  Of course that would mean
> __glibc_read_memory can't be used for other cases where we need a
> "those writes are not dead" optimization fence, so maybe I should
> rename it __explicit_bzero_fence or something like that.
It's going to be hard for GCC to recognize the memset_chk+read_memory as
an explict_bzero.  It'll have to infer it from the code pattern, which
is inherently prone to misses.

I'd also worry that this style definition may be prone to goofs.  The
keys being that if LTO were to see inside __glibc_read_memory and
realize that the reads are dead and removes the reads.

That would in turn allow DSE to come along and potentially realize that
the memory written by builtin__memset_chk is never read -- and GCC would
wipe out the call to builtin__memset_chk.  That, of course, leaves the
secret data in memory and available for someone to try and extract.

I think a proper explicit_bzero is probably a better choice.  We ought
to be able to tag it properly and ensure that it never goes away.

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
On Tue, Dec 13, 2016 at 5:46 PM, Jeff Law <[hidden email]> wrote:

> On 12/13/2016 08:53 AM, Zack Weinberg wrote:
>>
>> It's probably good to loop in some GCC people, but off the top of my
>> head, I think it should be sufficient to have it treat a call to
>> '__glibc_read_memory' as triggering the same special behavior as a
>> call to 'explicit_bzero'.  Of course that would mean
>> __glibc_read_memory can't be used for other cases where we need a
>> "those writes are not dead" optimization fence, so maybe I should
>> rename it __explicit_bzero_fence or something like that.
>
> It's going to be hard for GCC to recognize the memset_chk+read_memory as an
> explict_bzero.  It'll have to infer it from the code pattern, which is
> inherently prone to misses.

Maybe I don't understand what you have in mind, but I don't think you
have to infer anything from the code pattern.  Just treat
__explicit_bzero_fence the same as explicit_bzero itself.  The
arguments are the same and it won't be used for anything else.

I have no idea how hard this would be to implement in present-day GCC,
but the way I imagine explicit_bzero support working is: there's an
__attribute__ for variables (let's call it "sensitive") that triggers
a forward taint analysis: every data object transitively data or
control dependent on the "sensitive" variable also becomes
"sensitive".  This has no effect on data objects whose lifetime
extends beyond the function in which they were declared, but if they
don't, they are erased when their lifetime ends.

Then, the effect of the explicit_bzero and __explicit_bzero_fence
builtins just is to tag the variable whose address they receive with
the "sensitive" attribute; the explicit_bzero builtin also behaves as
__builtin_memset(s, 0, n) (this is in case the sensitive variable is
allocated on the heap or otherwise survives -- it still needs to be
erased when the programmer said it should be erased).  This might need
some sort of _backward_ data flow analysis to figure out which
source-code variable corresponds to the SSA name received by the
builtin, I dunno.

There's probably some wacky corner cases I haven't thought of, but I
don't see why in-glibc expansion of explicit_bzero to memset+fence
should be a problem.

From the glibc side of things, this inline (with or without
fortification) is highly desirable for two reasons.  Most important,
this is how we mitigate a problem with variables whose address is only
taken in a call to explicit_bzero.  With naive code generation, that
variable might live in CPU registers for its entire lifetime, but then
get copied into RAM just so explicit_bzero can erase it again, which
opens a window where the sensitive data is _more_ exposed than if the
programmer hadn't tried to erase it at all.  With the inline, the data
doesn't get copied out of registers.  (It doesn't get _erased_ from
the registers, but that was true either way.)

We also have a nasty interaction between internal PLT bypass and
ifuncs which means that a hypothetical __explicit_bzero_chk is
ridiculously difficult to implement.  I tried that once already and it
did not go well.

> I'd also worry that this style definition may be prone to goofs.  The keys
> being that if LTO were to see inside __glibc_read_memory and realize that
> the reads are dead and removes the reads.

Right now, LTO into glibc will have all kinds of other negative
consequences (Joseph can explain in more detail) and we don't even try
to support it.  I've carefully documented in the source code that, if
we ever do try to support LTO, __glibc_read_memory (/
__explicit_bzero_fence) must remain opaque to the compiler.

zw
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
Let me add that I will be traveling for the last two weeks of December
and I really don't want this to miss 2.25, so please let's get this
hashed out as quickly as possible.

zw
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Florian Weimer-5
In reply to this post by Zack Weinberg-2
On 12/14/2016 02:04 AM, Zack Weinberg wrote:
> We also have a nasty interaction between internal PLT bypass and
> ifuncs which means that a hypothetical __explicit_bzero_chk is
> ridiculously difficult to implement.  I tried that once already and it
> did not go well.

I'm looking into this aspect right now.

Florian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Jeff Law
In reply to this post by Zack Weinberg-2
On 12/13/2016 06:04 PM, Zack Weinberg wrote:

> On Tue, Dec 13, 2016 at 5:46 PM, Jeff Law <[hidden email]> wrote:
>> On 12/13/2016 08:53 AM, Zack Weinberg wrote:
>>>
>>> It's probably good to loop in some GCC people, but off the top of my
>>> head, I think it should be sufficient to have it treat a call to
>>> '__glibc_read_memory' as triggering the same special behavior as a
>>> call to 'explicit_bzero'.  Of course that would mean
>>> __glibc_read_memory can't be used for other cases where we need a
>>> "those writes are not dead" optimization fence, so maybe I should
>>> rename it __explicit_bzero_fence or something like that.
>>
>> It's going to be hard for GCC to recognize the memset_chk+read_memory as an
>> explict_bzero.  It'll have to infer it from the code pattern, which is
>> inherently prone to misses.
>
> Maybe I don't understand what you have in mind, but I don't think you
> have to infer anything from the code pattern.  Just treat
> __explicit_bzero_fence the same as explicit_bzero itself.  The
> arguments are the same and it won't be used for anything else.
>
> I have no idea how hard this would be to implement in present-day GCC,
> but the way I imagine explicit_bzero support working is: there's an
> __attribute__ for variables (let's call it "sensitive") that triggers
> a forward taint analysis: every data object transitively data or
> control dependent on the "sensitive" variable also becomes
> "sensitive".  This has no effect on data objects whose lifetime
> extends beyond the function in which they were declared, but if they
> don't, they are erased when their lifetime ends.
The concept of tainting variables and tracking that property seems
distinct from making sure that the compiler properly handles
explicit_bzero.  And just to be clear, I see that property as having value.



>
> Then, the effect of the explicit_bzero and __explicit_bzero_fence
> builtins just is to tag the variable whose address they receive with
> the "sensitive" attribute; the explicit_bzero builtin also behaves as
> __builtin_memset(s, 0, n) (this is in case the sensitive variable is
> allocated on the heap or otherwise survives -- it still needs to be
> erased when the programmer said it should be erased).  This might need
> some sort of _backward_ data flow analysis to figure out which
> source-code variable corresponds to the SSA name received by the
> builtin, I dunno.
Interesting.  I hadn't thought about using these builtins to seed the
dataflow analysis.


>
> There's probably some wacky corner cases I haven't thought of, but I
> don't see why in-glibc expansion of explicit_bzero to memset+fence
> should be a problem.
My worry is ensuring that the compiler doesn't ever remove these stores
as dead.  And the easiest way I see to do that is to recognize the
explicit_bzero explicitly.  Otherwise I have to walk the use->def chain
to pair the fence with the bzero and then tag the bzero.  While that
ought to work, it just seems easier to recognize the bzero from the
start as special.


>
>> I'd also worry that this style definition may be prone to goofs.  The keys
>> being that if LTO were to see inside __glibc_read_memory and realize that
>> the reads are dead and removes the reads.
>
> Right now, LTO into glibc will have all kinds of other negative
> consequences (Joseph can explain in more detail) and we don't even try
> to support it.  I've carefully documented in the source code that, if
> we ever do try to support LTO, __glibc_read_memory (/
> __explicit_bzero_fence) must remain opaque to the compiler.
Understood, WRT LTO today.  I'm not worried about today, I'm worried
about what happens in the future when the assumptions we're making today
about what the compiler can and can not do may no longer hold.
Jeff
>
> zw
>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Florian Weimer-5
In reply to this post by Florian Weimer-5
On 12/14/2016 02:15 PM, Florian Weimer wrote:
> On 12/14/2016 02:04 AM, Zack Weinberg wrote:
>> We also have a nasty interaction between internal PLT bypass and
>> ifuncs which means that a hypothetical __explicit_bzero_chk is
>> ridiculously difficult to implement.  I tried that once already and it
>> did not go well.
>
> I'm looking into this aspect right now.

This patch on top of yours implements proper explict_bzero and
__explicit_bzero_chk symbols.  I put it through build-many-glibcs, and
it results in the expected ABI everywhere.

The attached patch also fixes the compiler barrier, and removes some
unnecessary conditionals from include/string.h (because we know that a
current GCC is used for compiling glibc itself, so checking for that in
an internal header does not make sense).

It is up to the architecture maintainers to add optimized
implementations.  For explicit_bzero, this is often quite
straightforward because many architectures already have an optimized
bzero, and explicit_bzero can just be an alias.  But there is no
__bzero_chk.  IFUNC is transitive: if something is an IFUNC, all callers
in libc.so have to be an IFUNC, too.  Implementing __bzero_chk on top of
__memset_chk requires a tiny bit of argument adjustment, but in
practice, this means that all memset implementations need to be
duplicated, and a new IFUNC resolver has to be added.

Nevertheless, I think this is the way to proceed.

Thanks,
Florian


explicit_bzero.patch (30K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
On Wed, Dec 14, 2016 at 12:05 PM, Florian Weimer <[hidden email]> wrote:

> On 12/14/2016 02:15 PM, Florian Weimer wrote:
>> On 12/14/2016 02:04 AM, Zack Weinberg wrote:
>>>
>>> We also have a nasty interaction between internal PLT bypass and
>>> ifuncs which means that a hypothetical __explicit_bzero_chk is
>>> ridiculously difficult to implement.  I tried that once already and it
>>> did not go well.
>>
>> I'm looking into this aspect right now.
>
> This patch on top of yours implements proper explict_bzero and
> __explicit_bzero_chk symbols.  I put it through build-many-glibcs, and it
> results in the expected ABI everywhere.

I appreciate your having tried this; the patch has a number of
correctable problems (see below) but it does demonstrate that
__explicit_bzero_chk is not a lost cause.

The remaining question in my mind is whether, in the case where a
variable's address is only taken in a call to explicit_bzero, we
should give up on the "hack to prevent the data being copied to
memory" for the sake of hypothetical future GCC support.  That hack, I
remind you, is the inline expansion to memset+__glibc_read_memory.  We
made a huge fuss over that case in the manual, and a couple of people
were prepared to veto explicit_bzero altogether if we didn't do
something about it.  I am very reluctant to give it up, especially as
I'm still not convinced it's a problem for the compiler (see reply to
Jeff).

--- a/crypt/crypt-entry.c
+++ b/crypt/crypt-entry.c
@@ -142,15 +142,13 @@ __crypt_r (const char *key, const char *salt,
    */
   _ufc_output_conversion_r (res[0], res[1], salt, data);

-#ifdef _LIBC
   /*
    * Erase key-dependent intermediate data.  Data dependent only on
    * the salt is not considered sensitive.
    */
-  __explicit_bzero (ktab, sizeof (ktab));
-  __explicit_bzero (data->keysched, sizeof (data->keysched));
-  __explicit_bzero (res, sizeof (res));
-#endif
+  explicit_bzero (ktab, sizeof (ktab));
+  explicit_bzero (data->keysched, sizeof (data->keysched));
+  explicit_bzero (res, sizeof (res));

The _LIBC ifdeffage is vestigial, but should probably be left alone in
a patch that isn't about that.

libcrypt really does need to refer to __explicit_bzero, not
explicit_bzero.  Joseph can explain better than I can, but the
fundamental constraint is that the implementation of a standardized
function ('crypt' is POSIX) is not allowed to refer to nonstandard
user-namespace symbols.  This change should have triggered
linknamespace failures.

+/* This is the generic definition of __explicit_bzero_chk.  The
+   __explicit_bzero_chk symbol is used as the implementation of
+   explicit_bzero throughout glibc.  If this file is overriden by an
+   architecture, both __explicit_bzero_chk and
+   __explicit_bzero_chk_internal have to be defined (the latter not as
+   an IFUNC).  */

This file is not in sysdeps/generic, so it cannot be overridden (or is
that no longer the case? If so, why do we still have sysdeps/generic?)
and I don't think we need the capability to override it.  Better we
should get libc-internal references to memset going to the proper ifunc
for the architecture.

+  /* Compiler barrier.  */
+  asm volatile ("" ::: "memory");
+}

I do not understand why you have reverted to an older, inferior
compiler barrier.  This was extensively hashed out quite some time ago.

--- a/include/string.h
+++ b/include/string.h
@@ -100,20 +100,15 @@ extern __typeof (memmem) __memmem;
 libc_hidden_proto (__memmem)
 libc_hidden_proto (__ffs)

-/* explicit_bzero is used in libcrypt.  */
-extern __typeof (explicit_bzero) __explicit_bzero;
-extern __typeof (explicit_bzero) __internal_explicit_bzero;
-libc_hidden_proto (__internal_explicit_bzero)
-extern __typeof (__glibc_read_memory) __internal_glibc_read_memory;
-libc_hidden_proto (__internal_glibc_read_memory)
-/* Honor string.h inlines when present.  */
-#if __GNUC_PREREQ (3,4)                            \
-  && ((defined __extern_always_inline                    \
-       && defined __OPTIMIZE__ && !defined __OPTIMIZE_SIZE__        \
-       && !defined __NO_INLINE__ && !defined __NO_STRING_INLINES)    \
-      || (__USE_FORTIFY_LEVEL > 0 && defined __fortify_function))
-# define __explicit_bzero(s,n) explicit_bzero (s,n)
-# define __internal_explicit_bzero(s,n) explicit_bzero (s,n)
+#if IS_IN (libc)
+/* Avoid hidden reference to IFUNC symbol __explicit_bzero_chk.  */
+void __explicit_bzero_chk_internal (void *, size_t, size_t)
+  __THROW __nonnull ((1)) attribute_hidden;
+# define explicit_bzero(buf, len) \
+  __explicit_bzero_chk_internal (buf, len, __bos0 (buf))
+#elif !IS_IN (nonlib)
+void __explicit_bzero_chk (void *, size_t, size_t) __THROW __nonnull ((1));
+# define explicit_bzero(buf, len) __explicit_bzero_chk (buf, len, __bos0 (buf))
 #endif

Oh, I see why you're not getting linknamespace failures from the
libcrypt change: you've implicitly fortified all those calls, which
has the side effect of making them use an impl-namespace symbol.  It
makes sense as a testing strategy, but it doesn't feel like the right
move for the committed patch (better to leave that to an all-or-nothing
"fortify libc internally" switch, ne?)

What we _could_ do is

#if IS_IN (libc)
# define explicit_bzero(s, n) __internal_explicit_bzero (s, n)
#else
# define explicit_bzero(s, n) __explicit_bzero (s, n)
#endif

which would allow libcrypt's source code to use the unmangled names.
I think that's something we're trying to do for other string
functions, so perhaps it makes sense.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Joseph Myers
On Wed, 14 Dec 2016, Zack Weinberg wrote:

> This file is not in sysdeps/generic, so it cannot be overridden (or is
> that no longer the case? If so, why do we still have sysdeps/generic?)

Files outside sysdeps/generic can be overridden.

Internal headers used from more than one directory and also overridden by
some configurations belong in sysdeps/generic: include/ comes before
sysdeps so isn't suitable for headers that get overridden, while if the
header is in some other directory it's not visible for compilations within
another directory.  There may be other cases that need to be in
sysdeps/generic as well.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
In reply to this post by Jeff Law
On Wed, Dec 14, 2016 at 11:58 AM, Jeff Law <[hidden email]> wrote:

> On 12/13/2016 06:04 PM, Zack Weinberg wrote:
>> On Tue, Dec 13, 2016 at 5:46 PM, Jeff Law <[hidden email]> wrote:
>>>
>>> It's going to be hard for GCC to recognize the memset_chk+read_memory as
>>> an explict_bzero.  It'll have to infer it from the code pattern, which is
>>> inherently prone to misses.
>>
>> Maybe I don't understand what you have in mind, but I don't think you
>> have to infer anything from the code pattern.  Just treat
>> __explicit_bzero_fence the same as explicit_bzero itself.  The
>> arguments are the same and it won't be used for anything else.
>>
>> I have no idea how hard this would be to implement in present-day GCC,
>> but the way I imagine explicit_bzero support working is: there's an
>> __attribute__ for variables (let's call it "sensitive") that triggers
>> a forward taint analysis: every data object transitively data or
>> control dependent on the "sensitive" variable also becomes
>> "sensitive".  This has no effect on data objects whose lifetime
>> extends beyond the function in which they were declared, but if they
>> don't, they are erased when their lifetime ends.
>
> The concept of tainting variables and tracking that property seems distinct
> from making sure that the compiler properly handles explicit_bzero.  And
> just to be clear, I see that property as having value.

The thing is, if you do the taint tracking, that's _all_ you need to
do, I think.

>> There's probably some wacky corner cases I haven't thought of, but I
>> don't see why in-glibc expansion of explicit_bzero to memset+fence
>> should be a problem.
>
> My worry is ensuring that the compiler doesn't ever remove these stores as
> dead.

You don't have to worry about the stores generated by memset+fence
because they will be redundant to stores generated by the taint
tracker.

Let's recap the motivating example for the inline, since it's been
awhile and Jeff may not have seen it:

```
typedef __SIZE_TYPE__ size_t;

extern void *memset (void *, int, size_t);
extern void explicit_bzero (void *, size_t);
extern void __explicit_bzero_fence (const void *, size_t);

#ifdef INLINE_EXPLICIT_BZERO
extern inline __attribute__ ((gnu_inline)) void
explicit_bzero(void *p, size_t n)
{
  memset (p, 0, n);
  __explicit_bzero_fence (p, n);
}
#endif

// 128 bits of key material
struct k { unsigned long long lo, hi; };

extern struct k get_key ();
extern void do_encrypt (struct k, char *, const char *, size_t);

void
encrypt (char *out, const char *in, size_t n)
{
  struct k key = get_key ();
  do_encrypt (key, out, in, n);
#ifndef NO_CLEAR
  explicit_bzero (&key, sizeof (struct k));
#endif
}
```

If you compile this on x86-64 with gcc 6.2 at -DNO_CLEAR -O2 -S, you
will see that the `key` variable lives entirely in registers.  If you
compile it with neither `-DNO_CLEAR` nor `-DINLINE_EXPLICIT_BZERO`,
`key` is spilled to the stack immediately after the call to `get_key`
(it's more obvious that this is what's happening if you turn off RTL
scheduling). This is what we want to avoid.  With
`-DINLINE_EXPLICIT_BZERO` ... the spill still happens,
disappointingly, but it _could have_ been eliminated, because those
stack slots are not read again before they are overwritten with zeroes
by the memset.  (LLVM 3.8 does eliminate the spill.)

Anyway, right before lowering to RTL, the IR with -DINLINE_EXPLICIT_BZERO is

encrypt (char * out, const char * in, size_t n)
{
  struct k key;

  <bb 2>:
  key = get_key ();
  do_encrypt (key, out_3(D), in_4(D), n_5(D));
  memset (&key, 0, 16);
  __explicit_bzero_fence (&key, 16);
  key ={v} {CLOBBER};
  return;
}

With a compiler that implemented the taint analysis I am imagining, we
would have instead

encrypt (char * out, const char * in, size_t n)
{
  struct k key __attribute__ ((tainted));

  <bb 2>:
  key = get_key ();
  do_encrypt (key, out_3(D), in_4(D), n_5(D));
  memset (&key, 0, 16);
  key = {0, 0};
  USE (key);
  key ={v} {CLOBBER};
  return;
}

where "USE (key)" has the semantics of RTL (use ...) - I don't
remember what the tree equivalent is.  No assembly generated, but
preceding stores are preserved.

Now, hopefully the RTL optimizers will figure out that the memset is
unnecessary and remove it, but even if they don't, the codegen is
still correct.

(Alas, looking at the assembly here has made me realize that this is
maybe harder to implement than I thought; it probably can't be done
just by slapping in some writes and USEs at tree level; it needs to
work out which of the _hard registers_ and _stack slots_ are tainted,
and generate the appropriate clears, _after_ register allocation.  And
it needs to assume that function calls _don't_ clobber call-clobbered
registers.  Oh my aching head.)

> Understood, WRT LTO today.  I'm not worried about today, I'm worried about
> what happens in the future when the assumptions we're making today about
> what the compiler can and can not do may no longer hold.

The thing is that until substantial work is done on both libc and the
compiler, we can't _allow_ LTO to see into libc.  This is not the only
place where we rely on LTO not seeing through a function-call boundary
for correctness.

zw
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Jeff Law
On 12/14/2016 04:11 PM, Zack Weinberg wrote:
> On Wed, Dec 14, 2016 at 11:58 AM, Jeff Law <[hidden email]> wrote:
>>
>> The concept of tainting variables and tracking that property seems distinct
>> from making sure that the compiler properly handles explicit_bzero.  And
>> just to be clear, I see that property as having value.
>
> The thing is, if you do the taint tracking, that's _all_ you need to
> do, I think.
Perhaps.  This is the first time I've really thought about it.

>
> Let's recap the motivating example for the inline, since it's been
> awhile and Jeff may not have seen it:
Thanks.  I certainly haven't followed the discussion prior to yesterday.

>
> ```
> typedef __SIZE_TYPE__ size_t;
>
> extern void *memset (void *, int, size_t);
> extern void explicit_bzero (void *, size_t);
> extern void __explicit_bzero_fence (const void *, size_t);
>
> #ifdef INLINE_EXPLICIT_BZERO
> extern inline __attribute__ ((gnu_inline)) void
> explicit_bzero(void *p, size_t n)
> {
>   memset (p, 0, n);
>   __explicit_bzero_fence (p, n);
> }
> #endif
>
> // 128 bits of key material
> struct k { unsigned long long lo, hi; };
>
> extern struct k get_key ();
> extern void do_encrypt (struct k, char *, const char *, size_t);
>
> void
> encrypt (char *out, const char *in, size_t n)
> {
>   struct k key = get_key ();
>   do_encrypt (key, out, in, n);
> #ifndef NO_CLEAR
>   explicit_bzero (&key, sizeof (struct k));
> #endif
> }
> ```
>
> If you compile this on x86-64 with gcc 6.2 at -DNO_CLEAR -O2 -S, you
> will see that the `key` variable lives entirely in registers.  If you
> compile it with neither `-DNO_CLEAR` nor `-DINLINE_EXPLICIT_BZERO`,
> `key` is spilled to the stack immediately after the call to `get_key`
> (it's more obvious that this is what's happening if you turn off RTL
> scheduling). This is what we want to avoid.  With
> `-DINLINE_EXPLICIT_BZERO` ... the spill still happens,
> disappointingly, but it _could have_ been eliminated, because those
> stack slots are not read again before they are overwritten with zeroes
> by the memset.  (LLVM 3.8 does eliminate the spill.)
Right.  Just to correct a bit on terminology, it's not spilled to the
stack, but the object has to be addressable due to the call to
explicit_bzero.  That forces the object into the stack.  It might still
wind up in the stack due to other reasons (it's an aggregate and thus
not subject to live in registers).

Once you inline the call to explicit_bzero, then the compiler has the
opportunity to remove the addressable property, but it probably can't
due to the use in the memset and bzero_fence calls.

As it happens I'm looking at a slew of DSE issues right now and can give
some insights.  Note the following assumes my improvements and does not
necessarily represent what you can do with the trunk right now.

DSE fails to track the assignment key = get_key().  It just doesn't even
try.  If that's hacked around (easy) and we start tracking the memory
for KEY.  Then we have the use in the call to do_encrypt.  That's a
read, plain and simple.  But let's assume we can finesse past that.  We
then see the call to memset and that kills the memory for KEY.  So we
can actually track the memory with a little work.  There's no stores
that can actually be removed, but we can track the memory.

Note this is all at the higher gimple level.  Tracking it in the RTL DSE
pass would be tougher, but in theory possible and perhaps more useful
since the argument passing artifacts are exposed.

But the fact remains that as long as KEY is addressable you're going to
have this class of problem and with the explicit_bzero proposal anything
passed to it will be considered addressable and is likely going to drop
into memory, which is precisely what you do not want.

Rather than explicit_bzero what you may want is something more like a
clobbering assignment.




> Anyway, right before lowering to RTL, the IR with -DINLINE_EXPLICIT_BZERO is
>
> encrypt (char * out, const char * in, size_t n)
> {
>   struct k key;
>
>   <bb 2>:
>   key = get_key ();
>   do_encrypt (key, out_3(D), in_4(D), n_5(D));
>   memset (&key, 0, 16);
>   __explicit_bzero_fence (&key, 16);
>   key ={v} {CLOBBER};
>   return;
> }
Right, that's about what I'd expect.

>
> With a compiler that implemented the taint analysis I am imagining, we
> would have instead
>
> encrypt (char * out, const char * in, size_t n)
> {
>   struct k key __attribute__ ((tainted));
>
>   <bb 2>:
>   key = get_key ();
>   do_encrypt (key, out_3(D), in_4(D), n_5(D));
>   memset (&key, 0, 16);
>   key = {0, 0};
>   USE (key);
>   key ={v} {CLOBBER};
>   return;
> }
So if you get to this point, in particular the key = {0, 0} assignment,
then what's the point of the explicit_bzero/memset call?  It's dead.
What you're describing with key = {0, 0} is an empty CONSTRUCTOR
assignment.  And I *think* those won't force your object to be addressable.

FWIW, if you had this, there's a good chance the compiler would pick up
the fact that the memset call is dead, but I really don't see the point
in having the memset call at that point.

>
> (Alas, looking at the assembly here has made me realize that this is
> maybe harder to implement than I thought; it probably can't be done
> just by slapping in some writes and USEs at tree level; it needs to
> work out which of the _hard registers_ and _stack slots_ are tainted,
> and generate the appropriate clears, _after_ register allocation.  And
> it needs to assume that function calls _don't_ clobber call-clobbered
> registers.  Oh my aching head.)
Getting it 100% is tough for precisely those reasons.  But if we just
have a clobbering assignment rather than messing with explicit_bzero,
then what you get out of the gimple side will be as close to correct at
that point as you can get and the object is much more likely to end up
in registers than memory.

So why not drop explicit_bzero and instead we define something to get
you an empty CONSTRUCTOR assignment.  I think those have the semantics
you want without forcing your object to be addressable.



Then you just have to find a way to deal with the argument passing and
register allocation artifacts, and that's going to be hard I suspect.

jeff
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Florian Weimer-5
In reply to this post by Zack Weinberg-2
On 12/14/2016 11:28 PM, Zack Weinberg wrote:

> The remaining question in my mind is whether, in the case where a
> variable's address is only taken in a call to explicit_bzero, we
> should give up on the "hack to prevent the data being copied to
> memory" for the sake of hypothetical future GCC support.  That hack, I
> remind you, is the inline expansion to memset+__glibc_read_memory.  We
> made a huge fuss over that case in the manual, and a couple of people
> were prepared to veto explicit_bzero altogether if we didn't do
> something about it.

Those people were mistaken.  I can't see how you can prevent DSE on
scalars with current GCC.  I gave that example to show that compiler
support was needed to implement memset_s (or a full implementation of
explicit_zero).  It was never intended as something that is particularly
relevant to real-world code.

> I am very reluctant to give it up, especially as
> I'm still not convinced it's a problem for the compiler (see reply to
> Jeff).

It's a problem because its semantics are not defined in any sensible
way.  Even explicit_bzero is impossible to specify properly in C
standardese due to lack of an observable effect on the language level
(and so is memset_s), but at least it's reasonably obvious what is
intended.  Instead of your tracking idea, GCC could just clear all stack
memory and callee-saved registers upon exit from a function which calls
explicit_bzero.  It's still an approximation, but any implementation
retrofitted on an existing compiler will be.
       

> -#ifdef _LIBC
>    /*
>     * Erase key-dependent intermediate data.  Data dependent only on
>     * the salt is not considered sensitive.
>     */
> -  __explicit_bzero (ktab, sizeof (ktab));
> -  __explicit_bzero (data->keysched, sizeof (data->keysched));
> -  __explicit_bzero (res, sizeof (res));
> -#endif
> +  explicit_bzero (ktab, sizeof (ktab));
> +  explicit_bzero (data->keysched, sizeof (data->keysched));
> +  explicit_bzero (res, sizeof (res));
>
> The _LIBC ifdeffage is vestigial, but should probably be left alone in
> a patch that isn't about that.

Huh?  I think it's a new addition.

> +/* This is the generic definition of __explicit_bzero_chk.  The
> +   __explicit_bzero_chk symbol is used as the implementation of
> +   explicit_bzero throughout glibc.  If this file is overriden by an
> +   architecture, both __explicit_bzero_chk and
> +   __explicit_bzero_chk_internal have to be defined (the latter not as
> +   an IFUNC).  */
>
> This file is not in sysdeps/generic, so it cannot be overridden (or is
> that no longer the case? If so, why do we still have sysdeps/generic?)
> and I don't think we need the capability to override it.  Better we
> should get libc-internal references to memset going to the proper ifunc
> for the architecture.

It can be overridden, and it has to be, otherwise you end up with
multiple definitions of the same symbol when linking libc.so.

Two symbols are needed because on some architectures, hidden references
to IFUNC symbols are not supported because calls to hidden functions are
always direct and do not use the GOT.

> +  /* Compiler barrier.  */
> +  asm volatile ("" ::: "memory");
> +}
>
> I do not understand why you have reverted to an older, inferior
> compiler barrier.  This was extensively hashed out quite some time ago.

I did that because asm volatile ("") is essentially a NOP in this
context.  It certainly does not prevent DSE of the preceding memset.  In
contrast, the memory clobber ensures that if the thing-to-be-zeroed is
in memory.

> Oh, I see why you're not getting linknamespace failures from the
> libcrypt change: you've implicitly fortified all those calls, which
> has the side effect of making them use an impl-namespace symbol.  It
> makes sense as a testing strategy, but it doesn't feel like the right
> move for the committed patch (better to leave that to an all-or-nothing
> "fortify libc internally" switch, ne?)

I have no firm opinion on that.  It avoids adding another GLIBC_PRIVATE
symbol, and I think the current set of symbols is also compatible with
future IFUNC-based optimizations.

> What we _could_ do is
>
> #if IS_IN (libc)
> # define explicit_bzero(s, n) __internal_explicit_bzero (s, n)
> #else
> # define explicit_bzero(s, n) __explicit_bzero (s, n)
> #endif
>
> which would allow libcrypt's source code to use the unmangled names.
> I think that's something we're trying to do for other string
> functions, so perhaps it makes sense.

Other string functions use asm aliases, which also apply to GCC built-ins.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Florian Weimer-5
On 12/15/2016 10:08 AM, Florian Weimer wrote:
> It's a problem because its semantics are not defined in any sensible
> way.  Even explicit_bzero is impossible to specify properly in C
> standardese due to lack of an observable effect on the language level
> (and so is memset_s), but at least it's reasonably obvious what is
> intended.  Instead of your tracking idea, GCC could just clear all stack
> memory and callee-saved registers upon exit from a function which calls
> explicit_bzero.  It's still an approximation, but any implementation
> retrofitted on an existing compiler will be.

I meant “caller-saved registers”, of course.

Florian
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
In reply to this post by Jeff Law
On 12/15/2016 12:25 AM, Jeff Law wrote:
> On 12/14/2016 04:11 PM, Zack Weinberg wrote:
...

>>
>> If you compile this on x86-64 with gcc 6.2 at -DNO_CLEAR -O2 -S, you
>> will see that the `key` variable lives entirely in registers.  If you
>> compile it with neither `-DNO_CLEAR` nor `-DINLINE_EXPLICIT_BZERO`,
>> `key` is spilled to the stack immediately after the call to `get_key`
>> (it's more obvious that this is what's happening if you turn off RTL
>> scheduling). This is what we want to avoid.  With
>> `-DINLINE_EXPLICIT_BZERO` ... the spill still happens,
>> disappointingly, but it _could have_ been eliminated, because those
>> stack slots are not read again before they are overwritten with zeroes
>> by the memset.  (LLVM 3.8 does eliminate the spill.)
>
> Right.  Just to correct a bit on terminology, it's not spilled to the
> stack, but the object has to be addressable due to the call to
> explicit_bzero.  That forces the object into the stack.  It might still
> wind up in the stack due to other reasons (it's an aggregate and thus
> not subject to live in registers).

Right.  I suspect that sensitive data objects pretty much always wind up
in memory anyway, but people _were_ really worried about the case where
explicit_bzero is the only thing that makes one addressable.

...
> But the fact remains that as long as KEY is addressable you're going to
> have this class of problem and with the explicit_bzero proposal anything
> passed to it will be considered addressable and is likely going to drop
> into memory, which is precisely what you do not want.
>
> Rather than explicit_bzero what you may want is something more like a
> clobbering assignment.

To be clear, explicit_bzero itself is not going away.  It is already in
use in a wide variety of applications, and substitutes (such as the
hypothetical __attribute__((sensitive))) will not become widespread
quickly enough to avoid adding it to glibc.  I've been working on this
patch off and on for _two years_, and I picked it up from someone else
who'd given up on the process after roughly that long himself.  I fully
intend to commit it in some form tomorrow evening; I consider missing
2.25 unacceptable.

What I'd hope we could do with compiler-side smarts is _convert_
explicit_bzero to __attribute__((sensitive)) or a clobbering assignment
or whatever the most convenient compiler-side representation winds up
being, so that artifacts of glibc's implementation become irrelevant.

>> With a compiler that implemented the taint analysis I am imagining, we
>> would have instead
>>
>> encrypt (char * out, const char * in, size_t n)
>> {
>>   struct k key __attribute__ ((tainted));
>>
>>   <bb 2>:
>>   key = get_key ();
>>   do_encrypt (key, out_3(D), in_4(D), n_5(D));
>>   memset (&key, 0, 16);
>>   key = {0, 0};
>>   USE (key);
>>   key ={v} {CLOBBER};
>>   return;
>> }
>
> So if you get to this point, in particular the key = {0, 0} assignment,
> then what's the point of the explicit_bzero/memset call?  It's dead.
> What you're describing with key = {0, 0} is an empty CONSTRUCTOR
> assignment.  And I *think* those won't force your object to be addressable.
>
> FWIW, if you had this, there's a good chance the compiler would pick up
> the fact that the memset call is dead, but I really don't see the point
> in having the memset call at that point.

This is in the hypothetical where we keep the inline expansion in glibc,
gcc recognizes __explicit_bzero_fence, and gcc doesn't try to
pattern-match the preceding memset.  So the __explicit_bzero_fence got
replaced with the clobbering assignment, but the memset call was already
there.

I'm a lot less attached to the inline expansion now that I realize it
doesn't keep 'key' from getting written to the stack in GCC 6, and
Florian has figured out how to make __explicit_bzero_chk happen.

> So why not drop explicit_bzero and instead we define something to get
> you an empty CONSTRUCTOR assignment.  I think those have the semantics
> you want without forcing your object to be addressable.

Again, explicit_bzero itself (or another name for the same operation) is
the interface applications are using right now.  It's not going away.
But __builtin_explicit_bzero could turn into that empty CONSTRUCTOR
assignment, couldn't it?

> Then you just have to find a way to deal with the argument passing and
> register allocation artifacts, and that's going to be hard I suspect.

Yeah.  But hey, at least it won't involve reload, right?

zw

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Zack Weinberg-2
In reply to this post by Florian Weimer-5
On 12/15/2016 04:08 AM, Florian Weimer wrote:

> On 12/14/2016 11:28 PM, Zack Weinberg wrote:
>
>> The remaining question in my mind is whether, in the case where a
>> variable's address is only taken in a call to explicit_bzero, we
>> should give up on the "hack to prevent the data being copied to
>> memory" for the sake of hypothetical future GCC support.  That hack, I
>> remind you, is the inline expansion to memset+__glibc_read_memory.  We
>> made a huge fuss over that case in the manual, and a couple of people
>> were prepared to veto explicit_bzero altogether if we didn't do
>> something about it.
>
> Those people were mistaken.  I can't see how you can prevent DSE on
> scalars with current GCC.  I gave that example to show that compiler
> support was needed to implement memset_s (or a full implementation of
> explicit_zero).  It was never intended as something that is particularly
> relevant to real-world code.

Hmm, OK.  I'm going to send a new patch which merges yours and mine and
adjusts the manual accordingly.  I really want to get this landed
tomorrow, before I go out of town -- otherwise it's going to miss 2.25,
and this has been dragging on _quite_ long enough.

>> -#ifdef _LIBC
>>    /*
>>     * Erase key-dependent intermediate data.  Data dependent only on
>>     * the salt is not considered sensitive.
>>     */
>> -  __explicit_bzero (ktab, sizeof (ktab));
>> -  __explicit_bzero (data->keysched, sizeof (data->keysched));
>> -  __explicit_bzero (res, sizeof (res));
>> -#endif
>> +  explicit_bzero (ktab, sizeof (ktab));
>> +  explicit_bzero (data->keysched, sizeof (data->keysched));
>> +  explicit_bzero (res, sizeof (res));
>>
>> The _LIBC ifdeffage is vestigial, but should probably be left alone in
>> a patch that isn't about that.
>
> Huh?  I think it's a new addition.

Oh, huh, you're right.  I don't remember why I did that.  Certainly
nobody's gonna be merging this back with UFC, so it can go.

>> +/* This is the generic definition of __explicit_bzero_chk.  The
>> +   __explicit_bzero_chk symbol is used as the implementation of
>> +   explicit_bzero throughout glibc.  If this file is overriden by an
>> +   architecture, both __explicit_bzero_chk and
>> +   __explicit_bzero_chk_internal have to be defined (the latter not as
>> +   an IFUNC).  */
>>
>> This file is not in sysdeps/generic, so it cannot be overridden (or is
>> that no longer the case? If so, why do we still have sysdeps/generic?)
>> and I don't think we need the capability to override it.  Better we
>> should get libc-internal references to memset going to the proper ifunc
>> for the architecture.
>
> It can be overridden, and it has to be, otherwise you end up with
> multiple definitions of the same symbol when linking libc.so.

??? Your patch doesn't have any overrides of it.

> Two symbols are needed because on some architectures, hidden references
> to IFUNC symbols are not supported because calls to hidden functions are
> always direct and do not use the GOT.

Right, I remember that this was a problem before.

>> +  /* Compiler barrier.  */
>> +  asm volatile ("" ::: "memory");
>> +}
>>
>> I do not understand why you have reverted to an older, inferior
>> compiler barrier.  This was extensively hashed out quite some time ago.
>
> I did that because asm volatile ("") is essentially a NOP in this
> context.  It certainly does not prevent DSE of the preceding memset.  In
> contrast, the memory clobber ensures that if the thing-to-be-zeroed is
> in memory.

I meant as opposed to the out-of-line __glibc_read_memory.  I suppose an
all-memory clobber is not going to generate worse code than that as long
as explicit_bzero.c remains separately compiled, and it's less likely to
break (but _will_ generate worse code) if LTO-into-glibc happens before
compiler support for explicit_bzero, so I can live with it.

>> Oh, I see why you're not getting linknamespace failures from the
>> libcrypt change: you've implicitly fortified all those calls, which
>> has the side effect of making them use an impl-namespace symbol.  It
>> makes sense as a testing strategy, but it doesn't feel like the right
>> move for the committed patch (better to leave that to an all-or-nothing
>> "fortify libc internally" switch, ne?)
>
> I have no firm opinion on that.  It avoids adding another GLIBC_PRIVATE
> symbol, and I think the current set of symbols is also compatible with
> future IFUNC-based optimizations.

I think I'm going to leave it your way, both because it'll get the patch
done faster, and because the crypt code is awfully dusty and not at all
performance-critical.

zw
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Florian Weimer-5
On 12/16/2016 12:31 AM, Zack Weinberg wrote:

>>> +/* This is the generic definition of __explicit_bzero_chk.  The
>>> +   __explicit_bzero_chk symbol is used as the implementation of
>>> +   explicit_bzero throughout glibc.  If this file is overriden by an
>>> +   architecture, both __explicit_bzero_chk and
>>> +   __explicit_bzero_chk_internal have to be defined (the latter not as
>>> +   an IFUNC).  */
>>>
>>> This file is not in sysdeps/generic, so it cannot be overridden (or is
>>> that no longer the case? If so, why do we still have sysdeps/generic?)
>>> and I don't think we need the capability to override it.  Better we
>>> should get libc-internal references to memset going to the proper ifunc
>>> for the architecture.
>> It can be overridden, and it has to be, otherwise you end up with
>> multiple definitions of the same symbol when linking libc.so.
> ??? Your patch doesn't have any overrides of it.

I meant it has to be overridden if you want to introduce
architecture-specific implementations of explicit_bzero.  The patch
doesn't contain any, but should be compatible with future development.

Due the transitive nature of IFUNCs and the desire to avoid indirect
calls, all memset/__memset_chk variants need an
explicit_bzero/__explicit_bzero_chk wrapper, so this is quite involved.

Thanks,
Florian

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] explicit_bzero final

Jeff Law
In reply to this post by Zack Weinberg-2
On 12/15/2016 04:21 PM, Zack Weinberg wrote:

>>
>> Right.  Just to correct a bit on terminology, it's not spilled to the
>> stack, but the object has to be addressable due to the call to
>> explicit_bzero.  That forces the object into the stack.  It might still
>> wind up in the stack due to other reasons (it's an aggregate and thus
>> not subject to live in registers).
>
> Right.  I suspect that sensitive data objects pretty much always wind up
> in memory anyway, but people _were_ really worried about the case where
> explicit_bzero is the only thing that makes one addressable.
Agreed on both accounts.

>
> To be clear, explicit_bzero itself is not going away.  It is already in
> use in a wide variety of applications, and substitutes (such as the
> hypothetical __attribute__((sensitive))) will not become widespread
> quickly enough to avoid adding it to glibc.  I've been working on this
> patch off and on for _two years_, and I picked it up from someone else
> who'd given up on the process after roughly that long himself.  I fully
> intend to commit it in some form tomorrow evening; I consider missing
> 2.25 unacceptable.
Understood.  And my position is that explicit_bzero is inherently
flawed.  You really need direct compiler support.

So while it's not going away and it's an incremental improvement over
nothing, it comes with a cost.  Namely that some objects which
previously weren't addressable become addressable and are now sitting in
memory waiting to be extracted.


>
> What I'd hope we could do with compiler-side smarts is _convert_
> explicit_bzero to __attribute__((sensitive)) or a clobbering assignment
> or whatever the most convenient compiler-side representation winds up
> being, so that artifacts of glibc's implementation become irrelevant.
We can likely infer something passed to explicit_bzero is sensitive and
build a web to capture whatever DECLs potentially feed the explicit_bzero.

So we convert the fence to the CONSTRUCTOR assignment.  That's fine.
Then we'd want DSE to eliminate the memset, then recompute addressability.

>
>> Then you just have to find a way to deal with the argument passing and
>> register allocation artifacts, and that's going to be hard I suspect.
>
> Yeah.  But hey, at least it won't involve reload, right?
Depends on your target :-)

You've also got things like setjmp, caller saves, etc which can stuff
registers into memory that you'll want to deal with.  But at least the
surface is getting smaller.

jeff
12