[Bug regex/18986] New: ERE '0|()0|\1|0' causes regexec undefined behavior

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug regex/18986] New: ERE '0|()0|\1|0' causes regexec undefined behavior

macro@linux-mips.org
https://sourceware.org/bugzilla/show_bug.cgi?id=18986

            Bug ID: 18986
           Summary: ERE '0|()0|\1|0' causes regexec undefined behavior
           Product: glibc
           Version: 2.22
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
          Assignee: unassigned at sourceware dot org
          Reporter: eggert at gnu dot org
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---
             Flags: security+

Created attachment 8621
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8621&action=edit
Test program illustrating the bug.

This bug report was inspired by an assertion failure in GNU grep:

http://bugs.gnu.org/21513

I tracked it down to undefined behavior in glibc.  Sometimes the behavior
causes a core dump, sometimes the wrong answer, sometimes the right answer.  I
will attach a C program that illustrates the problem for me: compile and run
it, and typically it outputs "match (incorrect)"; it should output either
""regcomp returns REG_ESUBREG (arguably correct)" or "no match (arguably
correct)".

I have fixed the bug in the Gnulib version of the regex code, here:

http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=5513b40999149090987a0341c018d05d3eea1272

so when somebody backports Gnulib fixes into Glibc, Glibc should pick up the
bug fix as a part of that process.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/18986] ERE '0|()0|\1|0' causes regexec undefined behavior

macro@linux-mips.org
https://sourceware.org/bugzilla/show_bug.cgi?id=18986

Hanno Boeck <hanno at hboeck dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hanno at hboeck dot de

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/18986] ERE '0|()0|\1|0' causes regexec undefined behavior

macro@linux-mips.org
In reply to this post by macro@linux-mips.org
https://sourceware.org/bugzilla/show_bug.cgi?id=18986

Arkadiusz Miskiewicz <arekm at maven dot pl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |arekm at maven dot pl

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/18986] ERE '0|()0|\1|0' causes regexec undefined behavior

macro@linux-mips.org
In reply to this post by macro@linux-mips.org
https://sourceware.org/bugzilla/show_bug.cgi?id=18986

--- Comment #1 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  eb04c21373e2a2885f3d52ff192b0499afe3c672 (commit)
      from  b11643c21c5c9d67a69c8ae952e5231ce002e7f1 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=eb04c21373e2a2885f3d52ff192b0499afe3c672

commit eb04c21373e2a2885f3d52ff192b0499afe3c672
Author: Adhemerval Zanella <[hidden email]>
Date:   Wed Dec 20 09:47:44 2017 -0200

    posix: Sync gnulib regex implementation

    This patch syncs the regex implementation with gnulib (commit 0ee5212).
    Only two changes in GLIBC regex testing are required:

      1. posix/bug-regex28.c: as previously discussed [1] the change of
         expected results on the pattern should be safe.

      2. posix/PCRE.tests: the ERE (a)|\1 is malformed (in the sense that
         the \1 doesn't mean anything) and although current GLIBC accepts
         it has undefined behavior.  This patch removes the specific test.

    This sync contains some patches from thread 'Regex: Make libc regex
    more usable outside GLIBC.' [2] which have been pushed upstream in
    gnulib.  This patches also fixes some regex issues (BZ #23233,
    BZ #21163, BZ #18986, BZ #13762) and I did not add testcases for
    both #23233 and #13762 because I couldn't think a simple way to
    trigger the expected failure path to trigger them.

    Checked on x86_64-linux-gnu and i686-linux-gnu.

        [BZ #23233]
        [BZ #21163]
        [BZ #18986]
        [BZ #13762]
        * posix/Makefile (tests): Add bug-regex37 and bug-regex38.
        * posix/PCRE.tests: Remove invalid test.
        * posix/bug-regex28.c: Fix expected values for used syntax.
        * posix/bug-regex37.c: New file.
        * posix/bug-regex38.c: Likewise.
        * posix/regcomp.c: Sync with gnulib.
        * posix/regex.c: Likewise.
        * posix/regex.h: Likewise.
        * posix/regex_internal.c: Likewise.
        * posix/regex_internal.h: Likewise.
        * posix/regexec.c: Likewise.

    [1] https://sourceware.org/ml/libc-alpha/2017-12/msg00807.html
    [2] https://sourceware.org/ml/libc-alpha/2017-12/msg00237.html

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                                       |   18 +
 posix/Makefile                                  |    3 +-
 posix/PCRE.tests                                |   13 -
 posix/bug-regex28.c                             |   46 +-
 wcsmbs/test-char-types.c => posix/bug-regex37.c |   11 +-
 wcsmbs/test-char-types.c => posix/bug-regex38.c |   11 +-
 posix/regcomp.c                                 |  597 +++++++++------
 posix/regex.c                                   |   21 +-
 posix/regex.h                                   |  335 +++++----
 posix/regex_internal.c                          |  295 ++++----
 posix/regex_internal.h                          |  442 ++++++++----
 posix/regexec.c                                 |  936 ++++++++++++-----------
 12 files changed, 1557 insertions(+), 1171 deletions(-)
 copy wcsmbs/test-char-types.c => posix/bug-regex37.c (78%)
 copy wcsmbs/test-char-types.c => posix/bug-regex38.c (78%)

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/18986] ERE '0|()0|\1|0' causes regexec undefined behavior

macro@linux-mips.org
In reply to this post by macro@linux-mips.org
https://sourceware.org/bugzilla/show_bug.cgi?id=18986

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg
         Resolution|---                         |FIXED
           Assignee|unassigned at sourceware dot org   |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #2 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
Fixed by eb04c21373e2a2885f3d52ff192b0499afe3c672 (2.28).

--
You are receiving this mail because:
You are on the CC list for the bug.