[Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

            Bug ID: 25814
           Summary: Consecutive + operators accepted but have no effect
                    except consuming more memory
           Product: glibc
           Version: 2.27
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: regex
          Assignee: unassigned at sourceware dot org
          Reporter: dpmendenhall at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 12452
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12452&action=edit
test program

The attached test program just takes a regex pattern and a string at the
command line.

$ gcc -o regex regex.c
$ ./regex 0+ 0
pattern: 0+
string: 0

regex matched
$ ./regex 0++ 0
pattern: 0++
string: 0

regex matched
$ ./regex 0++++++++++++++++++++++++++++++++++ 0
pattern: 0++++++++++++++++++++++++++++++++++
string: 0
<hangs consuming all system memory>

I'm not even sure what consecutive + operators is supposed to mean, so I don't
know why "0++" accepts "0".

I tested this against the bionic/NetBSD regex implementation and compilation of
"0++" fails with REG_BADRPT, which makes more sense.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #1 from Andreas Schwab <[hidden email]> ---
RE_CONTEXT_INVALID_DUP is only part of RE_SYNTAX_POSIX_BASIC, but not
RE_SYNTAX_POSIX_EXTENDED.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #2 from Andreas Schwab <[hidden email]> ---
POSIX says that multiple adjacent duplication symbols produce undefined results
(both BRE and ERE).

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #3 from David Mendenhall <dpmendenhall at gmail dot com> ---
Created attachment 12453
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12453&action=edit
Same as regex.c, but BRE syntax

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #4 from David Mendenhall <dpmendenhall at gmail dot com> ---
> RE_CONTEXT_INVALID_DUP is only part of RE_SYNTAX_POSIX_BASIC, but not RE_SYNTAX_POSIX_EXTENDED

The same behavior is reproducible with basic syntax. See regex-basic.c
attached.

$ gcc -o regex-basic regex-basic.c
$ ./regex-basic 0\\+ 0
pattern: 0\+
string: 0

regex matched
$ ./regex-basic 0\\+\\+ 0
pattern: 0\+\+
string: 0

regex matched
./regex-basic
0\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+
0
pattern: 0\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+
string: 0
<hangs consuming all system memory>

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #5 from David Mendenhall <dpmendenhall at gmail dot com> ---
> POSIX says that multiple adjacent duplication symbols produce undefined results (both BRE and ERE).

Thanks for pointing this out. I was unaware.

So do you think this bug should be closed as INVALID or WONTFIX, or is there
value in investigating the excessive memory consumption and/or rejecting on
compilation like bionic does?

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #6 from Andreas Schwab <[hidden email]> ---
0\+ is not a valid BRE.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #7 from David Mendenhall <dpmendenhall at gmail dot com> ---
Really? The regex.h comments suggest otherwise.

#define RE_SYNTAX_POSIX_BASIC                                           \
  (_RE_SYNTAX_POSIX_COMMON | RE_BK_PLUS_QM | RE_CONTEXT_INVALID_DUP)

/* If this bit is not set, then + and ? are operators, and \+ and \? are
     literals.
   If set, then \+ and \? are operators and + and ? are literals.  */
# define RE_BK_PLUS_QM (RE_BACKSLASH_ESCAPE_IN_LISTS << 1)

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory

Sourceware - glibc-bugs-regex mailing list
In reply to this post by Sourceware - glibc-bugs-regex mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=25814

David Mendenhall <dpmendenhall at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #8 from David Mendenhall <dpmendenhall at gmail dot com> ---
Dup of 20095

*** This bug has been marked as a duplicate of bug 20095 ***

--
You are receiving this mail because:
You are on the CC list for the bug.