[Bug regex/1149] character class with range doesn't match half-width kana in SJIS locale

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Bug regex/1149] character class with range doesn't match half-width kana in SJIS locale

Martin.Jansa at gmail dot com

------- Additional Comments From kimura dot koichi at canon dot co dot jp  2006-02-01 04:48 -------
(In reply to comment #2)
I guess I found point of problem.
Here is patch.

--- regcomp.c.1~ 2005-07-18 11:51:43.000000000 +0900
+++ regcomp.c 2006-02-01 13:26:41.078750000 +0900
@@ -397,9 +397,13 @@ re_compile_fastmap_iter (bufp, init_stat
  }
 # else
       if (dfa->mb_cur_max > 1)
- for (i = 0; i < SBC_MAX; ++i)
-  if (__btowc (i) == WEOF)
-    re_set_fastmap (fastmap, icase, i);
+                  for (i = 0; i < SBC_MAX; ++i) {
+    wint_t wc;
+    wc = __btowc (i);
+
+    if (wc == WEOF || wc >= SBC_MAX)
+      re_set_fastmap (fastmap, icase, i);
+  }
 # endif /* not _LIBC */
     }
   for (i = 0; i < cset->nmbchars; ++i)

--- regexec.c.1~ 2005-07-18 11:51:42.000000000 +0900
+++ regexec.c 2006-02-01 13:26:44.016250000 +0900
@@ -3715,6 +3715,7 @@ check_node_accept_bytes (dfa, node_idx,
   const re_token_t *node = dfa->nodes + node_idx;
   int char_len, elem_len;
   int i;
+  wchar_t wc;
 
   if (BE (node->type == OP_UTF8_PERIOD, 0))
     {
@@ -3784,7 +3785,8 @@ check_node_accept_bytes (dfa, node_idx,
     }
 
   elem_len = re_string_elem_size_at (input, str_idx);
-  if ((elem_len <= 1 && char_len <= 1) || char_len == 0)
+  wc = __btowc(*(input->mbs+str_idx));
+  if ((elem_len <= 1 && char_len <= 1) || char_len == 0) && (wc != WEOF && wc <
SBC_MAX))
     return 0;
 
   if (node->type == COMPLEX_BRACKET)

This patch is for non-_LIBC part since I could not follow the _LIBC part flow.

--


http://sourceware.org/bugzilla/show_bug.cgi?id=1149

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.