[Bug locale/22668] New: LC_COLLATE: the last character of ellipsis is not ordered correctly

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug locale/22668] New: LC_COLLATE: the last character of ellipsis is not ordered correctly

glaubitz at physik dot fu-berlin.de
https://sourceware.org/bugzilla/show_bug.cgi?id=22668

            Bug ID: 22668
           Summary: LC_COLLATE: the last character of ellipsis is not
                    ordered correctly
           Product: glibc
           Version: 2.26
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: locale
          Assignee: unassigned at sourceware dot org
          Reporter: hanataka.shinya at gmail dot com
  Target Milestone: ---

Created attachment 10709
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10709&action=edit
LC_COLLATE ellipsis bug fixing patch

LC_COLLATE: the last character of ellipsis is not ordered correctly.

If the localedata use ellipsis for LC_COLLATE and the another character is
ordered after that, the last character of ellipsis is ordered last.

For simplified example,

LC_COLLATE
order_start forward;forward;forward;forward,position
<U0041> <U0041>;IGNORE;IGNORE;IGNORE  # A
...         ...;IGNORE;IGNORE;IGNORE
<U0044> <U0044>;IGNORE;IGNORE;IGNORE  # D
<U0045> <U0045>;IGNORE;IGNORE;IGNORE  # E
...         ...;IGNORE;IGNORE;IGNORE
<U004A> <U004A>;IGNORE;IGNORE;IGNORE  # Z

order_end

END LC_COLLATE

This rule expects A B C D E F ... Z
But glibc-2.26 results is A B C E F ... Z D.

After checking the source code of localedef, I found a problem in
handle_ellipsis () of locale/program/ld-collate.c. The insertion cursor
position is temporarily set between ellipsis in this function, but it seems
thIt does not seem to have been restored.at

The attached patch fixes this problem, and works fine for me.

--
You are receiving this mail because:
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug locale/22668] LC_COLLATE: the last character of ellipsis is not ordered correctly

Sourceware - glibc-bugs mailing list
https://sourceware.org/bugzilla/show_bug.cgi?id=22668

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-06-29

--- Comment #1 from Carlos O'Donell <carlos at redhat dot com> ---
I confirm that this fix corrects the issue.

I've reviewed the code in question and indeed the right solution is to move the
cursor to the end of the ellipsis sequence which is pointed to by endp.

I actually end up triggering this when working on C.UTF-8. I saw the end
pointer in all long ellipsis sequences getting the wrong sorting value because
they didn't get any weights (they were being unlinked).

--
You are receiving this mail because:
You are on the CC list for the bug.