[Bug localedata/14094] New: Update locale data to Unicode 6.1

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug localedata/14094] New: Update locale data to Unicode 6.1

glaubitz at physik dot fu-berlin.de
http://sourceware.org/bugzilla/show_bug.cgi?id=14094

             Bug #: 14094
           Summary: Update locale data to Unicode 6.1
           Product: glibc
           Version: 2.15
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
        AssignedTo: [hidden email]
        ReportedBy: [hidden email]
                CC: [hidden email]
    Classification: Unclassified


The Unicode locale data - character map and LC_CTYPE information - should be
updated from Unicode 6.1 (the character map is currently based on 6.0, and
LC_CTYPE is currently based on 5.0).  This should be done with proper
automation and wiki documentation being added of how to do future updates.  I
identified the following tasks at
<http://sourceware.org/ml/libc-alpha/2012-05/msg00590.html>:

* Ensure the character type data in localedata/charmaps/i18n can be
  properly reproduced from Unicode 5.0 data using gen-unicode-ctype.c,
  adapting gen-unicode-ctype.c as needed to replicate any changes that
  may have been made not using that program.

* Update the character type data to Unicode 6.1, removing any local
  hacks from gen-unicode-ctype.c that are no longer needed.
  (10646:2012, corresponding to Unicode 6.1, appears to be in
  publication stage so should be out very soon.)

* Ensure the character data in localedata/charmaps/UTF-8 can be
  reproduced in some automated fashion from Unicode 6.0, locating any
  previously used automation for this or creating some new automation
  if any previous automation can't be found.

* Update the character data to Unicode 6.1, removing any local hacks
  in the automation from the previous step.

* Document thoroughly on the wiki how the automation works and how to
  do updates to new Unicode versions.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Reply | Threaded
Open this post in threaded view
|

[Bug localedata/14094] Update locale data to Unicode 6.1

glaubitz at physik dot fu-berlin.de
http://sourceware.org/bugzilla/show_bug.cgi?id=14094

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #1 from Rich Felker <bugdal at aerifal dot cx> 2012-05-11 03:25:47 UTC ---
One of the major "local hacks" can be fixed, fixing many other problems at the
same time, by switching to using the Unicode "Alphabetic" property (from
DerivedCoreProperties.txt) instead of just categories L* for class alpha. Right
now there are many languages whose letters are considered non-alphabetic by
glibc because they're in category Mn or Mc or even Cf. There are "local hacks"
to fix this for maybe one or two languages, but using the right Unicode
property would fix it for all languages.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.