[PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

classic Classic list List threaded Threaded
40 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Rafal Luzynski
[PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Thank you Zack, Carlos, and Dmitry for your reviews.

For completeness, this is the 12th version of my patches.  Not much has
been changed, mostly changelog and commit messages.

More details of the changes since the version 11, [1] except the commit
messages:

Patch 1/6: More tests for strptime() added, in English, German, and French.
Patch 3/6: A test for strptime() in Spanish added.  A test in Russian uses
    the Unicode (UTF-8) characters directly rather than hexadecimal
    sequences.
Patch 5/6: Minor reword.
Patch 6/6: This patch previously has not been numbered.  It can be treated
    as part of this series or as a separate patch.  This new version
    adds a test for strptime() actually using a nominative case of a month
    name.

The missing patches are available in my github repo [2] which is up to
date this time.

I'd like to receive at least one "OK" review before I push to stable.

Regards,

Rafal


[1] https://sourceware.org/ml/libc-alpha/2018-01/msg00258.html
[2] https://github.com/rluzynski/glibc
Reply | Threaded
Open this post in threaded view
|

[PATCH v12 1/6] Implement alternative month names (bug 10871).

Rafal Luzynski
Some languages (Slavic, Baltic, etc.) require a genitive case of the
month name when formatting a full date (with the day number) while
they require a nominative case when referring to the month standalone.
This requirement cannot be fulfilled without providing two forms for
each month name.  From now it is specified that nl_langinfo(MON_1)
series (up to MON_12) and strftime("%B") generate the month names in
the grammatical form used when the month is a part of a complete date.
If the grammatical form used when the month is named by itself is needed,
the new values nl_langinfo(ALTMON_1) (up to ALTMON_12) and
strftime("%OB") are supported.  This new feature is optional so the
languages which do not need it or do not yet provide the updated
locales simply do not use it and their behaviour is unchanged.

        [BZ #10871]
        * locale/C-time.c (_nl_C_LC_TIME): Add alternative month names,
        define them as the same as primary full month names explicitly.
        * locale/categories.def (LC_TIME): Add alt_mon and wide-alt_mon.
        * locale/langinfo.h (__ALTMON_1, __ALTMON_2, __ALTMON_3, __ALTMON_4,
        __ALTMON_5, __ALTMON_6, __ALTMON_7, __ALTMON_8, __ALTMON_9, __ALTMON_10,
        __ALTMON_11, __ALTMON_12, _NL_WALTMON_1, _NL_WALTMON_2, _NL_WALTMON_3,
        _NL_WALTMON_4, _NL_WALTMON_5, _NL_WALTMON_6, _NL_WALTMON_7,
        _NL_WALTMON_8, _NL_WALTMON_9, _NL_WALTMON_10, _NL_WALTMON_11,
        _NL_WALTMON_12): New enum constants.
        [__USE_GNU] (ALTMON_1, ALTMON_2, ALTMON_3, ALTMON_4, ALTMON_5, ALTMON_6,
        ALTMON_7, ALTMON_8, ALTMON_9, ALTMON_10, ALTMON_11, ALTMON_12): New
        macros.
        * locale/programs/ld-time.c (struct locale_time_t): Add alt_mon,
        walt_mon, and alt_mon_defined members.
        (time_output): Output alt_mon and walt_mon members.
        (time_read): Read them, initialize them as copies of mon and wmon
        respectively if they are missing, initialize alt_mon_defined.
        * locale/programs/locfile-kw.gperf (alt_mon): Define.
        * locale/programs/locfile-token.h (tok_alt_mon): New enum constant.
        * localedata/tst-langinfo.c (map): Add tests for the new constants
        ALTMON_1 .. ALTMON_12.
        * time/Makefile [$(run-built-tests) = yes] (LOCALES): Add fr_FR.UTF-8
        and pl_PL.UTF-8.
        * time/strftime_l.c (f_altmonth): New macro.
        (__strftime_internal): Handle %OB format.
        * time/strptime_l.c [_LIBC] (alt_month_name): New macro.
        (__strptime_internal): Handle %OB format.
        * time/tst-strptime.c (day_tests): Add tests to parse different forms
        of month names including the new %OB format specifier.
---
 locale/C-time.c                  | 28 ++++++++++++++++++++--
 locale/categories.def            |  2 ++
 locale/langinfo.h                | 50 ++++++++++++++++++++++++++++++++++++++--
 locale/programs/ld-time.c        | 21 +++++++++++++++++
 locale/programs/locfile-kw.gperf |  1 +
 locale/programs/locfile-token.h  |  1 +
 localedata/tst-langinfo.c        | 12 ++++++++++
 time/Makefile                    |  3 ++-
 time/strftime_l.c                | 11 +++++++--
 time/strptime_l.c                | 32 +++++++++++++++++++++----
 time/tst-strptime.c              | 12 ++++++++++
 11 files changed, 162 insertions(+), 11 deletions(-)

diff --git a/locale/C-time.c b/locale/C-time.c
index 1f1ee01..73bc700 100644
--- a/locale/C-time.c
+++ b/locale/C-time.c
@@ -30,7 +30,7 @@ const struct __locale_data _nl_C_LC_TIME attribute_hidden =
   { NULL, }, /* no cached data */
   UNDELETABLE,
   0,
-  111,
+  135,
   {
     { .string = "Sun" },
     { .string = "Mon" },
@@ -142,6 +142,30 @@ const struct __locale_data _nl_C_LC_TIME attribute_hidden =
     { .string = "" },
     { .string = "%a %b %e %H:%M:%S %Z %Y" },
     { .wstr = (const uint32_t *) L"%a %b %e %H:%M:%S %Z %Y" },
-    { .string = _nl_C_codeset }
+    { .string = _nl_C_codeset },
+    { .string = "January" },
+    { .string = "February" },
+    { .string = "March" },
+    { .string = "April" },
+    { .string = "May" },
+    { .string = "June" },
+    { .string = "July" },
+    { .string = "August" },
+    { .string = "September" },
+    { .string = "October" },
+    { .string = "November" },
+    { .string = "December" },
+    { .wstr = (const uint32_t *) L"January" },
+    { .wstr = (const uint32_t *) L"February" },
+    { .wstr = (const uint32_t *) L"March" },
+    { .wstr = (const uint32_t *) L"April" },
+    { .wstr = (const uint32_t *) L"May" },
+    { .wstr = (const uint32_t *) L"June" },
+    { .wstr = (const uint32_t *) L"July" },
+    { .wstr = (const uint32_t *) L"August" },
+    { .wstr = (const uint32_t *) L"September" },
+    { .wstr = (const uint32_t *) L"October" },
+    { .wstr = (const uint32_t *) L"November" },
+    { .wstr = (const uint32_t *) L"December" }
   }
 };
diff --git a/locale/categories.def b/locale/categories.def
index 47947f7..3cbb4e6 100644
--- a/locale/categories.def
+++ b/locale/categories.def
@@ -249,6 +249,8 @@ DEFINE_CATEGORY
   DEFINE_ELEMENT (_DATE_FMT,                "date_fmt",            opt, string)
   DEFINE_ELEMENT (_NL_W_DATE_FMT,           "wide-date_fmt",       opt,
wstring)
   DEFINE_ELEMENT (_NL_TIME_CODESET,    "time-codeset",   std, string)
+  DEFINE_ELEMENT (ALTMON_1,       "alt_mon",       opt, stringarray, 12, 12)
+  DEFINE_ELEMENT (_NL_WALTMON_1,  "wide-alt_mon",  opt, wstringarray, 12, 12)
   ), NO_POSTLOAD)
 
 
diff --git a/locale/langinfo.h b/locale/langinfo.h
index 28a0a73..65374dd 100644
--- a/locale/langinfo.h
+++ b/locale/langinfo.h
@@ -100,7 +100,8 @@ enum
   ABMON_12,
 #define ABMON_12 ABMON_12
 
-  /* Long month names.  */
+  /* Long month names, in the grammatical form used when the month
+     is a part of a complete date.  */
   MON_1, /* January */
 #define MON_1 MON_1
   MON_2,
@@ -189,7 +190,8 @@ enum
   _NL_WABMON_11,
   _NL_WABMON_12,
 
-  /* Long month names.  */
+  /* Long month names, in the grammatical form used when the month
+     is a part of a complete date.  */
   _NL_WMON_1, /* January */
   _NL_WMON_2,
   _NL_WMON_3,
@@ -231,6 +233,50 @@ enum
 
   _NL_TIME_CODESET,
 
+  /* Long month names, in the grammatical form used when the month
+     is named by itself.  */
+  __ALTMON_1, /* January */
+  __ALTMON_2,
+  __ALTMON_3,
+  __ALTMON_4,
+  __ALTMON_5,
+  __ALTMON_6,
+  __ALTMON_7,
+  __ALTMON_8,
+  __ALTMON_9,
+  __ALTMON_10,
+  __ALTMON_11,
+  __ALTMON_12,
+#ifdef __USE_GNU
+# define ALTMON_1 __ALTMON_1
+# define ALTMON_2 __ALTMON_2
+# define ALTMON_3 __ALTMON_3
+# define ALTMON_4 __ALTMON_4
+# define ALTMON_5 __ALTMON_5
+# define ALTMON_6 __ALTMON_6
+# define ALTMON_7 __ALTMON_7
+# define ALTMON_8 __ALTMON_8
+# define ALTMON_9 __ALTMON_9
+# define ALTMON_10 __ALTMON_10
+# define ALTMON_11 __ALTMON_11
+# define ALTMON_12 __ALTMON_12
+#endif
+
+  /* Long month names, in the grammatical form used when the month
+     is named by itself.  */
+  _NL_WALTMON_1, /* January */
+  _NL_WALTMON_2,
+  _NL_WALTMON_3,
+  _NL_WALTMON_4,
+  _NL_WALTMON_5,
+  _NL_WALTMON_6,
+  _NL_WALTMON_7,
+  _NL_WALTMON_8,
+  _NL_WALTMON_9,
+  _NL_WALTMON_10,
+  _NL_WALTMON_11,
+  _NL_WALTMON_12,
+
   _NL_NUM_LC_TIME, /* Number of indices in LC_TIME category.  */
 
   /* LC_COLLATE category: text sorting.
diff --git a/locale/programs/ld-time.c b/locale/programs/ld-time.c
index 67d055a..4186448 100644
--- a/locale/programs/ld-time.c
+++ b/locale/programs/ld-time.c
@@ -91,6 +91,9 @@ struct locale_time_t
   const char *date_fmt;
   const uint32_t *wdate_fmt;
   int alt_digits_defined;
+  const char *alt_mon[12];
+  const uint32_t *walt_mon[12];
+  int alt_mon_defined;
   unsigned char week_ndays;
   uint32_t week_1stday;
   unsigned char week_1stweek;
@@ -639,6 +642,15 @@ time_output (struct localedef_t *locale, const struct
charmap_t *charmap,
   add_locale_string (&file, time->date_fmt);
   add_locale_wstring (&file, time->wdate_fmt);
   add_locale_string (&file, charmap->code_set_name);
+
+  /* The alt'mons.  */
+  for (n = 0; n < 12; ++n)
+    add_locale_string (&file, time->alt_mon[n] ?: "");
+
+  /* The wide character alt'mons.  */
+  for (n = 0; n < 12; ++n)
+    add_locale_wstring (&file, time->walt_mon[n] ?: empty_wstr);
+
   write_locale_data (output_path, LC_TIME, "LC_TIME", &file);
 }
 
@@ -782,6 +794,7 @@ time_read (struct linereader *ldfile, struct localedef_t
*result,
   STRARR_ELEM (mon, 12, 12);
   STRARR_ELEM (am_pm, 2, 2);
   STRARR_ELEM (alt_digits, 0, 100);
+  STRARR_ELEM (alt_mon, 12, 12);
 
  case tok_era:
   /* Ignore the rest of the line if we don't need the input of
@@ -934,6 +947,14 @@ time_read (struct linereader *ldfile, struct localedef_t
*result,
     lr_error (ldfile, _("\
 %1$s: definition does not end with `END %1$s'"), "LC_TIME");
   lr_ignore_rest (ldfile, now->tok == tok_lc_time);
+
+  /* If alt_mon was not specified, make it a copy of mon.  */
+  if (!ignore_content && !time->alt_mon_defined)
+    {
+      memcpy (time->alt_mon, time->mon, sizeof (time->mon));
+      memcpy (time->walt_mon, time->wmon, sizeof (time->wmon));
+      time->alt_mon_defined = 1;
+    }
   return;
 
  default:
diff --git a/locale/programs/locfile-kw.gperf b/locale/programs/locfile-kw.gperf
index c74c1f2..dad7f21 100644
--- a/locale/programs/locfile-kw.gperf
+++ b/locale/programs/locfile-kw.gperf
@@ -148,6 +148,7 @@ first_workday,          tok_first_workday,          0
 cal_direction,          tok_cal_direction,          0
 timezone,               tok_timezone,               0
 date_fmt,               tok_date_fmt,               0
+alt_mon,                tok_alt_mon,                0
 LC_MESSAGES,            tok_lc_messages,            0
 yesexpr,                tok_yesexpr,                0
 noexpr,                 tok_noexpr,                 0
diff --git a/locale/programs/locfile-token.h b/locale/programs/locfile-token.h
index f02325d..d49da5e 100644
--- a/locale/programs/locfile-token.h
+++ b/locale/programs/locfile-token.h
@@ -186,6 +186,7 @@ enum token_t
   tok_cal_direction,
   tok_timezone,
   tok_date_fmt,
+  tok_alt_mon,
   tok_lc_messages,
   tok_yesexpr,
   tok_noexpr,
diff --git a/localedata/tst-langinfo.c b/localedata/tst-langinfo.c
index 8c3667c..0d33e75 100644
--- a/localedata/tst-langinfo.c
+++ b/localedata/tst-langinfo.c
@@ -50,6 +50,18 @@ struct map
   VAL (ABMON_8),
   VAL (ABMON_9),
   VAL (ALT_DIGITS),
+  VAL (ALTMON_1),
+  VAL (ALTMON_10),
+  VAL (ALTMON_11),
+  VAL (ALTMON_12),
+  VAL (ALTMON_2),
+  VAL (ALTMON_3),
+  VAL (ALTMON_4),
+  VAL (ALTMON_5),
+  VAL (ALTMON_6),
+  VAL (ALTMON_7),
+  VAL (ALTMON_8),
+  VAL (ALTMON_9),
   VAL (AM_STR),
   VAL (CRNCYSTR),
   VAL (CURRENCY_SYMBOL),
diff --git a/time/Makefile b/time/Makefile
index 264eed9..2deb025 100644
--- a/time/Makefile
+++ b/time/Makefile
@@ -48,7 +48,8 @@ tests := test_time clocktest tst-posixtz tst-strptime
tst_wcsftime \
 include ../Rules
 
 ifeq ($(run-built-tests),yes)
-LOCALES := de_DE.ISO-8859-1 en_US.ISO-8859-1 ja_JP.EUC-JP
+LOCALES := de_DE.ISO-8859-1 en_US.ISO-8859-1 ja_JP.EUC-JP fr_FR.UTF-8 \
+   pl_PL.UTF-8
 include ../gen-locales.mk
 
 $(objpfx)tst-ftime_l.out: $(gen-locales)
diff --git a/time/strftime_l.c b/time/strftime_l.c
index 18651ff..ac5d28f 100644
--- a/time/strftime_l.c
+++ b/time/strftime_l.c
@@ -492,6 +492,9 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T
*format,
 # define f_month \
   ((const CHAR_T *) (tp->tm_mon < 0 || tp->tm_mon > 11     \
      ? "?" : _NL_CURRENT (LC_TIME, NLW(MON_1) + tp->tm_mon)))
+# define f_altmonth \
+  ((const CHAR_T *) (tp->tm_mon < 0 || tp->tm_mon > 11     \
+     ? "?" : _NL_CURRENT (LC_TIME, NLW(ALTMON_1) + tp->tm_mon)))
 # define ampm \
   ((const CHAR_T *) _NL_CURRENT (LC_TIME, tp->tm_hour > 11      \
  ? NLW(PM_STR) : NLW(AM_STR)))
@@ -507,6 +510,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T
*format,
    ? "?" : month_name[tp->tm_mon])
 #  define a_wkday f_wkday
 #  define a_month f_month
+#  define f_altmonth f_month
 #  define ampm (L_("AMPM") + 2 * (tp->tm_hour > 11))
 
   size_t aw_len = 3;
@@ -785,7 +789,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T
*format,
 #endif
 
  case L_('B'):
-  if (modifier != 0)
+  if (modifier == L_('E'))
     goto bad_format;
   if (change_case)
     {
@@ -793,7 +797,10 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const
CHAR_T *format,
       to_lowcase = 0;
     }
 #if defined _NL_CURRENT || !HAVE_STRFTIME
-  cpy (STRLEN (f_month), f_month);
+  if (modifier == L_('O'))
+    cpy (STRLEN (f_altmonth), f_altmonth);
+  else
+    cpy (STRLEN (f_month), f_month);
   break;
 #else
   goto underlying_strftime;
diff --git a/time/strptime_l.c b/time/strptime_l.c
index 7d4758e..39cf38d 100644
--- a/time/strptime_l.c
+++ b/time/strptime_l.c
@@ -124,6 +124,8 @@ extern const struct __locale_data _nl_C_LC_TIME
attribute_hidden;
   (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (ABDAY_1)].string)
 # define month_name (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (MON_1)].string)
 # define ab_month_name (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (ABMON_1)].string)
+# define alt_month_name \
+  (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (ALTMON_1)].string)
 # define HERE_D_T_FMT (_nl_C_LC_TIME.values[_NL_ITEM_INDEX (D_T_FMT)].string)
 # define HERE_D_FMT (_nl_C_LC_TIME.values[_NL_ITEM_INDEX (D_FMT)].string)
 # define HERE_AM_STR (_nl_C_LC_TIME.values[_NL_ITEM_INDEX (AM_STR)].string)
@@ -319,10 +321,9 @@ __strptime_internal (const char *rp, const char *fmt,
struct tm *tmp,
       while (*fmt >= '0' && *fmt <= '9')
  ++fmt;
 
-#ifndef _NL_CURRENT
-      /* We need this for handling the `E' modifier.  */
+      /* In some cases, modifiers are handled by adjusting state and
+         then restarting the switch statement below.  */
     start_over:
-#endif
 
       /* Make back up of current processing pointer.  */
       rp_backup = rp;
@@ -423,13 +424,32 @@ __strptime_internal (const char *rp, const char *fmt,
struct tm *tmp,
      ab_month_name[cnt]))
  decided_longest = loc;
     }
+#ifdef _LIBC
+  /* Now check the alt month.  */
+  trp = rp;
+  if (match_string (_NL_CURRENT (LC_TIME, ALTMON_1 + cnt), trp)
+      && trp > rp_longest)
+    {
+      rp_longest = trp;
+      cnt_longest = cnt;
+      if (s.decided == not
+  && strcmp (_NL_CURRENT (LC_TIME, ALTMON_1 + cnt),
+     alt_month_name[cnt]))
+ decided_longest = loc;
+    }
+#endif
  }
 #endif
       if (s.decided != loc
   && (((trp = rp, match_string (month_name[cnt], trp))
        && trp > rp_longest)
       || ((trp = rp, match_string (ab_month_name[cnt], trp))
-  && trp > rp_longest)))
+  && trp > rp_longest)
+#ifdef _LIBC
+      || ((trp = rp, match_string (alt_month_name[cnt], trp))
+  && trp > rp_longest)
+#endif
+      ))
  {
   rp_longest = trp;
   cnt_longest = cnt;
@@ -1015,6 +1035,10 @@ __strptime_internal (const char *rp, const char *fmt,
struct tm *tmp,
  case 'O':
   switch (*fmt++)
     {
+    case 'B':
+      /* Match month name.  Reprocess as plain 'B'.  */
+      fmt--;
+      goto start_over;
     case 'd':
     case 'e':
       /* Match day of month using alternate numeric symbols.  */
diff --git a/time/tst-strptime.c b/time/tst-strptime.c
index 34ad797..62ecb7c 100644
--- a/time/tst-strptime.c
+++ b/time/tst-strptime.c
@@ -51,6 +51,18 @@ static const struct
     6, 0, 0, 1 },
   { "ja_JP.EUC-JP", "2001 20 \xb7\xee", "%Y %U %a", 1, 140, 4, 21 },
   { "ja_JP.EUC-JP", "2001 21 \xb7\xee", "%Y %W %a", 1, 140, 4, 21 },
+  /* Most of the languages do not need the declension of the month names
+     and do not distinguish between %B and %OB.  */
+  { "en_US.ISO-8859-1", "November 17, 2017", "%B %e, %Y", 5, 320, 10, 17 },
+  { "de_DE.ISO-8859-1", "18. Nov 2017", "%d. %b %Y", 6, 321, 10, 18 },
+  { "fr_FR.UTF-8", "19 novembre 2017", "%d %OB %Y", 0, 322, 10, 19 },
+  /* Some languages do need the declension of the month names.  */
+  { "pl_PL.UTF-8", "21 lis 2017", "%d %b %Y", 2, 324, 10, 21 },
+  { "pl_PL.UTF-8", "22 LIS 2017", "%d %B %Y", 3, 325, 10, 22 },
+  /* TODO: Use the genitive case here as soon as it is added to localedata.  */
+  { "pl_PL.UTF-8", "23 listopad 2017", "%d %B %Y", 4, 326, 10, 23 },
+  /* The nominative case is incorrect here but it is parseable.  */
+  { "pl_PL.UTF-8", "24 listopad 2017", "%d %OB %Y", 5, 327, 10, 24 },
 };
 
 
--
2.7.5
Reply | Threaded
Open this post in threaded view
|

[PATCH v12 3/6] Abbreviated alternative month names (%Ob) also added (bug 10871).

Rafal Luzynski
In reply to this post by Rafal Luzynski
All the previous changes also repeated to support abbreviated
alternative month names.  In most languages which have declension and
need nominative/genitive month names the abbreviated forms for both
cases are the same.  An example where they do differ is May in Russian:
this name is too short to be abbreviated so even the abbreviated form
features the declension suffixes.

        [BZ #10871]
        * locale/C-time.c (_nl_C_LC_TIME): Add abbreviated alternative month
        names, define them as the same as abbreviated month names explicitly.
        * locale/categories.def (LC_TIME): Add ab_alt_mon and wide-ab_alt_mon.
        * locale/langinfo.h: (_NL_ABALTMON_1, _NL_ABALTMON_2, _NL_ABALTMON_3,
        _NL_ABALTMON_4, _NL_ABALTMON_5, _NL_ABALTMON_6, _NL_ABALTMON_7,
        _NL_ABALTMON_8, _NL_ABALTMON_9, _NL_ABALTMON_10, _NL_ABALTMON_11,
        _NL_ABALTMON_12, _NL_WABALTMON_1, _NL_WABALTMON_2, _NL_WABALTMON_3,
        _NL_WABALTMON_4, _NL_WABALTMON_5, _NL_WABALTMON_6, _NL_WABALTMON_7,
        _NL_WABALTMON_8, _NL_WABALTMON_9, _NL_WABALTMON_10, _NL_WABALTMON_11,
        _NL_WABALTMON_12): New enum constants.
        * locale/programs/ld-time.c (struct locale_time_t): Add ab_alt_mon,
        wab_alt_mon, and ab_alt_mon_defined members.
        (time_output): Output ab_alt_mon and wab_alt_mon members.
        (time_read): Read them, initialize them as copies of abmon and wabmon
        respectively if they are missing, initialize ab_alt_mon_defined.
        * locale/programs/locfile-kw.gperf (ab_alt_mon): Define.
        * locale/programs/locfile-token.h (tok_ab_alt_mon): New enum constant.
        * time/Makefile [$(run-built-tests) = yes] (LOCALES): Add es_ES.UTF-8
        and ru_RU.UTF-8.
        * time/strftime_l.c (a_altmonth, aam_len): New macros.
        [!COMPILE_WIDE] (ABALTMON_1): New macro.
        (__strftime_internal): Handle %Ob and %Oh formats.
        * time/strptime_l.c [_LIBC] (ab_alt_month_name): New macro.
        (__strptime_internal): Handle %Ob and %Oh formats.
        * time/tst-strptime.c (day_tests): Add more tests to parse different
        forms of month names including the new %Ob format specifier.
---
 locale/C-time.c                  | 28 ++++++++++++++++++++++++++--
 locale/categories.def            |  6 ++++--
 locale/langinfo.h                | 36 ++++++++++++++++++++++++++++++++++--
 locale/programs/ld-time.c        | 19 +++++++++++++++++++
 locale/programs/locfile-kw.gperf |  1 +
 locale/programs/locfile-token.h  |  1 +
 time/Makefile                    |  2 +-
 time/strftime_l.c                | 14 ++++++++++++--
 time/strptime_l.c                | 18 ++++++++++++++++++
 time/tst-strptime.c              |  9 +++++++++
 10 files changed, 125 insertions(+), 9 deletions(-)

diff --git a/locale/C-time.c b/locale/C-time.c
index 73bc700..e2b3b17 100644
--- a/locale/C-time.c
+++ b/locale/C-time.c
@@ -30,7 +30,7 @@ const struct __locale_data _nl_C_LC_TIME attribute_hidden =
   { NULL, }, /* no cached data */
   UNDELETABLE,
   0,
-  135,
+  159,
   {
     { .string = "Sun" },
     { .string = "Mon" },
@@ -166,6 +166,30 @@ const struct __locale_data _nl_C_LC_TIME attribute_hidden =
     { .wstr = (const uint32_t *) L"September" },
     { .wstr = (const uint32_t *) L"October" },
     { .wstr = (const uint32_t *) L"November" },
-    { .wstr = (const uint32_t *) L"December" }
+    { .wstr = (const uint32_t *) L"December" },
+    { .string = "Jan" },
+    { .string = "Feb" },
+    { .string = "Mar" },
+    { .string = "Apr" },
+    { .string = "May" },
+    { .string = "Jun" },
+    { .string = "Jul" },
+    { .string = "Aug" },
+    { .string = "Sep" },
+    { .string = "Oct" },
+    { .string = "Nov" },
+    { .string = "Dec" },
+    { .wstr = (const uint32_t *) L"Jan" },
+    { .wstr = (const uint32_t *) L"Feb" },
+    { .wstr = (const uint32_t *) L"Mar" },
+    { .wstr = (const uint32_t *) L"Apr" },
+    { .wstr = (const uint32_t *) L"May" },
+    { .wstr = (const uint32_t *) L"Jun" },
+    { .wstr = (const uint32_t *) L"Jul" },
+    { .wstr = (const uint32_t *) L"Aug" },
+    { .wstr = (const uint32_t *) L"Sep" },
+    { .wstr = (const uint32_t *) L"Oct" },
+    { .wstr = (const uint32_t *) L"Nov" },
+    { .wstr = (const uint32_t *) L"Dec" }
   }
 };
diff --git a/locale/categories.def b/locale/categories.def
index 3cbb4e6..56c5f88 100644
--- a/locale/categories.def
+++ b/locale/categories.def
@@ -249,8 +249,10 @@ DEFINE_CATEGORY
   DEFINE_ELEMENT (_DATE_FMT,                "date_fmt",            opt, string)
   DEFINE_ELEMENT (_NL_W_DATE_FMT,           "wide-date_fmt",       opt,
wstring)
   DEFINE_ELEMENT (_NL_TIME_CODESET,    "time-codeset",   std, string)
-  DEFINE_ELEMENT (ALTMON_1,       "alt_mon",       opt, stringarray, 12, 12)
-  DEFINE_ELEMENT (_NL_WALTMON_1,  "wide-alt_mon",  opt, wstringarray, 12, 12)
+  DEFINE_ELEMENT (ALTMON_1,        "alt_mon",         opt, stringarray,  12,
12)
+  DEFINE_ELEMENT (_NL_WALTMON_1,   "wide-alt_mon",    opt, wstringarray, 12,
12)
+  DEFINE_ELEMENT (_NL_ABALTMON_1,  "ab_alt_mon",      opt, stringarray,  12,
12)
+  DEFINE_ELEMENT (_NL_WABALTMON_1, "wide-ab_alt_mon", opt, wstringarray, 12,
12)
   ), NO_POSTLOAD)
 
 
diff --git a/locale/langinfo.h b/locale/langinfo.h
index 65374dd..a50cc9b 100644
--- a/locale/langinfo.h
+++ b/locale/langinfo.h
@@ -74,7 +74,8 @@ enum
   DAY_7, /* Saturday */
 #define DAY_7 DAY_7
 
-  /* Abbreviated month names.  */
+  /* Abbreviated month names, in the grammatical form used when the month
+     is a part of a complete date.  */
   ABMON_1, /* Jan */
 #define ABMON_1 ABMON_1
   ABMON_2,
@@ -176,7 +177,8 @@ enum
   _NL_WDAY_6, /* Friday */
   _NL_WDAY_7, /* Saturday */
 
-  /* Abbreviated month names.  */
+  /* Abbreviated month names, in the grammatical form used when the month
+     is a part of a complete date.  */
   _NL_WABMON_1, /* Jan */
   _NL_WABMON_2,
   _NL_WABMON_3,
@@ -277,6 +279,36 @@ enum
   _NL_WALTMON_11,
   _NL_WALTMON_12,
 
+  /* Abbreviated month names, in the grammatical form used when the month
+     is named by itself.  */
+  _NL_ABALTMON_1, /* Jan */
+  _NL_ABALTMON_2,
+  _NL_ABALTMON_3,
+  _NL_ABALTMON_4,
+  _NL_ABALTMON_5,
+  _NL_ABALTMON_6,
+  _NL_ABALTMON_7,
+  _NL_ABALTMON_8,
+  _NL_ABALTMON_9,
+  _NL_ABALTMON_10,
+  _NL_ABALTMON_11,
+  _NL_ABALTMON_12,
+
+  /* Abbreviated month names, in the grammatical form used when the month
+     is named by itself.  */
+  _NL_WABALTMON_1, /* Jan */
+  _NL_WABALTMON_2,
+  _NL_WABALTMON_3,
+  _NL_WABALTMON_4,
+  _NL_WABALTMON_5,
+  _NL_WABALTMON_6,
+  _NL_WABALTMON_7,
+  _NL_WABALTMON_8,
+  _NL_WABALTMON_9,
+  _NL_WABALTMON_10,
+  _NL_WABALTMON_11,
+  _NL_WABALTMON_12,
+
   _NL_NUM_LC_TIME, /* Number of indices in LC_TIME category.  */
 
   /* LC_COLLATE category: text sorting.
diff --git a/locale/programs/ld-time.c b/locale/programs/ld-time.c
index 4186448..a755792 100644
--- a/locale/programs/ld-time.c
+++ b/locale/programs/ld-time.c
@@ -94,6 +94,9 @@ struct locale_time_t
   const char *alt_mon[12];
   const uint32_t *walt_mon[12];
   int alt_mon_defined;
+  const char *ab_alt_mon[12];
+  const uint32_t *wab_alt_mon[12];
+  int ab_alt_mon_defined;
   unsigned char week_ndays;
   uint32_t week_1stday;
   unsigned char week_1stweek;
@@ -651,6 +654,14 @@ time_output (struct localedef_t *locale, const struct
charmap_t *charmap,
   for (n = 0; n < 12; ++n)
     add_locale_wstring (&file, time->walt_mon[n] ?: empty_wstr);
 
+  /* The ab'alt'mons.  */
+  for (n = 0; n < 12; ++n)
+    add_locale_string (&file, time->ab_alt_mon[n] ?: "");
+
+  /* The wide character ab'alt'mons.  */
+  for (n = 0; n < 12; ++n)
+    add_locale_wstring (&file, time->wab_alt_mon[n] ?: empty_wstr);
+
   write_locale_data (output_path, LC_TIME, "LC_TIME", &file);
 }
 
@@ -795,6 +806,7 @@ time_read (struct linereader *ldfile, struct localedef_t
*result,
   STRARR_ELEM (am_pm, 2, 2);
   STRARR_ELEM (alt_digits, 0, 100);
   STRARR_ELEM (alt_mon, 12, 12);
+  STRARR_ELEM (ab_alt_mon, 12, 12);
 
  case tok_era:
   /* Ignore the rest of the line if we don't need the input of
@@ -955,6 +967,13 @@ time_read (struct linereader *ldfile, struct localedef_t
*result,
       memcpy (time->walt_mon, time->wmon, sizeof (time->wmon));
       time->alt_mon_defined = 1;
     }
+  /* The same for abbreviated versions.  */
+  if (!ignore_content && !time->ab_alt_mon_defined)
+    {
+      memcpy (time->ab_alt_mon, time->abmon, sizeof (time->abmon));
+      memcpy (time->wab_alt_mon, time->wabmon, sizeof (time->wabmon));
+      time->ab_alt_mon_defined = 1;
+    }
   return;
 
  default:
diff --git a/locale/programs/locfile-kw.gperf b/locale/programs/locfile-kw.gperf
index dad7f21..6bf2f60 100644
--- a/locale/programs/locfile-kw.gperf
+++ b/locale/programs/locfile-kw.gperf
@@ -149,6 +149,7 @@ cal_direction,          tok_cal_direction,          0
 timezone,               tok_timezone,               0
 date_fmt,               tok_date_fmt,               0
 alt_mon,                tok_alt_mon,                0
+ab_alt_mon,             tok_ab_alt_mon,             0
 LC_MESSAGES,            tok_lc_messages,            0
 yesexpr,                tok_yesexpr,                0
 noexpr,                 tok_noexpr,                 0
diff --git a/locale/programs/locfile-token.h b/locale/programs/locfile-token.h
index d49da5e..e3cd18e 100644
--- a/locale/programs/locfile-token.h
+++ b/locale/programs/locfile-token.h
@@ -187,6 +187,7 @@ enum token_t
   tok_timezone,
   tok_date_fmt,
   tok_alt_mon,
+  tok_ab_alt_mon,
   tok_lc_messages,
   tok_yesexpr,
   tok_noexpr,
diff --git a/time/Makefile b/time/Makefile
index 2deb025..0db1206 100644
--- a/time/Makefile
+++ b/time/Makefile
@@ -49,7 +49,7 @@ include ../Rules
 
 ifeq ($(run-built-tests),yes)
 LOCALES := de_DE.ISO-8859-1 en_US.ISO-8859-1 ja_JP.EUC-JP fr_FR.UTF-8 \
-   pl_PL.UTF-8
+   es_ES.UTF-8 pl_PL.UTF-8 ru_RU.UTF-8
 include ../gen-locales.mk
 
 $(objpfx)tst-ftime_l.out: $(gen-locales)
diff --git a/time/strftime_l.c b/time/strftime_l.c
index ac5d28f..c71f9f4 100644
--- a/time/strftime_l.c
+++ b/time/strftime_l.c
@@ -106,6 +106,7 @@ extern char *tzname[];
 # define UCHAR_T unsigned char
 # define L_(Str) Str
 # define NLW(Sym) Sym
+# define ABALTMON_1 _NL_ABALTMON_1
 
 # if !defined STDC_HEADERS && !defined HAVE_MEMCPY
 #  define MEMCPY(d, s, n) bcopy ((s), (d), (n))
@@ -492,6 +493,9 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T
*format,
 # define f_month \
   ((const CHAR_T *) (tp->tm_mon < 0 || tp->tm_mon > 11     \
      ? "?" : _NL_CURRENT (LC_TIME, NLW(MON_1) + tp->tm_mon)))
+# define a_altmonth \
+  ((const CHAR_T *) (tp->tm_mon < 0 || tp->tm_mon > 11     \
+     ? "?" : _NL_CURRENT (LC_TIME, NLW(ABALTMON_1) + tp->tm_mon)))
 # define f_altmonth \
   ((const CHAR_T *) (tp->tm_mon < 0 || tp->tm_mon > 11     \
      ? "?" : _NL_CURRENT (LC_TIME, NLW(ALTMON_1) + tp->tm_mon)))
@@ -501,6 +505,7 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T
*format,
 
 # define aw_len STRLEN (a_wkday)
 # define am_len STRLEN (a_month)
+# define aam_len STRLEN (a_altmonth)
 # define ap_len STRLEN (ampm)
 #else
 # if !HAVE_STRFTIME
@@ -510,11 +515,13 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const
CHAR_T *format,
    ? "?" : month_name[tp->tm_mon])
 #  define a_wkday f_wkday
 #  define a_month f_month
+#  define a_altmonth a_month
 #  define f_altmonth f_month
 #  define ampm (L_("AMPM") + 2 * (tp->tm_hour > 11))
 
   size_t aw_len = 3;
   size_t am_len = 3;
+  size_t aam_len = 3;
   size_t ap_len = 2;
 # endif
 #endif
@@ -779,10 +786,13 @@ __strftime_internal (CHAR_T *s, size_t maxsize, const
CHAR_T *format,
       to_uppcase = 1;
       to_lowcase = 0;
     }
-  if (modifier != 0)
+  if (modifier == L_('E'))
     goto bad_format;
 #if defined _NL_CURRENT || !HAVE_STRFTIME
-  cpy (am_len, a_month);
+  if (modifier == L_('O'))
+    cpy (aam_len, a_altmonth);
+  else
+    cpy (am_len, a_month);
   break;
 #else
   goto underlying_strftime;
diff --git a/time/strptime_l.c b/time/strptime_l.c
index 39cf38d..cd901c2 100644
--- a/time/strptime_l.c
+++ b/time/strptime_l.c
@@ -126,6 +126,8 @@ extern const struct __locale_data _nl_C_LC_TIME
attribute_hidden;
 # define ab_month_name (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (ABMON_1)].string)
 # define alt_month_name \
   (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (ALTMON_1)].string)
+# define ab_alt_month_name \
+  (&_nl_C_LC_TIME.values[_NL_ITEM_INDEX (_NL_ABALTMON_1)].string)
 # define HERE_D_T_FMT (_nl_C_LC_TIME.values[_NL_ITEM_INDEX (D_T_FMT)].string)
 # define HERE_D_FMT (_nl_C_LC_TIME.values[_NL_ITEM_INDEX (D_FMT)].string)
 # define HERE_AM_STR (_nl_C_LC_TIME.values[_NL_ITEM_INDEX (AM_STR)].string)
@@ -437,6 +439,18 @@ __strptime_internal (const char *rp, const char *fmt,
struct tm *tmp,
      alt_month_name[cnt]))
  decided_longest = loc;
     }
+  trp = rp;
+  if (match_string (_NL_CURRENT (LC_TIME, _NL_ABALTMON_1 + cnt),
+    trp)
+      && trp > rp_longest)
+    {
+      rp_longest = trp;
+      cnt_longest = cnt;
+      if (s.decided == not
+  && strcmp (_NL_CURRENT (LC_TIME, _NL_ABALTMON_1 + cnt),
+     alt_month_name[cnt]))
+ decided_longest = loc;
+    }
 #endif
  }
 #endif
@@ -448,6 +462,8 @@ __strptime_internal (const char *rp, const char *fmt, struct
tm *tmp,
 #ifdef _LIBC
       || ((trp = rp, match_string (alt_month_name[cnt], trp))
   && trp > rp_longest)
+      || ((trp = rp, match_string (ab_alt_month_name[cnt], trp))
+  && trp > rp_longest)
 #endif
       ))
  {
@@ -1035,7 +1051,9 @@ __strptime_internal (const char *rp, const char *fmt,
struct tm *tmp,
  case 'O':
   switch (*fmt++)
     {
+    case 'b':
     case 'B':
+    case 'h':
       /* Match month name.  Reprocess as plain 'B'.  */
       fmt--;
       goto start_over;
diff --git a/time/tst-strptime.c b/time/tst-strptime.c
index 62ecb7c..2eac5a2 100644
--- a/time/tst-strptime.c
+++ b/time/tst-strptime.c
@@ -56,6 +56,7 @@ static const struct
   { "en_US.ISO-8859-1", "November 17, 2017", "%B %e, %Y", 5, 320, 10, 17 },
   { "de_DE.ISO-8859-1", "18. Nov 2017", "%d. %b %Y", 6, 321, 10, 18 },
   { "fr_FR.UTF-8", "19 novembre 2017", "%d %OB %Y", 0, 322, 10, 19 },
+  { "es_ES.UTF-8", "20 de nov de 2017", "%d de %Ob de %Y", 1, 323, 10, 20 },
   /* Some languages do need the declension of the month names.  */
   { "pl_PL.UTF-8", "21 lis 2017", "%d %b %Y", 2, 324, 10, 21 },
   { "pl_PL.UTF-8", "22 LIS 2017", "%d %B %Y", 3, 325, 10, 22 },
@@ -63,6 +64,14 @@ static const struct
   { "pl_PL.UTF-8", "23 listopad 2017", "%d %B %Y", 4, 326, 10, 23 },
   /* The nominative case is incorrect here but it is parseable.  */
   { "pl_PL.UTF-8", "24 listopad 2017", "%d %OB %Y", 5, 327, 10, 24 },
+  { "pl_PL.UTF-8", "25 lis 2017", "%d %Ob %Y", 6, 328, 10, 25 },
+  /* ноя - pronounce: 'noya' - "Nov" (abbreviated "November") in Russian.  */
+  { "ru_RU.UTF-8", "26 ноя 2017", "%d %b %Y", 0, 329, 10, 26 },
+  /* TODO: Add an example of "may"/"maya" (5th month, May) using %Ob in
+     Russian when the localedata is updated.  Without the genitive forms
+     in localedata the word "maya" is ambiguous and may be mistaken for
+     "mart" (March).
+   */
 };
 
 
--
2.7.5
Reply | Threaded
Open this post in threaded view
|

[PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rafal Luzynski
In reply to this post by Rafal Luzynski
        [BZ #10871]
        * manual/locale.texi: Document ALTMON_1..12 constants for
        nl_langinfo.  Specify when to use ALTMON instead of MON.
        * manual/time.texi (strftime, strptime): Document GNU extension
        permitting O modifier with %B and %b.  Specify when to use
        %OB instead of %B.
---
 ChangeLog          |  9 +++++++++
 NEWS               | 24 ++++++++++++++++++++++++
 manual/locale.texi | 26 +++++++++++++++++++++++---
 manual/time.texi   | 35 +++++++++++++++++++++++++++--------
 4 files changed, 83 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index 75bf467..c70d8a9 100644
--- a/NEWS
+++ b/NEWS
@@ -69,6 +69,30 @@ Major new features:
   collation ordering.  Previous glibc versions used locale-specific
   ordering, the change might break systems that relied on that.
 
+* Support for two grammatical forms of month name has been added.
+  In a call to strftime, the "%B" and "%b" format specifiers will now
+  produce the grammatical form required when the month is used as part
+  of a complete date.  New "%OB" and "%Ob" specifiers produce the form
+  required when the month is named by itself.  For instance, in Greek
+  and in many Slavic and Baltic languages, "%B" will produce the month
+  in genitive case, and "%OB" will produce the month in nominative case.
+
+  In a call to strptime, "%B", "%b", "%h", "%OB", "%Ob", and "%Oh"
+  are all valid and will all accept any known form of month
+  name---standalone or complete, abbreviated or full.  In a call to
+  nl_langinfo, the query constants MON_1..12 and ABMON_1..12 return
+  the strings used by "%B" and "%b", respectively.  New query
+  constants ALTMON_1..12 and _NL_ABALTMON_1..12 return the strings
+  used by "%OB" and "%Ob", respectively.
+
+  In a locale definition file, use "alt_mon" and "ab_alt_mon" to
+  define the strings for %OB and %Ob, respectively; these have the
+  same syntax as "mon" and "ab_mon".
+
+  This feature is currently a GNU extension, but it is expected to
+  be added to the next revision of POSIX, and it is also already
+  available on some BSD-derived operating systems.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * Support for statically linked applications which call dlopen is deprecated
diff --git a/manual/locale.texi b/manual/locale.texi
index 60ad2a1..059db75 100644
--- a/manual/locale.texi
+++ b/manual/locale.texi
@@ -923,7 +923,7 @@ corresponds to Sunday.
 @itemx DAY_5
 @itemx DAY_6
 @itemx DAY_7
-Similar to @code{ABDAY_1} etc., but here the return value is the
+Similar to @code{ABDAY_1} etc.,@: but here the return value is the
 unabbreviated weekday name.
 @item ABMON_1
 @itemx ABMON_2
@@ -937,7 +937,8 @@ unabbreviated weekday name.
 @itemx ABMON_10
 @itemx ABMON_11
 @itemx ABMON_12
-The return value is abbreviated name of the month.  @code{ABMON_1}
+The return value is the abbreviated name of the month, in the grammatical
+form used when the month forms part of a complete date.  @code{ABMON_1}
 corresponds to January.
 @item MON_1
 @itemx MON_2
@@ -951,8 +952,27 @@ corresponds to January.
 @itemx MON_10
 @itemx MON_11
 @itemx MON_12
-Similar to @code{ABMON_1} etc., but here the month names are not abbreviated.
+Similar to @code{ABMON_1} etc.,@: but here the month names are not abbreviated.
 Here the first value @code{MON_1} also corresponds to January.
+@item ALTMON_1
+@itemx ALTMON_2
+@itemx ALTMON_3
+@itemx ALTMON_4
+@itemx ALTMON_5
+@itemx ALTMON_6
+@itemx ALTMON_7
+@itemx ALTMON_8
+@itemx ALTMON_9
+@itemx ALTMON_10
+@itemx ALTMON_11
+@itemx ALTMON_12
+Similar to @code{MON_1} etc.,@: but here the month names are in the grammatical
+form used when the month is named by itself.  The @code{strftime} functions
+use these month names for the format specifier @code{OB}.
+
+Note that not all languages need two different forms of the month names,
+so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
+may or may not be the same, depending on the locale.
 @item AM_STR
 @itemx PM_STR
 The return values are strings which can be used in the representation of time
diff --git a/manual/time.texi b/manual/time.texi
index 33aa221..2a5afd9 100644
--- a/manual/time.texi
+++ b/manual/time.texi
@@ -1346,8 +1346,13 @@ example, @code{%Ex} might yield a date format based on
the Japanese
 Emperors' reigns.
 
 @item O
-Use the locale's alternate numeric symbols for numbers.  This modifier
-applies only to numeric format specifiers.
+With all format specifiers that produce numbers: use the locale's
+alternate numeric symbols.
+
+With @code{%B} and @code{%b}: use the grammatical form for month names
+that is appropriate when the month is named by itself, rather than
+the form that is appropriate when the month is used as part of a
+complete date.  This is a GNU extension.
 @end table
 
 If the format supports the modifier but no alternate representation
@@ -1365,13 +1370,21 @@ The abbreviated weekday name according to the current
locale.
 The full weekday name according to the current locale.
 
 @item %b
-The abbreviated month name according to the current locale.
+The abbreviated month name according to the current locale, in the
+grammatical form used when the month is part of a complete date.
+As a GNU extension, the @code{O} modifier can be used (@code{%Ob})
+to get the grammatical form used when the month is named by itself.
 
 @item %B
-The full month name according to the current locale.
+The full month name according to the current locale, in the
+grammatical form used when the month is part of a complete date.
+As a GNU extension, the @code{O} modifier can be used (@code{%OB})
+to get the grammatical form used when the month is named by itself.
 
-Using @code{%B} together with @code{%d} produces grammatically
-incorrect results for some locales.
+Note that not all languages need two different forms of the month
+names, so the text produced by @code{%B} and @code{%OB}, and by
+@code{%b} and @code{%Ob}, may or may not be the same, depending on
+the locale.
 
 @item %c
 The preferred calendar time representation for the current locale.
@@ -1778,8 +1791,14 @@ the full name.
 @item %b
 @itemx %B
 @itemx %h
-The month name according to the current locale, in abbreviated form or
-the full name.
+A month name according to the current locale.  All three specifiers
+will recognize both abbreviated and full month names.  If the
+locale provides two different grammatical forms of month names,
+all three specifiers will recognize both forms.
+
+As a GNU extension, the @code{O} modifier can be used with these
+specifiers; it has no effect, as both grammatical forms of month
+names are recognized.
 
 @item %c
 The date and time representation for the current locale.
--
2.7.5
Reply | Threaded
Open this post in threaded view
|

[PATCH v12 6/6] pl_PL: Add alternative month names (bug 10871).

Rafal Luzynski
In reply to this post by Rafal Luzynski
        [BZ #10871]
        * localedata/locales/pl_PL: Alternative month names added,
        primary month names are genitive now.
        * time/tst-strptime.c (day_tests): Actually use a genitive case
        of a month name in Polish language.
---
 localedata/locales/pl_PL | 14 +++++++++++++-
 time/tst-strptime.c      |  3 +--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 2a8d09d..632a1b3 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -175,7 +175,7 @@ abmon   "sty";"lut";/
         "lip";"sie";/
         "wrz";"pa<U017A>";/
         "lis";"gru"
-mon     "stycze<U0144>";/
+alt_mon "stycze<U0144>";/
         "luty";/
         "marzec";/
         "kwiecie<U0144>";/
@@ -187,6 +187,18 @@ mon     "stycze<U0144>";/
         "pa<U017A>dziernik";/
         "listopad";/
         "grudzie<U0144>"
+mon     "stycznia";/
+        "lutego";/
+        "marca";/
+        "kwietnia";/
+        "maja";/
+        "czerwca";/
+        "lipca";/
+        "sierpnia";/
+        "wrze<U015B>nia";/
+        "pa<U017A>dziernika";/
+        "listopada";/
+        "grudnia"
 d_t_fmt "%a, %-d %b %Y, %T"
 d_fmt   "%d.%m.%Y"
 t_fmt   "%T"
diff --git a/time/tst-strptime.c b/time/tst-strptime.c
index 2eac5a2..49dfbe9 100644
--- a/time/tst-strptime.c
+++ b/time/tst-strptime.c
@@ -60,8 +60,7 @@ static const struct
   /* Some languages do need the declension of the month names.  */
   { "pl_PL.UTF-8", "21 lis 2017", "%d %b %Y", 2, 324, 10, 21 },
   { "pl_PL.UTF-8", "22 LIS 2017", "%d %B %Y", 3, 325, 10, 22 },
-  /* TODO: Use the genitive case here as soon as it is added to localedata.  */
-  { "pl_PL.UTF-8", "23 listopad 2017", "%d %B %Y", 4, 326, 10, 23 },
+  { "pl_PL.UTF-8", "23 listopada 2017", "%d %B %Y", 4, 326, 10, 23 },
   /* The nominative case is incorrect here but it is parseable.  */
   { "pl_PL.UTF-8", "24 listopad 2017", "%d %OB %Y", 5, 327, 10, 24 },
   { "pl_PL.UTF-8", "25 lis 2017", "%d %Ob %Y", 6, 328, 10, 25 },
--
2.7.5
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Rafal Luzynski
In reply to this post by Rafal Luzynski
I have uploaded all these patches (and the previous series as well)
to bugzilla:

https://sourceware.org/bugzilla/show_bug.cgi?id=10871

This should facilitate tracking changes between the versions
of the patches.

Regards,

Rafal
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Carlos O'Donell-6
In reply to this post by Rafal Luzynski
On 01/12/2018 12:12 AM, Rafal Luzynski wrote:

> [PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case
>
> Thank you Zack, Carlos, and Dmitry for your reviews.
>
> For completeness, this is the 12th version of my patches.  Not much has
> been changed, mostly changelog and commit messages.
>
> More details of the changes since the version 11, [1] except the commit
> messages:
>
> Patch 1/6: More tests for strptime() added, in English, German, and French.
> Patch 3/6: A test for strptime() in Spanish added.  A test in Russian uses
>     the Unicode (UTF-8) characters directly rather than hexadecimal
>     sequences.
> Patch 5/6: Minor reword.
> Patch 6/6: This patch previously has not been numbered.  It can be treated
>     as part of this series or as a separate patch.  This new version
>     adds a test for strptime() actually using a nominative case of a month
>     name.
>
> The missing patches are available in my github repo [2] which is up to
> date this time.
>
> I'd like to receive at least one "OK" review before I push to stable.

I'm going to install these patches and test it out and give my OK.

I want to just verify the static binary case that Florian asked about
so we can document the result.


--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Rafal Luzynski
14.01.2018 04:53 Carlos O'Donell <[hidden email]> wrote:
> [...]
> I'm going to install these patches and test it out and give my OK.

Thank you for your effort.  Please note that you need to update
some locale data and use the corresponding locale in order to see
any difference.  But it's probably also worth testing without the
locale data updated in order to verify that there is no difference
if the locale data is not changed.

You can also use some of my test programs. [1] Feel free to ask if
you need some hints about how to verify if the date is correct in
an inflected language.

> I want to just verify the static binary case that Florian asked about
> so we can document the result.

I made a simple test and it confirms that the static binary is unable
to load any new locale data and it just falls back to a default
builtin "C" locale.  My tests were not thorough enough but I guess
that this is because of some sanity tests in the file locale/loadlocale.c
which make sure that the locale data is not too short or too long.
Also I noticed that a binary segfaults if it assumes that the newlocale()
function call is always successful and does not verify that the result
is nonzero.  But this can be treated as an application bug.

A workaround for the problem would be to deliver the old locale data
and put their directory name in the LOCPATH environment variable.

A similar problem has been reported as a bug 19084 [2] and the answer
was that it cannot be fixed for statically linked binaries.

Regards,

Rafal

[1] https://github.com/rluzynski/months-test
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=19084
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rical Jasan
In reply to this post by Rafal Luzynski
Sorry I haven't had time to review sooner, but I was able to make some
time today.

On 01/12/2018 12:18 AM, Rafal Luzynski wrote:

> [BZ #10871]
> * manual/locale.texi: Document ALTMON_1..12 constants for
> nl_langinfo.  Specify when to use ALTMON instead of MON.
> * manual/time.texi (strftime, strptime): Document GNU extension
> permitting O modifier with %B and %b.  Specify when to use
> %OB instead of %B.
> ---
>  ChangeLog          |  9 +++++++++
>  NEWS               | 24 ++++++++++++++++++++++++
>  manual/locale.texi | 26 +++++++++++++++++++++++---
>  manual/time.texi   | 35 +++++++++++++++++++++++++++--------
>  4 files changed, 83 insertions(+), 11 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 75bf467..c70d8a9 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -69,6 +69,30 @@ Major new features:
>    collation ordering.  Previous glibc versions used locale-specific
>    ordering, the change might break systems that relied on that.
>  
> +* Support for two grammatical forms of month name has been added.

"month names"

> +  In a call to strftime, the "%B" and "%b" format specifiers will now
> +  produce the grammatical form required when the month is used as part
> +  of a complete date.  New "%OB" and "%Ob" specifiers produce the form
> +  required when the month is named by itself.  For instance, in Greek
> +  and in many Slavic and Baltic languages, "%B" will produce the month
> +  in genitive case, and "%OB" will produce the month in nominative case.
> +
> +  In a call to strptime, "%B", "%b", "%h", "%OB", "%Ob", and "%Oh"
> +  are all valid and will all accept any known form of month
> +  name---standalone or complete, abbreviated or full.  In a call to
> +  nl_langinfo, the query constants MON_1..12 and ABMON_1..12 return
> +  the strings used by "%B" and "%b", respectively.  New query
> +  constants ALTMON_1..12 and _NL_ABALTMON_1..12 return the strings

It seems odd not to have ABALTMON_*.  Unfortunately I didn't get to
reviewing this sooner, and I don't want to block this, and another
developer has OK'd it [1], but I wanted to throw in my 2 cents.

ABALTMON_* is both intuitive and consistent, and I wonder how many
people are going to try using it, only to have to go look up
instructions on the _NL_* variant, which isn't documented at all...

...which brings up another question: why are we announcing a reserved
name (i.e., "_*") as available for general use (and not documenting it)?

> +  used by "%OB" and "%Ob", respectively.
> +
> +  In a locale definition file, use "alt_mon" and "ab_alt_mon" to
> +  define the strings for %OB and %Ob, respectively; these have the
> +  same syntax as "mon" and "ab_mon".
> +
> +  This feature is currently a GNU extension, but it is expected to
> +  be added to the next revision of POSIX, and it is also already
> +  available on some BSD-derived operating systems.
> +
>  Deprecated and removed features, and other changes affecting compatibility:
>  
>  * Support for statically linked applications which call dlopen is deprecated
> diff --git a/manual/locale.texi b/manual/locale.texi
> index 60ad2a1..059db75 100644
> --- a/manual/locale.texi
> +++ b/manual/locale.texi
> @@ -923,7 +923,7 @@ corresponds to Sunday.
>  @itemx DAY_5
>  @itemx DAY_6
>  @itemx DAY_7
> -Similar to @code{ABDAY_1} etc., but here the return value is the
> +Similar to @code{ABDAY_1} etc.,@: but here the return value is the

There shouldn't be a colon immediately following the comma, but there
should be a comma between ABDAY_1 and etc.: "Similar to @code{ABDAY_1},
etc., but..."  (Note the colon can follow the period because the period
is a part of the abbreviation.)  The comma and colon essentially serve
the same purpose, so it's somewhat redundant to use the two punctuation
marks together like that, and I don't think "but" will generally ever
follow a colon, so the comma is probably more appropriate.  I could see
an em-dash being used, though.  There are several occurrences of this.

>  unabbreviated weekday name.
>  @item ABMON_1
>  @itemx ABMON_2
> @@ -937,7 +937,8 @@ unabbreviated weekday name.
>  @itemx ABMON_10
>  @itemx ABMON_11
>  @itemx ABMON_12
> -The return value is abbreviated name of the month.  @code{ABMON_1}
> +The return value is the abbreviated name of the month, in the grammatical

Good catch.

> +form used when the month forms part of a complete date.  @code{ABMON_1}
>  corresponds to January.
>  @item MON_1
>  @itemx MON_2
> @@ -951,8 +952,27 @@ corresponds to January.
>  @itemx MON_10
>  @itemx MON_11
>  @itemx MON_12
> -Similar to @code{ABMON_1} etc., but here the month names are not abbreviated.
> +Similar to @code{ABMON_1} etc.,@: but here the month names are not abbreviated.>  Here the first value @code{MON_1} also corresponds to January.
> +@item ALTMON_1
> +@itemx ALTMON_2
> +@itemx ALTMON_3
> +@itemx ALTMON_4
> +@itemx ALTMON_5
> +@itemx ALTMON_6
> +@itemx ALTMON_7
> +@itemx ALTMON_8
> +@itemx ALTMON_9
> +@itemx ALTMON_10
> +@itemx ALTMON_11
> +@itemx ALTMON_12
> +Similar to @code{MON_1} etc.,@: but here the month names are in the grammatical
> +form used when the month is named by itself.  The @code{strftime} functions
> +use these month names for the format specifier @code{OB}.

An "(@pxref{Formatting Calendar Time})" would be good at the end there.

Also, I think it should read "conversion specifier @code{%OB}."  In the
strftime description, "B" is called the "format specifier", "O" is an
"optional modifier", and "%OB" would be an example of a conversion
specifier.

> +
> +Note that not all languages need two different forms of the month names,
> +so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
> +may or may not be the same, depending on the locale.
>  @item AM_STR
>  @itemx PM_STR
>  The return values are strings which can be used in the representation of time
> diff --git a/manual/time.texi b/manual/time.texi
> index 33aa221..2a5afd9 100644
> --- a/manual/time.texi
> +++ b/manual/time.texi
> @@ -1346,8 +1346,13 @@ example, @code{%Ex} might yield a date format based on
> the Japanese
>  Emperors' reigns.
>  
>  @item O
> -Use the locale's alternate numeric symbols for numbers.  This modifier
> -applies only to numeric format specifiers.
> +With all format specifiers that produce numbers: use the locale's
> +alternate numeric symbols.
> +
> +With @code{%B} and @code{%b}: use the grammatical form for month names

And "h"?  What about: "With the format specifiers that produce month
names:"?

> +that is appropriate when the month is named by itself, rather than
> +the form that is appropriate when the month is used as part of a
> +complete date.  This is a GNU extension.
>  @end table
>  
>  If the format supports the modifier but no alternate representation
> @@ -1365,13 +1370,21 @@ The abbreviated weekday name according to the current
> locale.
>  The full weekday name according to the current locale.
>  
>  @item %b
> -The abbreviated month name according to the current locale.
> +The abbreviated month name according to the current locale, in the
> +grammatical form used when the month is part of a complete date.
> +As a GNU extension, the @code{O} modifier can be used (@code{%Ob})
> +to get the grammatical form used when the month is named by itself.
>  
>  @item %B
> -The full month name according to the current locale.
> +The full month name according to the current locale, in the
> +grammatical form used when the month is part of a complete date.
> +As a GNU extension, the @code{O} modifier can be used (@code{%OB})
> +to get the grammatical form used when the month is named by itself.
>  
> -Using @code{%B} together with @code{%d} produces grammatically
> -incorrect results for some locales.
> +Note that not all languages need two different forms of the month
> +names, so the text produced by @code{%B} and @code{%OB}, and by
> +@code{%b} and @code{%Ob}, may or may not be the same, depending on
> +the locale.
>  
>  @item %c
>  The preferred calendar time representation for the current locale.
> @@ -1778,8 +1791,14 @@ the full name.
>  @item %b
>  @itemx %B
>  @itemx %h
> -The month name according to the current locale, in abbreviated form or
> -the full name.
> +A month name according to the current locale.  All three specifiers
> +will recognize both abbreviated and full month names.  If the
> +locale provides two different grammatical forms of month names,
> +all three specifiers will recognize both forms.
> +
> +As a GNU extension, the @code{O} modifier can be used with these
> +specifiers; it has no effect, as both grammatical forms of month
> +names are recognized.
>  
>  @item %c
>  The date and time representation for the current locale.

The rest sounds good.

The ABALTMON issue aside, I'm glad to see this patch/bugfix finally have
gained enough consensus to be going in.  Congratulations.  :)

Rical

[1] https://sourceware.org/ml/libc-alpha/2018-01/msg00409.html
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Carlos O'Donell-6
On 01/14/2018 07:42 PM, Rical Jasan wrote:
> It seems odd not to have ABALTMON_*.  Unfortunately I didn't get to
> reviewing this sooner, and I don't want to block this, and another
> developer has OK'd it [1], but I wanted to throw in my 2 cents.
I asked the same thing during the review, see:
https://www.sourceware.org/ml/libc-alpha/2018-01/msg00408.html

There is no reason we can't add it in the future.

Perhaps a note about this in the documentation might explain
why the expected define is not present?

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rical Jasan
On 01/14/2018 10:28 PM, Carlos O'Donell wrote:

> On 01/14/2018 07:42 PM, Rical Jasan wrote:
>> It seems odd not to have ABALTMON_*.  Unfortunately I didn't get to
>> reviewing this sooner, and I don't want to block this, and another
>> developer has OK'd it [1], but I wanted to throw in my 2 cents.
> I asked the same thing during the review, see:
> https://www.sourceware.org/ml/libc-alpha/2018-01/msg00408.html
>
> There is no reason we can't add it in the future.
>
> Perhaps a note about this in the documentation might explain
> why the expected define is not present?

That would be fine with me.  I think it deserves a mention because the
feature is implemented, and I imagine anybody taking advantage of the
bugfix or using %OB, et al., will naturally be interested in, if not
looking for, the abbreviated equivalent.  Better to have complete
documentation that gets updated later than no documentation at all.

Should the full _NL_ABALTMON list be documented alongside ALTMON, or do
you think another paragraph in the ALTMON description is a sufficient
shim?  If ABALTMON is expected to be added in 2.28 because of how close
to the 2.27 release this went in, I'd prefer the latter, perhaps even
with a note that ABALTMON is expected to supersede the
currently-available _NL_ABALTMON, but if ABALTMON is intended to be
deferred until standardization, I think the former is more appropriate,
with no mention of ABALTMON.

Rical
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rafal Luzynski
In reply to this post by Rical Jasan
15.01.2018 04:42 Rical Jasan <[hidden email]> wrote:
>
>
> Sorry I haven't had time to review sooner, but I was able to make some
> time today.

Thank you for your review.

> On 01/12/2018 12:18 AM, Rafal Luzynski wrote:
> > [BZ #10871]
> > * manual/locale.texi: Document ALTMON_1..12 constants for
> > nl_langinfo. Specify when to use ALTMON instead of MON.
> > * manual/time.texi (strftime, strptime): Document GNU extension
> > permitting O modifier with %B and %b. Specify when to use
> > %OB instead of %B.
> > ---
> > ChangeLog | 9 +++++++++
> > NEWS | 24 ++++++++++++++++++++++++
> > manual/locale.texi | 26 +++++++++++++++++++++++---
> > manual/time.texi | 35 +++++++++++++++++++++++++++--------
> > 4 files changed, 83 insertions(+), 11 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 75bf467..c70d8a9 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -69,6 +69,30 @@ Major new features:
> > collation ordering. Previous glibc versions used locale-specific
> > ordering, the change might break systems that relied on that.
> >
> > +* Support for two grammatical forms of month name has been added.
>
> "month names"

OK, I'm going to apply this locally.  That's why I'm asking other people
to review or even write the documentation from scratch because the details
like "month(s) name(s)" are sometimes confusing for me as a foreigner.

>
> > + In a call to strftime, the "%B" and "%b" format specifiers will now
> > + produce the grammatical form required when the month is used as part
> > + of a complete date. New "%OB" and "%Ob" specifiers produce the form
> > + required when the month is named by itself. For instance, in Greek
> > + and in many Slavic and Baltic languages, "%B" will produce the month
> > + in genitive case, and "%OB" will produce the month in nominative case.
> > +
> > + In a call to strptime, "%B", "%b", "%h", "%OB", "%Ob", and "%Oh"
> > + are all valid and will all accept any known form of month
> > + name---standalone or complete, abbreviated or full. In a call to
> > + nl_langinfo, the query constants MON_1..12 and ABMON_1..12 return
> > + the strings used by "%B" and "%b", respectively. New query
> > + constants ALTMON_1..12 and _NL_ABALTMON_1..12 return the strings
>
> It seems odd not to have ABALTMON_*. Unfortunately I didn't get to
> reviewing this sooner, and I don't want to block this, and another
> developer has OK'd it [1], but I wanted to throw in my 2 cents.
>
> ABALTMON_* is both intuitive and consistent, and I wonder how many
> people are going to try using it, only to have to go look up
> instructions on the _NL_* variant, which isn't documented at all...

It has been answered by Carlos already.  Well, the same question could be
asked about all other _NL_* constants.  Especially since they cannot be
changed in future due to ABI compatibility.  They are not documented in
the documentation but documented in the header file.

As I wrote before, the idea to standardize ABALTMON_* is very new.
Can we assume already that it will be eventually accepted even if it
takes many years and define it as a public GNU extension already?
I'm kinda tempted to say yes.  With _NL_ABALTMON_* the situation is
more complex because... here I need to put a longer story.  First of
all, we should discourage using nl_langinfo() to display month names.
Programmers should use strftime("%OB") and strftime("%Ob") even if
nl_langinfo() produces the same results more efficiently.  nl_langinfo()
should be considered a low-level API used to implement strftime().
I think I saw this recommendation somewhere so I think the idea is not
new.  But if they really want to use nl_langinfo() they should use ALTMON_*
for full month names if it is available (MON_* if it is not) and
_NL_ABALTMON_* for short month names (ABMON_* if it is not).  Recommend
to use an undocumented feature?

But OTOH I was told not to declare even ALTMON_* series as public
because it is not yet published by POSIX (even if it is accepted to
be published in future).  ABALTMON_* was not yet even filed to POSIX.

Sorry, I'm short of time so I don't provide links here.

> ...which brings up another question: why are we announcing a reserved
> name (i.e., "_*") as available for general use (and not documenting it)?

Actually they are public symbols even if unofficial and undocumented.
Would it be better to remove them from ChangeLog?  AFAIK the purpose is
to help if someone investigates in future who and why introduced these
symbols.

> > [...]
> > diff --git a/manual/locale.texi b/manual/locale.texi
> > index 60ad2a1..059db75 100644
> > --- a/manual/locale.texi
> > +++ b/manual/locale.texi
> > @@ -923,7 +923,7 @@ corresponds to Sunday.
> > @itemx DAY_5
> > @itemx DAY_6
> > @itemx DAY_7
> > -Similar to @code{ABDAY_1} etc., but here the return value is the
> > +Similar to @code{ABDAY_1} etc.,@: but here the return value is the
>
> There shouldn't be a colon immediately following the comma, but there
> should be a comma between ABDAY_1 and etc.: "Similar to @code{ABDAY_1},
> etc., but..." [...]

AFAIK it's not a colon but "@:" is an operator to force a regular space.
Otherwise there would be a full stop space automatically inserted due to
the "dot-space" sequence.

> > [...]
> > +@item ALTMON_1
> > +@itemx ALTMON_2
> > +@itemx ALTMON_3
> > +@itemx ALTMON_4
> > +@itemx ALTMON_5
> > +@itemx ALTMON_6
> > +@itemx ALTMON_7
> > +@itemx ALTMON_8
> > +@itemx ALTMON_9
> > +@itemx ALTMON_10
> > +@itemx ALTMON_11
> > +@itemx ALTMON_12
> > +Similar to @code{MON_1} etc.,@: but here the month names are in the
> > grammatical
> > +form used when the month is named by itself. The @code{strftime} functions
> > +use these month names for the format specifier @code{OB}.
>
> An "(@pxref{Formatting Calendar Time})" would be good at the end there.

It is already in other places in this document.  If you really need it here
then I'll appreciate a full example which I can just copy & paste.

> Also, I think it should read "conversion specifier @code{%OB}." In the
> strftime description, "B" is called the "format specifier", "O" is an
> "optional modifier", and "%OB" would be an example of a conversion
> specifier.

Again, I'm not a real expert to determine which term is correct here.
I will appreciate the corrections from native English speakers but I'd
like to get a consistent version.

> > [...]
> > diff --git a/manual/time.texi b/manual/time.texi
> > index 33aa221..2a5afd9 100644
> > --- a/manual/time.texi
> > +++ b/manual/time.texi
> > @@ -1346,8 +1346,13 @@ example, @code{%Ex} might yield a date format based
> > on
> > the Japanese
> > Emperors' reigns.
> >
> > @item O
> > -Use the locale's alternate numeric symbols for numbers. This modifier
> > -applies only to numeric format specifiers.
> > +With all format specifiers that produce numbers: use the locale's
> > +alternate numeric symbols.
> > +
> > +With @code{%B} and @code{%b}: use the grammatical form for month names
>
> And "h"? What about: "With the format specifiers that produce month
> names:"?

The documentation of "%h" says it works exactly the same as "%b" and I read
this as a sufficient explanation.  I can add "%h" here if you insist.
As a reader I don't like the documents which refer to other documents
(which may refer to yet other documents etc.) so I wouldn't like to see
"With the format specifiers that produce month names" (and please, readers,
read this document again if you have already forgotten which format
specifiers produce month names) :-)  So, again, at the moment I don't
change anything and I'll add "%h" if you insist.

> [...]
> The ABALTMON issue aside, I'm glad to see this patch/bugfix finally have
> gained enough consensus to be going in. Congratulations. :)
>
> Rical

Thank you again for your review.  I believe it helps me to finalize
this task.

Can I suggest that if there are no issues beyond the documentation
(yes, I know that whether we add _NL_ABALTMON_* or __ABALTMON_* and
ABALTMON_* is a serious API issue) then please let's commit this ASAP
to make sure we have those remaining 2 weeks to announce the change
to the outer world and let's polish the documentation within this
period?

Regards,

Rafal
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rafal Luzynski
In reply to this post by Rical Jasan
15.01.2018 09:30 Rical Jasan <[hidden email]> wrote:

>
>
> On 01/14/2018 10:28 PM, Carlos O'Donell wrote:
> > On 01/14/2018 07:42 PM, Rical Jasan wrote:
> >> It seems odd not to have ABALTMON_*. Unfortunately I didn't get to
> >> reviewing this sooner, and I don't want to block this, and another
> >> developer has OK'd it [1], but I wanted to throw in my 2 cents.
> > I asked the same thing during the review, see:
> > https://www.sourceware.org/ml/libc-alpha/2018-01/msg00408.html
> >
> > There is no reason we can't add it in the future.
> >
> > Perhaps a note about this in the documentation might explain
> > why the expected define is not present?
>
> That would be fine with me. I think it deserves a mention because the
> feature is implemented, and I imagine anybody taking advantage of the
> bugfix or using %OB, et al., will naturally be interested in, if not
> looking for, the abbreviated equivalent. Better to have complete
> documentation that gets updated later than no documentation at all.

Again, as we are short of time I'll appreciate a complete excerpt of
the documentation which I can just copy & paste.  Or maybe better please
polish the documentation later after I commit so you will get a full
credit in the commit message and a changelog entry. :-)

> Should the full _NL_ABALTMON list be documented alongside ALTMON, or do
> you think another paragraph in the ALTMON description is a sufficient
> shim? If ABALTMON is expected to be added in 2.28 because of how close
> to the 2.27 release this went in, I'd prefer the latter, perhaps even
> with a note that ABALTMON is expected to supersede the
> currently-available _NL_ABALTMON, but if ABALTMON is intended to be
> deferred until standardization, I think the former is more appropriate,
> with no mention of ABALTMON.

No, please don't defer this to 2.28.  This set of patches has missed
about 2 release deadlines already and I think it deserves to be included
in some new releases of major Linux distros which I expect to be released
this year.  Regarding the POSIX standardization, since ALTMON_* has been
accepted in ~2010 and is still not yet published I assume that ABALTMON_*
will remain waiting another ~10 years (I'm trying not to be ironic,
having worked on the issue for just glibc for about 2.5 years I really
understand the hard work beyond standardization.)

Regards,

Rafal
Reply | Threaded
Open this post in threaded view
|

[PING][PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rafal Luzynski
In reply to this post by Rafal Luzynski
15.01.2018 12:46 Rafal Luzynski <[hidden email]> wrote:

>
> 15.01.2018 04:42 Rical Jasan <[hidden email]> wrote:
> > [...]
> > On 01/12/2018 12:18 AM, Rafal Luzynski wrote:
> > > [...]
> > > diff --git a/NEWS b/NEWS
> > > index 75bf467..c70d8a9 100644
> > > --- a/NEWS
> > > +++ b/NEWS
> > > @@ -69,6 +69,30 @@ Major new features:
> > > collation ordering. Previous glibc versions used locale-specific
> > > ordering, the change might break systems that relied on that.
> > >
> > > +* Support for two grammatical forms of month name has been added.
> >
> > "month names"
>
> OK, I'm going to apply this locally. That's why I'm asking other people
> to review or even write the documentation from scratch because the details
> like "month(s) name(s)" are sometimes confusing for me as a foreigner.

I have applied this change locally but nothing else.  Should I apply
more changes?

PING - I'm getting worried about my patches.  Carlos, anybody?

BTW, my github repo is up to date at the moment:

https://github.com/rluzynski/glibc

Regards,

Rafal
Reply | Threaded
Open this post in threaded view
|

Re: [PING][PATCH v12 5/6] Documentation to the above changes (bug 10871).

Joseph Myers
On Thu, 18 Jan 2018, Rafal Luzynski wrote:

> PING - I'm getting worried about my patches.  Carlos, anybody?

I'm concerned more generally that we still have multiple complicated
architecture-independent patches, especially this one and C11 threads,
pending review for 2.27, as the architecture validation for 2.27 can't
really start until those patches have either been accepted or postponed to
2.28.  (Architecture maintainers of course can and should do preliminary
testing and fixing issues found, but we should resolve the major
architecture-independent patches before any near-final validation for the
release.)

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Rical Jasan
In reply to this post by Rafal Luzynski
On 01/15/2018 03:46 AM, Rafal Luzynski wrote:
> Can I suggest that if there are no issues beyond the documentation
> (yes, I know that whether we add _NL_ABALTMON_* or __ABALTMON_* and
> ABALTMON_* is a serious API issue) then please let's commit this ASAP
> to make sure we have those remaining 2 weeks to announce the change
> to the outer world and let's polish the documentation within this
> period?

Here's a patch with my suggestions for the documentation; it should
apply on top of yours if you'd like to merge it in (or I can push it
later; I don't have a strong opinion).

I mention that an ABALTMON equivalent for %Ob isn't provided, but is
expected, and that _NL_ABALTMON may be used in the meantime.  I'm not
sure if we want to say it quite that way, but it's a start.

That was my mistake about "@:" being a colon.  That's what I get for
reviewing with the Texinfo manual out of hand.  :)  I think Texinfo
handles the comma after the period properly, as I couldn't tell a
difference in the rendered output either way, but it doesn't hurt to
give it hints, so I left it in.  I did add commas before "etc." though.

I agree that if it's documentation minutiae holding this up, the patches
should go in.

Rical

diff --git a/manual/locale.texi b/manual/locale.texi
index 059db75c1c..19b1cfc421 100644
--- a/manual/locale.texi
+++ b/manual/locale.texi
@@ -923,7 +923,7 @@ corresponds to Sunday.
 @itemx DAY_5
 @itemx DAY_6
 @itemx DAY_7
-Similar to @code{ABDAY_1} etc.,@: but here the return value is the
+Similar to @code{ABDAY_1}, etc.,@: but here the return value is the
 unabbreviated weekday name.
 @item ABMON_1
 @itemx ABMON_2
@@ -952,7 +952,7 @@ corresponds to January.
 @itemx MON_10
 @itemx MON_11
 @itemx MON_12
-Similar to @code{ABMON_1} etc.,@: but here the month names are not abbreviated.
+Similar to @code{ABMON_1}, etc.,@: but here the month names are not abbreviated.
 Here the first value @code{MON_1} also corresponds to January.
 @item ALTMON_1
 @itemx ALTMON_2
@@ -966,13 +966,19 @@ Here the first value @code{MON_1} also corresponds to January.
 @itemx ALTMON_10
 @itemx ALTMON_11
 @itemx ALTMON_12
-Similar to @code{MON_1} etc.,@: but here the month names are in the grammatical
+Similar to @code{MON_1}, etc.,@: but here the month names are in the grammatical
 form used when the month is named by itself.  The @code{strftime} functions
-use these month names for the format specifier @code{OB}.
+use these month names for the conversion specifier @code{%OB}
+(@pxref{Formatting Calendar Time}).
 
 Note that not all languages need two different forms of the month names,
 so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
 may or may not be the same, depending on the locale.
+
+@strong{NB:} @code{ABALTMON_@dots{}} constants corresponding to the @code{%Ob}
+conversion specifier are not currently provided, but are expected to be in a
+future release.  In the meantime, it is possible to use
+@code{_NL_ABALTMON_@dots{}}.
 @item AM_STR
 @itemx PM_STR
 The return values are strings which can be used in the representation of time
diff --git a/manual/time.texi b/manual/time.texi
index 2a5afd9e56..6c3d5e9ab2 100644
--- a/manual/time.texi
+++ b/manual/time.texi
@@ -1349,7 +1349,7 @@ Emperors' reigns.
 With all format specifiers that produce numbers: use the locale's
 alternate numeric symbols.
 
-With @code{%B} and @code{%b}: use the grammatical form for month names
+With @code{%B}, @code{%b}, and @code{%h}: use the grammatical form for month names
 that is appropriate when the month is named by itself, rather than
 the form that is appropriate when the month is used as part of a
 complete date.  This is a GNU extension.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 5/6] Documentation to the above changes (bug 10871).

Carlos O'Donell-6
On 01/20/2018 12:36 AM, Rical Jasan wrote:

> On 01/15/2018 03:46 AM, Rafal Luzynski wrote:
>> Can I suggest that if there are no issues beyond the documentation
>> (yes, I know that whether we add _NL_ABALTMON_* or __ABALTMON_* and
>> ABALTMON_* is a serious API issue) then please let's commit this ASAP
>> to make sure we have those remaining 2 weeks to announce the change
>> to the outer world and let's polish the documentation within this
>> period?
> Here's a patch with my suggestions for the documentation; it should
> apply on top of yours if you'd like to merge it in (or I can push it
> later; I don't have a strong opinion).
>
> I mention that an ABALTMON equivalent for %Ob isn't provided, but is
> expected, and that _NL_ABALTMON may be used in the meantime.  I'm not
> sure if we want to say it quite that way, but it's a start.
>
> That was my mistake about "@:" being a colon.  That's what I get for
> reviewing with the Texinfo manual out of hand.  :)  I think Texinfo
> handles the comma after the period properly, as I couldn't tell a
> difference in the rendered output either way, but it doesn't hurt to
> give it hints, so I left it in.  I did add commas before "etc." though.
>
> I agree that if it's documentation minutiae holding this up, the patches
> should go in.

This looks good to me.

Reviewed-by: Carlos O'Donell <[hidden email]>

> diff --git a/manual/locale.texi b/manual/locale.texi
> index 059db75c1c..19b1cfc421 100644
> --- a/manual/locale.texi
> +++ b/manual/locale.texi
> @@ -923,7 +923,7 @@ corresponds to Sunday.
>  @itemx DAY_5
>  @itemx DAY_6
>  @itemx DAY_7
> -Similar to @code{ABDAY_1} etc.,@: but here the return value is the
> +Similar to @code{ABDAY_1}, etc.,@: but here the return value is the
>  unabbreviated weekday name.
>  @item ABMON_1
>  @itemx ABMON_2
> @@ -952,7 +952,7 @@ corresponds to January.
>  @itemx MON_10
>  @itemx MON_11
>  @itemx MON_12
> -Similar to @code{ABMON_1} etc.,@: but here the month names are not abbreviated.
> +Similar to @code{ABMON_1}, etc.,@: but here the month names are not abbreviated.
>  Here the first value @code{MON_1} also corresponds to January.
>  @item ALTMON_1
>  @itemx ALTMON_2
> @@ -966,13 +966,19 @@ Here the first value @code{MON_1} also corresponds to January.
>  @itemx ALTMON_10
>  @itemx ALTMON_11
>  @itemx ALTMON_12
> -Similar to @code{MON_1} etc.,@: but here the month names are in the grammatical
> +Similar to @code{MON_1}, etc.,@: but here the month names are in the grammatical
>  form used when the month is named by itself.  The @code{strftime} functions
> -use these month names for the format specifier @code{OB}.
> +use these month names for the conversion specifier @code{%OB}
> +(@pxref{Formatting Calendar Time}).
>  
>  Note that not all languages need two different forms of the month names,
>  so the strings returned for @code{MON_@dots{}} and @code{ALTMON_@dots{}}
>  may or may not be the same, depending on the locale.
> +
> +@strong{NB:} @code{ABALTMON_@dots{}} constants corresponding to the @code{%Ob}
> +conversion specifier are not currently provided, but are expected to be in a
> +future release.  In the meantime, it is possible to use
> +@code{_NL_ABALTMON_@dots{}}.
>  @item AM_STR
>  @itemx PM_STR
>  The return values are strings which can be used in the representation of time
> diff --git a/manual/time.texi b/manual/time.texi
> index 2a5afd9e56..6c3d5e9ab2 100644
> --- a/manual/time.texi
> +++ b/manual/time.texi
> @@ -1349,7 +1349,7 @@ Emperors' reigns.
>  With all format specifiers that produce numbers: use the locale's
>  alternate numeric symbols.
>  
> -With @code{%B} and @code{%b}: use the grammatical form for month names
> +With @code{%B}, @code{%b}, and @code{%h}: use the grammatical form for month names
>  that is appropriate when the month is named by itself, rather than
>  the form that is appropriate when the month is used as part of a
>  complete date.  This is a GNU extension.


--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Carlos O'Donell-6
In reply to this post by Rafal Luzynski
On 01/14/2018 07:03 AM, Rafal Luzynski wrote:
> A workaround for the problem would be to deliver the old locale data
> and put their directory name in the LOCPATH environment variable.
>
> A similar problem has been reported as a bug 19084 [2] and the answer
> was that it cannot be fixed for statically linked binaries.

OK, I have finished reviewing these patches again and testing some builds.

I agree that today we cannot support statically linked applications using new data.

Florian Weimer fairly clearly stated that he didn't object to this.

v12 + Rical's suggestions are good.

Please commit v12 :-)

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PING][PATCH v12 5/6] Documentation to the above changes (bug 10871).

Carlos O'Donell-6
In reply to this post by Joseph Myers
On 01/18/2018 05:35 AM, Joseph Myers wrote:

> On Thu, 18 Jan 2018, Rafal Luzynski wrote:
>
>> PING - I'm getting worried about my patches.  Carlos, anybody?
>
> I'm concerned more generally that we still have multiple complicated
> architecture-independent patches, especially this one and C11 threads,
> pending review for 2.27, as the architecture validation for 2.27 can't
> really start until those patches have either been accepted or postponed to
> 2.28.  (Architecture maintainers of course can and should do preliminary
> testing and fixing issues found, but we should resolve the major
> architecture-independent patches before any near-final validation for the
> release.)
 
This work is now complete. Rafal should commit this ASAP.

I think C11 threads should be deferred until 2.28 opens, and I will continue review.

With those two out of the way all that remains is fixing the x86 bug.

We should stick to discussing release blockers:
https://sourceware.org/glibc/wiki/Release/2.27#Release_blockers.3F

All "Desirable this release" should be deferred to 2.28.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v12 0/6][BZ 10871] Month names in alternative grammatical case

Rafal Luzynski
In reply to this post by Carlos O'Donell-6
20.01.2018 09:41 Carlos O'Donell <[hidden email]> wrote:
> [...]
> v12 + Rical's suggestions are good.
>
> Please commit v12 :-)

Thank you Carlos and everyone around for your help.
I will apply Rical's patch on top of my patches locally,
squash with the patches which have not been posted to this
list because they just update ChangeLog and generated files,
and then push everything on Monday morning.  I hope this is OK.

In the meantime this weekend I will try to ping translators
and as many projects as possible in order to coordinate the
locale data update and adaptation of the new features in other
projects.

Regards,

Rafal
12