[PATCH v3 0/2] Fix up soft-fp issue

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH v3 0/2] Fix up soft-fp issue

Zong Li-2
These patches fix up the issues of soft-fp as follows:
1. We found a wrong calculation of division when we testing the RV32.
  There is the same problem in libgcc. Because some math test cases of glibc
  testsuite link the libgcc library, so we have to fix it in gcc together,
  otherwise, there are some math test case failures (see the end).
  We expected to submit this modification to gcc to update the op-4.h after
  glibc accept this patch.
2. RV32 needs new macros implementation of soft floating for 128 bit support.

These patches are verified by passing all math test cases of glibc testsuite.

Changes in v3:
 - In op-8.h, modify the coding style about spliting multiple lines.

Changes in v2:
 - In op-4.h, fix the problem in FRAC_SUB function, not in use site.
 - In op-8.h, change the variable name for naming rule.

Record the failures of math test cases without gcc modification:
FAIL: math/test-double-ldouble-div
FAIL: math/test-float128-atan2
FAIL: math/test-float128-cacos
FAIL: math/test-float128-cacosh
FAIL: math/test-float128-casin
FAIL: math/test-float128-casinh
FAIL: math/test-float128-catan
FAIL: math/test-float128-catanh
FAIL: math/test-float128-clog
FAIL: math/test-float128-clog10
FAIL: math/test-float128-ctan
FAIL: math/test-float128-ctanh
FAIL: math/test-float128-erfc
FAIL: math/test-float128-finite-atan2
FAIL: math/test-float128-finite-cacos
FAIL: math/test-float128-finite-cacosh
FAIL: math/test-float128-finite-casin
FAIL: math/test-float128-finite-casinh
FAIL: math/test-float128-finite-catan
FAIL: math/test-float128-finite-catanh
FAIL: math/test-float128-finite-clog
FAIL: math/test-float128-finite-clog10
FAIL: math/test-float128-finite-ctan
FAIL: math/test-float128-finite-ctanh
FAIL: math/test-float128-finite-erfc
FAIL: math/test-float128-finite-j1
FAIL: math/test-float128-finite-lgamma
FAIL: math/test-float128-finite-tanh
FAIL: math/test-float128-finite-tgamma
FAIL: math/test-float128-j1
FAIL: math/test-float128-lgamma
FAIL: math/test-float128-tanh
FAIL: math/test-float128-tgamma
FAIL: math/test-float32x-float128-div
FAIL: math/test-float32x-float64x-div
FAIL: math/test-float64-float128-div
FAIL: math/test-float64-float64x-div
FAIL: math/test-float64x-atan2
FAIL: math/test-float64x-cacos
FAIL: math/test-float64x-cacosh
FAIL: math/test-float64x-casin
FAIL: math/test-float64x-casinh
FAIL: math/test-float64x-catan
FAIL: math/test-float64x-catanh
FAIL: math/test-float64x-clog
FAIL: math/test-float64x-clog10
FAIL: math/test-float64x-ctan
FAIL: math/test-float64x-ctanh
FAIL: math/test-float64x-erfc
FAIL: math/test-float64x-finite-atan2
FAIL: math/test-float64x-finite-cacos
FAIL: math/test-float64x-finite-cacosh
FAIL: math/test-float64x-finite-casin
FAIL: math/test-float64x-finite-casinh
FAIL: math/test-float64x-finite-catan
FAIL: math/test-float64x-finite-catanh
FAIL: math/test-float64x-finite-clog
FAIL: math/test-float64x-finite-clog10
FAIL: math/test-float64x-finite-ctan
FAIL: math/test-float64x-finite-ctanh
FAIL: math/test-float64x-finite-erfc
FAIL: math/test-float64x-finite-j1
FAIL: math/test-float64x-finite-lgamma
FAIL: math/test-float64x-finite-tanh
FAIL: math/test-float64x-finite-tgamma
FAIL: math/test-float64x-float128-div
FAIL: math/test-float64x-j1
FAIL: math/test-float64x-lgamma
FAIL: math/test-float64x-tanh
FAIL: math/test-float64x-tgamma
FAIL: math/test-ifloat128-atan2
FAIL: math/test-ifloat128-cacos
FAIL: math/test-ifloat128-cacosh
FAIL: math/test-ifloat128-casin
FAIL: math/test-ifloat128-casinh
FAIL: math/test-ifloat128-catan
FAIL: math/test-ifloat128-catanh
FAIL: math/test-ifloat128-clog
FAIL: math/test-ifloat128-clog10
FAIL: math/test-ifloat128-ctan
FAIL: math/test-ifloat128-ctanh
FAIL: math/test-ifloat128-erfc
FAIL: math/test-ifloat128-j1
FAIL: math/test-ifloat128-lgamma
FAIL: math/test-ifloat128-tanh
FAIL: math/test-ifloat128-tgamma
FAIL: math/test-ifloat64x-atan2
FAIL: math/test-ifloat64x-cacos
FAIL: math/test-ifloat64x-cacosh
FAIL: math/test-ifloat64x-casin
FAIL: math/test-ifloat64x-casinh
FAIL: math/test-ifloat64x-catan
FAIL: math/test-ifloat64x-catanh
FAIL: math/test-ifloat64x-clog
FAIL: math/test-ifloat64x-clog10
FAIL: math/test-ifloat64x-ctan
FAIL: math/test-ifloat64x-ctanh
FAIL: math/test-ifloat64x-erfc
FAIL: math/test-ifloat64x-j1
FAIL: math/test-ifloat64x-lgamma
FAIL: math/test-ifloat64x-tanh
FAIL: math/test-ifloat64x-tgamma
FAIL: math/test-ildouble-atan2
FAIL: math/test-ildouble-cacos
FAIL: math/test-ildouble-cacosh
FAIL: math/test-ildouble-casin
FAIL: math/test-ildouble-casinh
FAIL: math/test-ildouble-catan
FAIL: math/test-ildouble-catanh
FAIL: math/test-ildouble-clog
FAIL: math/test-ildouble-clog10
FAIL: math/test-ildouble-ctan
FAIL: math/test-ildouble-ctanh
FAIL: math/test-ildouble-erfc
FAIL: math/test-ildouble-j1
FAIL: math/test-ildouble-lgamma
FAIL: math/test-ildouble-tanh
FAIL: math/test-ildouble-tgamma
FAIL: math/test-ldouble-atan2
FAIL: math/test-ldouble-cacos
FAIL: math/test-ldouble-cacosh
FAIL: math/test-ldouble-casin
FAIL: math/test-ldouble-casinh
FAIL: math/test-ldouble-catan
FAIL: math/test-ldouble-catanh
FAIL: math/test-ldouble-clog
FAIL: math/test-ldouble-clog10
FAIL: math/test-ldouble-ctan
FAIL: math/test-ldouble-ctanh
FAIL: math/test-ldouble-erfc
FAIL: math/test-ldouble-finite-atan2
FAIL: math/test-ldouble-finite-cacos
FAIL: math/test-ldouble-finite-cacosh
FAIL: math/test-ldouble-finite-casin
FAIL: math/test-ldouble-finite-casinh
FAIL: math/test-ldouble-finite-catan
FAIL: math/test-ldouble-finite-catanh
FAIL: math/test-ldouble-finite-clog
FAIL: math/test-ldouble-finite-clog10
FAIL: math/test-ldouble-finite-ctan
FAIL: math/test-ldouble-finite-ctanh
FAIL: math/test-ldouble-finite-erfc
FAIL: math/test-ldouble-finite-j1
FAIL: math/test-ldouble-finite-lgamma
FAIL: math/test-ldouble-finite-tanh
FAIL: math/test-ldouble-finite-tgamma
FAIL: math/test-ldouble-j1
FAIL: math/test-ldouble-lgamma
FAIL: math/test-ldouble-tanh
FAIL: math/test-ldouble-tgamma

Zong Li (2):
  soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4
  soft-fp: Add inplementation for 128 bit self-contained

 ChangeLog      |  9 ++++++
 soft-fp/op-4.h | 63 ++++++++++++++++++++++-------------------
 soft-fp/op-8.h | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 132 insertions(+), 28 deletions(-)

--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCH v3 1/2] soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4

Zong Li-2
In FRAC_SUB_3(R, X, Y) and FRAC_SUB_4(R,, X, Y), it reference both
the X[N] and X[N] after R[N] have been set. If one of the X and Y is
the same address with R, the result of the calculation is wrong,
because the value of the original X and Y are overwritten.

In glibc, there are two places use FRAC_SUB and occurs the overlap.
The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source
_FP_DIV_MEAT_N_loop_u as the destination. This macro only be used
when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc
extend to _FP_FRAC_SUB_1 in this macro. so it also work because
_FP_FRAC_SUB_1 has no overlap problem in its implementation.
The second places is _FP_DIV_MEAT_4_udiv, the original value of X##_f[0]
is overwritten before the calculatation.

In FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after
destination have been set, so they have no problem.

After this modification, we can pass the soft floating testing of glibc
testsuites on RV32.

        * soft-fp/op-4.h (_FP_FRAC_SUB_3, _FP_FRAC_SUB_4): Use temporary
        variable to avoid overlap arguments.
---
 ChangeLog      |  5 +++++
 soft-fp/op-4.h | 63 ++++++++++++++++++++++++++++++++--------------------------
 2 files changed, 40 insertions(+), 28 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b798b63..0cced2b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2018-10-31  Zong Li  <[hidden email]>
+
+ * soft-fp/op-4.h (_FP_FRAC_SUB_3, _FP_FRAC_SUB_4): Use temporary
+ variable to avoid overlap arguments.
+
 2018-10-31  Samuel Thibault  <[hidden email]>
 
  * manual/errno.texi (EIEIO): Document how translators should
diff --git a/soft-fp/op-4.h b/soft-fp/op-4.h
index 01b87d0..b429801 100644
--- a/soft-fp/op-4.h
+++ b/soft-fp/op-4.h
@@ -696,39 +696,46 @@
 #endif
 
 #ifndef __FP_FRAC_SUB_3
-# define __FP_FRAC_SUB_3(r2, r1, r0, x2, x1, x0, y2, y1, y0) \
-  do \
-    { \
-      _FP_W_TYPE __FP_FRAC_SUB_3_c1, __FP_FRAC_SUB_3_c2; \
-      r0 = x0 - y0; \
-      __FP_FRAC_SUB_3_c1 = r0 > x0; \
-      r1 = x1 - y1; \
-      __FP_FRAC_SUB_3_c2 = r1 > x1; \
-      r1 -= __FP_FRAC_SUB_3_c1; \
-      __FP_FRAC_SUB_3_c2 |= __FP_FRAC_SUB_3_c1 && (y1 == x1); \
-      r2 = x2 - y2 - __FP_FRAC_SUB_3_c2; \
-    } \
+# define __FP_FRAC_SUB_3(r2, r1, r0, x2, x1, x0, y2, y1, y0)    \
+  do                                                            \
+    {                                                           \
+      _FP_W_TYPE __FP_FRAC_SUB_3_tmp[2];                        \
+      _FP_W_TYPE __FP_FRAC_SUB_3_c1, __FP_FRAC_SUB_3_c2;        \
+      __FP_FRAC_SUB_3_tmp[0] = x0 - y0;                         \
+      __FP_FRAC_SUB_3_c1 = __FP_FRAC_SUB_3_tmp[0] > x0;         \
+      __FP_FRAC_SUB_3_tmp[1] = x1 - y1;                         \
+      __FP_FRAC_SUB_3_c2 = __FP_FRAC_SUB_3_tmp[1] > x1;         \
+      __FP_FRAC_SUB_3_tmp[1] -= __FP_FRAC_SUB_3_c1;             \
+      __FP_FRAC_SUB_3_c2 |= __FP_FRAC_SUB_3_c1 && (y1 == x1);   \
+      r2 = x2 - y2 - __FP_FRAC_SUB_3_c2;                        \
+      r1 = __FP_FRAC_SUB_3_tmp[1];                              \
+      r0 = __FP_FRAC_SUB_3_tmp[0];                              \
+    }                                                           \
   while (0)
 #endif
 
 #ifndef __FP_FRAC_SUB_4
 # define __FP_FRAC_SUB_4(r3, r2, r1, r0, x3, x2, x1, x0, y3, y2, y1, y0) \
-  do \
-    { \
-      _FP_W_TYPE __FP_FRAC_SUB_4_c1, __FP_FRAC_SUB_4_c2; \
-      _FP_W_TYPE __FP_FRAC_SUB_4_c3; \
-      r0 = x0 - y0; \
-      __FP_FRAC_SUB_4_c1 = r0 > x0; \
-      r1 = x1 - y1; \
-      __FP_FRAC_SUB_4_c2 = r1 > x1; \
-      r1 -= __FP_FRAC_SUB_4_c1; \
-      __FP_FRAC_SUB_4_c2 |= __FP_FRAC_SUB_4_c1 && (y1 == x1); \
-      r2 = x2 - y2; \
-      __FP_FRAC_SUB_4_c3 = r2 > x2; \
-      r2 -= __FP_FRAC_SUB_4_c2; \
-      __FP_FRAC_SUB_4_c3 |= __FP_FRAC_SUB_4_c2 && (y2 == x2); \
-      r3 = x3 - y3 - __FP_FRAC_SUB_4_c3; \
-    } \
+  do                                                                     \
+    {                                                                    \
+      _FP_W_TYPE __FP_FRAC_SUB_4_tmp[3];                                 \
+      _FP_W_TYPE __FP_FRAC_SUB_4_c1, __FP_FRAC_SUB_4_c2;                 \
+      _FP_W_TYPE __FP_FRAC_SUB_4_c3;                                     \
+      __FP_FRAC_SUB_4_tmp[0] = x0 - y0;                                  \
+      __FP_FRAC_SUB_4_c1 = __FP_FRAC_SUB_4_tmp[0] > x0;                  \
+      __FP_FRAC_SUB_4_tmp[1] = x1 - y1;                                  \
+      __FP_FRAC_SUB_4_c2 = __FP_FRAC_SUB_4_tmp[1] > x1;                  \
+      __FP_FRAC_SUB_4_tmp[1] -= __FP_FRAC_SUB_4_c1;                      \
+      __FP_FRAC_SUB_4_c2 |= __FP_FRAC_SUB_4_c1 && (y1 == x1);            \
+      __FP_FRAC_SUB_4_tmp[2] = x2 - y2;                                  \
+      __FP_FRAC_SUB_4_c3 = __FP_FRAC_SUB_4_tmp[2] > x2;                  \
+      __FP_FRAC_SUB_4_tmp[2] -= __FP_FRAC_SUB_4_c2;                      \
+      __FP_FRAC_SUB_4_c3 |= __FP_FRAC_SUB_4_c2 && (y2 == x2);            \
+      r3 = x3 - y3 - __FP_FRAC_SUB_4_c3;                                 \
+      r2 = __FP_FRAC_SUB_4_tmp[2];                                       \
+      r1 = __FP_FRAC_SUB_4_tmp[1];                                       \
+      r0 = __FP_FRAC_SUB_4_tmp[0];                                       \
+    }                                                                    \
   while (0)
 #endif
 
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Zong Li-2
In reply to this post by Zong Li-2
Here only add the implementation when building the RV32 port.

These macros are used when the following situations occur at the same
time: soft-fp fma, ldbl-128 and 32-bit _FP_W_TYPE_SIZE. The RISC-V
32-bit port is the first port which use all three together.

This is the building flow about the situation:
When building soft-fp/s_fmal.c, there uses the FP_FMA_Q in __fmal.
The _FP_W_TYPE_SIZE is defined to 32-bit in sysdeps/riscv/sfp-machine.h,
so the FP_FMA_Q was defined to _FP_FMA (Q, 4, 8, R, X, Y, Z) in
soft-fp/quad.h.

Something in the soft-fp/quad.h:
 #if _FP_W_TYPE_SIZE < 64
    # define FP_FMA_Q(R, X, Y, Z)    _FP_FMA (Q, 4, 8, R, X, Y, Z)
 #else
    # define FP_FMA_Q(R, X, Y, Z)    _FP_FMA (Q, 2, 4, R, X, Y, Z)
 #endif

Finally, in _FP_FMA (fs, wc, dwc, R, X, Y, Z), it will use the
_FP_FRAC_HIGHBIT_DW_##dwc macro, and it will be expanded to
_FP_FRAC_HIGHBIT_DW_8, but the _FP_FRAC_HIGHBIT_DW_8 is not be
implemented in soft-fp/op-8.h. there is only _FP_FRAC_HIGHBIT_DW_1,
_FP_FRAC_HIGHBIT_DW_2 and _FP_FRAC_HIGHBIT_DW_4 in the
soft-fp/op-*.h.

After this modification, we can pass the soft floating testing of glibc
testsuites on RV32.

        * soft-fp/op-8.h (_FP_FRAC_SET_8, _FP_FRAC_ADD_8, _FP_FRAC_SUB_8)
        (_FP_FRAC_CLZ_8, _FP_MINFRAC_8, _FP_FRAC_NEGP_8, _FP_FRAC_ZEROP_8)
        (_FP_FRAC_HIGHBIT_DW_8, _FP_FRAC_COPY_4_8, _FP_FRAC_COPY_8_4)
        (__FP_FRAC_SET_8): Add implementation for RV32 use.
---
 ChangeLog      |  4 +++
 soft-fp/op-8.h | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 0cced2b..5c41833 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -2,6 +2,10 @@
 
  * soft-fp/op-4.h (_FP_FRAC_SUB_3, _FP_FRAC_SUB_4): Use temporary
  variable to avoid overlap arguments.
+ * soft-fp/op-8.h (_FP_FRAC_SET_8, _FP_FRAC_ADD_8, _FP_FRAC_SUB_8)
+ (_FP_FRAC_CLZ_8, _FP_MINFRAC_8, _FP_FRAC_NEGP_8, _FP_FRAC_ZEROP_8)
+ (_FP_FRAC_HIGHBIT_DW_8, _FP_FRAC_COPY_4_8, _FP_FRAC_COPY_8_4)
+ (__FP_FRAC_SET_8): Add implementation for RV32 use.
 
 2018-10-31  Samuel Thibault  <[hidden email]>
 
diff --git a/soft-fp/op-8.h b/soft-fp/op-8.h
index ffed258..4871c49 100644
--- a/soft-fp/op-8.h
+++ b/soft-fp/op-8.h
@@ -35,6 +35,7 @@
 /* We need just a few things from here for op-4, if we ever need some
    other macros, they can be added.  */
 #define _FP_FRAC_DECL_8(X) _FP_W_TYPE X##_f[8]
+#define _FP_FRAC_SET_8(X, I)    __FP_FRAC_SET_8 (X, I)
 #define _FP_FRAC_HIGH_8(X) (X##_f[7])
 #define _FP_FRAC_LOW_8(X) (X##_f[0])
 #define _FP_FRAC_WORD_8(X, w) (X##_f[w])
@@ -147,4 +148,91 @@
     } \
   while (0)
 
+#define _FP_FRAC_ADD_8(R, X, Y)                                             \
+  do                                                                        \
+    {                                                                       \
+      _FP_W_TYPE _FP_FRAC_ADD_8_c = 0;                                      \
+      _FP_I_TYPE _FP_FRAC_ADD_8_i;                                          \
+      for (_FP_FRAC_ADD_8_i = 0; _FP_FRAC_ADD_8_i < 8; ++_FP_FRAC_ADD_8_i)  \
+        {                                                                   \
+          R##_f[_FP_FRAC_ADD_8_i]                                           \
+            = (X##_f[_FP_FRAC_ADD_8_i] + Y##_f[_FP_FRAC_ADD_8_i]            \
+               + _FP_FRAC_ADD_8_c);                                         \
+          _FP_FRAC_ADD_8_c                                                  \
+            = (_FP_FRAC_ADD_8_c                                             \
+               ? R##_f[_FP_FRAC_ADD_8_i] <= X##_f[_FP_FRAC_ADD_8_i]         \
+               : R##_f[_FP_FRAC_ADD_8_i] < X##_f[_FP_FRAC_ADD_8_i]);        \
+        }                                                                   \
+    }                                                                       \
+  while (0)
+
+#define _FP_FRAC_SUB_8(R, X, Y)                                             \
+  do                                                                        \
+    {                                                                       \
+      _FP_W_TYPE _FP_FRAC_SUB_8_tmp[8];                                     \
+      _FP_W_TYPE _FP_FRAC_SUB_8_c = 0;                                      \
+      _FP_I_TYPE _FP_FRAC_SUB_8_i;                                          \
+      for (_FP_FRAC_SUB_8_i = 0; _FP_FRAC_SUB_8_i < 8; ++_FP_FRAC_SUB_8_i)  \
+        {                                                                   \
+          _FP_FRAC_SUB_8_tmp[_FP_FRAC_SUB_8_i]                              \
+            = (X##_f[_FP_FRAC_SUB_8_i] - Y##_f[_FP_FRAC_SUB_8_i]            \
+               - _FP_FRAC_SUB_8_c);                                         \
+          _FP_FRAC_SUB_8_c                                                  \
+            = (_FP_FRAC_SUB_8_c                                             \
+               ? (_FP_FRAC_SUB_8_tmp[_FP_FRAC_SUB_8_i]                      \
+                  >= X##_f[_FP_FRAC_SUB_8_i])                               \
+               : (_FP_FRAC_SUB_8_tmp[_FP_FRAC_SUB_8_i]                      \
+                  > X##_f[_FP_FRAC_SUB_8_i]));                              \
+        }                                                                   \
+      for (_FP_FRAC_SUB_8_i = 0; _FP_FRAC_SUB_8_i < 8; ++_FP_FRAC_SUB_8_i)  \
+        R##_f[_FP_FRAC_SUB_8_i] = _FP_FRAC_SUB_8_tmp[_FP_FRAC_SUB_8_i];     \
+    }                                                                       \
+  while (0)
+
+#define _FP_FRAC_CLZ_8(R, X)                                                \
+  do                                                                        \
+    {                                                                       \
+      _FP_I_TYPE _FP_FRAC_CLZ_8_i;                                          \
+      for (_FP_FRAC_CLZ_8_i = 7; _FP_FRAC_CLZ_8_i > 0; _FP_FRAC_CLZ_8_i--)  \
+        if (X##_f[_FP_FRAC_CLZ_8_i])                                        \
+          break;                                                            \
+      __FP_CLZ ((R), X##_f[_FP_FRAC_CLZ_8_i]);                              \
+      (R) += _FP_W_TYPE_SIZE * (7 - _FP_FRAC_CLZ_8_i);                      \
+    }                                                                       \
+  while (0)
+
+#define _FP_MINFRAC_8   0, 0, 0, 0, 0, 0, 0, 1
+
+#define _FP_FRAC_NEGP_8(X)      ((_FP_WS_TYPE) X##_f[7] < 0)
+#define _FP_FRAC_ZEROP_8(X)                                             \
+  ((X##_f[0] | X##_f[1] | X##_f[2] | X##_f[3]                           \
+    | X##_f[4] | X##_f[5] | X##_f[6] | X##_f[7]) == 0)
+#define _FP_FRAC_HIGHBIT_DW_8(fs, X)                                    \
+  (_FP_FRAC_HIGH_DW_##fs (X) & _FP_HIGHBIT_DW_##fs)
+
+#define _FP_FRAC_COPY_4_8(D, S)                           \
+  do                                                      \
+    {                                                     \
+      D##_f[0] = S##_f[0];                                \
+      D##_f[1] = S##_f[1];                                \
+      D##_f[2] = S##_f[2];                                \
+      D##_f[3] = S##_f[3];                                \
+    }                                                     \
+  while (0)
+
+#define _FP_FRAC_COPY_8_4(D, S)                           \
+  do                                                      \
+    {                                                     \
+      D##_f[0] = S##_f[0];                                \
+      D##_f[1] = S##_f[1];                                \
+      D##_f[2] = S##_f[2];                                \
+      D##_f[3] = S##_f[3];                                \
+      D##_f[4] = D##_f[5] = D##_f[6] = D##_f[7]= 0;       \
+    }                                                     \
+  while (0)
+
+#define __FP_FRAC_SET_8(X, I7, I6, I5, I4, I3, I2, I1, I0)             \
+  (X##_f[7] = I7, X##_f[6] = I6, X##_f[5] = I5, X##_f[4] = I4,         \
+   X##_f[3] = I3, X##_f[2] = I2, X##_f[1] = I1, X##_f[0] = I0)
+
 #endif /* !SOFT_FP_OP_8_H */
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 1/2] soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4

Joseph Myers
In reply to this post by Zong Li-2
On Thu, 1 Nov 2018, Zong Li wrote:

> In FRAC_SUB_3(R, X, Y) and FRAC_SUB_4(R,, X, Y), it reference both
> the X[N] and X[N] after R[N] have been set. If one of the X and Y is
> the same address with R, the result of the calculation is wrong,
> because the value of the original X and Y are overwritten.
>
> In glibc, there are two places use FRAC_SUB and occurs the overlap.
> The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source
> _FP_DIV_MEAT_N_loop_u as the destination. This macro only be used
> when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc
> extend to _FP_FRAC_SUB_1 in this macro. so it also work because
> _FP_FRAC_SUB_1 has no overlap problem in its implementation.
> The second places is _FP_DIV_MEAT_4_udiv, the original value of X##_f[0]
> is overwritten before the calculatation.
>
> In FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after
> destination have been set, so they have no problem.

Thanks, I've committed this patch.

(The #if 0 version of __FP_FRAC_SUB_2 would violate sequence point rules
if rl and xl were the same - "((rl = xl - yl) > xl)" is not valid in that
case - but as it's #if 0 that's not a particular concern.  The generic C
version of sub_ddmmss seems to get this right, and the assembly versions
mostly use earlyclobbers to deal with this.)

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Joseph Myers
In reply to this post by Zong Li-2
Thanks, committed.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 1/2] soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4

Zong Li-2
In reply to this post by Joseph Myers
Joseph Myers <[hidden email]> 於 2018年11月2日 週五 上午1:41寫道:

>
> On Thu, 1 Nov 2018, Zong Li wrote:
>
> > In FRAC_SUB_3(R, X, Y) and FRAC_SUB_4(R,, X, Y), it reference both
> > the X[N] and X[N] after R[N] have been set. If one of the X and Y is
> > the same address with R, the result of the calculation is wrong,
> > because the value of the original X and Y are overwritten.
> >
> > In glibc, there are two places use FRAC_SUB and occurs the overlap.
> > The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source
> > _FP_DIV_MEAT_N_loop_u as the destination. This macro only be used
> > when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc
> > extend to _FP_FRAC_SUB_1 in this macro. so it also work because
> > _FP_FRAC_SUB_1 has no overlap problem in its implementation.
> > The second places is _FP_DIV_MEAT_4_udiv, the original value of X##_f[0]
> > is overwritten before the calculatation.
> >
> > In FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after
> > destination have been set, so they have no problem.
>
> Thanks, I've committed this patch.
>
> (The #if 0 version of __FP_FRAC_SUB_2 would violate sequence point rules
> if rl and xl were the same - "((rl = xl - yl) > xl)" is not valid in that
> case - but as it's #if 0 that's not a particular concern.  The generic C
> version of sub_ddmmss seems to get this right, and the assembly versions
> mostly use earlyclobbers to deal with this.)
>

Ok, I see. Sorry for being naive to ask these but I am a little bit
curious about why the #if 0 version is present in glibc?
and what is the time would use it?
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Kito Cheng-2
In reply to this post by Joseph Myers
Hi Joseph:

(Apology if you revive this mail twice, I forgot change to plain text
mode in my gmail)

This patch and patch for soft-fp/op-4.h[1] also needed for libgcc
(libgcc/soft-fp/op-4.h, libgcc/soft-fp/op-8.h), could you commit those
two patches to gcc too?

Thanks :)

[1] https://www.sourceware.org/ml/libc-alpha/2018-11/msg00010.html

On Fri, Nov 2, 2018 at 2:22 AM Joseph Myers <[hidden email]> wrote:
>> Thanks, committed.
>
> --
> Joseph S. Myers
> [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 1/2] soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4

Joseph Myers
In reply to this post by Zong Li-2
On Fri, 2 Nov 2018, Zong Li wrote:

> Ok, I see. Sorry for being naive to ask these but I am a little bit
> curious about why the #if 0 version is present in glibc?
> and what is the time would use it?

You would have to research the state of the code when first added to glibc
and the Linux kernel, remembering that there may well have been no
community discussion of it at the time.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Joseph Myers
In reply to this post by Kito Cheng-2
On Fri, 2 Nov 2018, Kito Cheng wrote:

> This patch and patch for soft-fp/op-4.h[1] also needed for libgcc
> (libgcc/soft-fp/op-4.h, libgcc/soft-fp/op-8.h), could you commit those
> two patches to gcc too?

I advise testing and proposing a GCC patch to update *all* the files
coming from soft-fp to their current versions in glibc (not just these two
patches).

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Kito Cheng-2
Hi Joseph:

> I advise testing and proposing a GCC patch to update *all* the files
> coming from soft-fp to their current versions in glibc (not just these two
> patches).

Sounds good to me, I've compare the soft-float folder in gcc and
glibc, fortunately here is only 4 difference,
list of changes in glibc/soft-fp in following section, including its
corresponding git sha1, I don't list file changes
which only update copyright years.

I can testing following changes on riscv* and nds32 target, but I am
not sure should we also test other arch before send a patch?

soft-fp/op-4.h
  - soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4
    - ff48ea6787526d7e669af93ce2681b911d39675c
soft-fp/op-8.h
  - soft-fp: Add implementation for 128 bit self-contained
    - af1d5782c1e3a635fdd13d6688be64de7759857c
soft-fp/op-common.h
- Add FP_TRUNC_COOKED
- Add narrowing multiply functions.
   - 69a01461ee1417578d2ba20aac935828b50f1118

soft-fp/extended.h
soft-fp/half.h
soft-fp/single.h
soft-fp/double.h
soft-fp/quad.h
- __attribute__ ((packed)) removed
  - Do not use packed structures in soft-fp.
    - 049375e2b5fc707436fd5d80337c253beededb2d
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Joseph Myers
On Sat, 3 Nov 2018, Kito Cheng wrote:

> I can testing following changes on riscv* and nds32 target, but I am
> not sure should we also test other arch before send a patch?

I think testing on one architecture suffices - but the patch must update
*all* soft-fp files in GCC that come from glibc with copies of the glibc
versions, so that they become byte-identical to the glibc versions again -
not just the files with changes other than copyright dates.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Kito Cheng-2
Hi Joseph:

> I think testing on one architecture suffices - but the patch must update
> *all* soft-fp files in GCC that come from glibc with copies of the glibc
> versions, so that they become byte-identical to the glibc versions again -
> not just the files with changes other than copyright dates.

Ok, I'll prepare patch and running testing in next few days, just one more
question, here is 2 files not exist in current gcc's soft-fp, Makefile
and testit.c,
should we also copy those 2 files even it's unused for gcc to make sure
it's byte-identical with glibc version?

Thanks :)
On Tue, Nov 6, 2018 at 12:17 AM Joseph Myers <[hidden email]> wrote:

>
> On Sat, 3 Nov 2018, Kito Cheng wrote:
>
> > I can testing following changes on riscv* and nds32 target, but I am
> > not sure should we also test other arch before send a patch?
>
> I think testing on one architecture suffices - but the patch must update
> *all* soft-fp files in GCC that come from glibc with copies of the glibc
> versions, so that they become byte-identical to the glibc versions again -
> not just the files with changes other than copyright dates.
>
> --
> Joseph S. Myers
> [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH v3 2/2] soft-fp: Add inplementation for 128 bit self-contained

Joseph Myers
On Tue, 6 Nov 2018, Kito Cheng wrote:

> Hi Joseph:
>
> > I think testing on one architecture suffices - but the patch must update
> > *all* soft-fp files in GCC that come from glibc with copies of the glibc
> > versions, so that they become byte-identical to the glibc versions again -
> > not just the files with changes other than copyright dates.
>
> Ok, I'll prepare patch and running testing in next few days, just one more
> question, here is 2 files not exist in current gcc's soft-fp, Makefile
> and testit.c,
> should we also copy those 2 files even it's unused for gcc to make sure
> it's byte-identical with glibc version?

No.  It's the subset of files that are present in both places that should
be byte-identical between them.

--
Joseph S. Myers
[hidden email]