[PATCH 0/4] Use __copysign in sincos wherever possible

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH 0/4] Use __copysign in sincos wherever possible

Siddhesh Poyarekar-9
Hi,

The following set of patches is another set of cleanups to sin/cos functions
and their helpers.  The key change is to replace ternary conditions in the code
with copysign wherever applicable.  This results in improving performance of
the SPEC2006 test tonto by 4.5%.

Siddhesh Poyarekar (4):
  consolidate sign checks for slow2
  Check n instead of k1 to decide on sign of sin/cos result
  Use copysign instead of ternary conditions for positive constants
  Use copysign instead of ternary for some sin/cos input ranges

 sysdeps/ieee754/dbl-64/s_sin.c | 80 ++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 39 deletions(-)

--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCH 1/4] consolidate sign checks for slow2

Siddhesh Poyarekar-9
Simplify the code a bit by consolidating sign checks in slow1 and
slow2 into __sin at the higher level.

Siddhesh

        * sysdeps/ieee754/dbl-64/s_sin.c (slow1): Consolidate sign
        check from here...
        (slow2): ... and here...
        (__sin): ... to here.
---
 sysdeps/ieee754/dbl-64/s_sin.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
index 18f1789..d60feb4 100644
--- a/sysdeps/ieee754/dbl-64/s_sin.c
+++ b/sysdeps/ieee754/dbl-64/s_sin.c
@@ -478,7 +478,8 @@ __sin (double x)
   else if (k < 0x3feb6000)
     {
       res = do_sin (x, 0, &cor);
-      retval = (res == res + 1.096 * cor) ? (m > 0 ? res : -res) : slow1 (x);
+      retval = (res == res + 1.096 * cor) ? res : slow1 (x);
+      retval = m > 0 ? retval : -retval;
     } /*   else  if (k < 0x3feb6000)    */
 
 /*----------------------- 0.855469  <|x|<2.426265  ----------------------*/
@@ -487,7 +488,8 @@ __sin (double x)
 
       t = hp0 - fabs (x);
       res = do_cos (t, hp1, &cor);
-      retval = (res == res + 1.020 * cor) ? ((m > 0) ? res : -res) : slow2 (x);
+      retval = (res == res + 1.020 * cor) ? res : slow2 (x);
+      retval = m > 0 ? retval : -retval;
     } /*   else  if (k < 0x400368fd)    */
 
 #ifndef IN_SINCOS
@@ -650,13 +652,13 @@ slow1 (double x)
 
   res = do_sin_slow (x, 0, 0, &cor);
   if (res == res + cor)
-    return (x > 0) ? res : -res;
+    return res;
 
   __dubsin (fabs (x), 0, w);
   if (w[0] == w[0] + 1.000000005 * w[1])
-    return (x > 0) ? w[0] : -w[0];
+    return w[0];
 
-  return (x > 0) ? __mpsin (x, 0, false) : -__mpsin (-x, 0, false);
+  return __mpsin (fabs (x), 0, false);
 }
 
 /**************************************************************************/
@@ -672,16 +674,16 @@ slow2 (double x)
   double t = hp0 - fabs (x);
   res = do_cos_slow (t, hp1, 0, &cor);
   if (res == res + cor)
-    return (x > 0) ? res : -res;
+    return res;
 
   y = fabs (x) - hp0;
   y1 = y - hp1;
   y2 = (y - y1) - hp1;
   __docos (y1, y2, w);
   if (w[0] == w[0] + 1.000000005 * w[1])
-    return (x > 0) ? w[0] : -w[0];
+    return w[0];
 
-  return (x > 0) ? __mpsin (x, 0, false) : -__mpsin (-x, 0, false);
+  return __mpsin (fabs (x), 0, false);
 }
 
 /***************************************************************************/
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCH 2/4] Check n instead of k1 to decide on sign of sin/cos result

Siddhesh Poyarekar-9
In reply to this post by Siddhesh Poyarekar-9
For k1 in 1 and 3, n can only have values of 0 and 2, so checking k1 &
2 is equivalent to checking n & 2.  We prefer the latter so that we
don't use k1 for anything other than selecting the quadrant in
do_sincos_1, thus dropping it completely.

        * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Check N
        instead of K1.
---
 sysdeps/ieee754/dbl-64/s_sin.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
index d60feb4..ea41a7c 100644
--- a/sysdeps/ieee754/dbl-64/s_sin.c
+++ b/sysdeps/ieee754/dbl-64/s_sin.c
@@ -353,7 +353,7 @@ do_sincos_1 (double a, double da, double x, int4 n, int4 k)
     case 3:
       res = do_cos (a, da, &cor);
       cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps;
-      retval = ((res == res + cor) ? ((k1 & 2) ? -res : res)
+      retval = ((res == res + cor) ? ((n & 2) ? -res : res)
  : sloww2 (a, da, x, n));
       break;
     }
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCH 3/4] Use copysign instead of ternary conditions for positive constants

Siddhesh Poyarekar-9
In reply to this post by Siddhesh Poyarekar-9
This is the first very simple substitution of ternary conditions for
correction adjustments with __copysign for positive constants.

        * sysdeps/ieee754/dbl-64/s_sin.c (do_cos_slow): use copysign
        instead of ternary condition.
        (do_sin_slow): Likewise.
        (do_sincos_1): Likewise.
        (do_sincos_2): Likewise.
        (__cos): Likewise.
        (sloww): Likewise.
        (sloww1): Likewise.
        (sloww2): Likewise.
        (bsloww): Likewise.
        (bsloww1): Likewise.
        (bsloww2): Likewise.
---
 sysdeps/ieee754/dbl-64/s_sin.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
index ea41a7c..e4333a4 100644
--- a/sysdeps/ieee754/dbl-64/s_sin.c
+++ b/sysdeps/ieee754/dbl-64/s_sin.c
@@ -198,7 +198,7 @@ do_cos_slow (double x, double dx, double eps, double *corp)
   cor = cor + ((cs - y) - e1 * x1);
   res = y + cor;
   cor = (y - res) + cor;
-  cor = 1.0005 * cor + ((cor > 0) ? eps : -eps);
+  cor = 1.0005 * cor + __copysign (eps, cor);
   *corp = cor;
   return res;
 }
@@ -258,7 +258,7 @@ do_sin_slow (double x, double dx, double eps, double *corp)
   cor = cor + ((sn - y) + c1 * x1);
   res = y + cor;
   cor = (y - res) + cor;
-  cor = 1.0005 * cor + ((cor > 0) ? eps : -eps);
+  cor = 1.0005 * cor + __copysign (eps, cor);
   *corp = cor;
   return res;
 }
@@ -337,13 +337,13 @@ do_sincos_1 (double a, double da, double x, int4 n, int4 k)
  {
   /* Taylor series.  */
   res = TAYLOR_SIN (xx, a, da, cor);
-  cor = (cor > 0) ? 1.02 * cor + eps : 1.02 * cor - eps;
+  cor = 1.02 * cor + __copysign (eps, cor);
   retval = (res == res + cor) ? res : sloww (a, da, x, k);
  }
       else
  {
   res = do_sin (a, da, &cor);
-  cor = (cor > 0) ? 1.035 * cor + eps : 1.035 * cor - eps;
+  cor = 1.035 * cor + __copysign (eps, cor);
   retval = ((res == res + cor) ? ((a > 0) ? res : -res)
     : sloww1 (a, da, x, k));
  }
@@ -352,7 +352,7 @@ do_sincos_1 (double a, double da, double x, int4 n, int4 k)
     case 1:
     case 3:
       res = do_cos (a, da, &cor);
-      cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps;
+      cor = 1.025 * cor + __copysign (eps, cor);
       retval = ((res == res + cor) ? ((n & 2) ? -res : res)
  : sloww2 (a, da, x, n));
       break;
@@ -411,13 +411,13 @@ do_sincos_2 (double a, double da, double x, int4 n, int4 k)
  {
   /* Taylor series.  */
   res = TAYLOR_SIN (xx, a, da, cor);
-  cor = (cor > 0) ? 1.02 * cor + eps : 1.02 * cor - eps;
+  cor = 1.02 * cor + __copysign (eps, cor);
   retval = (res == res + cor) ? res : bsloww (a, da, x, n);
  }
       else
  {
   res = do_sin (a, da, &cor);
-  cor = (cor > 0) ? 1.035 * cor + eps : 1.035 * cor - eps;
+  cor = 1.035 * cor + __copysign (eps, cor);
   retval = ((res == res + cor) ? ((a > 0) ? res : -res)
     : bsloww1 (a, da, x, n));
  }
@@ -426,7 +426,7 @@ do_sincos_2 (double a, double da, double x, int4 n, int4 k)
     case 1:
     case 3:
       res = do_cos (a, da, &cor);
-      cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps;
+      cor = 1.025 * cor + __copysign (eps, cor);
       retval = ((res == res + cor) ? ((n & 2) ? -res : res)
  : bsloww2 (a, da, x, n));
       break;
@@ -573,13 +573,13 @@ __cos (double x)
       if (xx < 0.01588)
  {
   res = TAYLOR_SIN (xx, a, da, cor);
-  cor = (cor > 0) ? 1.02 * cor + 1.0e-31 : 1.02 * cor - 1.0e-31;
+  cor = 1.02 * cor + __copysign (1.0e-31, cor);
   retval = (res == res + cor) ? res : sloww (a, da, x, 1);
  }
       else
  {
   res = do_sin (a, da, &cor);
-  cor = (cor > 0) ? 1.035 * cor + 1.0e-31 : 1.035 * cor - 1.0e-31;
+  cor = 1.035 * cor + __copysign (1.0e-31, cor);
   retval = ((res == res + cor) ? ((a > 0) ? res : -res)
     : sloww1 (a, da, x, 1));
  }
@@ -705,7 +705,7 @@ sloww (double x, double dx, double orig, int k)
 
   double eps = fabs (orig) * 3.1e-30;
 
-  cor = 1.0005 * cor + ((cor > 0) ? eps : -eps);
+  cor = 1.0005 * cor + __copysign (eps, cor);
 
   if (res == res + cor)
     return res;
@@ -714,7 +714,7 @@ sloww (double x, double dx, double orig, int k)
   da = (x > 0) ? dx : -dx;
   __dubsin (a, da, w);
   eps = fabs (orig) * 1.1e-30;
-  cor = 1.000000001 * w[1] + ((w[1] > 0) ? eps : -eps);
+  cor = 1.000000001 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
     return (x > 0) ? w[0] : -w[0];
@@ -740,7 +740,7 @@ sloww (double x, double dx, double orig, int k)
   dx = (a > 0) ? da : -da;
   __dubsin (x, dx, w);
   eps = fabs (orig) * 1.1e-40;
-  cor = 1.000000001 * w[1] + ((w[1] > 0) ? eps : -eps);
+  cor = 1.000000001 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
     return (a > 0) ? w[0] : -w[0];
@@ -770,7 +770,7 @@ sloww1 (double x, double dx, double orig, int k)
   __dubsin (fabs (x), dx, w);
 
   double eps = 1.1e-30 * fabs (orig);
-  cor = 1.000000005 * w[1] + ((w[1] > 0) ? eps : -eps);
+  cor = 1.000000005 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
     return (x > 0) ? w[0] : -w[0];
@@ -800,7 +800,7 @@ sloww2 (double x, double dx, double orig, int n)
   __docos (fabs (x), dx, w);
 
   double eps = 1.1e-30 * fabs (orig);
-  cor = 1.000000005 * w[1] + ((w[1] > 0) ? eps : -eps);
+  cor = 1.000000005 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
     return (n & 2) ? -w[0] : w[0];
@@ -823,14 +823,14 @@ bsloww (double x, double dx, double orig, int n)
   double res, cor, w[2], a, da;
 
   res = TAYLOR_SLOW (x, dx, cor);
-  cor = 1.0005 * cor + ((cor > 0) ? 1.1e-24 : -1.1e-24);
+  cor = 1.0005 * cor + __copysign (1.1e-24, cor);
   if (res == res + cor)
     return res;
 
   a = fabs (x);
   da = (x > 0) ? dx : -dx;
   __dubsin (a, da, w);
-  cor = 1.000000001 * w[1] + ((w[1] > 0) ? 1.1e-24 : -1.1e-24);
+  cor = 1.000000001 * w[1] + __copysign (1.1e-24, w[1]);
 
   if (w[0] == w[0] + cor)
     return (x > 0) ? w[0] : -w[0];
@@ -858,7 +858,7 @@ bsloww1 (double x, double dx, double orig, int n)
   dx = (x > 0) ? dx : -dx;
   __dubsin (fabs (x), dx, w);
 
-  cor = 1.000000005 * w[1] + ((w[1] > 0) ? 1.1e-24 : -1.1e-24);
+  cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]);
 
   if (w[0] == w[0] + cor)
     return (x > 0) ? w[0] : -w[0];
@@ -886,7 +886,7 @@ bsloww2 (double x, double dx, double orig, int n)
   dx = (x > 0) ? dx : -dx;
   __docos (fabs (x), dx, w);
 
-  cor = 1.000000005 * w[1] + ((w[1] > 0) ? 1.1e-24 : -1.1e-24);
+  cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]);
 
   if (w[0] == w[0] + cor)
     return (n & 2) ? -w[0] : w[0];
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

[PATCH 4/4] Use copysign instead of ternary for some sin/cos input ranges

Siddhesh Poyarekar-9
In reply to this post by Siddhesh Poyarekar-9
These are remaining cases where we can deduce and conclude that the
sign of the result should be the same as the sign of the input being
checked.  For example, for sin(x), the sign of the result is the same
as the result itself for x < pi.  Likewise, for sine values where x
after range reduction falls into this range and its sign is preserved.

        * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Use copysign
        instead of ternary condition.
        (do_sincos_2): Likewise.
        (__sin): Likewise.
        (__cos): Likewise.
        (slow): Likewise.
        (sloww): Likewise.
        (sloww1): Likewise.
        (bsloww): Likewise.
        (bsloww1): Likewise.
---
 sysdeps/ieee754/dbl-64/s_sin.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
index e4333a4..40d538d 100644
--- a/sysdeps/ieee754/dbl-64/s_sin.c
+++ b/sysdeps/ieee754/dbl-64/s_sin.c
@@ -344,7 +344,7 @@ do_sincos_1 (double a, double da, double x, int4 n, int4 k)
  {
   res = do_sin (a, da, &cor);
   cor = 1.035 * cor + __copysign (eps, cor);
-  retval = ((res == res + cor) ? ((a > 0) ? res : -res)
+  retval = ((res == res + cor) ? __copysign (res, a)
     : sloww1 (a, da, x, k));
  }
       break;
@@ -418,7 +418,7 @@ do_sincos_2 (double a, double da, double x, int4 n, int4 k)
  {
   res = do_sin (a, da, &cor);
   cor = 1.035 * cor + __copysign (eps, cor);
-  retval = ((res == res + cor) ? ((a > 0) ? res : -res)
+  retval = ((res == res + cor) ? __copysign (res, a)
     : bsloww1 (a, da, x, n));
  }
       break;
@@ -479,7 +479,7 @@ __sin (double x)
     {
       res = do_sin (x, 0, &cor);
       retval = (res == res + 1.096 * cor) ? res : slow1 (x);
-      retval = m > 0 ? retval : -retval;
+      retval = __copysign (retval, x);
     } /*   else  if (k < 0x3feb6000)    */
 
 /*----------------------- 0.855469  <|x|<2.426265  ----------------------*/
@@ -489,7 +489,7 @@ __sin (double x)
       t = hp0 - fabs (x);
       res = do_cos (t, hp1, &cor);
       retval = (res == res + 1.020 * cor) ? res : slow2 (x);
-      retval = m > 0 ? retval : -retval;
+      retval = __copysign (retval, x);
     } /*   else  if (k < 0x400368fd)    */
 
 #ifndef IN_SINCOS
@@ -580,7 +580,7 @@ __cos (double x)
  {
   res = do_sin (a, da, &cor);
   cor = 1.035 * cor + __copysign (1.0e-31, cor);
-  retval = ((res == res + cor) ? ((a > 0) ? res : -res)
+  retval = ((res == res + cor) ? __copysign (res, a)
     : sloww1 (a, da, x, 1));
  }
 
@@ -634,9 +634,9 @@ slow (double x)
 
   __dubsin (fabs (x), 0, w);
   if (w[0] == w[0] + 1.000000001 * w[1])
-    return (x > 0) ? w[0] : -w[0];
+    return __copysign (w[0], x);
 
-  return (x > 0) ? __mpsin (x, 0, false) : -__mpsin (-x, 0, false);
+  return __copysign (__mpsin (fabs (x), 0, false), x);
 }
 
 /*******************************************************************************/
@@ -717,7 +717,7 @@ sloww (double x, double dx, double orig, int k)
   cor = 1.000000001 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
-    return (x > 0) ? w[0] : -w[0];
+    return __copysign (w[0], x);
 
   t = (orig * hpinv + toint);
   xn = t - toint;
@@ -743,7 +743,7 @@ sloww (double x, double dx, double orig, int k)
   cor = 1.000000001 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
-    return (a > 0) ? w[0] : -w[0];
+    return __copysign (w[0], a);
 
   return k ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
 }
@@ -764,7 +764,7 @@ sloww1 (double x, double dx, double orig, int k)
   res = do_sin_slow (x, dx, 3.1e-30 * fabs (orig), &cor);
 
   if (res == res + cor)
-    return (x > 0) ? res : -res;
+    return __copysign (res, x);
 
   dx = (x > 0 ? dx : -dx);
   __dubsin (fabs (x), dx, w);
@@ -773,7 +773,7 @@ sloww1 (double x, double dx, double orig, int k)
   cor = 1.000000005 * w[1] + __copysign (eps, w[1]);
 
   if (w[0] == w[0] + cor)
-    return (x > 0) ? w[0] : -w[0];
+    return __copysign (w[0], x);
 
   return (k == 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
 }
@@ -833,7 +833,7 @@ bsloww (double x, double dx, double orig, int n)
   cor = 1.000000001 * w[1] + __copysign (1.1e-24, w[1]);
 
   if (w[0] == w[0] + cor)
-    return (x > 0) ? w[0] : -w[0];
+    return __copysign (w[0], x);
 
   return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
 }
@@ -861,7 +861,7 @@ bsloww1 (double x, double dx, double orig, int n)
   cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]);
 
   if (w[0] == w[0] + cor)
-    return (x > 0) ? w[0] : -w[0];
+    return __copysign (w[0], x);
 
   return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
 }
--
2.7.4

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 0/4] Use __copysign in sincos wherever possible

Carlos O'Donell-6
In reply to this post by Siddhesh Poyarekar-9
On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:

> Hi,
>
> The following set of patches is another set of cleanups to sin/cos functions
> and their helpers.  The key change is to replace ternary conditions in the code
> with copysign wherever applicable.  This results in improving performance of
> the SPEC2006 test tonto by 4.5%.
>
> Siddhesh Poyarekar (4):
>   consolidate sign checks for slow2
>   Check n instead of k1 to decide on sign of sin/cos result
>   Use copysign instead of ternary conditions for positive constants
>   Use copysign instead of ternary for some sin/cos input ranges
>
>  sysdeps/ieee754/dbl-64/s_sin.c | 80 ++++++++++++++++++++++--------------------
>  1 file changed, 41 insertions(+), 39 deletions(-)
>

That's quiet a good improvement.

Is that a mean percentage improvement over some number of consecutive runs?

I'm just wondering if the 4.5% is noise, or a real performance gain.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 0/4] Use __copysign in sincos wherever possible

Siddhesh Poyarekar-9
On Tuesday 27 September 2016 12:10 PM, Carlos O'Donell wrote:
> That's quiet a good improvement.
>
> Is that a mean percentage improvement over some number of consecutive runs?
>
> I'm just wondering if the 4.5% is noise, or a real performance gain.

It has been consistently repeatable in our tests on multiple computers
of the same configuration.

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 1/4] consolidate sign checks for slow2

Carlos O'Donell-6
In reply to this post by Siddhesh Poyarekar-9
On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:
> Simplify the code a bit by consolidating sign checks in slow1 and
> slow2 into __sin at the higher level.
>
> Siddhesh
>
> * sysdeps/ieee754/dbl-64/s_sin.c (slow1): Consolidate sign
> check from here...
> (slow2): ... and here...
> (__sin): ... to here.

LGTM.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] Use copysign instead of ternary for some sin/cos input ranges

Maurizio Manfredini
In reply to this post by Siddhesh Poyarekar-9


On 9/27/2016 7:49 PM, Siddhesh Poyarekar wrote:
> @@ -479,7 +479,7 @@ __sin (double x)
>      {
>        res = do_sin (x, 0, &cor);
>        retval = (res == res + 1.096 * cor) ? res : slow1 (x);
> -      retval = m > 0 ? retval : -retval;
> +      retval = __copysign (retval, x);
>      } /*   else  if (k < 0x3feb6000)    */
>
The ternary uses m, __copysign uses x. It could be correct (I didn't
check the full context), but just in case...
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/4] Check n instead of k1 to decide on sign of sin/cos result

Carlos O'Donell-6
In reply to this post by Siddhesh Poyarekar-9
On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:
> For k1 in 1 and 3, n can only have values of 0 and 2, so checking k1 &
> 2 is equivalent to checking n & 2.  We prefer the latter so that we
> don't use k1 for anything other than selecting the quadrant in
> do_sincos_1, thus dropping it completely.

This is only true of the quadrant shift K is restricted to 0 or 1, which
is true of the code today, but may not be true tomorrow.

In which case you need `assert (k == 0 || k == 1);` and a comment
about this function only working for a shift of 0 or 1.

OK to checkin if you add the assert and comment.

> * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Check N
> instead of K1.
> ---
>  sysdeps/ieee754/dbl-64/s_sin.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
> index d60feb4..ea41a7c 100644
> --- a/sysdeps/ieee754/dbl-64/s_sin.c
> +++ b/sysdeps/ieee754/dbl-64/s_sin.c
> @@ -353,7 +353,7 @@ do_sincos_1 (double a, double da, double x, int4 n, int4 k)
>      case 3:
>        res = do_cos (a, da, &cor);
>        cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps;
> -      retval = ((res == res + cor) ? ((k1 & 2) ? -res : res)
> +      retval = ((res == res + cor) ? ((n & 2) ? -res : res)
>   : sloww2 (a, da, x, n));
>        break;
>      }
>

Start with:
(k1 & 2)

Expand in:
k1 = (n + k) & 3
(((n + k) & 3) & 2)

Simplify:
(n + k) & 2

Assume k1 is 1 or 3:
(n + k) & 3 = 1
(n + k) & 3 = 3

Requires the following ranges:
n = 0, k = 1
n = 1, k = 0

n = 0, k = 3
n = 1, k = 2
n = 2, k = 1
n = 3, k = 0


Produces the following result:
n = 0, k = 1, (n + k) & 2 = 0, n & 2 = 0; PASS
n = 1, k = 0, (n + k) & 2 = 0, n & 2 = 0; PASS

n = 0, k = 3, (n + k) & 2 = 2, n & 2 = 0; FAIL
n = 1, k = 2, (n + k) & 2 = 2, n & 2 = 0; FAIL
n = 2, k = 1, (n + k) & 2 = 2, n & 2 = 2; PASS
n = 3, k = 0, (n + k) & 2 = 2, n & 2 = 2; PASS

Thus the substitution you make is invalid for case where K is 2 or 3,
but that's never used in the code.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 3/4] Use copysign instead of ternary conditions for positive constants

Carlos O'Donell-6
In reply to this post by Siddhesh Poyarekar-9
On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:

> This is the first very simple substitution of ternary conditions for
> correction adjustments with __copysign for positive constants.
>
> * sysdeps/ieee754/dbl-64/s_sin.c (do_cos_slow): use copysign
> instead of ternary condition.
> (do_sin_slow): Likewise.
> (do_sincos_1): Likewise.
> (do_sincos_2): Likewise.
> (__cos): Likewise.
> (sloww): Likewise.
> (sloww1): Likewise.
> (sloww2): Likewise.
> (bsloww): Likewise.
> (bsloww1): Likewise.
> (bsloww2): Likewise.

LGTM.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] Use copysign instead of ternary for some sin/cos input ranges

Carlos O'Donell-6
In reply to this post by Maurizio Manfredini
On 09/28/2016 08:59 AM, Manfred wrote:

>
>
> On 9/27/2016 7:49 PM, Siddhesh Poyarekar wrote:
>> @@ -479,7 +479,7 @@ __sin (double x)
>>      {
>>        res = do_sin (x, 0, &cor);
>>        retval = (res == res + 1.096 * cor) ? res : slow1 (x);
>> -      retval = m > 0 ? retval : -retval;
>> +      retval = __copysign (retval, x);
>>      }                /*   else  if (k < 0x3feb6000)    */
>>
> The ternary uses m, __copysign uses x. It could be correct (I didn't check the full context), but just in case...

In both those cases X < pi, and therefore the result has the same sign as X.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] Use copysign instead of ternary for some sin/cos input ranges

Carlos O'Donell-6
In reply to this post by Siddhesh Poyarekar-9
On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:

> These are remaining cases where we can deduce and conclude that the
> sign of the result should be the same as the sign of the input being
> checked.  For example, for sin(x), the sign of the result is the same
> as the result itself for x < pi.  Likewise, for sine values where x
> after range reduction falls into this range and its sign is preserved.
>
> * sysdeps/ieee754/dbl-64/s_sin.c (do_sincos_1): Use copysign
> instead of ternary condition.
> (do_sincos_2): Likewise.
> (__sin): Likewise.
> (__cos): Likewise.
> (slow): Likewise.
> (sloww): Likewise.
> (sloww1): Likewise.
> (bsloww): Likewise.
> (bsloww1): Likewise.

LGTM.

> ---
>  sysdeps/ieee754/dbl-64/s_sin.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
> index e4333a4..40d538d 100644
> --- a/sysdeps/ieee754/dbl-64/s_sin.c
> +++ b/sysdeps/ieee754/dbl-64/s_sin.c
> @@ -344,7 +344,7 @@ do_sincos_1 (double a, double da, double x, int4 n, int4 k)
>   {
>    res = do_sin (a, da, &cor);
>    cor = 1.035 * cor + __copysign (eps, cor);
> -  retval = ((res == res + cor) ? ((a > 0) ? res : -res)
> +  retval = ((res == res + cor) ? __copysign (res, a)

OK.

>      : sloww1 (a, da, x, k));
>   }
>        break;
> @@ -418,7 +418,7 @@ do_sincos_2 (double a, double da, double x, int4 n, int4 k)
>   {
>    res = do_sin (a, da, &cor);
>    cor = 1.035 * cor + __copysign (eps, cor);
> -  retval = ((res == res + cor) ? ((a > 0) ? res : -res)
> +  retval = ((res == res + cor) ? __copysign (res, a)

OK.

>      : bsloww1 (a, da, x, n));
>   }
>        break;
> @@ -479,7 +479,7 @@ __sin (double x)
>      {
>        res = do_sin (x, 0, &cor);
>        retval = (res == res + 1.096 * cor) ? res : slow1 (x);
> -      retval = m > 0 ? retval : -retval;
> +      retval = __copysign (retval, x);

OK.

>      } /*   else  if (k < 0x3feb6000)    */
>  
>  /*----------------------- 0.855469  <|x|<2.426265  ----------------------*/
> @@ -489,7 +489,7 @@ __sin (double x)
>        t = hp0 - fabs (x);
>        res = do_cos (t, hp1, &cor);
>        retval = (res == res + 1.020 * cor) ? res : slow2 (x);
> -      retval = m > 0 ? retval : -retval;
> +      retval = __copysign (retval, x);

OK.

>      } /*   else  if (k < 0x400368fd)    */
>  
>  #ifndef IN_SINCOS
> @@ -580,7 +580,7 @@ __cos (double x)
>   {
>    res = do_sin (a, da, &cor);
>    cor = 1.035 * cor + __copysign (1.0e-31, cor);
> -  retval = ((res == res + cor) ? ((a > 0) ? res : -res)
> +  retval = ((res == res + cor) ? __copysign (res, a)

OK.

>      : sloww1 (a, da, x, 1));
>   }
>  
> @@ -634,9 +634,9 @@ slow (double x)
>  
>    __dubsin (fabs (x), 0, w);
>    if (w[0] == w[0] + 1.000000001 * w[1])
> -    return (x > 0) ? w[0] : -w[0];
> +    return __copysign (w[0], x);

OK.

>  
> -  return (x > 0) ? __mpsin (x, 0, false) : -__mpsin (-x, 0, false);
> +  return __copysign (__mpsin (fabs (x), 0, false), x);

OK.

>  }
>  
>  /*******************************************************************************/
> @@ -717,7 +717,7 @@ sloww (double x, double dx, double orig, int k)
>    cor = 1.000000001 * w[1] + __copysign (eps, w[1]);
>  
>    if (w[0] == w[0] + cor)
> -    return (x > 0) ? w[0] : -w[0];
> +    return __copysign (w[0], x);

OK.

>  
>    t = (orig * hpinv + toint);
>    xn = t - toint;
> @@ -743,7 +743,7 @@ sloww (double x, double dx, double orig, int k)
>    cor = 1.000000001 * w[1] + __copysign (eps, w[1]);
>  
>    if (w[0] == w[0] + cor)
> -    return (a > 0) ? w[0] : -w[0];
> +    return __copysign (w[0], a);

OK.

>  
>    return k ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
>  }
> @@ -764,7 +764,7 @@ sloww1 (double x, double dx, double orig, int k)
>    res = do_sin_slow (x, dx, 3.1e-30 * fabs (orig), &cor);
>  
>    if (res == res + cor)
> -    return (x > 0) ? res : -res;
> +    return __copysign (res, x);

OK.

>  
>    dx = (x > 0 ? dx : -dx);
>    __dubsin (fabs (x), dx, w);
> @@ -773,7 +773,7 @@ sloww1 (double x, double dx, double orig, int k)
>    cor = 1.000000005 * w[1] + __copysign (eps, w[1]);
>  
>    if (w[0] == w[0] + cor)
> -    return (x > 0) ? w[0] : -w[0];
> +    return __copysign (w[0], x);

OK.

>  
>    return (k == 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
>  }
> @@ -833,7 +833,7 @@ bsloww (double x, double dx, double orig, int n)
>    cor = 1.000000001 * w[1] + __copysign (1.1e-24, w[1]);
>  
>    if (w[0] == w[0] + cor)
> -    return (x > 0) ? w[0] : -w[0];
> +    return __copysign (w[0], x);

OK.

>  
>    return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
>  }
> @@ -861,7 +861,7 @@ bsloww1 (double x, double dx, double orig, int n)
>    cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]);
>  
>    if (w[0] == w[0] + cor)
> -    return (x > 0) ? w[0] : -w[0];
> +    return __copysign (w[0], x);

OK.

>  
>    return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true);
>  }
>


--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 4/4] Use copysign instead of ternary for some sin/cos input ranges

Siddhesh Poyarekar-8
In reply to this post by Maurizio Manfredini
On Wednesday 28 September 2016 05:59 AM, Manfred wrote:
> The ternary uses m, __copysign uses x. It could be correct (I didn't
> check the full context), but just in case...

m essentially tracks the sign bit of x, so both checks are equivalent.

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/4] Check n instead of k1 to decide on sign of sin/cos result

Siddhesh Poyarekar-9
In reply to this post by Carlos O'Donell-6
On Wednesday 28 September 2016 12:54 PM, Carlos O'Donell wrote:

> On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:
>> For k1 in 1 and 3, n can only have values of 0 and 2, so checking k1 &
>> 2 is equivalent to checking n & 2.  We prefer the latter so that we
>> don't use k1 for anything other than selecting the quadrant in
>> do_sincos_1, thus dropping it completely.
>
> This is only true of the quadrant shift K is restricted to 0 or 1, which
> is true of the code today, but may not be true tomorrow.
>
> In which case you need `assert (k == 0 || k == 1);` and a comment
> about this function only working for a shift of 0 or 1.
>
> OK to checkin if you add the assert and comment.

do_sincos_1 with K values other than 0 or 1 does not make sense in the
context of sin/cos because they're only 1 quadrant apart.  To be clear,
the previous logic was:

    "Compute sine for the value and based on the new rotated quadrant
     (k1) negate the value if we're in the fourth quadrant."

With the change, the logic is:

    "Compute sine for the value and negate it if we were either (1) in
     the fourth quadrant or (2) we actually wanted the cosine and were
     in the third quadrant."

which should be more intuitive when you visualize the quadrants. I can
update the patch description with the above.  I can add a comment to
do_sincos_* functions explicitly stating that K can only be either 0 and
1.  I can even make K an enum {CURRENT_QUADRANT = 0, NEXT_QUADRANT = 1}
to make it explicit in code that we don't want any other values there.
The assert is an added instruction that I want to avoid, but maybe it'll
get optimized away given that the functions are inlined anyway.

What do you prefer?

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/4] Check n instead of k1 to decide on sign of sin/cos result

Carlos O'Donell-6
On 09/28/2016 05:57 PM, Siddhesh Poyarekar wrote:

> On Wednesday 28 September 2016 12:54 PM, Carlos O'Donell wrote:
>> On 09/27/2016 01:49 PM, Siddhesh Poyarekar wrote:
>>> For k1 in 1 and 3, n can only have values of 0 and 2, so checking k1 &
>>> 2 is equivalent to checking n & 2.  We prefer the latter so that we
>>> don't use k1 for anything other than selecting the quadrant in
>>> do_sincos_1, thus dropping it completely.
>>
>> This is only true of the quadrant shift K is restricted to 0 or 1, which
>> is true of the code today, but may not be true tomorrow.
>>
>> In which case you need `assert (k == 0 || k == 1);` and a comment
>> about this function only working for a shift of 0 or 1.
>>
>> OK to checkin if you add the assert and comment.
>
> do_sincos_1 with K values other than 0 or 1 does not make sense in the
> context of sin/cos because they're only 1 quadrant apart.  To be clear,
> the previous logic was:
>
>     "Compute sine for the value and based on the new rotated quadrant
>      (k1) negate the value if we're in the fourth quadrant."
>
> With the change, the logic is:
>
>     "Compute sine for the value and negate it if we were either (1) in
>      the fourth quadrant or (2) we actually wanted the cosine and were
>      in the third quadrant."
>
> which should be more intuitive when you visualize the quadrants. I can
> update the patch description with the above.  I can add a comment to
> do_sincos_* functions explicitly stating that K can only be either 0 and
> 1.  I can even make K an enum {CURRENT_QUADRANT = 0, NEXT_QUADRANT = 1}
> to make it explicit in code that we don't want any other values there.
> The assert is an added instruction that I want to avoid, but maybe it'll
> get optimized away given that the functions are inlined anyway.
>
> What do you prefer?

I like where you are going with this.

The simplest solution I think is:

* Make K a bool (C99 bool type), which should resolve the issue.
  The compiler will always only use 0 or 1 and the result works.
* Update with the description above.

OK with that.

--
Cheers,
Carlos.
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH 2/4] Check n instead of k1 to decide on sign of sin/cos result

Siddhesh Poyarekar-9
On Friday 30 September 2016 06:07 AM, Carlos O'Donell wrote:
> I like where you are going with this.
>
> The simplest solution I think is:
>
> * Make K a bool (C99 bool type), which should resolve the issue.
>   The compiler will always only use 0 or 1 and the result works.

Ack, I made that a separate patch and pushed it (see other [committed]
email)

> * Update with the description above.

Done and pushed.

Thanks,
Siddhesh