nearbyint(double) on aarch64 vs. riscv

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

nearbyint(double) on aarch64 vs. riscv

Sourceware - libc-alpha mailing list
Hi,

I'm curious about the codegen for nearbyint() on aarc64.
From a build of off of build-many-glibc.py I see:

000000000003c8e8 <nearbyint>:
   3c8e8: frinti d0, d0
   3c8ec: ret

vs. rv64imafdc

0000000000030ff8 <nearbyint>:
   30ff8: *frflags* a4
   30ffc: feq.d a5,fa0,fa0
   31000: fabs.d fa5,fa0
   31004: beqz a5,31026 <nearbyint+0x2e>
   31006: auipc a5,0x43
   3100a: fld fa4,-694(a5) # 73d50 <factor+0x28>
   3100e: flt.d a5,fa5,fa4
   31012: beqz a5,31024 <nearbyint+0x2c>
   31014: fcvt.l.d a5,fa0
   31018: fcvt.d.l fa5,a5
   3101c: fsgnj.d fa0,fa5,fa0
   31020: *fsflags* a4
   31024: ret
   31026: fadd.d fa0,fa0,fa0
   3102a: ret

So RISCV is using the conversion instructions and also disabling the FPU
exceptions around the math code.

AARCH64 uses FRINTI instruction which per [1] can generate exceptions (atleast set
flags in FPSR). Isn't the code supposed to wrap __builtin_nearbyint() with
feholdexcept() / fesetenv().

I'm assuming FPU flags are callee preserved since the caller may not know what
goes down the rabbit hole ?

[1] ARM compiler asm user guide https://developer.arm.com/documentation/dui0802/a/
Reply | Threaded
Open this post in threaded view
|

Re: nearbyint(double) on aarch64 vs. riscv

Sourceware - libc-alpha mailing list


On 20/07/2020 16:00, Vineet Gupta via Libc-alpha wrote:

> Hi,
>
> I'm curious about the codegen for nearbyint() on aarc64.
> From a build of off of build-many-glibc.py I see:
>
> 000000000003c8e8 <nearbyint>:
>    3c8e8: frinti d0, d0
>    3c8ec: ret
>
> vs. rv64imafdc
>
> 0000000000030ff8 <nearbyint>:
>    30ff8: *frflags* a4
>    30ffc: feq.d a5,fa0,fa0
>    31000: fabs.d fa5,fa0
>    31004: beqz a5,31026 <nearbyint+0x2e>
>    31006: auipc a5,0x43
>    3100a: fld fa4,-694(a5) # 73d50 <factor+0x28>
>    3100e: flt.d a5,fa5,fa4
>    31012: beqz a5,31024 <nearbyint+0x2c>
>    31014: fcvt.l.d a5,fa0
>    31018: fcvt.d.l fa5,a5
>    3101c: fsgnj.d fa0,fa5,fa0
>    31020: *fsflags* a4
>    31024: ret
>    31026: fadd.d fa0,fa0,fa0
>    3102a: ret
>
> So RISCV is using the conversion instructions and also disabling the FPU
> exceptions around the math code.
>
> AARCH64 uses FRINTI instruction which per [1] can generate exceptions (atleast set
> flags in FPSR). Isn't the code supposed to wrap __builtin_nearbyint() with
> feholdexcept() / fesetenv().

If you check ARM Architecture Reference Manual for ARMv8-A architecture profile f
on the 'frinti' description for scalar (C7.2.143) its operation is defined as:

  CheckFPAdvSIMDEnabled64();
  bits(datasize) result;
  bits(datasize) operand = V[n];
  result = FPRoundInt(operand, FPCR, rounding, FALSE);
  V[d] = result;

And later FPRoundInt is defined as:

  bits(N) FPRoundInt(bits(N) op, FPCRType fpcr, FPRounding rounding, boolean exact)
    [...]
    else
    // extract integer component
    int_result = RoundDown(value);
    error = value - Real(int_result);

    // Convert integer value into an equivalent real value
    real_result = Real(int_result);

    // Re-encode as a floating-point value, result is always exact
    if real_result == 0.0 then
      result = FPZero(sign);
    else
      result = FPRound(real_result, fpcr, FPRounding_ZERO);

    // Generate inexact exceptions
    if error != 0.0 && exact then
      FPProcessException(FPExc_Inexact, fpcr);

Afaik RoundDown does not generate any exception by definition.  The FPRound
operation might generat an FP exception, however since the argument used is
always an integer neither undeflow, inexact, or invalid operation would happen.

An inexact exception is just generated is 'exact' is true and frinti
explicit sets it to false (different than frintx).

I am not sure exactly why the documentation does state that a floating-point
exception can be generated by frinti.

>
> I'm assuming FPU flags are callee preserved since the caller may not know what
> goes down the rabbit hole ?
>
> [1] ARM compiler asm user guide https://developer.arm.com/documentation/dui0802/a/
>

Reply | Threaded
Open this post in threaded view
|

Re: nearbyint(double) on aarch64 vs. riscv

Joseph Myers
On Mon, 20 Jul 2020, Adhemerval Zanella via Libc-alpha wrote:

> I am not sure exactly why the documentation does state that a floating-point
> exception can be generated by frinti.

Because the "invalid" exception would be generated for signaling NaNs, and
the architecture-specific "input denormal" exception might also be
generated by the FPUnpack part of the operation.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: nearbyint(double) on aarch64 vs. riscv

Sourceware - libc-alpha mailing list
On 7/20/20 12:52 PM, Joseph Myers wrote:
> On Mon, 20 Jul 2020, Adhemerval Zanella via Libc-alpha wrote:
>
>> I am not sure exactly why the documentation does state that a floating-point
>> exception can be generated by frinti.
>
> Because the "invalid" exception would be generated for signaling NaNs, and
> the architecture-specific "input denormal" exception might also be
> generated by the FPUnpack part of the operation.

But even for these arcane cases, isn't nearbyint() expected to "hold" FP
exceptions. It would be really easy to test/break this even with existing math tests.

Reply | Threaded
Open this post in threaded view
|

Re: nearbyint(double) on aarch64 vs. riscv

Joseph Myers
On Tue, 21 Jul 2020, Vineet Gupta via Libc-alpha wrote:

> On 7/20/20 12:52 PM, Joseph Myers wrote:
> > On Mon, 20 Jul 2020, Adhemerval Zanella via Libc-alpha wrote:
> >
> >> I am not sure exactly why the documentation does state that a floating-point
> >> exception can be generated by frinti.
> >
> > Because the "invalid" exception would be generated for signaling NaNs, and
> > the architecture-specific "input denormal" exception might also be
> > generated by the FPUnpack part of the operation.
>
> But even for these arcane cases, isn't nearbyint() expected to "hold" FP
> exceptions. It would be really easy to test/break this even with existing math tests.

math/test-nearbyint-except-2.c verifies that nearbyint works when traps on
"inexact" are enabled.  The generic libm tests cover the case of default
exception handling, including for sNaN arguments.  There are no
expectations that glibc does anything in particular with non-IEEE
exceptions such as "input denormal".

"hold" is not an IEEE concept.  Floating-point exceptions are signaled,
which, with default exception handling, generally results in a flag being
raised and a default result returned.  The requirements on nearbyint
relate to its observable behavior, which in standard C means the return
value and (standard) exception flags raised when it returns.  The
requirements include that an sNaN argument results in a qNaN return value
with "invalid" raised, but that noninteger (finite) arguments do not
result in "inexact" raised.  Whether the implementation just uses an
instruction with the relevant effect, or whether it has to save and
restore exception state internally, doesn't matter as long as the observed
behavior is as expcted.

--
Joseph S. Myers
[hidden email]