faster exp10f code

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

faster exp10f code

Zimmermann Paul
       Hi,

the 3 patches referenced below implement some faster exp10f code, inspired by
the expf code, that is much faster than the current one (tested on x86_64).
Here are the "make bench" results (I had to add an entry for exp10f):

before new code:
  "exp10f": {
   "": {
    "duration": 3.40525e+09,
    "iterations": 6.4368e+07,
    "max": 65906.3,
    "min": 24.056,
    "mean": 52.9029
   }

with new code:
  "exp10f": {
   "": {
    "duration": 3.30806e+09,
    "iterations": 1.59728e+08,
    "max": 39557.8,
    "min": 16.104,
    "mean": 20.7106
   }

I disabled the code in math/w_exp10f_compat.c by adding #if 0 ... #endif
around it, since I don't know how to properly do it.

Paul Zimmermann

[1] https://homepages.loria.fr/PZimmermann/glibc-contrib/0001-added-new-code-for-exp10f-inspired-from-expf-and-reu.patch
[2] https://homepages.loria.fr/PZimmermann/glibc-contrib/0002-e_exp10f.c-cleaned-up-added-error-analysis-and-comme.patch
[3] https://homepages.loria.fr/PZimmermann/glibc-contrib/0003-integrate-new-exp10f-code.patch
Reply | Threaded
Open this post in threaded view
|

Re: faster exp10f code

Joseph Myers
On Mon, 6 Apr 2020, paul zimmermann wrote:

> I disabled the code in math/w_exp10f_compat.c by adding #if 0 ... #endif
> around it, since I don't know how to properly do it.

I'd expect symbol versioning like for expf.  So

#if LIBM_SVID_COMPAT && SHLIB_COMPAT (libm, GLIBC_2_0, GLIBC_2_32)

in w_exp10f_compat.c, and math/w_exp10f.c using symbol versioning with
GLIBC_2_32 version, and sysdeps/ieee754/w_expf.c overriding that with a
dummy file, and e_expf.c using a GLIBC_2_32 symbol version, and
math/Versions being updated with the new symbol version, and all the ABI
test baselines being updated as well.

Everything apart from the benchtests will need to go in a single commit
(thus a single patch in the submission), rather than having later patches
fix up problems with earlier ones, on the principle of bisectability.  
Benchtests can go in a separate commit.

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: faster exp10f code

Joseph Myers
Also, you'll need to take special care that architectures with
architecture-specific versions of exp10f (i386, ia64, m68k) do get the new
symbol version as expected.  Again, see how expf is handled for an example
(note the special .symver handling in sysdeps/ia64/fpu/e_expf.S - and the
wrapper handling done in commit f7a0b063e7fc81d0eff1e8b2b169876bdfb4cc44).

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: faster exp10f code

Zimmermann Paul
In reply to this post by Joseph Myers
       Dear Joseph,

thank you for your comments. Since I'm new to glibc, I'm not sure to understand
everything. Anyway I've prepared a new cumulated patch at [1]. But I get a
compile error (multiple definition of __exp10f) I don't know how to get rid of.

Paul

PS: if you prefer, we can continue the discussion off-list.

[1] https://homepages.loria.fr/PZimmermann/glibc-contrib/0001-cumulated-patch-for-new-exp10f-code-v2.patch
Reply | Threaded
Open this post in threaded view
|

Re: faster exp10f code

Sourceware - libc-alpha mailing list


On 08/04/2020 07:15, paul zimmermann wrote:

>        Dear Joseph,
>
> thank you for your comments. Since I'm new to glibc, I'm not sure to understand
> everything. Anyway I've prepared a new cumulated patch at [1]. But I get a
> compile error (multiple definition of __exp10f) I don't know how to get rid of.
>
> Paul
>
> PS: if you prefer, we can continue the discussion off-list.
>
> [1] https://homepages.loria.fr/PZimmermann/glibc-contrib/0001-cumulated-patch-for-new-exp10f-code-v2.patch
>

I can help you on organize it, I will take a look.