Hello to all,
We are the maintainers of the crlibm project, which aims at developping a modern, correctly rounded version of the libm licenced under the LGPL. Project homepage is https://lipforge.ens-lyon.fr/projects/crlibm/ I feel that this project is now achieving sufficient maturity to be a candidate for gradual inclusion in the glibc. I would appreciate any feedback on this question, and probably some practical advice. Correct rounding means that the value returned by a libm function is fully and uniquely specified, to the best mathematical possible. Its main advantage is therefore to improve numerical portability between heterogeneous systems. The drawbacks are a small degradation in performance (theoretically a few percents in average) and a larger degradation in code size. The pdf files distributed along with CRLibm details this issue and gives more references. As far as I observe, the libm currently used in glibc is a derivative from IBM libultim, at least for architectures with double-precision by default, like powerPC and x86_64 using the sse2 FP unit. This is a very good choice, as it was the first efficient implementation of a double-precision correctly rounded library. However, libultim has several drawbacks: - impredictible worst-case performance with some calls in the million cycle range (see below) - obscure code, - some large tables of precomputed values (up to 43 KB for arctan, 13KB for exp, etc) CRLibm has a sounder theoretical basis, which allows in particular to provide much more acceptable worst-case execution time (several hundred times lower, see below). We have an emphasis on the proof of the correct rounding property and in general on documentation. We also limit the table size to 4KB altogether for each function (whether this is a good idea can be discussed - several years ago, it seemed so). All this has some performance cost, which we are slowly catching up. Appended are some performance comparisons. Besides, CRLibm is currently actively maintained, and in any case should be more future-proof than libultim. For instance, unlike libultim, the CRLibm distribution includes all the files that generate the tables and other constants, so current tradeoffs can be changed as technology evolves. It also includes a battery of selftests. Another example is that our code is designed to use FMA efficiently (on power/powerPC and ia64) as soon as there is clean and efficient compiler support for it. The code also attempts to expose parallelism (which current gcc fails to exploit efficiently, but this is bound to improve). We also have more correctly rounded functions that libultim (currently log2 and log10, expm1 and log1p, and the trigpi functions mentionned in the current draft of the revised IEEE-754 standard), and we keep adding new ones. There are other more prospective features, like machine-checkable proofs of some of the code, partial support of directed rounding modes and interval arithmetic functions, etc, all of which is probably currently irrelevant to the glibc. If there is interest, I would like some advice on the procedure to follow. I appreciate that we will have to submit patches. I have many questions on this process: should we add a separate directory and where, which functions should we submit first, should first improve average performance at the expense of larger tables, what should we do with the autotest and documentation, etc. I also appreciate that it will take some work to clean up crlibm and bring it to GNU standards. We would also appreciate pointers on relevant discussions on the list archive. Thank you for your interest, and many thanks in advance, Florent de Dinechin and Christoph Lauter Here is a sample of performance results. The actual input values used for the worst-case timing tests are available in the crlibm distribution. The random generators used for average case are by definition controversial, but they were chosen in good faith, not to distort the comparison in favor of crlibm. For example our asin and acos implementations are faster than libultim's on a random distribution of the floating-point numbers between -1 and 1, but slower on a normal distribution of the reals between -1 and 1 (used below). Tests on an Opteron ( Linux 2.6.17-2-amd64 gcc (GCC) 4.1.2 20060901 (prerelease) (Debian 4.1.1-13) ) exp avg time max time default libm 141 1105468 CRLibm 184 1210 log avg time max time default libm 210 150919 CRLibm 146 903 (using double-extended arithmetic) CRLibm 266 1459 (using standard SSE2 doubles) sin avg time max time default libm 147 1018622 CRLibm 171 3895 cos avg time max time default libm 152 028302 CRLibm 171 3752 tan avg time max time default libm 244 1101230 CRLibm 280 9949 asin avg time max time default libm 107 1018823 CRLibm 297 2383 acos avg time max time default libm 83 1018786 CRLibm 322 2360 atan avg time max time default libm 144 100066 CRLibm 243 4724 log10 avg time max time default libm 249 1061 NOT correctly rounded CRLibm 321 1706 log2 avg time max time default libm 174 274 NOT correctly rounded CRLibm 320 1663 The same, on an IBM power5 server (time units are arbitrary) log avg time max time default libm 61 23471 CRLibm 57 307 exp avg time max time default libm 41 25019 CRLibm 42 242 sin avg time max time default libm 37 132435 CRLibm 43 1910 cos avg time max time default libm 38 134045 CRLibm 44 1946 tan avg time max time default libm 65 141885 CRLibm 72 4671 asin avg time max time default libm 33 132912 CRLibm 50 465 acos avg time max time default libm 34 132798 CRLibm 58 433 atan avg time max time default libm 35 18785 CRLibm 53 2315 log10 avg time max time default libm 65 153 NOT correctly rounded CRLibm 65 355 log2 avg time max time default libm 47 59 NOT correctly rounded CRLibm 65 351 PS2 : while browsing the code I just noticed a probably harmless bug line 27 of sysdeps/ieee754/dbl-64/sincos.tbl : 40 should be 440) |
On Tue, 6 Feb 2007, Florent de Dinechin wrote:
> We are the maintainers of the crlibm project, which aims at developping a > modern, correctly rounded version of the libm licenced under the LGPL. > Project homepage is https://lipforge.ens-lyon.fr/projects/crlibm/ > > I feel that this project is now achieving sufficient maturity to be a > candidate for gradual inclusion in the glibc. I would appreciate any feedback > on this question, and probably some practical advice. The following are just my views, I don't speak for the glibc maintainers. First, I expect it will be necessary to follow the FSF copyright assignment process. While various parts of glibc at present, including mathematical functions, have non-FSF copyrights, they are less likely to be accepted for new code; see <http://sourceware.org/ml/libc-alpha/2007-01/msg00026.html>. Second, I'd like to suggest that you consider contributing the code to the libgcc-math <http://gcc.gnu.org/wiki/libgcc-math> project *as well as* to glibc. This aims to collect implementations of mathematical functions for use from GCC-compiled code beyond the scope of glibc; for example, vectorized functions, normal functions compiled with more efficient ABIs (SSE argument passing on x86) and C99 functions for use by the C++ and Fortran runtime libraries on platforms whose non-glibc libm is missing those functions. Such functions are specifically beyond the scope of what is considered appropriate for inclusion in glibc <http://sourceware.org/ml/libc-alpha/2005-07/msg00008.html>, so if you are interested in this then you should discuss it on [hidden email] and not on libc-alpha. By assigning to the FSF and using the LGPL+exception license used for soft-fp from the start, the code can be used in both places; you should aim for making it possible to have identical source files in both places, as that's best in terms of FSF policies. > should first improve average > performance at the expense of larger tables, If doing so makes the average case performance better than the existing implementations, that would make a clearer case for including this code as a replacement for the current code. There are trade-offs involved between speed, size and accuracy of mathematical functions. glibc has to strike a middle ground for general users. Other projects such as libgcc-math may collect different versions tuned for different trade-offs and allow the user to select them at compile time or link time. The glibc bug database <http://sources.redhat.com/bugzilla/> has 57 open bugs in the "math" component, mostly suspended until someone comes forward with patches for them. Perhaps your new implementations fix some of them? (Many of those bugs relate to float and long double versions of the functions; as I understand it, your methods could generate proven correctly rounded functions for float and probably but not proven correctly rounded versions for at least IEEE extended and quad long double.) -- Joseph S. Myers [hidden email] |
Free forum by Nabble | Edit this page |