|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
628fbbef |
| 25-Jul-2022 |
Tue Ly <[email protected]> |
[libc] Use nearest_integer instructions to improve expm1f performance.
Use nearest_integer instructions to improve expf performance.
Performance tests with CORE-MATH's perf tool:
Before the patch:
[libc] Use nearest_integer instructions to improve expm1f performance.
Use nearest_integer instructions to improve expf performance.
Performance tests with CORE-MATH's perf tool:
Before the patch: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 10.096 System LIBC reciprocal throughput : 44.036 LIBC reciprocal throughput : 11.575
$ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 42.239 System LIBC latency : 122.815 LIBC latency : 50.122 ``` After the patch: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 10.046 System LIBC reciprocal throughput : 43.899 LIBC reciprocal throughput : 9.179
$ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 42.078 System LIBC latency : 120.488 LIBC latency : 41.528 ```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D130502
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
484319f4 |
| 09-Apr-2022 |
Tue Ly <[email protected]> |
[libc] Make expm1f correctly rounded when the targets have no FMA instructions.
Add another exceptional value and fix the case when |x| is small.
Performance tests with CORE-MATH project scripts: W
[libc] Make expm1f correctly rounded when the targets have no FMA instructions.
Add another exceptional value and fix the case when |x| is small.
Performance tests with CORE-MATH project scripts: With FMA instructions on Ryzen 1700: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.194 LIBC reciprocal throughput : 14.595 $ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH latency : 57.755 System LIBC latency : 147.020 LIBC latency : 60.269 ``` Without FMA instructions: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.300 LIBC reciprocal throughput : 18.020 $ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a CORE-MATH latency : 57.758 System LIBC latency : 147.025 LIBC latency : 70.304 ```
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D123440
show more ...
|
| #
614567a7 |
| 08-May-2022 |
Tue Ly <[email protected]> |
[libc] Automatically add -mfma flag for architectures supporting FMA.
Detect if the architecture supports FMA instructions and if the targets depend on fma.
Reviewed By: gchatelet
Differential Rev
[libc] Automatically add -mfma flag for architectures supporting FMA.
Detect if the architecture supports FMA instructions and if the targets depend on fma.
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D123615
show more ...
|
| #
c5f8a0a1 |
| 07-Apr-2022 |
Tue Ly <[email protected]> |
[libc] Add support for x86-64 targets that do not have FMA instructions.
Make FMA flag checks more accurate for x86-64 targets, and refactor polyeval to use multiply and add instead when FMA instruc
[libc] Add support for x86-64 targets that do not have FMA instructions.
Make FMA flag checks more accurate for x86-64 targets, and refactor polyeval to use multiply and add instead when FMA instructions are not available.
Reviewed By: michaelrj, sivachandra
Differential Revision: https://reviews.llvm.org/D123335
show more ...
|
| #
a5466f04 |
| 27-Mar-2022 |
Tue Ly <[email protected]> |
[libc] Improve the performance of expm1f.
Improve the performance of expm1f: - Rearrange the selection logic for different cases to improve the overall throughput. - Use the same degree-4 polynomial
[libc] Improve the performance of expm1f.
Improve the performance of expm1f: - Rearrange the selection logic for different cases to improve the overall throughput. - Use the same degree-4 polynomial for large inputs as `expf` (https://reviews.llvm.org/D122418), reduced from a degree-7 polynomial.
Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master): Before this patch: ``` $ ./perf.sh expm1f
CORE-MATH reciprocal throughput : 15.362 System LIBC reciprocal throughput : 53.288 LIBC reciprocal throughput : 54.572
$ ./perf.sh expm1f --latency
CORE-MATH latency : 57.759 System LIBC latency : 147.146 LIBC latency : 118.057 ```
After this patch: ``` $ ./perf.sh expm1f
CORE-MATH reciprocal throughput : 15.359 System LIBC reciprocal throughput : 53.188 LIBC reciprocal throughput : 14.600
$ ./perf.sh expm1f --latency
CORE-MATH latency : 57.774 System LIBC latency : 147.119 LIBC latency : 60.280
```
Reviewed By: michaelrj, santoshn, zimmermann6
Differential Revision: https://reviews.llvm.org/D122538
show more ...
|
| #
64af346b |
| 14-Mar-2022 |
Tue Ly <[email protected]> |
[libc] Implement expm1f function that is correctly rounded for all rounding modes.
Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.
[libc] Implement expm1f function that is correctly rounded for all rounding modes.
Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.
From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision gives correctly rounded results for all |x| > 2^-4 with 1 exception. When |x| < 2^-25, we use x + x^2 (implemented with a single fma). And for 2^-25 <= |x| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121574
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
08aa40b9 |
| 11-Dec-2021 |
Tue Ly <[email protected]> |
[libc] Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.
Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.
Reviewed By: sivachandra
Differential Revisio
[libc] Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.
Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D115572
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
1c92911e |
| 19-Nov-2021 |
Michael Jones <[email protected]> |
[libc] apply new lint rules
This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templat
[libc] apply new lint rules
This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D114302
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
c120edc7 |
| 05-Aug-2021 |
Michael Jones <[email protected]> |
[libc][nfc] move ctype_utils and FPUtils to __support
Some ctype functions are called from other libc functions (e.g. isspace is used in atoi). By moving ctype_utils.h to __support it becomes easier
[libc][nfc] move ctype_utils and FPUtils to __support
Some ctype functions are called from other libc functions (e.g. isspace is used in atoi). By moving ctype_utils.h to __support it becomes easier to include just the implementations of these functions. For these reasons the implementation for isspace was moved into ctype_utils as well.
FPUtils was moved to simplify the build order, and to clarify which files are a part of the actual libc.
Many files were modified to accomodate these changes, mostly changing the #include paths.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D107600
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
4e5f8b4d |
| 23-Apr-2021 |
Tue Ly <[email protected]> |
[libc] Add implementation of expm1f.
Use expm1f(x) = exp(x) - 1 for |x| > ln(2). For |x| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2] and use a degree-6 polynomial
[libc] Add implementation of expm1f.
Use expm1f(x) = exp(x) - 1 for |x| > ln(2). For |x| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2] and use a degree-6 polynomial approximation generated by Sollya's fpminmax for each interval. Errors < 1.5 ULPs when we use fma to evaluate the polynomials.
Differential Revision: https://reviews.llvm.org/D101134
show more ...
|