History log of /llvm-project-15.0.7/libc/src/math/generic/expm1f.cpp (Results 1 – 10 of 10)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 628fbbef 25-Jul-2022 Tue Ly <[email protected]>

[libc] Use nearest_integer instructions to improve expm1f performance.

Use nearest_integer instructions to improve expf performance.

Performance tests with CORE-MATH's perf tool:

Before the patch:

[libc] Use nearest_integer instructions to improve expm1f performance.

Use nearest_integer instructions to improve expf performance.

Performance tests with CORE-MATH's perf tool:

Before the patch:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput : 10.096
System LIBC reciprocal throughput : 44.036
LIBC reciprocal throughput : 11.575

$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 42.239
System LIBC latency : 122.815
LIBC latency : 50.122
```
After the patch:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH reciprocal throughput : 10.046
System LIBC reciprocal throughput : 43.899
LIBC reciprocal throughput : 9.179

$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 42.078
System LIBC latency : 120.488
LIBC latency : 41.528
```

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D130502

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 484319f4 09-Apr-2022 Tue Ly <[email protected]>

[libc] Make expm1f correctly rounded when the targets have no FMA instructions.

Add another exceptional value and fix the case when |x| is small.

Performance tests with CORE-MATH project scripts:
W

[libc] Make expm1f correctly rounded when the targets have no FMA instructions.

Add another exceptional value and fix the case when |x| is small.

Performance tests with CORE-MATH project scripts:
With FMA instructions on Ryzen 1700:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
CORE-MATH reciprocal throughput : 15.362
System LIBC reciprocal throughput : 53.194
LIBC reciprocal throughput : 14.595
$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
CORE-MATH latency : 57.755
System LIBC latency : 147.020
LIBC latency : 60.269
```
Without FMA instructions:
```
$ ./perf.sh expm1f
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
CORE-MATH reciprocal throughput : 15.362
System LIBC reciprocal throughput : 53.300
LIBC reciprocal throughput : 18.020
$ ./perf.sh expm1f --latency
LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a
CORE-MATH latency : 57.758
System LIBC latency : 147.025
LIBC latency : 70.304
```

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D123440

show more ...


# 614567a7 08-May-2022 Tue Ly <[email protected]>

[libc] Automatically add -mfma flag for architectures supporting FMA.

Detect if the architecture supports FMA instructions and if
the targets depend on fma.

Reviewed By: gchatelet

Differential Rev

[libc] Automatically add -mfma flag for architectures supporting FMA.

Detect if the architecture supports FMA instructions and if
the targets depend on fma.

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D123615

show more ...


# c5f8a0a1 07-Apr-2022 Tue Ly <[email protected]>

[libc] Add support for x86-64 targets that do not have FMA instructions.

Make FMA flag checks more accurate for x86-64 targets, and refactor
polyeval to use multiply and add instead when FMA instruc

[libc] Add support for x86-64 targets that do not have FMA instructions.

Make FMA flag checks more accurate for x86-64 targets, and refactor
polyeval to use multiply and add instead when FMA instructions are not
available.

Reviewed By: michaelrj, sivachandra

Differential Revision: https://reviews.llvm.org/D123335

show more ...


# a5466f04 27-Mar-2022 Tue Ly <[email protected]>

[libc] Improve the performance of expm1f.

Improve the performance of expm1f:
- Rearrange the selection logic for different cases to improve the overall
throughput.
- Use the same degree-4 polynomial

[libc] Improve the performance of expm1f.

Improve the performance of expm1f:
- Rearrange the selection logic for different cases to improve the overall
throughput.
- Use the same degree-4 polynomial for large inputs as `expf`
(https://reviews.llvm.org/D122418), reduced from a degree-7 polynomial.

Performance benchmark using perf tool from CORE-MATH project
(https://gitlab.inria.fr/core-math/core-math/-/tree/master):
Before this patch:
```
$ ./perf.sh expm1f

CORE-MATH reciprocal throughput : 15.362
System LIBC reciprocal throughput : 53.288
LIBC reciprocal throughput : 54.572

$ ./perf.sh expm1f --latency

CORE-MATH latency : 57.759
System LIBC latency : 147.146
LIBC latency : 118.057
```

After this patch:
```
$ ./perf.sh expm1f

CORE-MATH reciprocal throughput : 15.359
System LIBC reciprocal throughput : 53.188
LIBC reciprocal throughput : 14.600

$ ./perf.sh expm1f --latency

CORE-MATH latency : 57.774
System LIBC latency : 147.119
LIBC latency : 60.280

```

Reviewed By: michaelrj, santoshn, zimmermann6

Differential Revision: https://reviews.llvm.org/D122538

show more ...


# 64af346b 14-Mar-2022 Tue Ly <[email protected]>

[libc] Implement expm1f function that is correctly rounded for all rounding modes.

Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.

[libc] Implement expm1f function that is correctly rounded for all rounding modes.

Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.

From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision
gives correctly rounded results for all |x| > 2^-4 with 1 exception. When |x| < 2^-25, we use x + x^2 (implemented with a
single fma). And for 2^-25 <= |x| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya.

Reviewed By: sivachandra, zimmermann6

Differential Revision: https://reviews.llvm.org/D121574

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 08aa40b9 11-Dec-2021 Tue Ly <[email protected]>

[libc] Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.

Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.

Reviewed By: sivachandra

Differential Revisio

[libc] Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.

Add ADD_FMA_FLAG macro to add -mfma flag to functions that requires it.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D115572

show more ...


Revision tags: llvmorg-13.0.1-rc1
# 1c92911e 19-Nov-2021 Michael Jones <[email protected]>

[libc] apply new lint rules

This patch applies the lint rules described in the previous patch. There
was also a significant amount of effort put into manually fixing things,
since all of the templat

[libc] apply new lint rules

This patch applies the lint rules described in the previous patch. There
was also a significant amount of effort put into manually fixing things,
since all of the templated functions, or structs defined in /spec, were
not updated and had to be handled manually.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D114302

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# c120edc7 05-Aug-2021 Michael Jones <[email protected]>

[libc][nfc] move ctype_utils and FPUtils to __support

Some ctype functions are called from other libc functions (e.g. isspace
is used in atoi). By moving ctype_utils.h to __support it becomes easier

[libc][nfc] move ctype_utils and FPUtils to __support

Some ctype functions are called from other libc functions (e.g. isspace
is used in atoi). By moving ctype_utils.h to __support it becomes easier
to include just the implementations of these functions. For these
reasons the implementation for isspace was moved into
ctype_utils as well.

FPUtils was moved to simplify the build order, and to clarify which
files are a part of the actual libc.

Many files were modified to accomodate these changes, mostly changing
the #include paths.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D107600

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 4e5f8b4d 23-Apr-2021 Tue Ly <[email protected]>

[libc] Add implementation of expm1f.

Use expm1f(x) = exp(x) - 1 for |x| > ln(2).
For |x| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2]
and use a degree-6 polynomial

[libc] Add implementation of expm1f.

Use expm1f(x) = exp(x) - 1 for |x| > ln(2).
For |x| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2]
and use a degree-6 polynomial approximation generated by Sollya's fpminmax for each interval.
Errors < 1.5 ULPs when we use fma to evaluate the polynomials.

Differential Revision: https://reviews.llvm.org/D101134

show more ...