|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
aff679a4 |
| 08-Jun-2022 |
Thomas Lively <[email protected]> |
[WebAssembly] Implement remaining relaxed SIMD instructions
Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s, i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructio
[WebAssembly] Implement remaining relaxed SIMD instructions
Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s, i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are the last instructions from the relaxed SIMD proposal[1] that had not been implemented.
[1]: https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.
Differential Revision: https://reviews.llvm.org/D127170
show more ...
|
| #
576b8245 |
| 07-Jun-2022 |
Thomas Lively <[email protected]> |
[WebAssembly][NFC] RelaxedBinary tablegen multiclass for relaxed SIMD
Refactor the tablegen definitions for relaxed SIMD min/max instructions to use a shared RelaxedBinary multiclass modeled on the
[WebAssembly][NFC] RelaxedBinary tablegen multiclass for relaxed SIMD
Refactor the tablegen definitions for relaxed SIMD min/max instructions to use a shared RelaxedBinary multiclass modeled on the existing SIMDBinary multiclass. A future commit will add further instruction definitions that use RelaxedBinary.
Also rename the SIMD_RELAXED_CONVERT multiclass to RelaxedConvert to better fit existing naming conventions.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D127157
show more ...
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
82a13d05 |
| 17-May-2022 |
Thomas Lively <[email protected]> |
[WebAssembly] Update relaxed SIMD opcodes and names
to reflect the latest state of the proposal: https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-forma
[WebAssembly] Update relaxed SIMD opcodes and names
to reflect the latest state of the proposal: https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format. Moves code around to match the instruction order from the proposal, but the only functional changes are to the names and opcodes.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D125726
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
7e8913d7 |
| 16-Mar-2022 |
Thomas Lively <[email protected]> |
[WebAssembly] Fix names of SIMD instructions containing '_zero'
Fix the instruction names to match the WebAssembly spec:
- `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero`
[WebAssembly] Fix names of SIMD instructions containing '_zero'
Fix the instruction names to match the WebAssembly spec:
- `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero` - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero`
Also rename related things like intrinsics, builtins, and test functions to match.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D121661
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
2a4a229d |
| 14-Dec-2021 |
Jing Bao <[email protected]> |
[WebAssembly] Custom optimization for truncate
When possible, optimize TRUNCATE to generate Wasm SIMD narrow instructions (i16x8.narrow_i32x4_u, i8x16.narrow_i16x8_u), rather than generate lots of e
[WebAssembly] Custom optimization for truncate
When possible, optimize TRUNCATE to generate Wasm SIMD narrow instructions (i16x8.narrow_i32x4_u, i8x16.narrow_i16x8_u), rather than generate lots of extract_lane and replace_lane.
Closes #50350.
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
fb67f3d9 |
| 28-Oct-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Add prototype relaxed float to int trunc instructions
Add i32x4.relaxed_trunc_f32x4_s, i32x4.relaxed_trunc_f32x4_u, i32x4.relaxed_trunc_f64x2_s_zero, i32x4.relaxed_trunc_f64x2_u_zero.
[WebAssembly] Add prototype relaxed float to int trunc instructions
Add i32x4.relaxed_trunc_f32x4_s, i32x4.relaxed_trunc_f32x4_u, i32x4.relaxed_trunc_f64x2_s_zero, i32x4.relaxed_trunc_f64x2_u_zero.
These are only exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112186
show more ...
|
| #
e1fb1340 |
| 20-Oct-2021 |
Zhi An Ng <[email protected]> |
[WebAssembly] Add prototype relaxed float min max instructions
Add relaxed. f32x4.min, f32x4.max, f64x2.min, f64x2.max. These are only exposed as builtins, and require user opt-in.
Differential Rev
[WebAssembly] Add prototype relaxed float min max instructions
Add relaxed. f32x4.min, f32x4.max, f64x2.min, f64x2.max. These are only exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112146
show more ...
|
| #
2542bfa4 |
| 20-Oct-2021 |
Zhi An Ng <[email protected]> |
[WebAssembly] Add prototype relaxed swizzle instructions
Add i8x16 relaxed_swizzle instructions. These are only exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.
[WebAssembly] Add prototype relaxed swizzle instructions
Add i8x16 relaxed_swizzle instructions. These are only exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112022
show more ...
|
| #
da079428 |
| 16-Oct-2021 |
Zhi An Ng <[email protected]> |
[WebAssembly] Add prototype relaxed laneselect instructions
Add i8x16, i16x8, i32x4, i64x2 laneselect instructions. These are only exposed as builtins, and require user opt-in.
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
2f519825 |
| 23-Sep-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Add prototype relaxed SIMD fma/fms instructions
Add experimental clang builtins, LLVM intrinsics, and backend definitions for the new {f32x4,f64x2}.{fma,fms} instructions in the relaxe
[WebAssembly] Add prototype relaxed SIMD fma/fms instructions
Add experimental clang builtins, LLVM intrinsics, and backend definitions for the new {f32x4,f64x2}.{fma,fms} instructions in the relaxed SIMD proposal: https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md. Do not allow these instructions to be selected without explicit user opt-in.
Differential Revision: https://reviews.llvm.org/D110295
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3 |
|
| #
fec47492 |
| 01-Sep-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Lower v2f32 to v2f64 extending loads with promote_low
Previously extra wide v4f32 to v4f64 extending loads would be legalized to v2f32 to v2f64 extending loads, which would then be sca
[WebAssembly] Lower v2f32 to v2f64 extending loads with promote_low
Previously extra wide v4f32 to v4f64 extending loads would be legalized to v2f32 to v2f64 extending loads, which would then be scalarized by legalization. (v2f32 to v2f64 extending loads not produced by legalization were already being emitted correctly.) Instead, mark v2f32 to v2f64 extending loads as legal and explicitly lower them using promote_low. This regresses the addressing modes supported for the extloads not produced by legalization, but that's a fine trade off for now.
Differential Revision: https://reviews.llvm.org/D108496
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
88962cea |
| 20-Aug-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Restore builtins and intrinsics for pmin/pmax
Partially reverts 85157c007903, which had removed these builtins and intrinsics in favor of normal codegen patterns. It turns out that it
[WebAssembly] Restore builtins and intrinsics for pmin/pmax
Partially reverts 85157c007903, which had removed these builtins and intrinsics in favor of normal codegen patterns. It turns out that it is possible for the patterns to be split over multiple basic blocks, however, which means that DAG ISel is not able to select them to the pmin/pmax instructions. To make sure the SIMD intrinsics generate the correct instructions in these cases, reintroduce the clang builtins and corresponding LLVM intrinsics, but also keep the normal pattern matching as well.
Differential Revision: https://reviews.llvm.org/D108387
show more ...
|
| #
b69374ca |
| 19-Aug-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Legalize vector types by widening
The default legalization of unsupported vector types is to promote the integers in each lane, which leads to extra sign or zero extending and masking
[WebAssembly] Legalize vector types by widening
The default legalization of unsupported vector types is to promote the integers in each lane, which leads to extra sign or zero extending and masking when moving data into and out of vectors. Switch our preferred type legalization from the default to vector widening, which keeps the data in the low lanes of the vector rather than in the low bits of each lane. The unused high lanes can be ignored.
Half-wide vectors are now loaded from memory into the low 64 bits of the v128 rather than spread out among the lanes. As a result, v128.load64_splat is a much more common operation, so add new patterns to support it.
Differential Revision: https://reviews.llvm.org/D107502
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
33786576 |
| 27-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Codegen for extmul SIMD instructions
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions with normal codegen patterns.
Differential Revision: https://revie
[WebAssembly] Codegen for extmul SIMD instructions
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions with normal codegen patterns.
Differential Revision: https://reviews.llvm.org/D106724
show more ...
|
| #
85157c00 |
| 23-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Codegen for pmin and pmax
Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax} with standard codegen patterns. Since wasm_simd128.h uses an integer vector as th
[WebAssembly] Codegen for pmin and pmax
Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax} with standard codegen patterns. Since wasm_simd128.h uses an integer vector as the standard single vector type, the IR for the pmin and pmax intrinsic functions contains bitcasts that would not be there otherwise. Add extra codegen patterns that can still select the pmin and pmax instructions in the presence of these bitcasts.
Differential Revision: https://reviews.llvm.org/D106612
show more ...
|
| #
39c0e4af |
| 23-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly][NFC] Simplify SIMD bitconvert pattern
Differential Revision: https://reviews.llvm.org/D106680
|
| #
8af333cf |
| 21-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8
Use the standard target-independent intrinsic to take advantage of standard optimizations.
Differential Revision: https://reviews.llvm
[WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8
Use the standard target-independent intrinsic to take advantage of standard optimizations.
Differential Revision: https://reviews.llvm.org/D106506
show more ...
|
| #
1a57ee12 |
| 21-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Codegen for v128.load{32,64}_zero
Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal instruction selection patterns. The wasm_simd128.h intr
[WebAssembly] Codegen for v128.load{32,64}_zero
Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal instruction selection patterns. The wasm_simd128.h intrinsics header was already using portable code for the corresponding intrinsics, so now it produces the correct instructions.
Differential Revision: https://reviews.llvm.org/D106400
show more ...
|
| #
4a4229f7 |
| 14-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Codegen for v128.storeX_lane instructions
Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435.
Differenti
[WebAssembly] Codegen for v128.storeX_lane instructions
Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435.
Differential Revision: https://reviews.llvm.org/D106019
show more ...
|
| #
970e0900 |
| 14-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Codegen for v128.loadX_lane instructions
Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433.
Differential
[WebAssembly] Codegen for v128.loadX_lane instructions
Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433.
Differential Revision: https://reviews.llvm.org/D105950
show more ...
|
| #
cbabfc63 |
| 12-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Custom combines for f32x4.demote_zero_f64x2
Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combin
[WebAssembly] Custom combines for f32x4.demote_zero_f64x2
Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern.
Differential Revision: https://reviews.llvm.org/D105755
show more ...
|
| #
e5220104 |
| 10-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Custom combines for f64x2.promote_low_f32x4
Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines
[WebAssembly] Custom combines for f64x2.promote_low_f32x4
Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines from standard SelectionDAG nodes. Implement the new combines to share code with the similar combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232.
Differential Revision: https://reviews.llvm.org/D105675
show more ...
|
| #
f8c5a4c6 |
| 08-Jul-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Optimize out shift masks
WebAssembly's shift instructions implicitly masks the shift count, so optimize out redundant explicit masks of the shift count. For vector shifts, this current
[WebAssembly] Optimize out shift masks
WebAssembly's shift instructions implicitly masks the shift count, so optimize out redundant explicit masks of the shift count. For vector shifts, this currently only works if the mask is applied before splatting the shift count, but this should be addressed in a future commit. Resolves PR49655.
Differential Revision: https://reviews.llvm.org/D105600
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
3067520b |
| 27-Apr-2021 |
Craig Topper <[email protected]> |
[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This w
[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the type legalizer.
This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes it opaque to the type legalizer.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101262
show more ...
|
| #
e657c84f |
| 19-Apr-2021 |
Thomas Lively <[email protected]> |
[WebAssembly] Use v128.const instead of splats for constants
We previously used splats instead of v128.const to materialize vector constants because V8 did not support v128.const. Now that V8 suppor
[WebAssembly] Use v128.const instead of splats for constants
We previously used splats instead of v128.const to materialize vector constants because V8 did not support v128.const. Now that V8 supports v128.const, we can use v128.const instead. Although this increases code size, it should also increase performance (or at least require fewer engine-side optimizations), so it is an appropriate change to make.
Differential Revision: https://reviews.llvm.org/D100716
show more ...
|