| 7ab2d0bd | 22-Oct-2017 |
Jan Vesely <[email protected]> |
shared: Implement aligned vector stores (vstorea_half)
Float version passes newly posted piglit tests on turks, float and double pass on carrizo. v2: scalar vstorea_half v3: fix typo
Reviewer: Aaro
shared: Implement aligned vector stores (vstorea_half)
Float version passes newly posted piglit tests on turks, float and double pass on carrizo. v2: scalar vstorea_half v3: fix typo
Reviewer: Aaron Watry Signed-off-by: Jan Vesely <[email protected]> llvm-svn: 316291
show more ...
|
| 0c21c7c7 | 12-Aug-2013 |
Aaron Watry <[email protected]> |
Add intN vloadN() implementations for address spaces 3 and 4
Not hooked up to R600 yet due to current lack of support, at least on EG.
Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Tom
Add intN vloadN() implementations for address spaces 3 and 4
Not hooked up to R600 yet due to current lack of support, at least on EG.
Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Tom Stellard <[email protected]> llvm-svn: 188181
show more ...
|
| 99a2f3b2 | 16-Jul-2013 |
Aaron Watry <[email protected]> |
Fix and re-enable R600 vload/vstore assembly
The assembly optimizations were making unsafe assumptions about which address spaces had which identifiers.
Also, fix vload/vstore with 64-bit pointers.
Fix and re-enable R600 vload/vstore assembly
The assembly optimizations were making unsafe assumptions about which address spaces had which identifiers.
Also, fix vload/vstore with 64-bit pointers. This was broken previously on Radeon SI.
This version still only has assembly versions of int/uint 2/4/8/16 for global loads and stores on R600, but it does it in a way that would be very easily extended to private/local/constant and could also be handled easily on other architectures.
v2: 1) Leave v[load|store]_impl.ll in generic/lib 2) Remove vload_if.ll and vstore_if.ll interfaces 3) Fix address+offset calculations 3) Remove offset from assembly arg list llvm-svn: 186416
show more ...
|
| 64b3bbae | 26-Jun-2013 |
Tom Stellard <[email protected]> |
libclc: Add assembly versions of vstore for global [u]int4/8/16
The assembly should be generic, but at least currently R600 only supports 32-bit stores of [u]int1/4, and I believe that only global i
libclc: Add assembly versions of vstore for global [u]int4/8/16
The assembly should be generic, but at least currently R600 only supports 32-bit stores of [u]int1/4, and I believe that only global is well-supported.
R600 lowers the 8/16 component stores to multiple 4-component stores.
The unoptimized C versions of the other stuff is left in place.
Patch by: Aaron Watry
llvm-svn: 185009
show more ...
|
| 922ac056 | 26-Jun-2013 |
Tom Stellard <[email protected]> |
libclc: Add assembly versions of vload for global int4/8/16
The assembly should be generic, but at least currently R600 only supports 32-bit loads of int1/4, and I believe that only global is well-s
libclc: Add assembly versions of vload for global int4/8/16
The assembly should be generic, but at least currently R600 only supports 32-bit loads of int1/4, and I believe that only global is well-supported.
R600 lowers the 8/16 component vectors to multiple 4-bit loads.
The unoptimized C versions of the other stuff is left in place.
Patch by: Aaron Watry
llvm-svn: 185008
show more ...
|