| #
4e18d71c |
| 28-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Generate kernel parameter allocation with right size
Before this change we miscounted the number of function parameters.
llvm-svn: 276960
|
| #
79a947c2 |
| 27-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Add basic support for kernel launches
llvm-svn: 276863
|
| #
57793596 |
| 25-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Load GPU kernels
We embed the PTX code into the host IR as a global variable and compile it at run-time into a GPU kernel.
llvm-svn: 276645
|
| #
13c78e4d |
| 25-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Emit data-transfer code
Also factor out getArraySize() to avoid code dupliciation and reorder some function arguments to indicate the direction into which data is transferred.
llvm-svn: 2766
GPGPU: Emit data-transfer code
Also factor out getArraySize() to avoid code dupliciation and reorder some function arguments to indicate the direction into which data is transferred.
llvm-svn: 276636
show more ...
|
| #
7287aedd |
| 25-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Complete code to allocate and free device arrays
At the beginning of each SCoP, we allocate device arrays for all arrays used on the GPU and we free such arrays after the SCoP has been execut
GPGPU: Complete code to allocate and free device arrays
At the beginning of each SCoP, we allocate device arrays for all arrays used on the GPU and we free such arrays after the SCoP has been executed.
llvm-svn: 276635
show more ...
|
| #
fa7b0802 |
| 25-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: initialize GPU context and simplify the corresponding GPURuntime interface.
There is no need to expose the selected device at the moment. We also pass back pointers as return values, as this
GPGPU: initialize GPU context and simplify the corresponding GPURuntime interface.
There is no need to expose the selected device at the moment. We also pass back pointers as return values, as this simplifies the interface.
llvm-svn: 276623
show more ...
|
| #
8ed5e599 |
| 25-Jul-2016 |
Tobias Grosser <[email protected]> |
IslNodeBuilder: Make finalize() virtual
This allows the finalization routine of the IslNodeBuilder to be overwritten by derived classes. Being here, we also drop the unnecessary 'Scop' postfix and t
IslNodeBuilder: Make finalize() virtual
This allows the finalization routine of the IslNodeBuilder to be overwritten by derived classes. Being here, we also drop the unnecessary 'Scop' postfix and the unnecessary 'Scop' parameter.
llvm-svn: 276622
show more ...
|
| #
9a18d559 |
| 24-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Optimize kernel IR before generating assembly code
We optimize the kernel _after_ dumping the IR we generate to make the IR we dump easier readable and independent of possible changes in the
GPGPU: Optimize kernel IR before generating assembly code
We optimize the kernel _after_ dumping the IR we generate to make the IR we dump easier readable and independent of possible changes in the general purpose LLVM optimizers.
llvm-svn: 276551
show more ...
|
| #
e1a98343 |
| 24-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Verify kernel IR before generating assembly
llvm-svn: 276550
|
| #
74dc3cb4 |
| 22-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Generate PTX assembly code for the kernel modules
Run the NVPTX backend over the GPUModule IR and write the resulting assembly code in a string.
To work correctly, it is important to invalid
GPGPU: Generate PTX assembly code for the kernel modules
Run the NVPTX backend over the GPUModule IR and write the resulting assembly code in a string.
To work correctly, it is important to invalidate analysis results that still reference the IR in the kernel module. Hence, this change clears all references to dominators, loop info, and scalar evolution.
Finally, the NVPTX backend has troubles to generate code for various special floating point types (not surprising), but also for uncommon integer types. This commit does not resolve these issues, but pulls out problematic test cases into separate files to XFAIL them individually and resolve them in future (not immediate) changes one by one.
llvm-svn: 276396
show more ...
|
| #
edb885cb |
| 21-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: generate code for ScopStatements
This change introduces the actual compute code in the GPU kernels. To ensure all values referenced from the statements in the GPU kernel are indeed available
GPGPU: generate code for ScopStatements
This change introduces the actual compute code in the GPU kernels. To ensure all values referenced from the statements in the GPU kernel are indeed available we scan all ScopStmts in the GPU kernel for references to llvm::Values that are not yet covered by already modeled outer loop iterators, parameters, or array base pointers and also pass these additional llvm::Values to the GPU kernel.
For arrays used in the GPU kernel we introduce a new ScopArrayInfo object, which is referenced by the newly generated access functions within the GPU kernel and which is used to help with code generation.
llvm-svn: 276270
show more ...
|
| #
2d58a64e |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Bail out of scops with hoisted invariant loads
This is currently not supported and will only be added later. Also update the test cases to ensure no invariant code hoisting is applied.
llvm-
GPGPU: Bail out of scops with hoisted invariant loads
This is currently not supported and will only be added later. Also update the test cases to ensure no invariant code hoisting is applied.
llvm-svn: 275987
show more ...
|
| #
5260c041 |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Emit in-kernel synchronization statements
We use this opportunity to further classify the different user statements that can arise and add TODOs for the ones not yet implemented.
llvm-svn: 2
GPGPU: Emit in-kernel synchronization statements
We use this opportunity to further classify the different user statements that can arise and add TODOs for the ones not yet implemented.
llvm-svn: 275957
show more ...
|
| #
59ab0705 |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: generate control flow within the kernel
llvm-svn: 275956
|
| #
c84a1995 |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: add scop parameters to kernel arguments
llvm-svn: 275955
|
| #
f6044bd0 |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: add host iterators to kernel arguments
llvm-svn: 275954
|
| #
472f9654 |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: add intrinsic functions to obtain a kernels thread and block ids
llvm-svn: 275953
|
| #
32837fe3 |
| 19-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: create kernel function skeleton
Create for each kernel a separate LLVM-IR module containing a single function marked as kernel function and taking one pointer for each array referenced by thi
GPGPU: create kernel function skeleton
Create for each kernel a separate LLVM-IR module containing a single function marked as kernel function and taking one pointer for each array referenced by this kernel. Add debugging output to verify the kernels are generated correctly.
llvm-svn: 275952
show more ...
|
| #
b9fc860a |
| 18-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: collect array references
Initialize the list of references to a GPU array to ensure that the arrays that need to be passed to kernel calls are computed correctly. Furthermore, the very same
GPGPU: collect array references
Initialize the list of references to a GPU array to ensure that the arrays that need to be passed to kernel calls are computed correctly. Furthermore, the very same information is also necessary to compute synchronization correctly. As the functionality to compute these references is already available, what is left for us to do is only to connect the necessary functionality to compute array reference information.
llvm-svn: 275798
show more ...
|
| #
1fb9b64d |
| 18-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Pull implementation out of class definition
This will allow us to see the full class definition even after we add non-trivial implementations of the different member functions.
llvm-svn: 275
GPGPU: Pull implementation out of class definition
This will allow us to see the full class definition even after we add non-trivial implementations of the different member functions.
llvm-svn: 275797
show more ...
|
| #
38fc0aed |
| 18-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Create host control flow
Create LLVM-IR for all host-side control flow of a given GPU AST. We implement this by introducing a new GPUNodeBuilder class derived from IslNodeBuilder. The IslNod
GPGPU: Create host control flow
Create LLVM-IR for all host-side control flow of a given GPU AST. We implement this by introducing a new GPUNodeBuilder class derived from IslNodeBuilder. The IslNodeBuilder will take care of generating all general-purpose ast nodes, but we provide our own createUser implementation to handle the different GPU specific user statements. For now, we just skip any user statement and only generate a host-code sceleton, but in subsequent commits we will add handling of normal ScopStmt's performing computations, kernel calls, as well as host-device data transfers. We will also introduce run-time check generation and LICM in subsequent commits.
llvm-svn: 275783
show more ...
|
| #
20251734 |
| 15-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Format statements scheduled on the host ourselves
Otherwise ppcg would try to call into pet functionality that this not available, which obviously will cause trouble. As we can easily print t
GPGPU: Format statements scheduled on the host ourselves
Otherwise ppcg would try to call into pet functionality that this not available, which obviously will cause trouble. As we can easily print these statements ourselves, we just do so.
llvm-svn: 275579
show more ...
|
| #
2341fe9e |
| 15-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Use schedule whole components for scheduler
This option increases the scalability of the scheduler and allows us to remove the 'gisting' workaround we introduced in r275565 to handle a more c
GPGPU: Use schedule whole components for scheduler
This option increases the scalability of the scheduler and allows us to remove the 'gisting' workaround we introduced in r275565 to handle a more complicated test case. Another benefit of using this option is also that the generated code looks a lot more streamlined.
Thanks to Sven Verdoolaege for reminding me of this option.
llvm-svn: 275573
show more ...
|
| #
e4725437 |
| 15-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Drop domain constraints from flow dependences
This works around a shortcoming of the isl scheduler, which even for some smaller test cases does not terminate in case domain constraints are pa
GPGPU: Drop domain constraints from flow dependences
This works around a shortcoming of the isl scheduler, which even for some smaller test cases does not terminate in case domain constraints are part of the flow dependences.
llvm-svn: 275565
show more ...
|
| #
6293ba69 |
| 15-Jul-2016 |
Tobias Grosser <[email protected]> |
GPGPU: Add memory reference tag ids to tagged accesses
It seems we forgot to actually add the memory access ids to the tagged accesses, but instead just tagged the accesses with empty isl_ids. This
GPGPU: Add memory reference tag ids to tagged accesses
It seems we forgot to actually add the memory access ids to the tagged accesses, but instead just tagged the accesses with empty isl_ids. This issue was found by inspection and without code generation it is difficult to test just by itself. We fix it for now without test case and expect our code generation tests to cover this later on.
llvm-svn: 275557
show more ...
|