| 52e082b6 | 08-Oct-2019 |
Alex Zinenko <[email protected]> |
GPUToCUDA: emit addressof directly instead of wrapping it into a getter function
Originally, the CUBIN getter function was introduced as a mechanism to circumvent the absence of globals in the LLVM
GPUToCUDA: emit addressof directly instead of wrapping it into a getter function
Originally, the CUBIN getter function was introduced as a mechanism to circumvent the absence of globals in the LLVM dialect. It would allocate memory and populate it with the CUBIN data. LLVM dialect now supports globals and they are already used to store CUBIN data, making the getter function a trivial address computation of a global. Emit the address computation directly at the place of `gpu.launch_func` instead of putting it in a function and calling it. This simplifies the conversion flow and prepares it for using the DialectConversion infrastructure.
PiperOrigin-RevId: 273496221
show more ...
|
| 16af5924 | 08-Oct-2019 |
Alex Zinenko <[email protected]> |
Fuse GenerateCubinAccessors pass into LaunchFunctToCuda
Now that the accessor function is a trivial getter of the global variable, it makes less sense to have the getter generation as a separate pas
Fuse GenerateCubinAccessors pass into LaunchFunctToCuda
Now that the accessor function is a trivial getter of the global variable, it makes less sense to have the getter generation as a separate pass. Move the getter generation into the lowering of `gpu.launch_func` to CUDA calls. This change is mostly code motion, but the process can be simplified further by generating the addressof inplace instead of using a call. This is will be done in a follow-up.
PiperOrigin-RevId: 273492517
show more ...
|