Lines Matching refs:GPU
58 $ clang++ axpy.cu -o axpy --cuda-gpu-arch=<GPU arch> \
81 * ``<GPU arch>`` -- the `compute capability
82 <https://developer.nvidia.com/cuda-gpus>`_ of your GPU. For example, if you
83 want to run your program on a GPU with compute capability of 3.5, specify
101 GPU hardware allows for more control over numerical operations than most CPUs,
242 * For each GPU architecture ``arch`` that we're compiling for, do:
248 ``S_arch``, containing GPU machine code (SASS) for ``arch``.
262 * For each GPU architecture ``arch`` that we're compiling for, do:
288 host compilation and during device compilation for each GPU architecture.)
503 on a CPU isn't necessarily fast on a GPU. We've made a number of changes to
504 LLVM to make it generate good GPU code. Among these changes are:
534 control flow transfer in GPU is more expensive. More aggressive unrolling and