|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
9727c77d |
| 25-Apr-2022 |
David Green <[email protected]> |
[NFC] Rename Instrinsic to Intrinsic
|
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
be672934 |
| 22-Feb-2022 |
Jakub Chlanda <[email protected]> |
[NVPTX] Add more FMA intriniscs/builtins
This patch adds builtins/intrinsics for the following variants of FMA:
- f16, f16x2 - rn - rn_ftz - rn_sat - rn_ftz_sat - rn_relu - rn_ftz_relu
[NVPTX] Add more FMA intriniscs/builtins
This patch adds builtins/intrinsics for the following variants of FMA:
- f16, f16x2 - rn - rn_ftz - rn_sat - rn_ftz_sat - rn_relu - rn_ftz_relu - bf16, bf16x2 - rn - rn_relu
ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly.
Differential Revision: https://reviews.llvm.org/D118977
show more ...
|
| #
e0dc4ac2 |
| 22-Feb-2022 |
Jakub Chlanda <[email protected]> |
[NVPTX] Expose float tys min, max, abs, neg as builtins
Adds support for the following builtins:
- abs, neg: - .bf16, - .bf16x2 - min, max - {.ftz}{.NaN}{.xorsign.abs}.f16 - {.ftz}{.NaN}{.xorsign.a
[NVPTX] Expose float tys min, max, abs, neg as builtins
Adds support for the following builtins:
- abs, neg: - .bf16, - .bf16x2 - min, max - {.ftz}{.NaN}{.xorsign.abs}.f16 - {.ftz}{.NaN}{.xorsign.abs}.f16x2 - {.NaN}{.xorsign.abs}.bf16 - {.NaN}{.xorsign.abs}.bf16x2 - {.ftz}{.NaN}{.xorsign.abs}.f32
Differential Revision: https://reviews.llvm.org/D117887
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
f631173d |
| 30-Sep-2021 |
Kazu Hirata <[email protected]> |
[llvm] Migrate from arg_operands to args (NFC)
Note that arg_operands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1 |
|
| #
6f6e9a86 |
| 02-Aug-2021 |
Roman Lebedev <[email protected]> |
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this, but the situation is
[BasicTTIImpl][LoopUnroll] getUnrollingPreferences(): emit ORE remark when advising against unrolling due to a call in a loop
I'm not sure this is the best way to approach this, but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D107271
show more ...
|
|
Revision tags: llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
3489c2d7 |
| 29-Apr-2021 |
Daniil Fukalov <[email protected]> |
[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https
[TTI] NFC: Change getTypeLegalizationCost to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Reviewed By: sdesmalen, kparzysz
Differential Revision: https://reviews.llvm.org/D101533
show more ...
|
| #
4f42d873 |
| 14-Apr-2021 |
Sander de Smalen <[email protected]> |
[TTI] NFC: Change getArithmeticInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https:/
[TTI] NFC: Change getArithmeticInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.
See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D100317
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
2a6c8715 |
| 03-Jun-2020 |
Sebastian Neubauer <[email protected]> |
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for imp
[InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time.
D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target).
This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic
A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64.
This allows to move about 3000 lines out from InstCombine to the targets.
Differential Revision: https://reviews.llvm.org/D81728
show more ...
|
| #
e541e1b7 |
| 10-Jul-2020 |
Sidharth Baveja <[email protected]> |
[NFC] Separate Peeling Properties into its own struct (re-land after minor fix)
Summary: This patch separates the peeling specific parameters from the UnrollingPreferences, and creates a new struct
[NFC] Separate Peeling Properties into its own struct (re-land after minor fix)
Summary: This patch separates the peeling specific parameters from the UnrollingPreferences, and creates a new struct called PeelingPreferences. Functions which used the UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct.
Author: sidbav (Sidharth Baveja)
Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov)
Reviewed By: Meinersbur (Michael Kruse)
Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80580
show more ...
|
| #
0b39d2d7 |
| 08-Jul-2020 |
Nikita Popov <[email protected]> |
Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit 0369dc98f958a1ca2ec05f1897f091129bb16e8a.
Many failing tests.
|
| #
0369dc98 |
| 08-Jul-2020 |
Sidharth Baveja <[email protected]> |
[NFC] Separate Peeling Properties into its own struct
Summary: This patch makes the peeling properties of the loop accessible by other loop transformations.
Author: sidbav (Sidharth Baveja)
Review
[NFC] Separate Peeling Properties into its own struct
Summary: This patch makes the peeling properties of the loop accessible by other loop transformations.
Author: sidbav (Sidharth Baveja)
Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)
Reviewed By: Meinersbur (Michael Kruse)
Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80580
show more ...
|
| #
6965af43 |
| 08-Jul-2020 |
Anh Tuyen Tran <[email protected]> |
Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit fead250b439bbd4ec0f21e6a52d0c174e5fcdf5a.
|
| #
fead250b |
| 08-Jul-2020 |
Anh Tuyen Tran <[email protected]> |
[NFC] Separate Peeling Properties into its own struct
Summary: This patch makes the peeling properties of the loop accessible by other loop transformations.
Author: sidbav (Sidharth Baveja)
Review
[NFC] Separate Peeling Properties into its own struct
Summary: This patch makes the peeling properties of the loop accessible by other loop transformations.
Author: sidbav (Sidharth Baveja)
Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)
Reviewed By: Meinersbur (Michael Kruse)
Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80580
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
40574fef |
| 28-Apr-2020 |
Sam Parker <[email protected]> |
[NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate bette
[NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs.
RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html
Differential Revision: https://reviews.llvm.org/D79002
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3 |
|
| #
5d986953 |
| 11-Dec-2019 |
Reid Kleckner <[email protected]> |
[IR] Split out target specific intrinsic enums into separate headers
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build
[IR] Split out target specific intrinsic enums into separate headers
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
show more ...
|
| #
be7a1070 |
| 08-Dec-2019 |
David Green <[email protected]> |
[ARM] Teach the Arm cost model that a Shift can be folded into other instructions
This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under
[ARM] Teach the Arm cost model that a Shift can be folded into other instructions
This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3
So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost.
We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp.
Differential Revision: https://reviews.llvm.org/D70966
show more ...
|
|
Revision tags: llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init |
|
| #
fa1a4e4d |
| 11-Jul-2019 |
Benjamin Kramer <[email protected]> |
[NVPTX] Use atomicrmw fadd instead of intrinsics
AutoUpgrade the old intrinsics to atomicrmw fadd.
llvm-svn: 365796
|
|
Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2 |
|
| #
b3bde2ea |
| 17-Nov-2017 |
David Blaikie <[email protected]> |
Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, n
Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around).
llvm-svn: 318490
show more ...
|
|
Revision tags: llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1 |
|
| #
66d9bdbc |
| 28-Jun-2017 |
Geoff Berry <[email protected]> |
[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, j
[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.
Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper
Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D34531
llvm-svn: 306554
show more ...
|
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1 |
|
| #
2c96c433 |
| 11-Jan-2017 |
Mohammed Agabaria <[email protected]> |
[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimiz
[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16.
Differential Revision: https://reviews.llvm.org/D28104
llvm-svn: 291657
show more ...
|
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
| #
3e121158 |
| 28-Sep-2016 |
Artem Belevich <[email protected]> |
[NVPTX] Added intrinsics for atom.gen.{sys|cta}.* instructions.
These are only available on sm_60+ GPUs.
Differential Revision: https://reviews.llvm.org/D24943
llvm-svn: 282607
|
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
| #
a466cc33 |
| 07-Jul-2016 |
Justin Bogner <[email protected]> |
NVPTX: Remove the legacy ptx intrinsics
- Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.b
NVPTX: Remove the legacy ptx intrinsics
- Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0.
There's a fair amount of code motion here, but it's all very mechanical.
llvm-svn: 274769
show more ...
|
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3 |
|
| #
93205eb9 |
| 05-Aug-2015 |
Chandler Carruth <[email protected]> |
[TTI] Make the cost APIs in TargetTransformInfo consistently use 'int' rather than 'unsigned' for their costs.
For something like costs in particular there is a natural "negative" value, that of sav
[TTI] Make the cost APIs in TargetTransformInfo consistently use 'int' rather than 'unsigned' for their costs.
For something like costs in particular there is a natural "negative" value, that of savings or saved cost. As a consequence, there is a lot of code that subtracts or creates negative values based on cost, all of which is prone to awkwardness or bugs when dealing with an unsigned type. Similarly, we *never* want these values to wrap, as that would cause Very Bad code generation (likely percieved as an infinite loop as we try to emit over 2^32 instructions or some such insanity).
All around 'int' seems a much better fit for these basic metrics. I've added asserts to ensure that at least the TTI interface never returns negative numbers here. If we ever have a use case for negative numbers, we can remove this, but this way a bug where someone used '-1' to produce a 'very large' cost will be caught by the assert.
This passes all tests, and is also UBSan clean.
No functional change intended.
Differential Revision: http://reviews.llvm.org/D11741
llvm-svn: 244080
show more ...
|
|
Revision tags: llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1 |
|
| #
4c8ca53f |
| 13-Jul-2015 |
Mark Heffernan <[email protected]> |
Enable partial and runtime loop unrolling for NVPTX.
Enable partial and runtime loop unrolling for NVPTX backend via TTI::UnrollingPreferences with a small threshold. This partially unrolls small lo
Enable partial and runtime loop unrolling for NVPTX.
Enable partial and runtime loop unrolling for NVPTX backend via TTI::UnrollingPreferences with a small threshold. This partially unrolls small loops which are often unrolled by the PTX to SASS compiler and unrolling earlier can be beneficial.
llvm-svn: 242049
show more ...
|