MachineBlockPlacement.cpp - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/CodeGen/MachineBlockPlacement.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 9e6d1f4b	17-Jul-2022	Kazu Hirata <[email protected]>	[CodeGen] Qualify auto variables in for loops (NFC)
Revision tags: llvmorg-14.0.6
# 1e67385d	17-Jun-2022	Mingming Liu <[email protected]>	[MachineBlockPlacementStats] Added check for "-filter-print-funcs" option to the machine-block-placement-stats. Differential Revision: https://reviews.llvm.org/D128019
# b7d09557	17-Jun-2022	Mingming Liu <[email protected]>	Revert "[MachineBlockPlacementStats] Add check for `-filter-print-funcs` option to machine-block-placement stats." This reverts commit 46d45df4516e9a5bc43460429cd02cd04a85db1a. Going to add differen Revert "[MachineBlockPlacementStats] Add check for `-filter-print-funcs` option to machine-block-placement stats." This reverts commit 46d45df4516e9a5bc43460429cd02cd04a85db1a. Going to add differential revision link to commit message and re-commit. show more ...
# 46d45df4	17-Jun-2022	Mingming Liu <[email protected]>	[MachineBlockPlacementStats] Add check for `-filter-print-funcs` option to machine-block-placement stats.
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 989f1c72	15-Mar-2022	serge-sans-paille <[email protected]>	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# a278250b	10-Mar-2022	Nico Weber <[email protected]>	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https:/ Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169 show more ...
# 7f230fee	07-Mar-2022	serge-sans-paille <[email protected]>	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
Revision tags: llvmorg-14.0.0-rc2
# bcdc0477	01-Mar-2022	spupyrev <[email protected]>	speeding up ext-tsp for huge instances Differential Revision: https://reviews.llvm.org/D120780
Revision tags: llvmorg-14.0.0-rc1
# dee058c6	05-Feb-2022	Hongtao Yu <[email protected]>	[CSSPGO] Turn on ext-tsp by default for CSSPGO. I'm seeing ext-tsp helps CSSPGO for our intern large benchmarks so I'm turning on it for CSSPGO. For non-CS AutoFDO, ext-tsp doesn't seem to help, pro [CSSPGO] Turn on ext-tsp by default for CSSPGO. I'm seeing ext-tsp helps CSSPGO for our intern large benchmarks so I'm turning on it for CSSPGO. For non-CS AutoFDO, ext-tsp doesn't seem to help, probably because of lower profile counts quality. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D119048 show more ...
Revision tags: llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 73d92faa	01-Dec-2021	Nicholas Guy <[email protected]>	[CodeGen] Emit alignment "Max Skip" operand The current AsmPrinter has support to emit the "Max Skip" operand (the 3rd of .p2align), however has no support for it to actually be specified. Adding Ma [CodeGen] Emit alignment "Max Skip" operand The current AsmPrinter has support to emit the "Max Skip" operand (the 3rd of .p2align), however has no support for it to actually be specified. Adding MaxBytesForAlignment to MachineBasicBlock provides this capability on a per-block basis. Leaving the value as default (0) causes no observable differences in behaviour. Differential Revision: https://reviews.llvm.org/D114590 show more ...
Revision tags: llvmorg-13.0.1-rc1
# f573f686	08-Nov-2021	spupyrev <[email protected]>	ext-tsp basic block layout A new basic block ordering improving existing MachineBlockPlacement. The algorithm tries to find a layout of nodes (basic blocks) of a given CFG optimizing jump locality ext-tsp basic block layout A new basic block ordering improving existing MachineBlockPlacement. The algorithm tries to find a layout of nodes (basic blocks) of a given CFG optimizing jump locality and thus processor I-cache utilization. This is achieved via increasing the number of fall-through jumps and co-locating frequently executed nodes together. The name follows the underlying optimization problem, Extended-TSP, which is a generalization of classical (maximum) Traveling Salesmen Problem. The algorithm is a greedy heuristic that works with chains (ordered lists) of basic blocks. Initially all chains are isolated basic blocks. On every iteration, we pick a pair of chains whose merging yields the biggest increase in the ExtTSP value, which models how i-cache "friendly" a specific chain is. A pair of chains giving the maximum gain is merged into a new chain. The procedure stops when there is only one chain left, or when merging does not increase ExtTSP. In the latter case, the remaining chains are sorted by density in decreasing order. An important aspect is the way two chains are merged. Unlike earlier algorithms (e.g., based on the approach of Pettis-Hansen), two chains, X and Y, are first split into three, X1, X2, and Y. Then we consider all possible ways of gluing the three chains (e.g., X1YX2, X1X2Y, X2X1Y, X2YX1, YX1X2, YX2X1) and choose the one producing the largest score. This improves the quality of the final result (the search space is larger) while keeping the implementation sufficiently fast. Differential Revision: https://reviews.llvm.org/D113424 show more ...
# 3678326d	07-Dec-2021	Nico Weber <[email protected]>	Revert "ext-tsp basic block layout" This reverts commit c68f71eb37c2b6ffcf29e865d443a910e73083bd. Breaks tests on arm hosts, see comments on https://reviews.llvm.org/D113424
# c68f71eb	08-Nov-2021	spupyrev <[email protected]>	ext-tsp basic block layout A new basic block ordering improving existing MachineBlockPlacement. The algorithm tries to find a layout of nodes (basic blocks) of a given CFG optimizing jump locality ext-tsp basic block layout A new basic block ordering improving existing MachineBlockPlacement. The algorithm tries to find a layout of nodes (basic blocks) of a given CFG optimizing jump locality and thus processor I-cache utilization. This is achieved via increasing the number of fall-through jumps and co-locating frequently executed nodes together. The name follows the underlying optimization problem, Extended-TSP, which is a generalization of classical (maximum) Traveling Salesmen Problem. The algorithm is a greedy heuristic that works with chains (ordered lists) of basic blocks. Initially all chains are isolated basic blocks. On every iteration, we pick a pair of chains whose merging yields the biggest increase in the ExtTSP value, which models how i-cache "friendly" a specific chain is. A pair of chains giving the maximum gain is merged into a new chain. The procedure stops when there is only one chain left, or when merging does not increase ExtTSP. In the latter case, the remaining chains are sorted by density in decreasing order. An important aspect is the way two chains are merged. Unlike earlier algorithms (e.g., based on the approach of Pettis-Hansen), two chains, X and Y, are first split into three, X1, X2, and Y. Then we consider all possible ways of gluing the three chains (e.g., X1YX2, X1X2Y, X2X1Y, X2YX1, YX1X2, YX2X1) and choose the one producing the largest score. This improves the quality of the final result (the search space is larger) while keeping the implementation sufficiently fast. Differential Revision: https://reviews.llvm.org/D113424 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# c9fca53a	10-Sep-2021	Kazu Hirata <[email protected]>	[CodeGen, Target] Use pred_empty and succ_empty (NFC)
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1
# 50b62731	29-Jul-2021	Guozhi Wei <[email protected]>	[MBP] findBestLoopTopHelper should exit if OldTop is not a chain header Function findBestLoopTopHelper tries to find a new loop top block which can also fall through to OldTop, but it's impossible i [MBP] findBestLoopTopHelper should exit if OldTop is not a chain header Function findBestLoopTopHelper tries to find a new loop top block which can also fall through to OldTop, but it's impossible if OldTop is not a chain header, so it should exit immediately. Differential Revision: https://reviews.llvm.org/D106329 show more ...
Revision tags: llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# d8aba75a	07-May-2021	Fangrui Song <[email protected]>	Internalize some cl::opt global variables or move them under namespace llvm
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3
# cd880442	28-Jan-2021	Nicholas Guy <[email protected]>	[CodeGen][AArch64] Add TargetInstrInfo hook to modify the TailDuplicateSize default threshold Different targets might handle branch performance differently, so this patch allows for targets to speci [CodeGen][AArch64] Add TargetInstrInfo hook to modify the TailDuplicateSize default threshold Different targets might handle branch performance differently, so this patch allows for targets to specify the TailDuplicateSize threshold. Said threshold defines how small a branch can be and still be duplicated to generate straight-line code instead. This patch also specifies said override values for the AArch64 subtarget. Differential Revision: https://reviews.llvm.org/D95631 show more ...
Revision tags: llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 7bc76fd0	31-Dec-2020	Kazu Hirata <[email protected]>	[CodeGen] Construct SmallVector with iterator ranges (NFC)
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2
# 687e80be	16-Dec-2020	Guozhi Wei <[email protected]>	[MBP] Add whole chain to BlockFilterSet instead of individual BB Currently we add individual BB to BlockFilterSet if its frequency satisfies LoopFreq / Freq <= LoopToColdBlockRatio LoopFreq is e [MBP] Add whole chain to BlockFilterSet instead of individual BB Currently we add individual BB to BlockFilterSet if its frequency satisfies LoopFreq / Freq <= LoopToColdBlockRatio LoopFreq is edge frequency from outside to loop header. LoopToColdBlockRatio is a command line parameter. It doesn't make sense since we always layout whole chain, not individual BBs. It may also cause a tricky problem. Sometimes it is possible that the LoopFreq of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop, like .cold in the test case. So it is added to the chain of inner loop. When work on the outer loop, .cold is not added to BlockFilterSet, so the edge to successor .problem is not counted in UnscheduledPredecessors of .problem chain. But other blocks in the inner loop are added BlockFilterSet, so the whole inner loop chain can be layout, and markChainSuccessors is called to decrease UnscheduledPredecessors of following chains. markChainSuccessors calls markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold, so .problem chain's UnscheduledPredecessors is decreased, but this edge was not counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors becomes 0 when it still has an unscheduled predecessor .pred! And it causes problems in following various successor BB selection algorithms. Differential Revision: https://reviews.llvm.org/D89088 show more ...
# d50d7c37	14-Dec-2020	Guozhi Wei <[email protected]>	[MBP] Prevent rotating a chain contains entry block The entry block should always be the first BB in a function. So we should not rotate a chain contains the entry block. Differential Revision: htt [MBP] Prevent rotating a chain contains entry block The entry block should always be the first BB in a function. So we should not rotate a chain contains the entry block. Differential Revision: https://reviews.llvm.org/D92882 show more ...
# ee5b5b7a	14-Dec-2020	Kazu Hirata <[email protected]>	[CodeGen] Use llvm::erase_value (NFC)
# a553ac97	05-Dec-2020	Kazu Hirata <[email protected]>	[CodeGen] llvm::erase_if (NFC)
Revision tags: llvmorg-11.0.1-rc1
# 68403af0	22-Nov-2020	Kazu Hirata <[email protected]>	[MBP] Remove unused declaration shouldPredBlockBeOutlined (NFC) The function was introduced on Jun 12, 2016 in commit 071d0f180794f7819c44026815614ce8fa00a3bd. Its definition was removed on Mar 2, [MBP] Remove unused declaration shouldPredBlockBeOutlined (NFC) The function was introduced on Jun 12, 2016 in commit 071d0f180794f7819c44026815614ce8fa00a3bd. Its definition was removed on Mar 2, 2017 in commit 1393761e0ca3fe8271245762f78daf4d5208cd77. show more ...
# e42f6c0a	23-Oct-2020	Han Shen <[email protected]>	Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB" This reverts commit adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4. This is reverted because it caused an chrome error: https://c Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB" This reverts commit adfb5415010fbbc009a4a6298cfda7a6ed4fa6d4. This is reverted because it caused an chrome error: https://crbug.com/1140168 show more ...
# adfb5415	14-Oct-2020	Guozhi Wei <[email protected]>	[MBP] Add whole chain to BlockFilterSet instead of individual BB Currently we add individual BB to BlockFilterSet if its frequency satisfies LoopFreq / Freq <= LoopToColdBlockRatio LoopFreq is edg [MBP] Add whole chain to BlockFilterSet instead of individual BB Currently we add individual BB to BlockFilterSet if its frequency satisfies LoopFreq / Freq <= LoopToColdBlockRatio LoopFreq is edge frequency from outside to loop header. LoopToColdBlockRatio is a command line parameter. It doesn't make sense since we always layout whole chain, not individual BBs. It may also cause a tricky problem. Sometimes it is possible that the LoopFreq of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop, like .cold in the test case. So it is added to the chain of inner loop. When work on the outer loop, .cold is not added to BlockFilterSet, so the edge to successor .problem is not counted in UnscheduledPredecessors of .problem chain. But other blocks in the inner loop are added BlockFilterSet, so the whole inner loop chain can be layout, and markChainSuccessors is called to decrease UnscheduledPredecessors of following chains. markChainSuccessors calls markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold, so .problem chain's UnscheduledPredecessors is decreased, but this edge was not counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors becomes 0 when it still has an unscheduled predecessor .pred! And it causes problems in following various successor BB selection algorithms. Differential Revision: https://reviews.llvm.org/D89088 show more ...
12 3 4 5 6 7 8 9 10 >>...12