|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
6678f8e5 |
| 27-Jun-2022 |
Yuanfang Chen <[email protected]> |
[ubsan] Using metadata instead of prologue data for function sanitizer
Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The se
[ubsan] Using metadata instead of prologue data for function sanitizer
Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The self (global value) references inside `Prologue Data` is still pointing to the original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.
This patch detaches the information from function `Prologue Data` and attaches it to a function metadata node.
This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D115844
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
705a4c14 |
| 08-Dec-2020 |
Hongtao Yu <[email protected]> |
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdY
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s
Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead.
**ELF object emission**
The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.
Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.
The format of `.pseudo_probe_desc` section looks like:
``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ```
To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.
**Assembling**
Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.
A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.
A example assembly looks like:
``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq *%rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ```
With inlining turned on, the assembly may look different around %bb2 with an inlined probe:
``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ```
**Disassembling**
We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.
An example disassembly looks like:
``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ```
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D91878
show more ...
|
| #
7ead5f5a |
| 10-Dec-2020 |
Mitch Phillips <[email protected]> |
Revert "[CSSPGO] Pseudo probe encoding and emission."
This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.
Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/
Revert "[CSSPGO] Pseudo probe encoding and emission."
This reverts commit b035513c06d1cba2bae8f3e88798334e877523e1.
Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269
show more ...
|
| #
b035513c |
| 08-Dec-2020 |
Hongtao Yu <[email protected]> |
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdY
[CSSPGO] Pseudo probe encoding and emission.
This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s
Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead.
**ELF object emission**
The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission.
Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool.
The format of `.pseudo_probe_desc` section looks like:
``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ```
For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :
``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ```
To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index.
**Assembling**
Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis.
A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file.
A example assembly looks like:
``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq *%rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ```
With inlining turned on, the assembly may look different around %bb2 with an inlined probe:
``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ```
**Disassembling**
We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file.
An example disassembly looks like:
``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ```
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D91878
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1 |
|
| #
6861d938 |
| 14-Nov-2020 |
Roman Lebedev <[email protected]> |
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the imp
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the implementation is known-broken for certain inputs, the bugreport was up for a significant amount of timer, and there has been no activity to address it. Therefore, just completely rip out all of misexpect handling.
I suspect, fixing it requires redesigning the internals of MD_misexpect. Should anyone commit to fixing the implementation problem, starting from clean slate may be better anyways.
This reverts commit 7bdad08429411e7d0ecd58cd696b1efe3cff309e, and some of it's follow-ups, that don't stand on their own.
show more ...
|
| #
5c31b8b9 |
| 31-Oct-2020 |
Arthur Eubanks <[email protected]> |
Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4.
More uint64_t overflows.
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
| #
10f2a0d6 |
| 30-Sep-2020 |
Arthur Eubanks <[email protected]> |
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
show more ...
|
| #
2a4e704c |
| 27-Oct-2020 |
Nico Weber <[email protected]> |
Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b. Makes clang assert when building Chromium, see https://crbug.com/1142813 fo
Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit e5766f25c62c185632e3a75bf45b313eadab774b. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.
show more ...
|
| #
e5766f25 |
| 30-Sep-2020 |
Arthur Eubanks <[email protected]> |
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64
Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
show more ...
|
| #
d4c667c9 |
| 23-Oct-2020 |
Duncan P. N. Exon Smith <[email protected]> |
Avoid unnecessary uses of `MDNode::getTemporary`, NFC
This is a long-delayed follow-up to 5e5b85098dbeaea2cfa5d01695b5d2982634d7dd.
`TempMDNode` includes a bunch of machinery for RAUW, and should o
Avoid unnecessary uses of `MDNode::getTemporary`, NFC
This is a long-delayed follow-up to 5e5b85098dbeaea2cfa5d01695b5d2982634d7dd.
`TempMDNode` includes a bunch of machinery for RAUW, and should only be used when necessary. RAUW wasn't being used in any of these cases... it was just a placeholder for a self-reference.
Where the real node was using `MDNode::getDistinct`, just replace the temporary argument with `nullptr`.
Where the real node was using `MDNode::get`, the `replaceOperandWith` call was "promoting" the node to a distinct one implicitly due to self-reference detection in `MDNode::handleChangedOperand`. The `TempMDNode` was serving a purpose by delaying uniquing, but it's way simpler to just call `MDNode::getDistinct` in the first place.
Note that using a self-reference at all in these places is a hold-over from before `distinct` metadata existed. It was an old trick to create distinct nodes. It would be intrusive to change, including bitcode upgrades, etc., and it's harmless so I'm not sure there's much value in removing it from existing schemas. After this commit it still has a tiny memory cost (in the extra metadata operand) but no more overhead in construction.
Differential Revision: https://reviews.llvm.org/D90079
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
ba2e72c5 |
| 28-Mar-2020 |
Benjamin Kramer <[email protected]> |
[MDBuilder] Don't use stable sort for sorting integers.
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5 |
|
| #
7bdad084 |
| 11-Sep-2019 |
Petr Hosek <[email protected]> |
Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by
Reland "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by comparing the developer's annotation against a collected PGO profile. A more detailed proposal and discussion appears on the CFE-dev mailing list (http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the LLVM backend, and adding support for IR and sampling based profiles, in addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly influenced by the llvm.expect intrinsic (branch, switch, and select) when lowering the intrinsics. The misexpect metadata contains information about the expected target of the intrinsic so that we can check against the correct PGO counter when emitting diagnostics, and the compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight. We use these branch weight values to determine when to emit the diagnostic to the user.
A future patch should address the comment at the top of LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and UnlikelyBranchWeight values into a shared space that can be accessed outside of the LowerExpectIntrinsic pass. Once that is done, the misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the misexpect metadata from the existing profile data. However, we have avoided this to keep the code simple, and because some kind of metadata tag will be required to identify which branch/switch/select instructions are influenced by the use of llvm.expect
Patch By: paulkirth Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371635
show more ...
|
| #
57256af3 |
| 11-Sep-2019 |
Dmitri Gribenko <[email protected]> |
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This reverts commit r371584. It introduced a dependency from compiler-rt to llvm/include/ADT, which is problema
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This reverts commit r371584. It introduced a dependency from compiler-rt to llvm/include/ADT, which is problematic for multiple reasons.
One is that it is a novel dependency edge, which needs cross-compliation machinery for llvm/include/ADT (yes, it is true that right now compiler-rt included only header-only libraries, however, if we allow compiler-rt to depend on anything from ADT, other libraries will eventually get used).
Secondly, depending on ADT from compiler-rt exposes ADT symbols from compiler-rt, which would cause ODR violations when Clang is built with the profile library.
llvm-svn: 371598
show more ...
|
| #
394a8ed8 |
| 11-Sep-2019 |
Petr Hosek <[email protected]> |
clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM
This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by comparing
clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM
This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by comparing the developer's annotation against a collected PGO profile. A more detailed proposal and discussion appears on the CFE-dev mailing list (http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the LLVM backend, and adding support for IR and sampling based profiles, in addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly influenced by the llvm.expect intrinsic (branch, switch, and select) when lowering the intrinsics. The misexpect metadata contains information about the expected target of the intrinsic so that we can check against the correct PGO counter when emitting diagnostics, and the compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight. We use these branch weight values to determine when to emit the diagnostic to the user.
A future patch should address the comment at the top of LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and UnlikelyBranchWeight values into a shared space that can be accessed outside of the LowerExpectIntrinsic pass. Once that is done, the misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the misexpect metadata from the existing profile data. However, we have avoided this to keep the code simple, and because some kind of metadata tag will be required to identify which branch/switch/select instructions are influenced by the use of llvm.expect
Patch By: paulkirth Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371584
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc4 |
|
| #
7d1757ab |
| 10-Sep-2019 |
Petr Hosek <[email protected]> |
Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
This reverts commit r371484: this broke sanitizer-x86_64-linux-fast bot.
llvm-svn: 371488
|
| #
a10802fd |
| 10-Sep-2019 |
Petr Hosek <[email protected]> |
clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM
This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by comparing
clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM
This patch contains the basic functionality for reporting potentially incorrect usage of __builtin_expect() by comparing the developer's annotation against a collected PGO profile. A more detailed proposal and discussion appears on the CFE-dev mailing list (http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the LLVM backend, and adding support for IR and sampling based profiles, in addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly influenced by the llvm.expect intrinsic (branch, switch, and select) when lowering the intrinsics. The misexpect metadata contains information about the expected target of the intrinsic so that we can check against the correct PGO counter when emitting diagnostics, and the compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight. We use these branch weight values to determine when to emit the diagnostic to the user.
A future patch should address the comment at the top of LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and UnlikelyBranchWeight values into a shared space that can be accessed outside of the LowerExpectIntrinsic pass. Once that is done, the misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the misexpect metadata from the existing profile data. However, we have avoided this to keep the code simple, and because some kind of metadata tag will be required to identify which branch/switch/select instructions are influenced by the use of llvm.expect
Patch By: paulkirth Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371484
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
efd94c56 |
| 23-Apr-2019 |
Fangrui Song <[email protected]> |
Use llvm::stable_sort
While touching the code, simplify if feasible.
llvm-svn: 358996
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
043a0873 |
| 19-Jan-2019 |
Johannes Doerfert <[email protected]> |
[NFC] Fix unused variable warnings in Release builds
llvm-svn: 351641
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
| #
18251842 |
| 19-Jan-2019 |
Johannes Doerfert <[email protected]> |
AbstractCallSite -- A unified interface for (in)direct and callback calls
An abstract call site is a wrapper that allows to treat direct, indirect, and callback calls the same. If an abstract ca
AbstractCallSite -- A unified interface for (in)direct and callback calls
An abstract call site is a wrapper that allows to treat direct, indirect, and callback calls the same. If an abstract call site represents a direct or indirect call site it behaves like a stripped down version of a normal call site object. The abstract call site can also represent a callback call, thus the fact that the initially called function (=broker) may invoke a third one (=callback callee). In this case, the abstract call side hides the middle man, hence the broker function. The result is a representation of the callback call, inside the broker, but in the context of the original instruction that invoked the broker.
Again, there are up to three functions involved when we talk about callback call sites. The caller (1), which invokes the broker function. The broker function (2), that may or may not invoke the callback callee. And finally the callback callee (3), which is the target of the callback call.
The abstract call site will handle the mapping from parameters to arguments depending on the semantic of the broker function. However, it is important to note that the mapping is often partial. Thus, some arguments of the call/invoke instruction are mapped to parameters of the callee while others are not. At the same time, arguments of the callback callee might be unknown, thus "null" if queried.
This patch introduces also !callback metadata which describe how a callback broker maps from parameters to arguments. This metadata is directly created by clang for known broker functions, provided through source code attributes by the user, or later deduced by analyses.
For motivation and additional information please see the corresponding talk (slides/video) https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk20 as well as the LCPC paper http://compilers.cs.uni-saarland.de/people/doerfert/par_opt_lcpc18.pdf
Differential Revision: https://reviews.llvm.org/D54498
llvm-svn: 351627
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2 |
|
| #
3083105b |
| 15-Aug-2018 |
George Burgess IV <[email protected]> |
[Metadata] Replace a SmallVector with an array; NFC
MDNode::get takes an ArrayRef, so these should be equivalent.
llvm-svn: 339824
|
|
Revision tags: llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
| #
5f8f34e4 |
| 01-May-2018 |
Adrian Prantl <[email protected]> |
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they ar
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3 |
|
| #
4a381b44 |
| 13-Feb-2018 |
Ivan A. Kosarev <[email protected]> |
[IR] Fix creating mutable versions of TBAA access tags
Due to a typo in D41565, mutable TBAA tags created with createMutableTBAAAccessTag() lose their base types. This patch fixes that typo and upda
[IR] Fix creating mutable versions of TBAA access tags
Due to a typo in D41565, mutable TBAA tags created with createMutableTBAAAccessTag() lose their base types. This patch fixes that typo and updates tests respectively.
Differential Revision: https://reviews.llvm.org/D42364
llvm-svn: 325008
show more ...
|
|
Revision tags: llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1 |
|
| #
4d0ff0c7 |
| 17-Jan-2018 |
Ivan A. Kosarev <[email protected]> |
[Transforms] Support making mutable versions of new-format TBAA access tags
Differential Revision: https://reviews.llvm.org/D41565
llvm-svn: 322650
|
| #
bdf20261 |
| 09-Jan-2018 |
Easwaran Raman <[email protected]> |
Add a pass to generate synthetic function entry counts.
Summary: This pass synthesizes function entry counts by traversing the callgraph and using the relative block frequencies of the callsites. Th
Add a pass to generate synthetic function entry counts.
Summary: This pass synthesizes function entry counts by traversing the callgraph and using the relative block frequencies of the callsites. The intended use of these counts is in inlining to determine hot/cold callsites in the absence of profile information.
The pass is split into two files with the code that propagates the counts in a callgraph in a Utils file. I plan to add support for propagation in the thinlto link phase and the propagation code will be shared and hence this split. I did not add support to the old PM since hot callsite determination in inlining is not possible in old PM (although we could use hot callee heuristic with synthetic counts in the old PM it is not worth the effort tuning it)
Reviewers: davidxl, silvas
Subscribers: mgorny, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D41604
llvm-svn: 322110
show more ...
|