| #
7aaf024d |
| 01-Feb-2022 |
Fangrui Song <[email protected]> |
[BitcodeWriter] Fix cases of some functions
`WriteIndexToFile` is used by external projects so I do not touch it.
|
| #
85dfe19b |
| 01-Feb-2022 |
Fangrui Song <[email protected]> |
[ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils
D116542 adds EmbedBufferInModule which introduces a layer violation (https://llvm.org/docs/CodingStandards.html#library-layering). See 2
[ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils
D116542 adds EmbedBufferInModule which introduces a layer violation (https://llvm.org/docs/CodingStandards.html#library-layering). See 2d5f857a1eaf5f7a806d12953c79b96ed8952da8 for detail.
EmbedBufferInModule does not use BitcodeWriter functionality and should be moved LLVMTransformsUtils. While here, change the function case to the prevailing convention.
It seems that EmbedBufferInModule just follows the steps of EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR update operations which ideally should be refactored to another library.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D118666
show more ...
|
| #
4a780aa1 |
| 31-Jan-2022 |
Joseph Huber <[email protected]> |
[LLVM] Resolve layer violation in BitcodeWriter
Summary: The changes introduced in D116542 added a dependency on TransformUtils to use the `appendToCompilerUsed` method. This created a circular depe
[LLVM] Resolve layer violation in BitcodeWriter
Summary: The changes introduced in D116542 added a dependency on TransformUtils to use the `appendToCompilerUsed` method. This created a circular dependency. This patch simply copies the needed function locally to remove the dependency.
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
551b1774 |
| 03-Dec-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add a flag for embedding a file into the module
This patch adds support for a flag `-fembed-offload-binary` to embed a file as an ELF section in the output by placing it in a global variabl
[OpenMP] Add a flag for embedding a file into the module
This patch adds support for a flag `-fembed-offload-binary` to embed a file as an ELF section in the output by placing it in a global variable. This can be used to bundle offloading files with the host binary so it can be accessed by the linker. The section is named using the `-fembed-offload-section` option.
Depends on D116541
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D116542
show more ...
|
| #
28bfa57a |
| 18-Jan-2022 |
Chih-Ping Chen <[email protected]> |
[DebugInfo] Add stringLocationExp field to DIStringType
DIStringType is used to encode the debug info of a character object in Fortran. A Fortran deferred-length character object is typically implem
[DebugInfo] Add stringLocationExp field to DIStringType
DIStringType is used to encode the debug info of a character object in Fortran. A Fortran deferred-length character object is typically implemented as a pair of the following two pieces of info: An address of the raw storage of the characters, and the length of the object. The stringLocationExp field contains the DIExpression to get to the raw storage.
This patch also enables the emission of DW_AT_data_location attribute in a DW_TAG_string_type debug info entry based on stringLocationExp in DIStringType.
A test is also added to ensure that the bitcode reader is backward compatible with the old DIStringType format.
Differential Revision: https://reviews.llvm.org/D117586
show more ...
|
| #
aa97bc11 |
| 21-Jan-2022 |
Nikita Popov <[email protected]> |
[NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecatin
[NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
show more ...
|
| #
62b16825 |
| 30-Dec-2021 |
Roman Lebedev <[email protected]> |
[Opaqueptrs][IR Serialization] Improve inlineasm [de]serialization
The bitcode reader expected that the pointers are typed, so that it can extract the function type for the assembly so `bitc::CST_CO
[Opaqueptrs][IR Serialization] Improve inlineasm [de]serialization
The bitcode reader expected that the pointers are typed, so that it can extract the function type for the assembly so `bitc::CST_CODE_INLINEASM` did not explicitly store said function type.
I'm not really sure how the upgrade path will look for existing bitcode, but i think we can easily support opaque pointers going forward, by simply storing the function type.
Reviewed By: #opaque-pointers, nikic
Differential Revision: https://reviews.llvm.org/D116341
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
5dc8aaac |
| 10-Aug-2021 |
Sami Tolvanen <[email protected]> |
[llvm][IR] Add no_cfi constant
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces function references with CFI jump table references, which is a problem for low-level code that need
[llvm][IR] Add no_cfi constant
With Control-Flow Integrity (CFI), the LowerTypeTests pass replaces function references with CFI jump table references, which is a problem for low-level code that needs the address of the actual function body.
For example, in the Linux kernel, the code that sets up interrupt handlers needs to take the address of the interrupt handler function instead of the CFI jump table, as the jump table may not even be mapped into memory when an interrupt is triggered.
This change adds the no_cfi constant type, which wraps function references in a value that LowerTypeTestsModule::replaceCfiUses does not replace.
Link: https://github.com/ClangBuiltLinux/linux/issues/1353
Reviewed By: nickdesaulniers, pcc
Differential Revision: https://reviews.llvm.org/D108478
show more ...
|
| #
09a704c5 |
| 10-Dec-2021 |
Mingming Liu <[email protected]> |
[LTO] Ignore unreachable virtual functions in WPD in hybrid LTO.
Differential Revision: https://reviews.llvm.org/D115492
|
| #
ca2f5389 |
| 04-Dec-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use range-based for loops (NFC)
|
| #
f6bce30c |
| 21-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops (NFC)
|
| #
7ca14f60 |
| 18-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops (NFC)
|
| #
05963a3d |
| 09-Nov-2021 |
Arthur Eubanks <[email protected]> |
Revert "[DebugInfo] Enforce implicit constraints on `distinct` MDNodes"
This reverts commit ee7652569854af567ba83e5255d70e80cc8619a1.
Causes crashes, see comments in D104827.
|
| #
ee765256 |
| 04-Nov-2021 |
Scott Linder <[email protected]> |
[DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Add UNIQUED and DISTINCT properties in Metadata.def and use them to implement restrictions on the `distinct` property of MDNodes:
* DI
[DebugInfo] Enforce implicit constraints on `distinct` MDNodes
Add UNIQUED and DISTINCT properties in Metadata.def and use them to implement restrictions on the `distinct` property of MDNodes:
* DIExpression can currently be parsed from IR or read from bitcode as `distinct`, but this property is silently dropped when printing to IR. This causes accepted IR to fail to round-trip. As DIExpression appears inline at each use in the canonical form of IR, it cannot actually be `distinct` anyway, as there is no syntax to describe it. * Similarly, DIArgList is conceptually always uniqued. It is currently restricted to only appearing in contexts where there is no syntax for `distinct`, but for consistency it is treated equivalently to DIExpression in this patch. * DICompileUnit is already restricted to always being `distinct`, but along with adding general support for the inverse restriction I went ahead and described this in Metadata.def and updated the parser to be general. Future nodes which have this restriction can share this support.
The new UNIQUED property applies to DIExpression and DIArgList, and forbids them to be `distinct`. It also implies they are canonically printed inline at each use, rather than via MDNode ID.
The new DISTINCT property applies to DICompileUnit, and requires it to be `distinct`.
A potential alternative change is to forbid the non-inline syntax for DIExpression entirely, as is done with DIArgList implicitly by requiring it appear in the context of a function. For example, we would forbid:
!named = !{!0} !0 = !DIExpression()
Instead we would only accept the equivalent inlined version:
!named = !{!DIExpression()}
This essentially removes the ability to create a `distinct` DIExpression by construction, as there is no syntax for `distinct` inline. If this patch is accepted as-is, the result would be that the non-canonical version is accepted, but the following would be an error and produce a diagnostic:
!named = !{!0} ; error: 'distinct' not allowed for !DIExpression() !0 = distinct !DIExpression()
Also update some documentation to consistently use the inline syntax for DIExpression, and to describe the restrictions on `distinct` for nodes where applicable.
Reviewed By: StephenTozer, t-tye
Differential Revision: https://reviews.llvm.org/D104827
show more ...
|
| #
89b57061 |
| 08-Oct-2021 |
Reid Kleckner <[email protected]> |
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually us
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
show more ...
|
| #
40ec1c0f |
| 07-Oct-2021 |
Itay Bookstein <[email protected]> |
[IR][NFC] Rename getBaseObject to getAliaseeObject
To better reflect the meaning of the now-disambiguated {GlobalValue, GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunctio
[IR][NFC] Rename getBaseObject to getAliaseeObject
To better reflect the meaning of the now-disambiguated {GlobalValue, GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction (D109792), the function is renamed to getAliaseeObject.
show more ...
|
| #
05392466 |
| 24-Sep-2021 |
Arthur Eubanks <[email protected]> |
Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bi
Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
show more ...
|
| #
569346f2 |
| 06-Oct-2021 |
Arthur Eubanks <[email protected]> |
Revert "Reland [IR] Increase max alignment to 4GB"
This reverts commit 8d64314ffea55f2ad94c1b489586daa8ce30f451.
|
| #
8d64314f |
| 24-Sep-2021 |
Arthur Eubanks <[email protected]> |
Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bi
Reland [IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
show more ...
|
| #
72cf8b60 |
| 06-Oct-2021 |
Arthur Eubanks <[email protected]> |
Revert "[IR] Increase max alignment to 4GB"
This reverts commit df84c1fe78130a86445d57563dea742e1b85156a.
Breaks some bots
|
| #
df84c1fe |
| 24-Sep-2021 |
Arthur Eubanks <[email protected]> |
[IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are
[IR] Increase max alignment to 4GB
Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
show more ...
|
| #
3081de8c |
| 05-Oct-2021 |
Kazu Hirata <[email protected]> |
[llvm] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name. See llvm/include/llvm/IR/InstrTypes.h for details.
|
| #
a7b4ce9c |
| 30-Sep-2021 |
Arthur Eubanks <[email protected]> |
[NFC][AttributeList] Replace index_begin/end with an iterator
We expose the fact that we rely on unsigned wrapping to iterate through all indexes. This can be confusing. Rather, keeping it as an imp
[NFC][AttributeList] Replace index_begin/end with an iterator
We expose the fact that we rely on unsigned wrapping to iterate through all indexes. This can be confusing. Rather, keeping it as an implementation detail through an iterator is less confusing and is less code.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D110885
show more ...
|
| #
20faf789 |
| 27-Sep-2021 |
modimo <[email protected]> |
[ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportu
[ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation
Thinlink provides an opportunity to propagate function attributes across modules, enabling additional propagation opportunities.
This change propagates (currently default off, turn on with `disable-thinlto-funcattrs=1`) noRecurse and noUnwind based off of function summaries of the prevailing functions in bottom-up call-graph order. Testing on clang self-build: 1. There's a 35-40% increase in noUnwind functions due to the additional propagation opportunities. 2. Throughput is measured at 10-15% increase in thinlink time which itself is 1.5% of E2E link time.
Implementation-wise this adds the following summary function attributes: 1. noUnwind: function is noUnwind 2. mayThrow: function contains a non-call instruction that `Instruction::mayThrow` returns true on (e.g. windows SEH instructions) 3. hasUnknownCall: function contains calls that don't make it into the summary call-graph thus should not be propagated from (e.g. indirect for now, could add no-opt functions as well)
Testing: Clang self-build passes and 2nd stage build passes check-all ninja check-all with newly added tests passing
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D36850
show more ...
|
| #
96cb97c4 |
| 24-Sep-2021 |
Teresa Johnson <[email protected]> |
[ThinLTO] Update combined index for SamplePGO indirect calls to locals
In ThinLTO for locals we normally compute the GUID from the name after prepending the source path to get a unique global id. Sa
[ThinLTO] Update combined index for SamplePGO indirect calls to locals
In ThinLTO for locals we normally compute the GUID from the name after prepending the source path to get a unique global id. SamplePGO indirect call profiles contain the target GUID without this uniquification, however (unless compiling with -funique-internal-linkage-names).
In order to correctly handle the call edges added to the combined index for these indirect calls, during importing and bitcode writing we consult a map of original to full GUID to identify the actual callee. However, for a large application this was consuming a lot of compile time as we need to do this repeatedly (especially during importing where we may traverse call edges multiple times).
To fix this implement a suggestion in one of the FIXME comments, and actually modify the call edges during a single traversal after the index is built to perform the fixups once. I combined this fixup with the dead code analysis performed on the index in order to avoid adding an additional walk of the index. The dead code analysis is the first analysis performed on the index.
This reduced the time required for a large thin link with SamplePGO by about 20%.
No new test added, but I confirmed that there are existing tests that will fail when no fixup is performed.
Differential Revision: https://reviews.llvm.org/D110374
show more ...
|