|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
| #
09dcfe68 |
| 09-Oct-2019 |
Wei Mi <[email protected]> |
[SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format
Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo,
[SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format
Currently for Text, Binary and ExtBinary format profiles, when we compile a module with samplefdo, even if there is no function showing up in the profile, we have to load all the function profiles from the profile input. That is a waste of compile time.
CompactBinary format profile has already had the support of loading function profiles on demand. In this patch, we add the support to load profile on demand for ExtBinary format. It will work no matter the sections in ExtBinary format profile are compressed or not. Experiment shows it reduces the time to compile a server benchmark by 30%.
When profile remapping and loading function profiles on demand are both used, extra work needs to be done so that the loading on demand process will take the name remapping into consideration. It will be addressed in a follow-up patch.
Differential Revision: https://reviews.llvm.org/D68601
llvm-svn: 374233
show more ...
|
| #
5f7e822d |
| 18-Sep-2019 |
Wei Mi <[email protected]> |
[SampleFDO] Minimize performance impact when profile-sample-accurate is enabled.
We can save memory and reduce binary size significantly by enabling ProfileSampleAccurate. However when ProfileSample
[SampleFDO] Minimize performance impact when profile-sample-accurate is enabled.
We can save memory and reduce binary size significantly by enabling ProfileSampleAccurate. However when ProfileSampleAccurate is true, function without sample will be regarded as cold and this could potentially cause performance regression.
To minimize the potential negative performance impact, we want to be a little conservative here saying if a function shows up in the profile, no matter as outline instance, inline instance or call targets, treat the function as not being cold. This will handle the cases such as most callsites of a function are inlined in sampled binary (thus outline copy don't get any sample) but not inlined in current build (because of source code drift, imprecise debug information, or the callsites are all cold individually but not cold accumulatively...), so that the outline function showing up as cold in sampled binary will actually not be cold after current build. After the change, such function will be treated as not cold even profile-sample-accurate is enabled.
At the same time we lower the hot criteria of callsiteIsHot check when profile-sample-accurate is enabled. callsiteIsHot is used to determined whether a callsite is hot and qualified for early inlining. When profile-sample-accurate is enabled, functions without profile will be regarded as cold and much less inlining will happen in CGSCC inlining pass, so we can worry less about size increase and be aggressive to allow more early inlining to happen for warm callsites and it is helpful for performance overall.
Differential Revision: https://reviews.llvm.org/D67561
llvm-svn: 372232
show more ...
|