|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
a19fcc2b |
| 24-May-2022 |
Walter Erquinigo <[email protected]> |
[trace][intelpt] Support system-wide tracing [14] - Decode per cpu
This is the final functional patch to support intel pt decoding per cpu. It works by doing the following:
- First, all context swi
[trace][intelpt] Support system-wide tracing [14] - Decode per cpu
This is the final functional patch to support intel pt decoding per cpu. It works by doing the following:
- First, all context switches are split by tid and sorted in order. This produces a list of continuous executes per thread per core. - Then, all intel pt subtraces are split by PSB boundaries and assigned to individual thread continuous executions on the same core by doing simple TSC-based comparisons. - With this, we have, per thread, a sorted list of continuous executions each one with a list of intel pt subtraces. Up to this point, this is really fast because no instructions were actually decoded. - Then, each thread can be decoded by traversing their continuous executions and intel pt subtraces. An advantage of having these continuous executions is that we can identify if a continuous exexecution doesn't have intel pt data, and thus has a gap in it. We can later to more sofisticated comparisons to identify if within a continuous execution there are gaps.
I'm adding a test as well.
Differential Revision: https://reviews.llvm.org/D126394
show more ...
|
| #
b97b082c |
| 15-Jun-2022 |
Walter Erquinigo <[email protected]> |
Fix failures
https://lab.llvm.org/buildbot/#/builders/17/builds/23269 breaks because we are doing some asm calls that only work on x86
https://lab.llvm.org/buildbot/#/builders/68/builds/34092/steps
Fix failures
https://lab.llvm.org/buildbot/#/builders/17/builds/23269 breaks because we are doing some asm calls that only work on x86
https://lab.llvm.org/buildbot/#/builders/68/builds/34092/steps/6/logs/stdio breaks because some comparators where being done incorrectly.
show more ...
|
| #
fc5ef57c |
| 19-May-2022 |
Walter Erquinigo <[email protected]> |
[trace][intelpt] Support system-wide tracing [12] - Support multi-core trace load and save
:q! This diff is massive, but it's because it connects the client with lldb-server and also ensures that th
[trace][intelpt] Support system-wide tracing [12] - Support multi-core trace load and save
:q! This diff is massive, but it's because it connects the client with lldb-server and also ensures that the postmortem case works.
- Flatten the postmortem trace schema. The reason is that the schema has become quite complex due to the new multicore case, which defeats the original purpose of having a schema that could work for every trace plug-in. At this point, it's better that each trace plug-in defines it's own full schema. This means that the only common field is "type". -- Because of this new approach, I merged the "common" trace load and saving functionalities into the IntelPT one. This simplified the code quite a bit. If we eventually implement another trace plug-in, we can see then what we could reuse. -- The new schema, which is flattened, has now better comments and is parsed better. A change I did was to disallow hex addresses, because they are a bit error prone. I'm asking now to print the address in decimal. -- Renamed "intel" to "GenuineIntel" in the schema because that's what you see in /proc/cpuinfo. - Implemented reading the context switch trace data buffer. I had to do some refactors to do that cleanly. -- A major change that I did here was to simplify the perf_event circular buffer reading logic. It was too complex. Maybe the original Intel author had something different in mind. - Implemented all the necessary bits to read trace.json files with per-core data. - Implemented all the necessary bits to save to disk per-core trace session. - Added a test that ensures that parsing and saving to disk works.
Differential Revision: https://reviews.llvm.org/D126015
show more ...
|
| #
1f2d49a8 |
| 18-May-2022 |
Walter Erquinigo <[email protected]> |
[trace][intelpt] Support system-wide tracing [10] - Return warnings and tsc information from lldb-server.
- Add a warnings field in the jLLDBGetState response, for warnings to be delivered to the cl
[trace][intelpt] Support system-wide tracing [10] - Return warnings and tsc information from lldb-server.
- Add a warnings field in the jLLDBGetState response, for warnings to be delivered to the client for troubleshooting. This removes the need to silently log lldb-server's llvm::Errors and not expose them easily to the user - Simplify the tscPerfZeroConversion struct and schema. It used to extend a base abstract class, but I'm doubting that we'll ever add other conversion mechanisms because all modern kernels support perf zero. It is also the one who is supposed to work with the timestamps produced by the context switch trace, so expecting it is imperative. - Force tsc collection for cpu tracing. - Add a test checking that tscPerfZeroConversion is returned by the GetState request - Add a pre-check for cpu tracing that makes sure that perf zero values are available.
Differential Revision: https://reviews.llvm.org/D125932
show more ...
|
| #
7b73de9e |
| 29-Apr-2022 |
Walter Erquinigo <[email protected]> |
[trace][intelpt] Support system-wide tracing [3] - Refactor IntelPTThreadTrace
I'm refactoring IntelPTThreadTrace into IntelPTSingleBufferTrace so that it can both single threads or single cores. In
[trace][intelpt] Support system-wide tracing [3] - Refactor IntelPTThreadTrace
I'm refactoring IntelPTThreadTrace into IntelPTSingleBufferTrace so that it can both single threads or single cores. In this diff I'm basically renaming the class, moving it to its own file, and removing all the pieces that are not used along with some basic cleanup.
Differential Revision: https://reviews.llvm.org/D124648
show more ...
|
|
Revision tags: llvmorg-14.0.3 |
|
| #
5de0a3e9 |
| 27-Apr-2022 |
Walter Erquinigo <[email protected]> |
[trace][intelpt] Support system-wide tracing [1] - Add a method for accessing the list of logical core ids
In order to open perf events per core, we need to first get the list of core ids available
[trace][intelpt] Support system-wide tracing [1] - Add a method for accessing the list of logical core ids
In order to open perf events per core, we need to first get the list of core ids available in the system. So I'm adding a function that does that by parsing /proc/cpuinfo. That seems to be the simplest and most portable way to do that.
Besides that, I made a few refactors and renames to reflect better that the cpu info that we use in lldb-server comes from procfs.
Differential Revision: https://reviews.llvm.org/D124573
show more ...
|
|
Revision tags: llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
9b79187c |
| 22-Mar-2022 |
Jakob Johnson <[email protected]> |
[trace][intelpt] Server side changes for TSC to wall time conversion
Update the response schema of the TraceGetState packet and add Intel PT specific response structure that contains the TSC convers
[trace][intelpt] Server side changes for TSC to wall time conversion
Update the response schema of the TraceGetState packet and add Intel PT specific response structure that contains the TSC conversion, if it exists. The IntelPTCollector loads the TSC conversion and caches it to prevent unnecessary calls to perf_event_open. Move the TSC conversion calculation from Perf.h to TraceIntelPTGDBRemotePackets.h to remove dependency on Linux specific headers.
Differential Revision: https://reviews.llvm.org/D122246
show more ...
|
| #
45d9aab7 |
| 21-Mar-2022 |
Jakob Johnson <[email protected]> |
Fix e6c84f82b87576a57d1fa1c7e8c289d3d4fa7ab1
Failed buildbot: https://lab.llvm.org/buildbot/#/builders/17/builds/19490
Only run perf event tsc conversion test on x86_64.
|
| #
d1375285 |
| 21-Mar-2022 |
Jakob Johnson <[email protected]> |
Fix e6c84f82b87576a57d1fa1c7e8c289d3d4fa7ab1
Failed buildbot: https://lab.llvm.org/buildbot/#/builders/68/builds/29250
Use toString() to consume the Error
|
| #
e6c84f82 |
| 15-Mar-2022 |
Jakob Johnson <[email protected]> |
Add thin wrapper for perf_event_open API - Add PerfEvent class to handle creating ring buffers and handle the resources associated with a perf_event - Refactor IntelPT collection code to use this
Add thin wrapper for perf_event_open API - Add PerfEvent class to handle creating ring buffers and handle the resources associated with a perf_event - Refactor IntelPT collection code to use this new API - Add TSC to timestamp conversion logic with unittest
Differential Revision: https://reviews.llvm.org/D121734
show more ...
|