|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6 |
|
| #
74fb903b |
| 05-Mar-2025 |
Thomas Falcon <[email protected]> |
perf script: Fix output type for dynamically allocated core PMU's
This patch was originally posted here:
https://lore.kernel.org/all/[email protected]/
I have rebased
perf script: Fix output type for dynamically allocated core PMU's
This patch was originally posted here:
https://lore.kernel.org/all/[email protected]/
I have rebased on top of Arnaldo's patch here:
https://lore.kernel.org/all/Z2XCi3PgstSrV0SE@x1/
The original commit message: " perf script output may show different fields on different core PMU's that exist on heterogeneous platforms. For example,
perf record -e "{cpu_core/mem-loads-aux/,cpu_core/event=0xcd,\ umask=0x01,ldlat=3,name=MEM_UOPS_RETIRED.LOAD_LATENCY/}:upp"\ -c10000 -W -d -a -- sleep 1
perf script:
chromium-browse 46572 [002] 544966.882384: 10000 cpu_core/MEM_UOPS_RETIRED.LOAD_LATENCY/: 7ffdf1391b0c 10268100142 \ |OP LOAD|LVL L1 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 5 7 0 7fad7c47425d [unknown] (/usr/lib64/libglib-2.0.so.0.8000.3)
perf record -e cpu_atom/event=0xd0,umask=0x05,ldlat=3,\ name=MEM_UOPS_RETIRED.LOAD_LATENCY/upp -c10000 -W -d -a -- sleep 1
perf script:
gnome-control-c 534224 [023] 544951.816227: 10000 cpu_atom/MEM_UOPS_RETIRED.LOAD_LATENCY/: 7f0aaaa0aae0 [unknown] (/usr/lib64/libglib-2.0.so.0.8000.3)
Some fields, such as data_src, are not included by default.
The cause is that while one PMU may be assigned a type such as PERF_TYPE_RAW, other core PMU's are dynamically allocated at boot time. If this value does not match an existing PERF_TYPE_X value, output_type(perf_event_attr.type) will return OUTPUT_TYPE_OTHER.
Instead search for a core PMU with a matching perf_event_attr type and, if one is found, return PERF_TYPE_RAW to match output of other core PMU's. "
Suggested-by: Kan Liang <[email protected]> Suggested-by: Ian Rogers <[email protected]> Signed-off-by: Thomas Falcon <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
| #
1e66dcff |
| 04-Mar-2025 |
Leo Yan <[email protected]> |
perf script: Add not taken event for branch stack
The branch stack has an existed field for printing mispredict, extend the field for printing events and add support not-taken event.
Reviewed-by: I
perf script: Add not taken event for branch stack
The branch stack has an existed field for printing mispredict, extend the field for printing events and add support not-taken event.
Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
| #
2b747a86 |
| 04-Mar-2025 |
Leo Yan <[email protected]> |
perf script: Make printing flags reliable
Add a check for the generated string of flags. Print out the raw number if the string generation fails.
Use the SAMPLE_FLAGS_STR_ALIGNED_SIZE macro to rep
perf script: Make printing flags reliable
Add a check for the generated string of flags. Print out the raw number if the string generation fails.
Use the SAMPLE_FLAGS_STR_ALIGNED_SIZE macro to replace the value '21'.
Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
dc6d2bc2 |
| 13-Jan-2025 |
Ian Rogers <[email protected]> |
perf sample: Make user_regs and intr_regs optional
The struct dump_regs contains 512 bytes of cache_regs, meaning the two values in perf_sample contribute 1088 bytes of its total 1384 bytes size. In
perf sample: Make user_regs and intr_regs optional
The struct dump_regs contains 512 bytes of cache_regs, meaning the two values in perf_sample contribute 1088 bytes of its total 1384 bytes size. Initializing this much memory has a cost reported by Tavian Barnes <[email protected]> as about 2.5% when running `perf script --itrace=i0`: https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/
Adrian Hunter <[email protected]> replied that the zero initialization was necessary and couldn't simply be removed.
This patch aims to strike a middle ground of still zeroing the perf_sample, but removing 79% of its size by make user_regs and intr_regs optional pointers to zalloc-ed memory. To support the allocation accessors are created for user_regs and intr_regs. To support correct cleanup perf_sample__init and perf_sample__exit functions are created and added throughout the code base.
Signed-off-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
efff5add |
| 20-Dec-2024 |
Arnaldo Carvalho de Melo <[email protected]> |
perf script: Cache the output type
Right now every time we need to figure out the type of an evsel for output purposes we do a quick sequence of ifs, but there are new cases where there is a need to
perf script: Cache the output type
Right now every time we need to figure out the type of an evsel for output purposes we do a quick sequence of ifs, but there are new cases where there is a need to do more complex iterations over multiple data structures, sso allow for caching this operation on a hole of 'struct evsel'.
This should really be done on the evsel->priv area that 'perf script' sets up, but more work is needed to make sure that it is allocated when we need it, right now it is only used for conditionally, add some comments so that we move this to that 'perf script' specific area when the conditions are in place for that.
Acked-by: Thomas Falcon <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/lkml/Z2XCi3PgstSrV0SE@x1 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1 |
|
| #
dc7be5e4 |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf script: Move perf_sample__sprintf_flags to trace-event-scripting.c
perf_sample__sprintf_flags is used in the python C code and so needs to be in the util library rather than a builtin.
Signed-
perf script: Move perf_sample__sprintf_flags to trace-event-scripting.c
perf_sample__sprintf_flags is used in the python C code and so needs to be in the util library rather than a builtin.
Signed-off-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Mark Rutland <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Howard Chu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: James Clark <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
1ff2ca39 |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf script: Move script_fetch_insn to trace-event-scripting.c
Add native_arch as a parameter to script_fetch_insn rather than relying on the builtin-script value that won't be initialized for the d
perf script: Move script_fetch_insn to trace-event-scripting.c
Add native_arch as a parameter to script_fetch_insn rather than relying on the builtin-script value that won't be initialized for the dlfilter and python Context use cases. Assume both of those cases are running natively.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
04051b4a |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf script: Move script_spec code to trace-event-scripting.c
The script_spec code is referenced in util/trace-event-scripting but the list was in builtin-script, accessed via a function that requir
perf script: Move script_spec code to trace-event-scripting.c
The script_spec code is referenced in util/trace-event-scripting but the list was in builtin-script, accessed via a function that required a stub function in python.c. Move all the logic to trace-event-scripting, with lookup and foreach functions exposed for builtin-script's benefit.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
9557d156 |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf stat: Move stat_config into config.c
stat_config is accessed by config.c via helper functions, but declared in builtin-stat. Move to util/config.c so that stub functions aren't needed in python
perf stat: Move stat_config into config.c
stat_config is accessed by config.c via helper functions, but declared in builtin-stat. Move to util/config.c so that stub functions aren't needed in python.c which doesn't link against the builtin files.
To avoid name conflicts change builtin-script to use the same stat_config as builtin-stat. Rename local variables in tests to avoid shadow declaration warnings.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
d927e30c |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf script: Move find_scripts to browser/scripts.c
The only use of find_scripts is in browser/scripts.c but the definition in builtin causes linking problems requiring a stub in python.c. Move the
perf script: Move find_scripts to browser/scripts.c
The only use of find_scripts is in browser/scripts.c but the definition in builtin causes linking problems requiring a stub in python.c. Move the function to allow the stub to be removed.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
f76f94dc |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf script: Use openat for directory iteration
Rewrite the directory iteration to use openat so that large character arrays aren't needed. The arrays are warned about potential buffer overflows by
perf script: Use openat for directory iteration
Rewrite the directory iteration to use openat so that large character arrays aren't needed. The arrays are warned about potential buffer overflows by GCC when the code exists in a single C file.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
702c7a4a |
| 19-Nov-2024 |
Ian Rogers <[email protected]> |
perf script: Move scripting_max_stack out of builtin
scripting_max_stack is used in util code which is linked into the python module. Move the variable declaration to util/trace-event-scripting.c to
perf script: Move scripting_max_stack out of builtin
scripting_max_stack is used in util code which is linked into the python module. Move the variable declaration to util/trace-event-scripting.c to avoid conditional compilation.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dapeng Mi <[email protected]> Cc: Howard Chu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Veronika Molnarova <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
c46d634a |
| 18-Nov-2024 |
Ian Rogers <[email protected]> |
perf evsel: Add/use accessor for tp_format
Add an accessor function for tp_format. Rather than search+replace uses try to use a variable and reuse it. Add additional NULL checks when accessing/using
perf evsel: Add/use accessor for tp_format
Add an accessor function for tp_format. Rather than search+replace uses try to use a variable and reuse it. Add additional NULL checks when accessing/using the value. Make sure the PTR_ERR is nulled out on error path in evsel__newtp_idx.
Reviewed-by: Namhyung Kim <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dominique Martinet <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Falcon <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yang Li <[email protected]> Cc: Ze Gao <[email protected]> Cc: Zixian Cai <[email protected]> Cc: zhaimingbing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.12, v6.12-rc7 |
|
| #
35de42cd |
| 05-Nov-2024 |
Yicong Yang <[email protected]> |
perf build: Include libtraceevent headers directly indicated by pkg-config
Currently the libtraceevent's found by pkg-config, which give the include path as:
[root@localhost tmp]# pkg-config --cf
perf build: Include libtraceevent headers directly indicated by pkg-config
Currently the libtraceevent's found by pkg-config, which give the include path as:
[root@localhost tmp]# pkg-config --cflags libtraceevent -I/usr/local/include/traceevent
So we should include the libtraceevent headers directly without "traceevent/" prefix. Update all the users.
Fixes: 0f0e1f445690 ("perf build: Use pkg-config for feature check for libtrace{event,fs}") Suggested-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/linux-perf-users/[email protected]/ Signed-off-by: Yicong Yang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc6, v6.12-rc5 |
|
| #
edff8dad |
| 25-Oct-2024 |
Graham Woodward <[email protected]> |
perf arm-spe: Correctly set sample flags
Set flags on all synthesized instruction and branch samples.
Signed-off-by: Graham Woodward <[email protected]> Reviewed-by: James Clark <james.clark@
perf arm-spe: Correctly set sample flags
Set flags on all synthesized instruction and branch samples.
Signed-off-by: Graham Woodward <[email protected]> Reviewed-by: James Clark <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc4 |
|
| #
37b77ae9 |
| 17-Oct-2024 |
Ian Rogers <[email protected]> |
perf stat: Change color to threshold in print_metric
Colors don't mean things in CSV and JSON output, switch to a threshold enum value that the standard output can convert to a color. Updating the C
perf stat: Change color to threshold in print_metric
Colors don't mean things in CSV and JSON output, switch to a threshold enum value that the standard output can convert to a color. Updating the CSV and JSON output will be later changes.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Yicong Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Will Deacon <[email protected]> Cc: James Clark <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tim Chen <[email protected]> Cc: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11 |
|
| #
edf3ce0e |
| 09-Sep-2024 |
Kan Liang <[email protected]> |
perf env: Find correct branch counter info on hybrid
No event is printed in the "Branch Counter" column on hybrid machines.
For example,
$ perf record -e "{cpu_core/branch-instructions/pp,cpu_co
perf env: Find correct branch counter info on hybrid
No event is printed in the "Branch Counter" column on hybrid machines.
For example,
$ perf record -e "{cpu_core/branch-instructions/pp,cpu_core/branches/}:S" -j any,counter $ perf report --total-cycles
# Branch counter abbr list: # cpu_core/branch-instructions/pp = A # cpu_core/branches/ = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter # ............... .............. ........... .......... .............. 44.54% 727.1K 0.00% 1 |+ |+ | 36.31% 592.7K 0.00% 2 |+ |+ | 17.83% 291.1K 0.00% 1 |+ |+ |
The branch counter information (br_cntr_width and br_cntr_nr) in the perf_env is retrieved from the CPU_PMU_CAPS. However, the CPU_PMU_CAPS is not available on hybrid machines. Without the width information, the number of occurrences of an event cannot be calculated.
For a hybrid machine, the caps information should be retrieved from the PMU_CAPS, and stored in the perf_env->pmu_caps.
Add a perf_env__find_br_cntr_info() to return the correct branch counter information from the corresponding fields.
Committer notes:
While testing I couldn't s ee those "Branch counter" columns enabled by pressing 'B' on the TUI, after reporting it to the list Kan explained the situation:
<quote Kan Liang> For a hybrid client, the "Branch Counter" feature is only supported starting from the just released Lunar Lake. Perf falls back to only "ANY" on your Raptor Lake.
The "The branch counter is not available" message is expected.
Here is the 'perf evlist' result from my Lunar Lake machine,
# perf evlist -v cpu_core/branch-instructions/pp: type: 4 (cpu_core), size: 136, config: 0xc4 (branch-instructions), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|READ|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID|GROUP|LOST, disabled: 1, freq: 1, enable_on_exec: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, branch_sample_type: ANY|COUNTERS # </quote>
Fixes: 6f9d8d1de2c61288 ("perf script: Add branch counters") Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7 |
|
| #
bf0db8c7 |
| 29-Feb-2024 |
Andi Kleen <[email protected]> |
perf script: Minimize "not reaching sample" for '-F +brstackinsn'
In some situations 'perf script -F +brstackinsn' sees a lot of "not reaching sample" messages.
This happens when the last LBR block
perf script: Minimize "not reaching sample" for '-F +brstackinsn'
In some situations 'perf script -F +brstackinsn' sees a lot of "not reaching sample" messages.
This happens when the last LBR block before the sample contains a branch that is not in the LBR, and the instruction dumping stops.
$ perf record -b emacs -Q --batch '()' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.396 MB perf.data (443 samples) ] $ perf script -F +brstackinsn ... 00007f0ab2d171a4 insn: 41 0f 94 c0 00007f0ab2d171a8 insn: 83 fa 01 00007f0ab2d171ab insn: 74 d3 # PRED 6 cycles [313] 1.00 IPC 00007f0ab2d17180 insn: 45 84 c0 00007f0ab2d17183 insn: 74 28 ... not reaching sample ...
$ perf script -F +brstackinsn | grep -c reach 136 $
This is a problem for further analysis that wants to see the full code upto the sample.
There are two common cases where the message is bogus:
- The LBR only logs taken branches, but the branch might be a conditional branch that is not taken (that is the most common case actually)
- The LBR sampling uses a filter ignoring some branches, but the perf script check checks for all branches.
This patch fixes these two conditions, by only checking for conditional branches, as well as checking the perf_event_attr's branch filter attributes.
For the test case above it fixes all the messages:
$ ./perf script -F +brstackinsn | grep -c reach 0
Note that there are still conditions when the message is hit -- sometimes there can be a unconditional branch that misses the LBR update before the sample -- but they are much more rare now.
Signed-off-by: Andi Kleen <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
6f9d8d1d |
| 13-Aug-2024 |
Kan Liang <[email protected]> |
perf script: Add branch counters
It's useful to print the branch counter information for each jump in the brstackinsn when it's available.
Add a new field 'brcntr' to display the branch counter inf
perf script: Add branch counters
It's useful to print the branch counter information for each jump in the brstackinsn when it's available.
Add a new field 'brcntr' to display the branch counter information.
By default, the abbreviation will be used to indicate the branch counter. In the verbose mode, the real event name is shown.
$ perf script -F +brstackinsn,+brcntr
# Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (home/sdp/test/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: AA # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: A # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 7 cycles [14] 0.43 IPC
$ perf script -F +brstackinsn,+brcntr -v
tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (/home/sdp/os.linux.perf.test-suite/kernels/lbr_kernel/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: branch-instructions:ppp 2 branch-misses 0 # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 7 cycles [14] 0.43 IPC
Originally-by: Tinghao Zhang <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Signed-off-by: Kan Liang <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
2fa28ccb |
| 12-Aug-2024 |
Ian Rogers <[email protected]> |
perf script: Use perf_tool__init()
Use perf_tool__init() so that more uses of 'struct perf_tool' can be const and not relying on perf_tool__fill_defaults().
Signed-off-by: Ian Rogers <irogers@googl
perf script: Use perf_tool__init()
Use perf_tool__init() so that more uses of 'struct perf_tool' can be const and not relying on perf_tool__fill_defaults().
Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
30f29bae |
| 12-Aug-2024 |
Ian Rogers <[email protected]> |
perf tool: Constify tool pointers
The tool pointer (to a struct largely of function pointers) is passed around but is unchanged except at initialization. Change parameter and variable types to be co
perf tool: Constify tool pointers
The tool pointer (to a struct largely of function pointers) is passed around but is unchanged except at initialization. Change parameter and variable types to be const to lower the possibilities of what could happen with a tool.
Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Adrian Hunter <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
e6b56ae7 |
| 19-Jul-2024 |
Martin Liška <[email protected]> |
perf script: add --addr2line option
Similarly to other subcommands (like report, top), it would be handy to provide a path for addr2line command.
Signed-off-by: Martin Liska <[email protected]>
perf script: add --addr2line option
Similarly to other subcommands (like report, top), it would be handy to provide a path for addr2line command.
Signed-off-by: Martin Liska <[email protected]> Cc: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
1a8c2e01 |
| 07-May-2024 |
Ian Rogers <[email protected]> |
perf mem-info: Add reference count checking
Add reference count checking and switch 'struct mem_info' usage to use accessor functions.
Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunt
perf mem-info: Add reference count checking
Add reference count checking and switch 'struct mem_info' usage to use accessor functions.
Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
ad3003a6 |
| 07-May-2024 |
Ian Rogers <[email protected]> |
perf mem-info: Move mem-info out of mem-events and symbol
Move mem-info to its own header rather than having it split between mem-events and symbol.
Signed-off-by: Ian Rogers <[email protected]> C
perf mem-info: Move mem-info out of mem-events and symbol
Move mem-info to its own header rather than having it split between mem-events and symbol.
Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
ee756ef7 |
| 04-May-2024 |
Ian Rogers <[email protected]> |
perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCES
perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCESS everywhere, add accessor functions for the variables in struct dso.
The majority of the change is mechanical in nature and not easy to split up.
Committer testing:
'perf test' up to this patch shows no regressions.
But:
util/symbol.c: In function ‘dso__load_bfd_symbols’: util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’ 1683 | dso__set_adjust_symbols(dso); | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from util/symbol.c:21: util/dso.h:268:20: note: declared here 268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val) | ^~~~~~~~~~~~~~~~~~~~~~~ make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1 MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/ make[6]: *** Waiting for unfinished jobs....
This was updated:
- symbols__fixup_end(&dso->symbols, false); - symbols__fixup_duplicate(&dso->symbols); - dso->adjust_symbols = 1; + symbols__fixup_end(dso__symbols(dso), false); + symbols__fixup_duplicate(dso__symbols(dso)); + dso__set_adjust_symbols(dso);
But not build tested with BUILD_NONDISTRO and libbfd devel files installed (binutils-devel on fedora).
Add the missing argument:
symbols__fixup_end(dso__symbols(dso), false); symbols__fixup_duplicate(dso__symbols(dso)); - dso__set_adjust_symbols(dso); + dso__set_adjust_symbols(dso, true);
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahelenia Ziemiańska <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Changbin Du <[email protected]> Cc: Chengen Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dima Kogan <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Yanteng Si <[email protected]> Cc: zhaimingbing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|