|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2 |
|
| #
ee8aef2d |
| 07-Feb-2025 |
Kan Liang <[email protected]> |
perf tools: Add skip check in tool_pmu__event_to_str()
Some topdown related metrics may fail on hybrid machines.
$ perf stat -M tma_frontend_bound Cannot resolve IDs for tma_frontend_bound: cpu_
perf tools: Add skip check in tool_pmu__event_to_str()
Some topdown related metrics may fail on hybrid machines.
$ perf stat -M tma_frontend_bound Cannot resolve IDs for tma_frontend_bound: cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 * cpu_atom@CPU_CLK_UNHALTED.CORE@)
In the find_tool_events(), the tool_pmu__event_to_str() is used to compare the tool_events. It only checks the event name, no PMU or arch. So the tool_events[TOOL_PMU__EVENT_SLOTS] is set to true, because the p-core Topdown metrics has "slots" event. The tool_events is shared. So when parsing the e-core metrics, the "slots" is automatically added.
The "slots" event as a tool event should only be available on arm64. It has a different meaning on X86. The tool_pmu__skip_event() intends handle the case. Apply it for tool_pmu__event_to_str() as well.
There is a lack of sanity check in the expr__get_id(). Add the check.
Closes: https://lore.kernel.org/lkml/[email protected]/ Fixes: 069057239a67 ("perf tool_pmu: Move expr literals to tool_pmu") Signed-off-by: Kan Liang <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7 |
|
| #
1d18ebcf |
| 08-Nov-2024 |
Levi Yun <[email protected]> |
perf expr: Initialize is_test value in expr__ctx_new()
When expr_parse_ctx is allocated by expr_ctx_new(), expr_scanner_ctx->is_test isn't initialize, so it has garbage value. this can affects the r
perf expr: Initialize is_test value in expr__ctx_new()
When expr_parse_ctx is allocated by expr_ctx_new(), expr_scanner_ctx->is_test isn't initialize, so it has garbage value. this can affects the result of expr__parse() return when it parses non-exist event literal according to garbage value.
Use calloc instead of malloc in expr_ctx_new() to fix this.
Fixes: 3340a08354ac286e ("perf pmu-events: Fix testing with JEVENTS_ARCH=all") Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Signed-off-by: Levi Yun <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
494c403f |
| 07-Nov-2024 |
Ian Rogers <[email protected]> |
perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means i
perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means in places a temporary PMU is created just to pass a CPU value. Just pass the CPU and fix up the callers.
As there are no longer PMU users in header.h, shuffle forward declarations earlier to work around build failures.
Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Xu Yang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Zong-You Xie <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: Bibo Mao <[email protected]> Cc: Clément Le Goffic <[email protected]> Cc: Dima Kogan <[email protected]> Cc: Dr. David Alan Gilbert <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yicong Yang <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2 |
|
| #
06905723 |
| 02-Oct-2024 |
Ian Rogers <[email protected]> |
perf tool_pmu: Move expr literals to tool_pmu
Add the expr literals like "#smt_on" as tool events, this allows stat events to give the values. On my laptop with hyperthreading enabled:
``` $ perf s
perf tool_pmu: Move expr literals to tool_pmu
Add the expr literals like "#smt_on" as tool events, this allows stat events to give the values. On my laptop with hyperthreading enabled:
``` $ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true
Performance counter stats for 'true':
0 has_pmem 8 num_cores 16 num_cpus 16 num_cpus_online 1 num_dies 1 num_packages 1 smt_on 2,496,000,000 system_tsc_freq
0.001113637 seconds time elapsed
0.001218000 seconds user 0.000000000 seconds sys ```
And with hyperthreading disabled: ``` $ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true
Performance counter stats for 'true':
0 has_pmem 8 num_cores 16 num_cpus 8 num_cpus_online 1 num_dies 1 num_packages 0 smt_on 2,496,000,000 system_tsc_freq
0.000802115 seconds time elapsed
0.000000000 seconds user 0.000806000 seconds sys ```
As zero matters for these values, in stat-display should_skip_zero_counter only skip the zero value if it is not the first aggregation index.
The tool event implementations are used in expr but not evaluated as events for simplicity. Also core_wide isn't made a tool event as it requires command line parameters.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3 |
|
| #
9aa61d8e |
| 05-Jun-2024 |
Clément Le Goffic <[email protected]> |
perf: parse-events: Fix compilation error while defining DEBUG_PARSER
Compiling perf tool with 'DEBUG_PARSER=1' leads to errors:
$> make -C tools/perf PARSER_DEBUG=1 NO_LIBTRACEEVENT=1 ... CC
perf: parse-events: Fix compilation error while defining DEBUG_PARSER
Compiling perf tool with 'DEBUG_PARSER=1' leads to errors:
$> make -C tools/perf PARSER_DEBUG=1 NO_LIBTRACEEVENT=1 ... CC util/expr-flex.o CC util/expr.o util/parse-events.c:33:12: error: redundant redeclaration of ‘parse_events_debug’ [-Werror=redundant-decls] 33 | extern int parse_events_debug; | ^~~~~~~~~~~~~~~~~~ In file included from util/parse-events.c:18: util/parse-events-bison.h:43:12: note: previous declaration of ‘parse_events_debug’ with type ‘int’ 43 | extern int parse_events_debug; | ^~~~~~~~~~~~~~~~~~ util/expr.c:27:12: error: redundant redeclaration of ‘expr_debug’ [-Werror=redundant-decls] 27 | extern int expr_debug; | ^~~~~~~~~~ In file included from util/expr.c:11: util/expr-bison.h:43:12: note: previous declaration of ‘expr_debug’ with type ‘int’ 43 | extern int expr_debug; | ^~~~~~~~~~ cc-1: all warnings being treated as errors
Remove extern declaration from the parse-envents.c file as there is a conflict with the ones generated using bison and yacc tools from the file parse-events.[ly].
Signed-off-by: Clément Le Goffic <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4 |
|
| #
6dd76680 |
| 09-Feb-2024 |
Ian Rogers <[email protected]> |
perf expr: Fix "has_event" function for metric style events
Events in metrics cannot use '/' as a separator, it would be recognized as a divide, so they use '@'. The '@' is recognized in the metricg
perf expr: Fix "has_event" function for metric style events
Events in metrics cannot use '/' as a separator, it would be recognized as a divide, so they use '@'. The '@' is recognized in the metricgroups code and changed to '/', do the same in the has_event function so that the parsing is only tried without the @s.
Fixes: 4a4a9bf9075f ("perf expr: Add has_event function") Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: James Clark <[email protected]> Cc: Kaige Ye <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
| #
3d0f5f45 |
| 13-Sep-2023 |
James Clark <[email protected]> |
perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c
At the same time rename it to perf_pmus__
perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c
At the same time rename it to perf_pmus__find_core_pmu() to match the naming convention in this file.
list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now that it's called from the same compilation unit. This is with -O2 (specifically -O1 -ftree-vrp -finline-functions -finline-small-functions) which allow the bounds of the array access to be determined at compile time. list_prepare_entry() subtracts the offset of the 'list' member in struct perf_pmu from &core_pmus, which isn't a struct perf_pmu. The compiler sees that pmu results in &core_pmus - 8 and refuses to compile. At runtime this works because list_for_each_entry_continue() always adds the offset back again before dereferencing ->next, but it's technically undefined behavior. With -fsanitize=undefined an additional warning is generated.
Using list_first_entry_or_null() to get the first entry here avoids doing &core_pmus - 8 but has the same result and fixes both the compile warning and the undefined behavior warning. There are other uses of list_prepare_entry() in pmus.c, but the compiler doesn't seem to be able to see that they can also be called with &core_pmus, so I won't change any at this time.
Signed-off-by: James Clark <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc1 |
|
| #
f0005f17 |
| 30-Aug-2023 |
Ian Rogers <[email protected]> |
perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the number present.
Add a test of the property.
This will be used in future Intel metrics
perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the number present.
Add a test of the property.
This will be used in future Intel metrics.
Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.5, v6.5-rc7 |
|
| #
9d5da30e |
| 16-Aug-2023 |
James Clark <[email protected]> |
perf jevents: Add a new expression builtin strcmp_cpuid_str()
This will allow writing formulas that are conditional on a specific CPU type or CPU version. It calls through to the existing strcmp_cpu
perf jevents: Add a new expression builtin strcmp_cpuid_str()
This will allow writing formulas that are conditional on a specific CPU type or CPU version. It calls through to the existing strcmp_cpuid_str() function in Perf which has a default weak version, and an arch specific version for x86 and arm64.
The function takes an 'ID' type value, which is a string. But in this case Arm CPU IDs are hex numbers prefixed with '0x'. metric.py assumes strings are only used by event names, and that they can't start with a number ('0'), so an additional change has to be made to the regex to convert hex numbers back to 'ID' types.
Signed-off-by: James Clark <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc6, v6.5-rc5, v6.5-rc4 |
|
| #
c7e97f21 |
| 28-Jul-2023 |
Namhyung Kim <[email protected]> |
perf build: Include generated header files properly
The flex and bison generate header files from the source. When user specified a build directory with O= option, it'd generate files under the dir
perf build: Include generated header files properly
The flex and bison generate header files from the source. When user specified a build directory with O= option, it'd generate files under the directory. The build command has -I option to specify the header include directory.
But the -I option only affects the files included like <...>. Let's change the flex and bison headers to use it instead of "...".
Fixes: 80eeb67fe577aa76 ("perf jevents: Program to convert JSON file") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Anup Sharma <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4 |
|
| #
4a4a9bf9 |
| 23-Jun-2023 |
Ian Rogers <[email protected]> |
perf expr: Add has_event function
Some events are dependent on firmware/kernel enablement. Allow such events to be detected when the metric is parsed so that the metric's event parsing doesn't fail.
perf expr: Add has_event function
Some events are dependent on firmware/kernel enablement. Allow such events to be detected when the metric is parsed so that the metric's event parsing doesn't fail.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: Zhengjun Xing <[email protected]> Cc: John Garry <[email protected]> Cc: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7 |
|
| #
a77f8184 |
| 12-Apr-2023 |
Arnaldo Carvalho de Melo <[email protected]> |
perf expr: Use zfree() to reduce chances of use after free
Do defensive programming by using zfree() to initialize freed pointers to NULL, so that eventual use after free result in a NULL pointer de
perf expr: Use zfree() to reduce chances of use after free
Do defensive programming by using zfree() to initialize freed pointers to NULL, so that eventual use after free result in a NULL pointer deref instead of more subtle behaviour.
Also remove one NULL test before free(), as it accepts a NULL arg and we get one line shaved not doing it explicitely.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc6, v6.3-rc5, v6.3-rc4 |
|
| #
c3bf86f1 |
| 24-Mar-2023 |
Ian Rogers <[email protected]> |
perf metrics: Add has_pmem literal
Add literal so that if nvdimms aren't installed we can record fewer events. The file detection mechanism was suggested by Dan Williams <[email protected]>
perf metrics: Add has_pmem literal
Add literal so that if nvdimms aren't installed we can record fewer events. The file detection mechanism was suggested by Dan Williams <[email protected]> in:
https://lore.kernel.org/linux-perf-users/[email protected]/
Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Dan Williams <[email protected]> Cc: Edward Baker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2 |
|
| #
207f7df7 |
| 19-Feb-2023 |
Ian Rogers <[email protected]> |
perf expr: Make the online topology accessible globally
Knowing the topology of online CPUs is useful for more than just expr literals. Move to a global function that caches the value. An additional
perf expr: Make the online topology accessible globally
Knowing the topology of online CPUs is useful for more than just expr literals. Move to a global function that caches the value. An additional upside is that this may also avoid computing the CPU topology in some situations.
Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5 |
|
| #
acef233b |
| 17-Jan-2023 |
Jing Zhang <[email protected]> |
perf pmu: Add #slots literal support for arm64
The slots in each architecture may be different, so add #slots literal to obtain the slots of different architectures, and the #slots can be applied in
perf pmu: Add #slots literal support for arm64
The slots in each architecture may be different, so add #slots literal to obtain the slots of different architectures, and the #slots can be applied in the metric. Currently, The #slots just support for arm64, and other architectures will return NAN.
On arm64, the value of slots is from the register PMMIR_EL1.SLOT, which I can read in /sys/bus/event_source/device/armv8_pmuv3_*/caps/slots. PMMIR_EL1.SLOT might read as zero if the PMU version is lower than ID_AA64DFR0_EL1_PMUVer_V3P4 or the STALL_SLOT event is not implemented.
Reviewed-by: John Garry <[email protected]> Signed-off-by: Jing Zhang <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrew Kilroy <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuai Xue <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Zhuo Song <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5 |
|
| #
bd560973 |
| 09-Nov-2022 |
Ian Rogers <[email protected]> |
perf expr: Tidy hashmap dependency
hashmap.h comes from libbpf but isn't installed with its headers. Always use the header file of the code in util. Change the hashmap.h dependency in expr.h to a fo
perf expr: Tidy hashmap dependency
hashmap.h comes from libbpf but isn't installed with its headers. Always use the header file of the code in util. Change the hashmap.h dependency in expr.h to a forward declaration, add the necessary header file includes in the C files.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nicolas Schier <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
c302378b |
| 09-Nov-2022 |
Eduard Zingerman <[email protected]> |
libbpf: Hashmap interface update to allow both long and void* keys/values
An update for libbpf's hashmap interface from void* -> void* to a polymorphic one, allowing both long and void* keys and val
libbpf: Hashmap interface update to allow both long and void* keys/values
An update for libbpf's hashmap interface from void* -> void* to a polymorphic one, allowing both long and void* keys and values.
This simplifies many use cases in libbpf as hashmaps there are mostly integer to integer.
Perf copies hashmap implementation from libbpf and has to be updated as well.
Changes to libbpf, selftests/bpf and perf are packed as a single commit to avoid compilation issues with any future bisect.
Polymorphic interface is acheived by hiding hashmap interface functions behind auxiliary macros that take care of necessary type casts, for example:
#define hashmap_cast_ptr(p) \ ({ \ _Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\ #p " pointee should be a long-sized integer or a pointer"); \ (long *)(p); \ })
bool hashmap_find(const struct hashmap *map, long key, long *value);
#define hashmap__find(map, key, value) \ hashmap_find((map), (long)(key), hashmap_cast_ptr(value))
- hashmap__find macro casts key and value parameters to long and long* respectively - hashmap_cast_ptr ensures that value pointer points to a memory of appropriate size.
This hack was suggested by Andrii Nakryiko in [1]. This is a follow up for [2].
[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/ [2] https://lore.kernel.org/bpf/[email protected]/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d
Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1 |
|
| #
715b824f |
| 04-Oct-2022 |
Ian Rogers <[email protected]> |
perf expr: Remove jevents case workaround
jevents.py no longer lowercases metrics and altering the case can cause hashmap lookups to fail, so remove.
Signed-off-by: Ian Rogers <[email protected]>
perf expr: Remove jevents case workaround
jevents.py no longer lowercases metrics and altering the case can cause hashmap lookups to fail, so remove.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4 |
|
| #
1725e9cd |
| 31-Aug-2022 |
Ian Rogers <[email protected]> |
perf metrics: Wire up core_wide
Pass state necessary for core_wide into the expression parser. Add system_wide and user_requested_cpu_list to perf_stat_config to make it available at display time. e
perf metrics: Wire up core_wide
Pass state necessary for core_wide into the expression parser. Add system_wide and user_requested_cpu_list to perf_stat_config to make it available at display time. evlist isn't used as the evlist__create_maps, that computes user_requested_cpus, needs the list of events which is generated by the metric.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
09b73fe9 |
| 31-Aug-2022 |
Ian Rogers <[email protected]> |
perf smt: Compute SMT from topology
The topology records sibling threads. Rather than computing SMT using siblings in sysfs, reuse the values in topology. This only applies when the file smt/active
perf smt: Compute SMT from topology
The topology records sibling threads. Rather than computing SMT using siblings in sysfs, reuse the values in topology. This only applies when the file smt/active isn't available.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
1a6abdde |
| 31-Aug-2022 |
Ian Rogers <[email protected]> |
perf expr: Move the scanner_ctx into the parse_ctx
We currently maintain the two independently and copy from one to the other. This is a burden when additional scanner context values are necessary,
perf expr: Move the scanner_ctx into the parse_ctx
We currently maintain the two independently and copy from one to the other. This is a burden when additional scanner context values are necessary, so combine them.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8 |
|
| #
bc2373a5 |
| 18-Jul-2022 |
Kan Liang <[email protected]> |
perf tsc: Add arch TSC frequency information
The TSC frequency information is required for the event metrics with the literal, system_tsc_freq. For the newer Intel platform, the TSC frequency inform
perf tsc: Add arch TSC frequency information
The TSC frequency information is required for the event metrics with the literal, system_tsc_freq. For the newer Intel platform, the TSC frequency information can be retrieved from the CPUID leaf 0x15. If the TSC frequency information isn't present the /proc/cpuinfo approach is used.
Refactor cpuid() for this use. Note, the previous stack pushing/popping approach was broken on x86-64 that has stack red zones that would be clobbered.
Committer testing:
Before:
$ perf record sleep 0.0001 [ perf record: Woken up 1 times to write data ] $ perf report --header-only |& grep cpuid # cpuid : AuthenticAMD,25,33,0 $
After the patch:
$ perf record sleep 0.0001 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ] $ perf report --header-only |& grep cpuid # cpuid : AuthenticAMD,25,33,0 $
Signed-off-by: Kan Liang <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ian Rogers <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3 |
|
| #
c0dd9455 |
| 26-Nov-2021 |
Ian Rogers <[email protected]> |
perf pmu-events: Don't lower case MetricExpr
This patch changes MetricExpr to be written out in the same case. This enables events in metrics to use modifiers like 'G' which currently yield parse er
perf pmu-events: Don't lower case MetricExpr
This patch changes MetricExpr to be written out in the same case. This enables events in metrics to use modifiers like 'G' which currently yield parse errors when made lower case. To keep tests passing the literal #smt_on is compared in a non-case sensitive way - #SMT_on is present in at least SkylakeX metrics.
Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
f56ef30a |
| 24-Nov-2021 |
Ian Rogers <[email protected]> |
perf expr: Add debug logging for literals
Useful for diagnosing problems with metrics.
Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc:
perf expr: Add debug logging for literals
Useful for diagnosing problems with metrics.
Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Fixed up perf_cpu conflict, i.e. we need to append ".cpu" to cpu__max_present_cpu() result ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
6d18804b |
| 05-Jan-2022 |
Ian Rogers <[email protected]> |
perf cpumap: Give CPUs their own type
A common problem is confusing CPU map indices with the CPU, by wrapping the CPU with a struct then this is avoided. This approach is similar to atomic_t.
Commi
perf cpumap: Give CPUs their own type
A common problem is confusing CPU map indices with the CPU, by wrapping the CPU with a struct then this is avoided. This approach is similar to atomic_t.
Committer notes:
To make it build with BUILD_BPF_SKEL=1 these files needed the conversions to 'struct perf_cpu' usage:
tools/perf/util/bpf_counter.c tools/perf/util/bpf_counter_cgroup.c tools/perf/util/bpf_ftrace.c
Also perf_env__get_cpu() was removed back in "perf cpumap: Switch cpu_map__build_map to cpu function".
Additionally these needed to be fixed for the ARM builds to complete:
tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/pmu.c
Suggested-by: John Garry <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|