|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2 |
|
| #
a93a620c |
| 05-Dec-2024 |
Ian Rogers <[email protected]> |
perf test expr: Fix system_tsc_freq for only x86
The refactoring of tool PMU events to have a PMU then adding the expr literals to the tool PMU made it so that the literal system_tsc_freq was only s
perf test expr: Fix system_tsc_freq for only x86
The refactoring of tool PMU events to have a PMU then adding the expr literals to the tool PMU made it so that the literal system_tsc_freq was only supported on x86. Update the test expectations to match - namely the parsing is x86 specific and only yields a non-zero value on Intel.
Fixes: 609aa2667f67 ("perf tool_pmu: Switch to standard pmu functions and json descriptions") Reported-by: Athira Rajeev <[email protected]> Closes: https://lore.kernel.org/linux-perf-users/[email protected]/ Co-developed-by: Athira Rajeev <[email protected]> Tested-by: Namhyung Kim <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc1, v6.12, v6.12-rc7 |
|
| #
494c403f |
| 07-Nov-2024 |
Ian Rogers <[email protected]> |
perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means i
perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means in places a temporary PMU is created just to pass a CPU value. Just pass the CPU and fix up the callers.
As there are no longer PMU users in header.h, shuffle forward declarations earlier to work around build failures.
Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Xu Yang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Zong-You Xie <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: Bibo Mao <[email protected]> Cc: Clément Le Goffic <[email protected]> Cc: Dima Kogan <[email protected]> Cc: Dr. David Alan Gilbert <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yicong Yang <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
7463ee17 |
| 07-Nov-2024 |
Ian Rogers <[email protected]> |
perf header: Avoid transitive PMU includes
Currently satisfied via header.h. Note, pmu.h includes parse-events.h.
Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <iroger
perf header: Avoid transitive PMU includes
Currently satisfied via header.h. Note, pmu.h includes parse-events.h.
Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Xu Yang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Zong-You Xie <[email protected]> Cc: Benjamin Gray <[email protected]> Cc: Bibo Mao <[email protected]> Cc: Clément Le Goffic <[email protected]> Cc: Dima Kogan <[email protected]> Cc: Dr. David Alan Gilbert <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yicong Yang <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
| #
3d0f5f45 |
| 13-Sep-2023 |
James Clark <[email protected]> |
perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c
At the same time rename it to perf_pmus__
perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c
At the same time rename it to perf_pmus__find_core_pmu() to match the naming convention in this file.
list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now that it's called from the same compilation unit. This is with -O2 (specifically -O1 -ftree-vrp -finline-functions -finline-small-functions) which allow the bounds of the array access to be determined at compile time. list_prepare_entry() subtracts the offset of the 'list' member in struct perf_pmu from &core_pmus, which isn't a struct perf_pmu. The compiler sees that pmu results in &core_pmus - 8 and refuses to compile. At runtime this works because list_for_each_entry_continue() always adds the offset back again before dereferencing ->next, but it's technically undefined behavior. With -fsanitize=undefined an additional warning is generated.
Using list_first_entry_or_null() to get the first entry here avoids doing &core_pmus - 8 but has the same result and fixes both the compile warning and the undefined behavior warning. There are other uses of list_prepare_entry() in pmus.c, but the compiler doesn't seem to be able to see that they can also be called with &core_pmus, so I won't change any at this time.
Signed-off-by: James Clark <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc1 |
|
| #
a1ebf771 |
| 04-Sep-2023 |
James Clark <[email protected]> |
perf test: Add a test for strcmp_cpuid_str() expression
Test that the new expression builtin returns a match when the current escaped CPU ID is given, and that it doesn't match when "0x0" is given.
perf test: Add a test for strcmp_cpuid_str() expression
Test that the new expression builtin returns a match when the current escaped CPU ID is given, and that it doesn't match when "0x0" is given.
The CPU ID in test__expr() has to be changed to perf_pmu__getcpuid() which returns the CPU ID string, rather than the raw CPU ID that get_cpuid() returns because that can't be used with strcmp_cpuid_str(). It doesn't affect the is_intel test because both versions contain "Intel".
Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
d19a353c |
| 04-Sep-2023 |
James Clark <[email protected]> |
perf test: Check result of has_event(cycles) test
Currently the function always returns 0, so even when the has_event() test fails, the test still passes. Fix it by returning ret instead.
Reviewed-
perf test: Check result of has_event(cycles) test
Currently the function always returns 0, so even when the has_event() test fails, the test still passes. Fix it by returning ret instead.
Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
f0005f17 |
| 30-Aug-2023 |
Ian Rogers <[email protected]> |
perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the number present.
Add a test of the property.
This will be used in future Intel metrics
perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the number present.
Add a test of the property.
This will be used in future Intel metrics.
Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4 |
|
| #
4a4a9bf9 |
| 23-Jun-2023 |
Ian Rogers <[email protected]> |
perf expr: Add has_event function
Some events are dependent on firmware/kernel enablement. Allow such events to be detected when the metric is parsed so that the metric's event parsing doesn't fail.
perf expr: Add has_event function
Some events are dependent on firmware/kernel enablement. Allow such events to be detected when the metric is parsed so that the metric's event parsing doesn't fail.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: Zhengjun Xing <[email protected]> Cc: John Garry <[email protected]> Cc: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3 |
|
| #
6f765bbb |
| 19-May-2023 |
Ian Rogers <[email protected]> |
perf expr: Make the evaluation of & and | logical and lazy
Currently the & and | operators are only used in metric thresholds like (from the tma_retiring metric):
tma_retiring > 0.7 | tma_heavy_ope
perf expr: Make the evaluation of & and | logical and lazy
Currently the & and | operators are only used in metric thresholds like (from the tma_retiring metric):
tma_retiring > 0.7 | tma_heavy_operations > 0.1
Thresholds are always computed when present, but a lack of events may mean the threshold can't be computed. This happens with the option --metric-no-threshold for say the metric tma_retiring on Tigerlake model CPUs.
To fully compute the threshold tma_heavy_operations is needed and it needs the extra events of IDQ.MS_UOPS, UOPS_DECODED.DEC0, cpu/UOPS_DECODED.DEC0,cmask=1/ and IDQ.MITE_UOPS. So --metric-no-threshold is a useful option to reduce the number of events needed and potentially multiplexing of events.
Rather than just fail threshold computations like this, we may know a result from just the left or right-hand side. So, for tma_retiring if its value is "> 0.7" we know it is over the threshold. This allows the metric to have the threshold coloring, when possible, without all the counters being programmed.
Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Edward Baker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Weilin Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc2, v6.4-rc1 |
|
| #
2a939c86 |
| 02-May-2023 |
Ian Rogers <[email protected]> |
perf metric: Change divide by zero and !support events behavior
Division by zero causes expression parsing to fail and no metric to be generated. This can mean for short running benchmarks metrics a
perf metric: Change divide by zero and !support events behavior
Division by zero causes expression parsing to fail and no metric to be generated. This can mean for short running benchmarks metrics are not shown. Change the behavior to make the value nan, which gets shown like:
''' $ perf stat -M TopdownL2 true
Performance counter stats for 'true':
1,031,492 INST_RETIRED.ANY # nan % tma_fetch_bandwidth # nan % tma_heavy_operations # nan % tma_light_operations 29,304 CPU_CLK_UNHALTED.REF_XCLK # nan % tma_fetch_latency # nan % tma_branch_mispredicts # nan % tma_machine_clears # nan % tma_core_bound # nan % tma_memory_bound 2,658,319 IDQ_UOPS_NOT_DELIVERED.CORE 11,167 EXE_ACTIVITY.BOUND_ON_STORES 262,058 EXE_ACTIVITY.1_PORTS_UTIL <not counted> BR_MISP_RETIRED.ALL_BRANCHES (0.00%) <not counted> INT_MISC.RECOVERY_CYCLES_ANY (0.00%) <not counted> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE (0.00%) <not counted> CPU_CLK_UNHALTED.THREAD (0.00%) <not counted> UOPS_RETIRED.RETIRE_SLOTS (0.00%) <not counted> CYCLE_ACTIVITY.STALLS_MEM_ANY (0.00%) <not counted> UOPS_RETIRED.MACRO_FUSED (0.00%) <not counted> IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE (0.00%) <not counted> EXE_ACTIVITY.2_PORTS_UTIL (0.00%) <not counted> CYCLE_ACTIVITY.STALLS_TOTAL (0.00%) <not counted> MACHINE_CLEARS.COUNT (0.00%) <not counted> UOPS_ISSUED.ANY (0.00%)
0.002864879 seconds time elapsed
0.003012000 seconds user 0.000000000 seconds sys '''
When events aren't supported a count of 0 can be confusing and make metrics look meaningful. Change these to be nan also which, with the next change, gets shown like:
''' $ perf stat true Performance counter stats for 'true':
1.25 msec task-clock:u # 0.387 CPUs utilized 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 46 page-faults:u # 36.702 K/sec 255,942 cycles:u # 0.204 GHz (88.66%) 123,046 instructions:u # 0.48 insn per cycle 28,301 branches:u # 22.580 M/sec 2,489 branch-misses:u # 8.79% of all branches 4,719 CPU_CLK_UNHALTED.REF_XCLK:u # 3.765 M/sec # nan % tma_frontend_bound # nan % tma_retiring # nan % tma_backend_bound # nan % tma_bad_speculation 344,855 IDQ_UOPS_NOT_DELIVERED.CORE:u # 275.147 M/sec <not supported> INT_MISC.RECOVERY_CYCLES_ANY:u <not counted> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u (0.00%) <not counted> CPU_CLK_UNHALTED.THREAD:u (0.00%) <not counted> UOPS_RETIRED.RETIRE_SLOTS:u (0.00%) <not counted> UOPS_ISSUED.ANY:u (0.00%)
0.003238142 seconds time elapsed
0.000000000 seconds user 0.003434000 seconds sys '''
Ensure that nan metric values are quoted as nan isn't a valid number in JSON.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2 |
|
| #
207f7df7 |
| 19-Feb-2023 |
Ian Rogers <[email protected]> |
perf expr: Make the online topology accessible globally
Knowing the topology of online CPUs is useful for more than just expr literals. Move to a global function that caches the value. An additional
perf expr: Make the online topology accessible globally
Knowing the topology of online CPUs is useful for more than just expr literals. Move to a global function that caches the value. An additional upside is that this may also avoid computing the CPU topology in some situations.
Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5 |
|
| #
bd560973 |
| 09-Nov-2022 |
Ian Rogers <[email protected]> |
perf expr: Tidy hashmap dependency
hashmap.h comes from libbpf but isn't installed with its headers. Always use the header file of the code in util. Change the hashmap.h dependency in expr.h to a fo
perf expr: Tidy hashmap dependency
hashmap.h comes from libbpf but isn't installed with its headers. Always use the header file of the code in util. Change the hashmap.h dependency in expr.h to a forward declaration, add the necessary header file includes in the C files.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nicolas Schier <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
c302378b |
| 09-Nov-2022 |
Eduard Zingerman <[email protected]> |
libbpf: Hashmap interface update to allow both long and void* keys/values
An update for libbpf's hashmap interface from void* -> void* to a polymorphic one, allowing both long and void* keys and val
libbpf: Hashmap interface update to allow both long and void* keys/values
An update for libbpf's hashmap interface from void* -> void* to a polymorphic one, allowing both long and void* keys and values.
This simplifies many use cases in libbpf as hashmaps there are mostly integer to integer.
Perf copies hashmap implementation from libbpf and has to be updated as well.
Changes to libbpf, selftests/bpf and perf are packed as a single commit to avoid compilation issues with any future bisect.
Polymorphic interface is acheived by hiding hashmap interface functions behind auxiliary macros that take care of necessary type casts, for example:
#define hashmap_cast_ptr(p) \ ({ \ _Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\ #p " pointee should be a long-sized integer or a pointer"); \ (long *)(p); \ })
bool hashmap_find(const struct hashmap *map, long key, long *value);
#define hashmap__find(map, key, value) \ hashmap_find((map), (long)(key), hashmap_cast_ptr(value))
- hashmap__find macro casts key and value parameters to long and long* respectively - hashmap_cast_ptr ensures that value pointer points to a memory of appropriate size.
This hack was suggested by Andrii Nakryiko in [1]. This is a follow up for [2].
[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/ [2] https://lore.kernel.org/bpf/[email protected]/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d
Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1 |
|
| #
4b65fc7b |
| 04-Oct-2022 |
Ian Rogers <[email protected]> |
perf expr: Allow a double if expression
Some TMA metrics have double if expressions like:
( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 + CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK )
perf expr: Allow a double if expression
Some TMA metrics have double if expressions like:
( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 + CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK ) ) if #core_wide < 1 else ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else CPU_CLK_UNHALTED.THREAD
This currently fails to parse as the left hand side if expression needs to be in parentheses. By allowing the if expression to have a right hand side that is an if expression we can parse the expression above, with left to right evaluation order that matches languages like Python.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4 |
|
| #
f0c4b97a |
| 31-Aug-2022 |
Ian Rogers <[email protected]> |
perf test: Add basic core_wide expression test
Add basic test for coverage, similar to #smt_on.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander
perf test: Add basic core_wide expression test
Add basic test for coverage, similar to #smt_on.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
09b73fe9 |
| 31-Aug-2022 |
Ian Rogers <[email protected]> |
perf smt: Compute SMT from topology
The topology records sibling threads. Rather than computing SMT using siblings in sysfs, reuse the values in topology. This only applies when the file smt/active
perf smt: Compute SMT from topology
The topology records sibling threads. Rather than computing SMT using siblings in sysfs, reuse the values in topology. This only applies when the file smt/active isn't available.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
1a6abdde |
| 31-Aug-2022 |
Ian Rogers <[email protected]> |
perf expr: Move the scanner_ctx into the parse_ctx
We currently maintain the two independently and copy from one to the other. This is a burden when additional scanner context values are necessary,
perf expr: Move the scanner_ctx into the parse_ctx
We currently maintain the two independently and copy from one to the other. This is a burden when additional scanner context values are necessary, so combine them.
Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8 |
|
| #
6923397c |
| 18-Jul-2022 |
Ian Rogers <[email protected]> |
perf test: Add test for #system_tsc_freq in metrics
The value should be non-zero on Intel while zero on everything else.
Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers
perf test: Add test for #system_tsc_freq in metrics
The value should be non-zero on Intel while zero on everything else.
Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1 |
|
| #
e5287e6d |
| 27-May-2022 |
Ian Rogers <[email protected]> |
perf expr: Allow exponents on floating point values
Pass the optional exponent component through to strtod that already supports it. We already have exponents in ScaleUnit and so this adds uniformit
perf expr: Allow exponents on floating point values
Pass the optional exponent component through to strtod that already supports it. We already have exponents in ScaleUnit and so this adds uniformity.
Reported-by: Zhengjun Xing <[email protected]> Reviewed-By: Kajol Jain <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4 |
|
| #
6c481031 |
| 29-Nov-2021 |
Thomas Richter <[email protected]> |
perf test: Fix 'Simple expression parser' test on arch without CPU die topology info
Some platforms do not have CPU die support, for example s390.
Commit Cc: Ian Rogers <[email protected]> Fixes:
perf test: Fix 'Simple expression parser' test on arch without CPU die topology info
Some platforms do not have CPU die support, for example s390.
Commit Cc: Ian Rogers <[email protected]> Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.") fails on s390:
# perf test -Fv 7 ... # FAILED tests/expr.c:173 #num_dies >= #num_packages ---- end ---- Simple expression parser: FAILED! #
Investigating this issue leads to these functions:
build_cpu_topology() +--> has_die_topology(void) { struct utsname uts;
if (uname(&uts) < 0) return false; if (strncmp(uts.machine, "x86_64", 6)) return false; .... }
which always returns false on s390. The caller build_cpu_topology() checks has_die_topology() return value. On false the the struct cpu_topology::die_cpu_list is not contructed and has zero entries. This leads to the failing comparison: #num_dies >= #num_packages. s390 of course has a positive number of packages.
Fix this and check if the function build_cpu_topology() did build up a die_cpus_list. The number of entries in this list should be larger than 0. If the number of list element is zero, the die_cpus_list has not been created and the check in function test__expr():
TEST_ASSERT_VAL("#num_dies >= #num_packages", \ num_dies >= num_packages)
always fails.
Output after:
# perf test -Fv 7 7: Simple expression parser : --- start --- division by zero syntax error ---- end ---- Simple expression parser: Ok #
Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.") Signed-off-by: Thomas Richter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Added comment in the added 'if (num_dies)' line about architectures not having die topology ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
|
Revision tags: v5.16-rc3, v5.16-rc2, v5.16-rc1 |
|
| #
9aba0ada |
| 11-Nov-2021 |
Ian Rogers <[email protected]> |
perf expr: Add source_count for aggregating events
Events like uncore_imc/cas_count_read/ on Skylake open multiple events and then aggregate in the metric leader. To determine the average value per
perf expr: Add source_count for aggregating events
Events like uncore_imc/cas_count_read/ on Skylake open multiple events and then aggregate in the metric leader. To determine the average value per event the number of these events is needed. Add a source_count function that returns this value by counting the number of events with the given metric leader. For most events the value is 1 but for uncore_imc/cas_count_read/ it can yield values like 6.
Add a generic test, but manually tested with a test metric that uses the function.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul A . Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Wan Jiabing <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
fdf1e29b |
| 11-Nov-2021 |
Ian Rogers <[email protected]> |
perf expr: Add metric literals for topology.
Allow the number of cpus, cores, dies and packages to be queried by a metric expression.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri O
perf expr: Add metric literals for topology.
Allow the number of cpus, cores, dies and packages to be queried by a metric expression.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul A . Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Wan Jiabing <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
604ce2f0 |
| 11-Nov-2021 |
Ian Rogers <[email protected]> |
perf test: Add expr test for events with hyphens
An example of such an event is topdown-fe-bound.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander
perf test: Add expr test for events with hyphens
An example of such an event is topdown-fe-bound.
Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul A . Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Wan Jiabing <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
33f44bfd |
| 04-Nov-2021 |
Ian Rogers <[email protected]> |
perf test: Rename struct test to test_suite
This is to align with kunit's terminology.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Sohaib Mohamed <[email protected]> Acked-by: Ji
perf test: Rename struct test to test_suite
This is to align with kunit's terminology.
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Sohaib Mohamed <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: David Gow <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|
| #
d68f0365 |
| 04-Nov-2021 |
Ian Rogers <[email protected]> |
perf test: Move each test suite struct to its test
Rather than export test functions, export the test struct. Rename with a suite__ prefix to avoid name collisions.
Committer notes:
Its '&suite__v
perf test: Move each test suite struct to its test
Rather than export test functions, export the test struct. Rename with a suite__ prefix to avoid name collisions.
Committer notes:
Its '&suite__vectors_page', not '&suite__vectors_pages', noticed when cross building to arm (32-bit).
Signed-off-by: Ian Rogers <[email protected]> Tested-by: Sohaib Mohamed <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Daniel Latypov <[email protected]> Cc: David Gow <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
show more ...
|