History log of /linux-6.15/tools/perf/util/stat.c (Results 1 – 25 of 148)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1
# 2d9961c6 01-Feb-2025 Ian Rogers <[email protected]>

perf stat: Don't merge counters purely on name

Counter merging was added in commit 942c5593393d ("perf stat: Add
perf_stat_merge_counters()"), however, it merges events with the same
name on differe

perf stat: Don't merge counters purely on name

Counter merging was added in commit 942c5593393d ("perf stat: Add
perf_stat_merge_counters()"), however, it merges events with the same
name on different PMUs regardless of whether the different PMUs are
actually of the same type (ie they differ only in the suffix on the
PMU). For hwmon events there may be a temp1 event on every PMU, but
the PMU names are all unique and don't differ just by a suffix. The
merging is over eager and will merge all the hwmon counters together
meaning an aggregated and very large temp1 value is shown. The same
would be true for say cache events and memory controller events where
the PMUs differ but the event names are the same.

Fix the problem by correctly saying two PMUs alias when they differ
only by suffix.

Note, there is an overlap with evsel's merged_stat with aggregation
and the evsel's metric_leader where aggregation happens for metrics.

Fixes: 942c5593393d ("perf stat: Add perf_stat_merge_counters()")
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>

show more ...


Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1
# d7d156fc 26-Sep-2024 Ian Rogers <[email protected]>

perf evsel: Remove pmu_name

"evsel->pmu_name" is only ever assigned a strdup of "pmu->name", a
strdup of "evsel->pmu_name" or NULL. As such, prefer to use
"pmu->name" directly and even to directly c

perf evsel: Remove pmu_name

"evsel->pmu_name" is only ever assigned a strdup of "pmu->name", a
strdup of "evsel->pmu_name" or NULL. As such, prefer to use
"pmu->name" directly and even to directly compare PMUs than PMU
names. For safety, add some additional NULL tests.

Acked-by: Namhyung Kim <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
[ Fix arm-spe.c usage of pmu_name and empty PMU name ]
Acked-by: Kan Liang <[email protected]>
Signed-off-by: James Clark <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Dominique Martinet <[email protected]>
Cc: Howard Chu <[email protected]>
Cc: Ze Gao <[email protected]>
Cc: Yicong Yang <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: [email protected]
Cc: Athira Rajeev <[email protected]>
Cc: [email protected]
Cc: Sun Haiyong <[email protected]>
Cc: John Garry <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>

show more ...


Revision tags: v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3
# 3e5deb70 02-Feb-2024 Ian Rogers <[email protected]>

perf cpumap: Clean up use of perf_cpu_map__has_any_cpu_or_is_empty

Most uses of what was perf_cpu_map__empty but is now
perf_cpu_map__has_any_cpu_or_is_empty want to do something with the
CPU map if

perf cpumap: Clean up use of perf_cpu_map__has_any_cpu_or_is_empty

Most uses of what was perf_cpu_map__empty but is now
perf_cpu_map__has_any_cpu_or_is_empty want to do something with the
CPU map if it contains CPUs. Replace uses of
perf_cpu_map__has_any_cpu_or_is_empty with other helpers so that CPUs
within the map can be handled.

Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6
# 6f33e6fa 14-Dec-2023 Ian Rogers <[email protected]>

perf stat: Combine the -A/--no-aggr and --no-merge options

The -A or --no-aggr option disables aggregation of core events:

$ perf stat -A -e cycles,data_total -a true

Performance counter stat

perf stat: Combine the -A/--no-aggr and --no-merge options

The -A or --no-aggr option disables aggregation of core events:

$ perf stat -A -e cycles,data_total -a true

Performance counter stats for 'system wide':

CPU0 1,287,665 cycles
CPU1 1,831,681 cycles
CPU2 27,345,998 cycles
CPU3 1,964,799 cycles
CPU4 236,174 cycles
CPU5 3,302,825 cycles
CPU6 9,201,446 cycles
CPU7 1,403,043 cycles
CPU0 110.90 MiB data_total

0.008961761 seconds time elapsed

The --no-merge option disables the aggregation of uncore events:

$ perf stat --no-merge -e cycles,data_total -a true

Performance counter stats for 'system wide':

38,482,778 cycles
15.04 MiB data_total [uncore_imc_free_running_1]
15.00 MiB data_total [uncore_imc_free_running_0]

0.005915155 seconds time elapsed

Having two options confuses users who generally don't appreciate the
difference in PMUs. Keep all the options but make it so they all
disable aggregation both of core and uncore events:

$ perf stat -A -e cycles,data_total -a true

Performance counter stats for 'system wide':

CPU0 85,878 cycles
CPU1 88,179 cycles
CPU2 60,872 cycles
CPU3 3,265,567 cycles
CPU4 82,357 cycles
CPU5 83,383 cycles
CPU6 84,156 cycles
CPU7 220,803 cycles
CPU0 2.38 MiB data_total [uncore_imc_free_running_0]
CPU0 2.38 MiB data_total [uncore_imc_free_running_1]

0.001397205 seconds time elapsed

Update the relevant 'perf stat' man page information.

Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kaige Ye <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.7-rc5, v6.7-rc4
# 923ca62a 29-Nov-2023 Ian Rogers <[email protected]>

libperf cpumap: Rename perf_cpu_map__empty() to perf_cpu_map__has_any_cpu_or_is_empty()

The name perf_cpu_map_empty is misleading as true is also returned
when the map contains an "any" CPU (aka dum

libperf cpumap: Rename perf_cpu_map__empty() to perf_cpu_map__has_any_cpu_or_is_empty()

The name perf_cpu_map_empty is misleading as true is also returned
when the map contains an "any" CPU (aka dummy) map.

Rename to perf_cpu_map__has_any_cpu_or_is_empty(), later changes will
(re)introduce perf_cpu_map__empty() and perf_cpu_map__has_any_cpu().

Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4
# 91f88a0a 24-Jul-2023 Ian Rogers <[email protected]>

perf stat: Avoid uninitialized use of perf_stat_config

perf_event__read_stat_config will assign values based on number of
tags and tag values. Initialize the structs to zero before they are
assigned

perf stat: Avoid uninitialized use of perf_stat_config

perf_event__read_stat_config will assign values based on number of
tags and tag values. Initialize the structs to zero before they are
assigned so that no uninitialized values can be seen.

This potential error was reported by GCC with LTO enabled.

Reviewed-by: Nick Desaulniers <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Carsten Haitzler <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Tom Rix <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7
# dada1a1f 16-Jun-2023 Namhyung Kim <[email protected]>

perf stat: Show average value on multiple runs

When -r option is used, perf stat runs the command multiple times and
update stats in the evsel->stats.res_stats for global aggregation. But
the value

perf stat: Show average value on multiple runs

When -r option is used, perf stat runs the command multiple times and
update stats in the evsel->stats.res_stats for global aggregation. But
the value is never used and the value it prints at the end is just the
value from the last run. I think we should print the average number of
multiple runs.

Add evlist__copy_res_stats() to update the aggr counter (for display)
using the values in the evsel->stats.res_stats.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1
# 25f69c69 02-Mar-2023 Changbin Du <[email protected]>

perf stat: Fix counting when initial delay configured

When creating counters with initial delay configured, the enable_on_exec
field is not set. So we need to enable the counters later. The problem

perf stat: Fix counting when initial delay configured

When creating counters with initial delay configured, the enable_on_exec
field is not set. So we need to enable the counters later. The problem
is, when a workload is specified the target__none() is true. So we also
need to check stat_config.initial_delay.

In this change, we add a new field 'initial_delay' for struct target
which could be shared by other subcommands. And define
target__enable_on_exec() which returns whether enable_on_exec should be
set on normal cases.

Before this fix the event is not counted:

$ ./perf stat -e instructions -D 100 sleep 2
Events disabled
Events enabled

Performance counter stats for 'sleep 2':

<not counted> instructions

1.901661124 seconds time elapsed

0.001602000 seconds user
0.000000000 seconds sys

After fix it works:

$ ./perf stat -e instructions -D 100 sleep 2
Events disabled
Events enabled

Performance counter stats for 'sleep 2':

404,214 instructions

1.901743475 seconds time elapsed

0.001617000 seconds user
0.000000000 seconds sys

Fixes: c587e77e100fa40e ("perf stat: Do not delay the workload with --delay")
Signed-off-by: Changbin Du <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Hui Wang <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.2
# aa0964e3 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Remove saved_value/runtime_stat

As saved_value/runtime_stat are only written to and not read, remove
the associated logic and writes.

Signed-off-by: Ian Rogers <[email protected]>
Cc: A

perf stat: Remove saved_value/runtime_stat

As saved_value/runtime_stat are only written to and not read, remove
the associated logic and writes.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 8945bef3 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Add cpu_aggr_map for loop

Rename variables, add a comment and add a cpu_aggr_map__for_each_idx
to aid the readability of the stat-display code. In particular, try to
make sure aggr_idx is

perf stat: Add cpu_aggr_map for loop

Rename variables, add a comment and add a cpu_aggr_map__for_each_idx
to aid the readability of the stat-display code. In particular, try to
make sure aggr_idx is used consistently to differentiate from other
kinds of index.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# cc26ffaa 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Hide runtime_stat

runtime_stat is only shared for the sake of tests that don't care
about its value. Move the definition into stat-shadow.c and have the
tests also use the global version.

perf stat: Hide runtime_stat

runtime_stat is only shared for the sake of tests that don't care
about its value. Move the definition into stat-shadow.c and have the
tests also use the global version.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# d74192c7 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Remove perf_stat_evsel_id

perf_stat_evsel_id was used for hard coded metrics. These have now
migrated to json metrics and so the id values are no longer necessary.

Signed-off-by: Ian Rog

perf stat: Remove perf_stat_evsel_id

perf_stat_evsel_id was used for hard coded metrics. These have now
migrated to json metrics and so the id values are no longer necessary.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# c23f5cc0 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Use metrics for --smi-cost

Rather than parsing events for --smi-cost, use the json metric group
'smi'.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <adrian.hunter@int

perf stat: Use metrics for --smi-cost

Rather than parsing events for --smi-cost, use the json metric group
'smi'.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# d6964c5b 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Remove hard coded transaction events

The metric group "transaction" is now present for Intel architectures
so the legacy hard coded approach won't be used. Remove the associated
logic.

S

perf stat: Remove hard coded transaction events

The metric group "transaction" is now present for Intel architectures
so the legacy hard coded approach won't be used. Remove the associated
logic.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 7b86475f 19-Feb-2023 Ian Rogers <[email protected]>

perf stat: Remove topdown event special handling

Now the events are computed from json metrics, the hard coded logic
can be removed.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter

perf stat: Remove topdown event special handling

Now the events are computed from json metrics, the hard coded logic
can be removed.

Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


Revision tags: v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5
# bd560973 09-Nov-2022 Ian Rogers <[email protected]>

perf expr: Tidy hashmap dependency

hashmap.h comes from libbpf but isn't installed with its
headers. Always use the header file of the code in util. Change the
hashmap.h dependency in expr.h to a fo

perf expr: Tidy hashmap dependency

hashmap.h comes from libbpf but isn't installed with its
headers. Always use the header file of the code in util. Change the
hashmap.h dependency in expr.h to a forward declaration, add the
necessary header file includes in the C files.

Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# e5f4afbe 14-Nov-2022 Ian Rogers <[email protected]>

perf pmu: Remove mostly unused 'struct perf_pmu' 'is_hybrid' member

Replace usage with perf_pmu__is_hybrid().

Suggested-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <irogers@

perf pmu: Remove mostly unused 'struct perf_pmu' 'is_hybrid' member

Replace usage with perf_pmu__is_hybrid().

Suggested-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xin Gao <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# c302378b 09-Nov-2022 Eduard Zingerman <[email protected]>

libbpf: Hashmap interface update to allow both long and void* keys/values

An update for libbpf's hashmap interface from void* -> void* to a
polymorphic one, allowing both long and void* keys and val

libbpf: Hashmap interface update to allow both long and void* keys/values

An update for libbpf's hashmap interface from void* -> void* to a
polymorphic one, allowing both long and void* keys and values.

This simplifies many use cases in libbpf as hashmaps there are mostly
integer to integer.

Perf copies hashmap implementation from libbpf and has to be
updated as well.

Changes to libbpf, selftests/bpf and perf are packed as a single
commit to avoid compilation issues with any future bisect.

Polymorphic interface is acheived by hiding hashmap interface
functions behind auxiliary macros that take care of necessary
type casts, for example:

#define hashmap_cast_ptr(p) \
({ \
_Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\
#p " pointee should be a long-sized integer or a pointer"); \
(long *)(p); \
})

bool hashmap_find(const struct hashmap *map, long key, long *value);

#define hashmap__find(map, key, value) \
hashmap_find((map), (long)(key), hashmap_cast_ptr(value))

- hashmap__find macro casts key and value parameters to long
and long* respectively
- hashmap_cast_ptr ensures that value pointer points to a memory
of appropriate size.

This hack was suggested by Andrii Nakryiko in [1].
This is a follow up for [2].

[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/
[2] https://lore.kernel.org/bpf/[email protected]/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d

Signed-off-by: Eduard Zingerman <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]

show more ...


Revision tags: v6.1-rc4, v6.1-rc3, v6.1-rc2
# 8b76a318 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Remove unused perf_counts.aggr field

The aggr field in the struct perf_counts is to keep the aggregated value
in the AGGR_GLOBAL for the old code. But it's not used anymore.

Signed-off-

perf stat: Remove unused perf_counts.aggr field

The aggr field in the struct perf_counts is to keep the aggregated value
in the AGGR_GLOBAL for the old code. But it's not used anymore.

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 91f85f98 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Display event stats using aggr counts

Now aggr counts are ready for use. Convert the display routines to use
the aggr counts and update the shadow stat with them. It doesn't need
to agg

perf stat: Display event stats using aggr counts

Now aggr counts are ready for use. Convert the display routines to use
the aggr counts and update the shadow stat with them. It doesn't need
to aggregate counts or collect aliases anymore during the display. Get
rid of now unused struct perf_aggr_thread_value.

Note that there's a difference in the display order among the aggr mode.
For per-core/die/socket/node aggregation, it shows relevant events in
the same unit together, whereas global/thread/no aggregation it shows
the same events for different units together. So it still uses separate
codes to display them due to the ordering.

One more thing to note is that it breaks per-core event display for now.
The next patch will fix it to have identical output as of now.

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 88f1d351 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Add perf_stat_process_shadow_stats()

This function updates the shadow stats using the aggregated counts
uniformly since it uses the aggr_counts for the every aggr mode.

It'd have duplica

perf stat: Add perf_stat_process_shadow_stats()

This function updates the shadow stats using the aggregated counts
uniformly since it uses the aggr_counts for the every aggr mode.

It'd have duplicate shadow stats for each items for now since the
display routines will update them once again. But that'd be fine
as it shows the average values and it'd be gone eventually.

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 1d6d2bea 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Add perf_stat_process_percore()

The perf_stat_process_percore() is to aggregate counts for an event per-core
even if the aggr_mode is AGGR_NONE. This is enabled when user requested it
on

perf stat: Add perf_stat_process_percore()

The perf_stat_process_percore() is to aggregate counts for an event per-core
even if the aggr_mode is AGGR_NONE. This is enabled when user requested it
on the command line.

To handle that, it keeps the per-cpu counts at first. And then it aggregates
the counts that have the same core id in the aggr->counts and updates the
values for each cpu back.

Later, per-core events will skip one of the CPUs unless percore-show-thread
option is given. In that case, it can simply print all cpu stats with the
updated (per-core) values.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 942c5593 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Add perf_stat_merge_counters()

The perf_stat_merge_counters() is to aggregate the same events in different
PMUs like in case of uncore or hybrid. The same logic is in the stat-display
ro

perf stat: Add perf_stat_merge_counters()

The perf_stat_merge_counters() is to aggregate the same events in different
PMUs like in case of uncore or hybrid. The same logic is in the stat-display
routines but I think it should be handled when it processes the event counters.

As it works on the aggr_counters, it doesn't change the output yet.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# 8f97963e 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Reset aggr counts for each interval

The evsel->stats->aggr->count should be reset for interval processing
since we want to use the values directly for display.

Signed-off-by: Namhyung Ki

perf stat: Reset aggr counts for each interval

The evsel->stats->aggr->count should be reset for interval processing
since we want to use the values directly for display.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


# ae7e6492 18-Oct-2022 Namhyung Kim <[email protected]>

perf stat: Allocate aggr counts for recorded data

In the process_stat_config_event() it sets the aggr_mode that means the
earlier evlist__alloc_stats() cannot allocate the aggr counts due to the
mis

perf stat: Allocate aggr counts for recorded data

In the process_stat_config_event() it sets the aggr_mode that means the
earlier evlist__alloc_stats() cannot allocate the aggr counts due to the
missing aggr_mode.

Do it after setting the aggr_map using evlist__alloc_aggr_stats().

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

show more ...


123456