| dabf410d | 10-Oct-2023 |
Yicong Yang <[email protected]> |
hwtracing: hisi_ptt: Optimize the trace data committing
In the current implementation, there're 4*4MiB trace buffer and hardware will fill the buffer one by one. The driver will get notified if one
hwtracing: hisi_ptt: Optimize the trace data committing
In the current implementation, there're 4*4MiB trace buffer and hardware will fill the buffer one by one. The driver will get notified if one buffer is full and then copy data to the AUX buffer. If there's no enough room for the next trace buffer, we'll commit the AUX buffer to the perf core and try to apply a new one. In a typical configuration the AUX buffer will be 16MiB, so we'll commit the data after the whole AUX buffer is occupied. Then the driver cannot apply a new AUX buffer immediately until the committed data is consumed by userspace and then there's room in the AUX buffer again.
This patch tries to optimize this by commit the data after one single trace buffer is filled. Since there's still room in the AUX buffer, driver can apply a new one without failure and don't need to wait for the userspace to consume the data.
Signed-off-by: Yicong Yang <[email protected]> Acked-by: Jonathan Cameron <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| e0dd27ad | 10-Oct-2023 |
Yicong Yang <[email protected]> |
hwtracing: hisi_ptt: Handle the interrupt in hardirq context
Handle the trace interrupt in the hardirq context, make sure the irq core won't threaded it by declaring IRQF_NO_THREAD and userspace won
hwtracing: hisi_ptt: Handle the interrupt in hardirq context
Handle the trace interrupt in the hardirq context, make sure the irq core won't threaded it by declaring IRQF_NO_THREAD and userspace won't balance it by declaring IRQF_NOBALANCING. Otherwise we may violate the synchronization requirements of the perf core, referenced to the change of arm-ccn PMU commit 0811ef7e2f54 ("bus: arm-ccn: fix PMU interrupt flags").
In the interrupt handler we mainly doing 2 things: - Copy the data from the local DMA buffer to the AUX buffer - Commit the data in the AUX buffer
Signed-off-by: Yicong Yang <[email protected]> Acked-by: Jonathan Cameron <[email protected]> [ Fixed commit description to suppress checkpatch warning ] Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| 6c50384e | 21-Jun-2023 |
Yicong Yang <[email protected]> |
hwtracing: hisi_ptt: Fix potential sleep in atomic context
We're using pci_irq_vector() to obtain the interrupt number and then bind it to the CPU start perf under the protection of spinlock in pmu:
hwtracing: hisi_ptt: Fix potential sleep in atomic context
We're using pci_irq_vector() to obtain the interrupt number and then bind it to the CPU start perf under the protection of spinlock in pmu::start(). pci_irq_vector() might sleep since [1] because it will call msi_domain_get_virq() to get the MSI interrupt number and it needs to acquire dev->msi.data->mutex. Getting a mutex will sleep on contention. So use pci_irq_vector() in an atomic context is problematic.
This patch cached the interrupt number in the probe() and uses the cached data instead to avoid potential sleep.
[1] commit 82ff8e6b78fc ("PCI/MSI: Use msi_get_virq() in pci_get_vector()")
Fixes: ff0de066b463 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device") Reviewed-by: Jonathan Cameron <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| 45c90292 | 21-Jun-2023 |
Yicong Yang <[email protected]> |
hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU
The PTT trace collects PCIe TLP headers from the PCIe link and don't have the ability to exclude certain context. It doesn't suppor
hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU
The PTT trace collects PCIe TLP headers from the PCIe link and don't have the ability to exclude certain context. It doesn't support itrace as well. So replace PERF_PMU_CAP_ITRACE with PERF_PMU_CAP_NO_EXCLUDE. This will greatly save the storage of final data. Tested tracing idle link for ~15s, without this patch we'll collect ~28.682MB data for additional information and with this patch it reduced to ~0.226MB.
Reviewed-by: Jonathan Cameron <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Tested-by: Junhao He <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| 6373c463 | 21-Jun-2023 |
Yicong Yang <[email protected]> |
hwtracing: hisi_ptt: Export available filters through sysfs
The PTT can only filter the traced TLP headers by the Root Ports or the Requester ID of the Endpoint, which are located on the same PCIe c
hwtracing: hisi_ptt: Export available filters through sysfs
The PTT can only filter the traced TLP headers by the Root Ports or the Requester ID of the Endpoint, which are located on the same PCIe core of the PTT device. The filter value used is derived from the BDF number of the supported Root Port or the Endpoint. It's not friendly enough for the users since it requires the user to be familiar enough with the platform and calculate the filter value manually.
This patch export the available filters through sysfs. Each available filters is presented as an individual file with the name of the BDF number of the related PCIe device. The files are created under $(PTT PMU dir)/available_root_port_filters and $(PTT PMU dir)/available_requester_filters respectively. The filter value can be known by reading the related file.
Then the users can easily know the available filters for trace and get the filter values without calculating.
Reviewed-by: Jonathan Cameron <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|