|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1 |
|
| #
3af2e2f6 |
| 17-Sep-2024 |
Narayana Murty N <[email protected]> |
powerpc/pseries/eeh: move pseries_eeh_err_inject() outside CONFIG_DEBUG_FS block
Makes pseries_eeh_err_inject() available even when debugfs is disabled (CONFIG_DEBUG_FS=n). It moves eeh_debugfs_brea
powerpc/pseries/eeh: move pseries_eeh_err_inject() outside CONFIG_DEBUG_FS block
Makes pseries_eeh_err_inject() available even when debugfs is disabled (CONFIG_DEBUG_FS=n). It moves eeh_debugfs_break_device() and eeh_pe_inject_mmio_error() out of the CONFIG_DEBUG_FS block and renames it as eeh_break_device().
Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Fixes: b0e2b828dfca ("powerpc/pseries/eeh: Fix pseries_eeh_err_inject") Signed-off-by: Narayana Murty N <[email protected]> Reviewed-by: Ritesh Harjani (IBM) <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.11 |
|
| #
b0e2b828 |
| 09-Sep-2024 |
Narayana Murty N <[email protected]> |
powerpc/pseries/eeh: Fix pseries_eeh_err_inject
VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries due to missing implementation of err_inject eeh_ops for pseries. This patch implements ps
powerpc/pseries/eeh: Fix pseries_eeh_err_inject
VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries due to missing implementation of err_inject eeh_ops for pseries. This patch implements pseries_eeh_err_inject in eeh_ops/pseries eeh_ops. Implements support for injecting MMIO load/store error for testing from user space.
The check on PCI error type (bus type) code is moved to platform code, since the eeh_pe_inject_err can be allowed to more error types depending on platform requirement. Removal of the check for 'type' in eeh_pe_inject_err() doesn't impact PowerNV as pnv_eeh_err_inject() already has an equivalent check in place.
Signed-off-by: Narayana Murty N <[email protected]> Reviewed-by: Vaibhav Jain <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.11-rc7 |
|
| #
5b4bc44a |
| 03-Sep-2024 |
Michael Ellerman <[email protected]> |
powerpc: Stop using no_llseek
Since commit 868941b14441 ("fs: remove no_llseek"), no_llseek() is simply defined to be NULL, and a NULL llseek means seeking is unsupported.
So for statically defined
powerpc: Stop using no_llseek
Since commit 868941b14441 ("fs: remove no_llseek"), no_llseek() is simply defined to be NULL, and a NULL llseek means seeking is unsupported.
So for statically defined file_operations, such as all these, there's no need or benefit to set llseek = no_llseek.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6 |
|
| #
35146ead |
| 24-Jun-2024 |
Shivaprasad G Bhat <[email protected]> |
powerpc/iommu: Move dev_has_iommu_table() to iommu.c
Move function dev_has_iommu_table() to powerpc/kernel/iommu.c as it is going to be used by machine specific iommu code as well in subsequent patc
powerpc/iommu: Move dev_has_iommu_table() to iommu.c
Move function dev_has_iommu_table() to powerpc/kernel/iommu.c as it is going to be used by machine specific iommu code as well in subsequent patches.
Signed-off-by: Shivaprasad G Bhat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6 |
|
| #
d1679b4f |
| 22-Apr-2024 |
Ganesh Goudar <[email protected]> |
powerpc/eeh: Permanently disable the removed device
When a device is hot removed on powernv, the hotplug driver clears the device's state. However, on pseries, if a device is removed by phyp after r
powerpc/eeh: Permanently disable the removed device
When a device is hot removed on powernv, the hotplug driver clears the device's state. However, on pseries, if a device is removed by phyp after reaching the error threshold, the kernel remains unaware, leading to the device not being torn down. This prevents necessary remediation actions like failover.
Permanently disable the device if the presence check fails.
Also, in eeh_dev_check_failure in we may consider the error as false positive if the device is hotpluged out as the get_state call returns EEH_STATE_NOT_SUPPORT and we may end up not clearing the device state, so log the event if the state is not moved to permanent failure state.
Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5 |
|
| #
1fd02f66 |
| 30-Apr-2022 |
Julia Lawall <[email protected]> |
powerpc: fix typos in comments
Various spelling mistakes in comments. Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]> Reviewed-by: Joel Stanley <[email protected]
powerpc: fix typos in comments
Various spelling mistakes in comments. Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5 |
|
| #
b616230e |
| 09-Oct-2021 |
Kai Song <[email protected]> |
powerpc/eeh: Fix docstrings in eeh.c
We fix the following warnings when building kernel with W=1: arch/powerpc/kernel/eeh.c:598: warning: Function parameter or member 'function' not described in 'ee
powerpc/eeh: Fix docstrings in eeh.c
We fix the following warnings when building kernel with W=1: arch/powerpc/kernel/eeh.c:598: warning: Function parameter or member 'function' not described in 'eeh_pci_enable' arch/powerpc/kernel/eeh.c:774: warning: Function parameter or member 'edev' not described in 'eeh_set_dev_freset' arch/powerpc/kernel/eeh.c:774: warning: expecting prototype for eeh_set_pe_freset(). Prototype was for eeh_set_dev_freset() instead arch/powerpc/kernel/eeh.c:814: warning: Function parameter or member 'include_passed' not described in 'eeh_pe_reset_full' arch/powerpc/kernel/eeh.c:944: warning: Function parameter or member 'ops' not described in 'eeh_init' arch/powerpc/kernel/eeh.c:1451: warning: Function parameter or member 'include_passed' not described in 'eeh_pe_reset' arch/powerpc/kernel/eeh.c:1526: warning: Function parameter or member 'func' not described in 'eeh_pe_inject_err' arch/powerpc/kernel/eeh.c:1526: warning: Excess function parameter 'function' described in 'eeh_pe_inject_err'
Signed-off-by: Kai Song <[email protected]> Reviewed-by: Daniel Axtens <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
5a72431e |
| 04-Oct-2021 |
Uwe Kleine-König <[email protected]> |
powerpc/eeh: Use dev_driver_string() instead of struct pci_dev->driver->name
Replace pdev->driver->name by dev_driver_string() for the corresponding struct device. This is a step toward removing pc
powerpc/eeh: Use dev_driver_string() instead of struct pci_dev->driver->name
Replace pdev->driver->name by dev_driver_string() for the corresponding struct device. This is a step toward removing pci_dev->driver.
Move the function nearer its only user and instead of the ?: operator use a normal "if" which is more readable.
Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
|
Revision tags: v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6 |
|
| #
dbf77fed |
| 12-Aug-2021 |
Aneesh Kumar K.V <[email protected]> |
powerpc: rename powerpc_debugfs_root to arch_debugfs_dir
No functional change in this patch. arch_debugfs_dir is the generic kernel name declared in linux/debugfs.h for arch-specific debugfs directo
powerpc: rename powerpc_debugfs_root to arch_debugfs_dir
No functional change in this patch. arch_debugfs_dir is the generic kernel name declared in linux/debugfs.h for arch-specific debugfs directory. Architectures like x86/s390 already use the name. Rename powerpc specific powerpc_debugfs_root to arch_debugfs_dir.
Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4 |
|
| #
5362a4b6 |
| 26-May-2021 |
Nicholas Piggin <[email protected]> |
powerpc: Fix reverse map real-mode address lookup with huge vmalloc
real_vmalloc_addr() does not currently work for huge vmalloc, which is what the reverse map can be allocated with for radix host,
powerpc: Fix reverse map real-mode address lookup with huge vmalloc
real_vmalloc_addr() does not currently work for huge vmalloc, which is what the reverse map can be allocated with for radix host, hash guest.
Extract the hugepage aware equivalent from eeh code into a helper, and convert existing sites including this one to use it.
Fixes: 8abddd968a30 ("powerpc/64s/radix: Enable huge vmalloc mappings") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7 |
|
| #
f3d03fc7 |
| 02-Feb-2021 |
Yang Li <[email protected]> |
powerpc/eeh: remove unneeded semicolon
Eliminate the following coccicheck warning: ./arch/powerpc/kernel/eeh.c:782:2-3: Unneeded semicolon
Reported-by: Abaci Robot <[email protected]> Signed-
powerpc/eeh: remove unneeded semicolon
Eliminate the following coccicheck warning: ./arch/powerpc/kernel/eeh.c:782:2-3: Unneeded semicolon
Reported-by: Abaci Robot <[email protected]> Signed-off-by: Yang Li <[email protected]> Reviewed-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
5ae5bc12 |
| 12-Apr-2021 |
Mahesh Salgaonkar <[email protected]> |
powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
During the EEH MMIO error checking, the current implementation fails to map the (virtual) MMIO address back to the pci device on radix w
powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
During the EEH MMIO error checking, the current implementation fails to map the (virtual) MMIO address back to the pci device on radix with hugepage mappings for I/O. This results into failure to dispatch EEH event with no recovery even when EEH capability has been enabled on the device.
eeh_check_failure(token) # token = virtual MMIO address addr = eeh_token_to_phys(token); edev = eeh_addr_cache_get_dev(addr); if (!edev) return 0; eeh_dev_check_failure(edev); <= Dispatch the EEH event
In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys translation that results in wrong physical address, which is then passed to eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges to get to a PCI device. Hence, it fails to find a match and the EEH event never gets dispatched leaving the device in failed state.
The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space") introduced following logic to translate virt to phys for hugepage mappings:
eeh_token_to_phys(): + pa = pte_pfn(*ptep); + + /* On radix we can do hugepage mappings for io, so handle that */ + if (hugepage_shift) { + pa <<= hugepage_shift; <= This is wrong + pa |= token & ((1ul << hugepage_shift) - 1); + }
This patch fixes the virt -> phys translation in eeh_token_to_phys() function.
$ cat /sys/kernel/debug/powerpc/eeh_address_cache mem addr range [0x0000040080000000-0x00000400807fffff]: 0030:01:00.1 mem addr range [0x0000040080800000-0x0000040080ffffff]: 0030:01:00.1 mem addr range [0x0000040081000000-0x00000400817fffff]: 0030:01:00.0 mem addr range [0x0000040081800000-0x0000040081ffffff]: 0030:01:00.0 mem addr range [0x0000040082000000-0x000004008207ffff]: 0030:01:00.1 mem addr range [0x0000040082080000-0x00000400820fffff]: 0030:01:00.0 mem addr range [0x0000040082100000-0x000004008210ffff]: 0030:01:00.1 mem addr range [0x0000040082110000-0x000004008211ffff]: 0030:01:00.0
Above is the list of cached io address ranges of pci 0030:01:00.<fn>.
Before this patch:
Tracing 'arg1' of function eeh_addr_cache_get_dev() during error injection clearly shows that 'addr=' contains wrong physical address:
kworker/u16:0-7 [001] .... 108.883775: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x80103000a510
dmesg shows no EEH recovery messages:
[ 108.563768] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x9ae) != mcp_pulse (0x7fff) [ 108.563788] bnx2x: [bnx2x_hw_stats_update:870(eth2)]NIG timer max (4294967295) [ 108.883788] bnx2x: [bnx2x_acquire_hw_lock:2013(eth1)]lock_status 0xffffffff resource_bit 0x1 [ 108.884407] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout [ 108.884976] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout <..>
After this patch:
eeh_addr_cache_get_dev() trace shows correct physical address:
<idle>-0 [001] ..s. 1043.123828: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x40080bc7cd8
dmesg logs shows EEH recovery getting triggerred:
[ 964.323980] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x746f) != mcp_pulse (0x7fff) [ 964.323991] EEH: Recovering PHB#30-PE#10000 [ 964.324002] EEH: PE location: N/A, PHB location: N/A [ 964.324006] EEH: Frozen PHB#30-PE#10000 detected <..>
Fixes: 33439620680b ("powerpc/eeh: Handle hugepages in ioremap space") Cc: [email protected] # v5.3+ Reported-by: Dominic DeMarco <[email protected]> Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/161821396263.48361.2796709239866588652.stgit@jupiter
show more ...
|
| #
7a7685ac |
| 14-Mar-2021 |
Michael Ellerman <[email protected]> |
powerpc/eeh: Fix build failure with CONFIG_PROC_FS=n
The build fails with CONFIG_PROC_FS=n:
arch/powerpc/kernel/eeh.c:1571:12: error: ‘proc_eeh_show’ defined but not used 1571 | static int pro
powerpc/eeh: Fix build failure with CONFIG_PROC_FS=n
The build fails with CONFIG_PROC_FS=n:
arch/powerpc/kernel/eeh.c:1571:12: error: ‘proc_eeh_show’ defined but not used 1571 | static int proc_eeh_show(struct seq_file *m, void *v)
Wrap proc_eeh_show() in an ifdef to avoid it.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3 |
|
| #
9e857416 |
| 03-Nov-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Add a debugfs interface to check if a driver supports recovery
If a PCI device's current driver implements the error handling callbacks EEH can use them to recover the device after an e
powerpc/eeh: Add a debugfs interface to check if a driver supports recovery
If a PCI device's current driver implements the error handling callbacks EEH can use them to recover the device after an error occurs. For devices without the error handling callbacks we recover them by removing the device and re-scanning it so the PCI core puts the device back into a known good state.
Currently there's no way for userspace to determine if the driver supports recovery or not which makes it difficult to write automated tests for EEH. This patch addressing that by adding a debugfs interface for querying if a specific device can be recovered or not.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
b5e904b8 |
| 03-Nov-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Rework pci_dev lookup in debugfs attributes
Pull the string -> pci_dev lookup stuff into a helper function. No functional change.
Signed-off-by: Oliver O'Halloran <[email protected]> Si
powerpc/eeh: Rework pci_dev lookup in debugfs attributes
Pull the string -> pci_dev lookup stuff into a helper function. No functional change.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.10-rc2, v5.10-rc1 |
|
| #
99f6e979 |
| 21-Oct-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
In commit 269e583357df ("powerpc/eeh: Delete eeh_pe->config_addr") the following simplification was made:
- if (!pe->addr && !pe->config_addr
powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
In commit 269e583357df ("powerpc/eeh: Delete eeh_pe->config_addr") the following simplification was made:
- if (!pe->addr && !pe->config_addr) { + if (!pe->addr) { eeh_stats.no_cfg_addr++; return 0; }
This introduced a bug which causes EEH checking to be skipped for devices in PE#0.
Before the change above the check would always pass since at least one of the two PE addresses would be non-zero in all circumstances. On PowerNV pe->config_addr would be the BDFN of the first device added to the PE. The zero BDFN is reserved for the PHB's root port, but this is fine since for obscure platform reasons the root port is never assigned to PE#0.
Similarly, on pseries pe->addr has always been non-zero for the reasons outlined in commit 42de19d5ef71 ("powerpc/pseries/eeh: Allow zero to be a valid PE configuration address").
We can fix the problem by deleting the block entirely The original purpose of this test was to avoid performing EEH checks on devices that were not on an EEH capable bus. In modern Linux the edev->pe pointer will be NULL for devices that are not on an EEH capable bus. The code block immediately above this one already checks for the edev->pe == NULL case so this test (new and old) is entirely redundant.
Ideally we'd delete eeh_stats.no_cfg_addr too since nothing increments it any more. Unfortunately, that information is exposed via /proc/powerpc/eeh which means it's technically ABI. We could make it hard-coded, but that's a change for another patch.
Fixes: 269e583357df ("powerpc/eeh: Delete eeh_pe->config_addr") Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.9 |
|
| #
269e5833 |
| 07-Oct-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Delete eeh_pe->config_addr
The eeh_pe->config_addr field was supposed to be removed in commit 35d64734b643 ("powerpc/eeh: Clean up PE addressing") which made it largely unused. Finish t
powerpc/eeh: Delete eeh_pe->config_addr
The eeh_pe->config_addr field was supposed to be removed in commit 35d64734b643 ("powerpc/eeh: Clean up PE addressing") which made it largely unused. Finish the job.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.9-rc8, v5.9-rc7, v5.9-rc6 |
|
| #
35d64734 |
| 18-Sep-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Clean up PE addressing
When support for EEH on PowerNV was added a lot of pseries specific code was made "generic" and some of the quirks of pseries EEH came along for the ride. One of
powerpc/eeh: Clean up PE addressing
When support for EEH on PowerNV was added a lot of pseries specific code was made "generic" and some of the quirks of pseries EEH came along for the ride. One of the stranger quirks is eeh_pe containing two types of PE address: pe->addr and pe->config_addr. There reason for this appears to be historical baggage rather than any real requirements.
On pseries EEH PEs are manipulated using RTAS calls. Each EEH RTAS call takes a "PE configuration address" as an input which is used to identify which EEH PE is being manipulated by the call. When initialising the EEH state for a device the first thing we need to do is determine the configuration address for the PE which contains the device so we can enable EEH on that PE. This process is outlined in PAPR which is the modern (i.e post-2003) FW specification for pseries. However, EEH support was first described in the pSeries RISC Platform Architecture (RPA) and although they are mostly compatible EEH is one of the areas where they are not.
The major difference is that RPA doesn't actually have the concept of a PE. On RPA systems the EEH RTAS calls are done on a per-device basis using the same config_addr that would be passed to the RTAS functions to access PCI config space (e.g. ibm,read-pci-config). The config_addr is not identical since the function and config register offsets of the config_addr must be set to zero. EEH operations being done on a per-device basis doesn't make a whole lot of sense when you consider how EEH was implemented on legacy PCI systems.
For legacy PCI(-X) systems EEH was implemented using special PCI-PCI bridges which contained logic to detect errors and freeze the secondary bus when one occurred. This means that the EEH enabled state is shared among all devices behind that EEH bridge. As a result there's no way to implement the per-device control required for the semantics specified by RPA. It can be made to work if we assume that a separate EEH bridge exists for each EEH capable PCI slot and there are no bridges behind those slots. However, RPA also specifies the ibm,configure-bridge RTAS call for re-initalising bridges behind EEH capable slots after they are reset due to an EEH event so that is probably not a valid assumption. This incoherence was fixed in later PAPR, which succeeded RPA. Unfortunately, since Linux EEH support seems to have been implemented based on the RPA spec some of the legacy assumptions were carried over (probably for POWER4 compatibility).
The fix made in PAPR was the introduction of the "PE" concept and redefining the EEH RTAS calls (set-eeh-option, reset-slot, etc) to operate on a per-PE basis so all devices behind an EEH bride would share the same EEH state. The "config_addr" argument to the EEH RTAS calls became the "PE_config_addr" and the OS was required to use the ibm,get-config-addr-info RTAS call to find the correct PE address for the device. When support for the new interfaces was added to Linux it was implemented using something like:
At probe time:
pdn->eeh_config_addr = rtas_config_addr(pdn); pdn->eeh_pe_config_addr = rtas_get_config_addr_info(pdn);
When performing an RTAS call:
config_addr = pdn->eeh_config_addr; if (pdn->eeh_pe_config_addr) config_addr = pdn->eeh_pe_config_addr;
rtas_call(..., config_addr, ...);
In other words, if the ibm,get-config-addr-info RTAS call is implemented and returned a valid result we'd use that as the argument to the EEH RTAS calls. If not, Linux would fall back to using the device's config_addr. Over time these addresses have moved around going from pci_dn to eeh_dev and finally into eeh_pe. Today the users look like this:
config_addr = pe->config_addr; if (pe->addr) config_addr = pe->addr;
rtas_call(..., config_addr, ...);
However, considering the EEH core always operates on a per-PE basis and even on pseries the only per-device operation is the initial call to ibm,set-eeh-option I'm not sure if any of this actually works on an RPA system today. It doesn't make much sense to have the fallback address in a generic structure either since the bulk of the code which reference it is in pseries anyway.
The EEH core makes a token effort to support looking up a PE using the config_addr by having two arguments to eeh_pe_get(). However, a survey of all the callers to eeh_pe_get() shows that all bar one have the config_addr argument hard-coded to zero.The only caller that doesn't is in eeh_pe_tree_insert() which has:
if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) return -EINVAL;
pe = eeh_pe_get(hose, edev->pe_config_addr, edev->bdfn);
The third argument (config_addr) is only used if the second (pe->addr) argument is invalid. The preceding check ensures that the call to eeh_pe_get() will never happen if edev->pe_config_addr is invalid so there is no situation where eeh_pe_get() will search for a PE based on the 3rd argument. The check also means that we'll never insert a PE into the tree where pe_config_addr is zero since EEH_VALID_PE_ZERO is never set on pseries. All the users of the fallback address on pseries never actually use the fallback and all the only caller that supplies something for the config_addr argument to eeh_pe_get() never use it either. It's all dead code.
This patch removes the fallback address from eeh_pe since nothing uses it. Specificly, we do this by:
1) Removing pe->config_addr 2) Removing the EEH_VALID_PE_ZERO flag 3) Removing the fallback address argument to eeh_pe_get(). 4) Removing all the checks for pe->addr being zero in the pseries EEH code.
This leaves us with PE's only being identified by what's in their pe->addr field and the EEH core relying on the platform to ensure that eeh_dev's are only inserted into the EEH tree if they're actually inside a PE.
No functional changes, I hope.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
395ee2a2 |
| 18-Sep-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Move EEH initialisation to an arch initcall
The initialisation of EEH mostly happens in a core_initcall_sync initcall, followed by registering a bus notifier later on in an arch_initcal
powerpc/eeh: Move EEH initialisation to an arch initcall
The initialisation of EEH mostly happens in a core_initcall_sync initcall, followed by registering a bus notifier later on in an arch_initcall. Anything involving initcall dependecies is mostly incomprehensible unless you've spent a while staring at code so here's the full sequence:
ppc_md.setup_arch <-- pci_controllers are created here
...time passes...
core_initcall <-- pci_dns are created from DT nodes core_initcall_sync <-- platforms call eeh_init() postcore_initcall <-- PCI bus type is registered postcore_initcall_sync arch_initcall <-- EEH pci_bus notifier registered subsys_initcall <-- PHBs are scanned here
There's no real requirement to do the EEH setup at the core_initcall_sync level. It just needs to be done after pci_dn's are created and before we start scanning PHBs. Simplify the flow a bit by moving the platform EEH inititalisation to an arch_initcall so we can fold the bus notifier registration into eeh_init().
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
5d69e46a |
| 18-Sep-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Delete eeh_ops->init
No longer used since the platforms perform their EEH initialisation before calling eeh_init().
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: M
powerpc/eeh: Delete eeh_ops->init
No longer used since the platforms perform their EEH initialisation before calling eeh_init().
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
d125aedb |
| 18-Sep-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Rework EEH initialisation
Drop the EEH register / unregister ops thing and have the platform pass the ops structure into eeh_init() directly. This takes one initcall out of the EEH setu
powerpc/eeh: Rework EEH initialisation
Drop the EEH register / unregister ops thing and have the platform pass the ops structure into eeh_init() directly. This takes one initcall out of the EEH setup path and it means we're only doing EEH setup on the platforms which actually support it. It's also less code and generally easier to follow.
No functional changes.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7 |
|
| #
d923ab7a |
| 25-Jul-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Rename eeh_{add_to|remove_from}_parent_pe()
The naming of eeh_{add_to|remove_from}_parent_pe() doesn't really reflect what they actually do. If the PE referred to be edev->pe_config_add
powerpc/eeh: Rename eeh_{add_to|remove_from}_parent_pe()
The naming of eeh_{add_to|remove_from}_parent_pe() doesn't really reflect what they actually do. If the PE referred to be edev->pe_config_addr already exists under that PHB then the edev is added to that PE. However, if the PE doesn't exist the a new one is created for the edev.
The bulk of the implementation of eeh_add_to_parent_pe() covers that second case. Similarly, most of eeh_remove_from_parent_pe() is determining when it's safe to delete a PE.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
1a303d88 |
| 25-Jul-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Remove spurious use of pci_dn in eeh_dump_dev_log
Retrieve the domain, bus, device, and function numbers from the edev.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-b
powerpc/eeh: Remove spurious use of pci_dn in eeh_dump_dev_log
Retrieve the domain, bus, device, and function numbers from the edev.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
17d2a487 |
| 25-Jul-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config()
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes.
powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config()
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
0c2c7652 |
| 25-Jul-2020 |
Oliver O'Halloran <[email protected]> |
powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config()
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes.
Signe
powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config()
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes.
Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|