|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4 |
|
| #
665745f2 |
| 18-Oct-2024 |
Ilpo Järvinen <[email protected]> |
PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth notification"). An upcoming commit extends this driver build
PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth notification"). An upcoming commit extends this driver building PCIe bandwidth controller on top of it.
PCIe bandwidth notifications were first added in the commit e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification") but later had to be removed. The significant changes compared with the old bandwidth notification driver include:
1) Don't print the notifications into kernel log, just keep the Link Speed cached in struct pci_bus updated. While somewhat unfortunate, the log spam was the source of complaints that eventually lead to the removal of the bandwidth notifications driver (see the links below for further information).
2) Besides the Link Bandwidth Management Interrupt, also enable Link Autonomous Bandwidth Interrupt to cover the other source of bandwidth changes.
3) Handle Link Speed updates robustly. Refresh the cached Link Speed when enabling Bandwidth Notification Interrupts, and solve the race between Link Speed read and LBMS/LABS update in pcie_bwnotif_irq_thread().
4) Use concurrency safe LNKCTL RMW operations.
5) The driver is now called PCIe bwctrl (bandwidth controller) instead of just bandwidth notifications because of increased scope and functionality within the driver.
6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain(). Provide LBMS counting API for it.
7) Tweaks to variable/functions names for consistency and length reasons.
Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to keep track PCIe Link Speed changes.
[bhelgaas: This is based on previous work by Alexandru Gagniuc <[email protected]>; see e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification")]
Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/linux-pci/[email protected]/ Link: https://lore.kernel.org/linux-pci/[email protected]/ Suggested-by: Lukas Wunner <[email protected]> # Building bwctrl on top of bwnotif Signed-off-by: Ilpo Järvinen <[email protected]> [bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler: https://lore.kernel.org/r/[email protected]] Signed-off-by: Bjorn Helgaas <[email protected]> Tested-by: Stefan Wahren <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2 |
|
| #
e8afd0d9 |
| 12-May-2023 |
Rongguang Wei <[email protected]> |
PCI: pciehp: Cancel bringup sequence if card is not present
If a PCIe hotplug slot has an Attention Button, the normal hot-add flow is:
- Slot is empty and slot power is off - User inserts card
PCI: pciehp: Cancel bringup sequence if card is not present
If a PCIe hotplug slot has an Attention Button, the normal hot-add flow is:
- Slot is empty and slot power is off - User inserts card in slot and presses Attention Button - OS blinks Power Indicator for 5 seconds - After 5 seconds, OS turns on Power Indicator, turns on slot power, and enumerates the device
Previously, if a user pressed the Attention Button on an *empty* slot, pciehp logged the following messages and blinked the Power Indicator until a second button press:
[0.000] pciehp: Button press: will power on in 5 sec [0.001] # Power Indicator starts blinking [5.001] # 5 second timeout; slot is empty, so we should cancel the request to power on and turn off Power Indicator
[7.000] # Power Indicator still blinking [8.000] # possible card insertion [9.000] pciehp: Button press: canceling request to power on
The first button press incorrectly left the slot in BLINKINGON_STATE, so the second was interpreted as a "cancel power on" event regardless of whether a card was present.
If the slot is empty, turn off the Power Indicator and return from BLINKINGON_STATE to OFF_STATE after 5 seconds, effectively canceling the request to power on. Putting the slot in OFF_STATE also means the second button press will correctly request a slot power on if the slot is occupied.
[bhelgaas: commit log] Link: https://lore.kernel.org/r/[email protected] Fixes: d331710ea78f ("PCI: pciehp: Become resilient to missed events") Suggested-by: Lukas Wunner <[email protected]> Signed-off-by: Rongguang Wei <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Lukas Wunner <[email protected]>
show more ...
|
| #
5054133a |
| 22-May-2023 |
Bjorn Helgaas <[email protected]> |
PCI: pciehp: Simplify Attention Button logging
Previously, pressing the Attention Button always logged two lines, the first from pciehp_ist() and the second from pciehp_handle_button_press():
Att
PCI: pciehp: Simplify Attention Button logging
Previously, pressing the Attention Button always logged two lines, the first from pciehp_ist() and the second from pciehp_handle_button_press():
Attention button pressed Powering on due to button press
Since pciehp_handle_button_press() always logs the more detailed message, remove the generic "Attention button pressed" message. Reword the pciehp_handle_button_press() to be of the form:
Button press: will power on in 5 sec Button press: will power off in 5 sec Button press: canceling request to power on Button press: canceling request to power off Button press: ignoring invalid state %#x
Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Lukas Wunner <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6 |
|
| #
8a614499 |
| 17-Sep-2020 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Reduce noisiness on hot removal
When a PCIe card is hot-removed, the Presence Detect State and Data Link Layer Link Active bits often do not clear simultaneously. I've seen delays of u
PCI: pciehp: Reduce noisiness on hot removal
When a PCIe card is hot-removed, the Presence Detect State and Data Link Layer Link Active bits often do not clear simultaneously. I've seen delays of up to 244 msec between the two events with Thunderbolt.
After pciehp has brought down the slot in response to the first event, the other bit may still be set. It's not discernible whether it's set because a new card is already in the slot or if it will soon clear. So pciehp tries to bring up the slot and in the latter case fails with a bunch of messages, some of them at KERN_ERR severity. If the slot is no longer occupied, the messages are false positives and annoy users.
Stuart Hayes reports the following splat on hot removal:
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001 KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status
Dongdong Liu complains about a similar splat:
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down KERN_INFO iommu: Removing device 0000:87:00.0 from group 12 KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status
Users are particularly irritated to see a bringup attempt even though the slot was explicitly brought down via sysfs. In a perfect world, we could avoid this by setting Link Disable on slot bringdown and re-enabling it upon a Presence Detect State change. In reality however, there are broken hotplug ports which hardwire Presence Detect to zero, see 80696f991424 ("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely, PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active Reporting wasn't specified before PCIe r1.1. On unplug, some ports first clear Presence then Link (see Stuart Hayes' splat) whereas others use the inverse order (see Dongdong Liu's splat). To top it off, there are hotplug ports which flap the Presence and Link bits on slot bringup, see 6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link").
pciehp is designed to work with all of these variants. Surplus attempts at slot bringup are a lesser evil than not being able to bring up slots at all. Although we could try to perfect the behavior for specific hotplug controllers, we'd risk breaking others or increasing code complexity.
But we can certainly minimize annoyance by emitting only a single message with KERN_INFO severity if bringup is unsuccessful:
* Drop the "Timeout waiting for Presence Detect" message in pcie_wait_for_presence(). The sole caller of that function, pciehp_check_link_status(), ignores the timeout and carries on. It emits error messages of its own and I don't think this particular message adds much value.
* There's a single error condition in pciehp_check_link_status() which does not emit a message. Adding one allows dropping the "Failed to check link status" message emitted by board_added() if pciehp_check_link_status() returns a non-zero integer.
* Tone down all messages in pciehp_check_link_status() to KERN_INFO severity and rephrase them to look as innocuous as possible. To this end, move the message emitted by pcie_wait_for_link_delay() to its callers.
As a result, Stuart Hayes' splat becomes:
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001
Dongdong Liu's splat becomes:
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link
The messages now merely serve as information that presence or link bits were set a little longer than expected. Bringup failures which are not false positives are still reported, albeit no longer at KERN_ERR severity.
Link: https://lore.kernel.org/linux-pci/[email protected]/ Link: https://lore.kernel.org/linux-pci/[email protected]/ Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de Reported-by: Stuart Hayes <[email protected]> Reported-by: Dongdong Liu <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Mika Westerberg <[email protected]>
show more ...
|
|
Revision tags: v5.9-rc5, v5.9-rc4, v5.9-rc3 |
|
| #
df561f66 |
| 23-Aug-2020 |
Gustavo A. R. Silva <[email protected]> |
treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through mar
treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <[email protected]>
show more ...
|
|
Revision tags: v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6 |
|
| #
87d0f2a5 |
| 29-Oct-2019 |
Mika Westerberg <[email protected]> |
PCI: pciehp: Prevent deadlock on disconnect
This addresses deadlocks in these common cases in hierarchies containing two switches:
- All involved ports are runtime suspended and they are unplugge
PCI: pciehp: Prevent deadlock on disconnect
This addresses deadlocks in these common cases in hierarchies containing two switches:
- All involved ports are runtime suspended and they are unplugged. This can happen easily if the drivers involved automatically enable runtime PM (xHCI for example does that).
- System is suspended (e.g., closing the lid on a laptop) with a dock + something else connected, and the dock is unplugged while suspended.
These cases lead to the following deadlock:
INFO: task irq/126-pciehp:198 blocked for more than 120 seconds. irq/126-pciehp D 0 198 2 0x80000000 Call Trace: schedule+0x2c/0x80 schedule_timeout+0x246/0x350 wait_for_completion+0xb7/0x140 kthread_stop+0x49/0x110 free_irq+0x32/0x70 pcie_shutdown_notification+0x2f/0x50 pciehp_remove+0x27/0x50 pcie_port_remove_service+0x36/0x50 device_release_driver+0x12/0x20 bus_remove_device+0xec/0x160 device_del+0x13b/0x350 device_unregister+0x1a/0x60 remove_iter+0x1e/0x30 device_for_each_child+0x56/0x90 pcie_port_device_remove+0x22/0x40 pcie_portdrv_remove+0x20/0x60 pci_device_remove+0x3e/0xc0 device_release_driver_internal+0x18c/0x250 device_release_driver+0x12/0x20 pci_stop_bus_device+0x6f/0x90 pci_stop_bus_device+0x31/0x90 pci_stop_and_remove_bus_device+0x12/0x20 pciehp_unconfigure_device+0x88/0x140 pciehp_disable_slot+0x6a/0x110 pciehp_handle_presence_or_link_change+0x263/0x400 pciehp_ist+0x1c9/0x1d0 irq_thread_fn+0x24/0x60 irq_thread+0xeb/0x190 kthread+0x120/0x140
INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds. irq/190-pciehp D 0 2288 2 0x80000000 Call Trace: __schedule+0x2a2/0x880 schedule+0x2c/0x80 schedule_preempt_disabled+0xe/0x10 mutex_lock+0x2c/0x30 pci_lock_rescan_remove+0x15/0x20 pciehp_unconfigure_device+0x4d/0x140 pciehp_disable_slot+0x6a/0x110 pciehp_handle_presence_or_link_change+0x263/0x400 pciehp_ist+0x1c9/0x1d0 irq_thread_fn+0x24/0x60 irq_thread+0xeb/0x190 kthread+0x120/0x140
What happens here is that the whole hierarchy is runtime resumed and the parent PCIe downstream port, which got the hot-remove event, starts removing devices below it, taking pci_lock_rescan_remove() lock. When the child PCIe port is runtime resumed it calls pciehp_check_presence() which ends up calling pciehp_card_present() and pciehp_check_link_active(). Both of these use pcie_capability_read_word(), which notices that the underlying device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the capability value set to 0. When pciehp gets this value it thinks that its child device is also hot-removed and schedules its IRQ thread to handle the event.
The deadlock happens when the child's IRQ thread runs and tries to acquire pci_lock_rescan_remove() which is already taken by the parent and the parent waits for the child's IRQ thread to finish.
Prevent this from happening by checking the return value of pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop performing any hot-removal activities.
[bhelgaas: add common scenarios to commit log] Link: https://lore.kernel.org/r/[email protected] Tested-by: Kai-Heng Feng <[email protected]> Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
|
Revision tags: v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4 |
|
| #
157c1062 |
| 09-Aug-2019 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Avoid returning prematurely from sysfs requests
A sysfs request to enable or disable a PCIe hotplug slot should not return before it has been carried out. That is sought to be achieved
PCI: pciehp: Avoid returning prematurely from sysfs requests
A sysfs request to enable or disable a PCIe hotplug slot should not return before it has been carried out. That is sought to be achieved by waiting until the controller's "pending_events" have been cleared.
However the IRQ thread pciehp_ist() clears the "pending_events" before it acts on them. If pciehp_sysfs_enable_slot() / _disable_slot() happen to check the "pending_events" after they have been cleared but while pciehp_ist() is still running, the functions may return prematurely with an incorrect return value.
Fix by introducing an "ist_running" flag which must be false before a sysfs request is allowed to return.
Fixes: 32a8cef274fe ("PCI: pciehp: Enable/disable exclusively from IRQ thread") Link: https://lore.kernel.org/linux-pci/[email protected] Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de Reported-and-tested-by: Xiongfeng Wang <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: [email protected] # v4.19+
show more ...
|
| #
4a06c2c3 |
| 05-Sep-2019 |
Bjorn Helgaas <[email protected]> |
PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments
The PCIe spec doesn't mention "green LEDs" or "amber LEDs". Replace those terms with "Power Indicator" and "Attention Indicator" so
PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments
The PCIe spec doesn't mention "green LEDs" or "amber LEDs". Replace those terms with "Power Indicator" and "Attention Indicator" so the comments match the spec language. Comment changes only.
Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
| #
9194094b |
| 03-Sep-2019 |
Denis Efremov <[email protected]> |
PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()
Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators() instead, since the code is mostly the same.
[bhelgaas: drop set_power_
PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()
Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators() instead, since the code is mostly the same.
[bhelgaas: drop set_power_indicator() wrapper to reduce the number of interfaces] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
show more ...
|
| #
94719ba0 |
| 03-Sep-2019 |
Denis Efremov <[email protected]> |
PCI: pciehp: Combine adjacent indicator updates
Combine adjacent updates of power and attention indicators into a single pciehp_set_indicators() call. This sends one command to the hotplug controll
PCI: pciehp: Combine adjacent indicator updates
Combine adjacent updates of power and attention indicators into a single pciehp_set_indicators() call. This sends one command to the hotplug controller instead of two.
Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Reviewed-by: Lukas Wunner <[email protected]>
show more ...
|
|
Revision tags: v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1 |
|
| #
94dbc956 |
| 07-May-2019 |
Frederick Lawler <[email protected]> |
PCI: pciehp: Log messages with pci_dev, not pcie_device
Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt().
Example output change:
- pciehp 0000:00:06
PCI: pciehp: Log messages with pci_dev, not pcie_device
Log messages with pci_dev, not pcie_device. Factor out common message prefixes with dev_fmt().
Example output change:
- pciehp 0000:00:06.0:pcie004: Slot(0) Powering on due to button press + pcieport 0000:00:06.0: pciehp: Slot(0) Powering on due to button press
Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Frederick Lawler <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]>
show more ...
|
|
Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1 |
|
| #
3943af9d |
| 12-Mar-2019 |
Sergey Miroshnichenko <[email protected]> |
PCI: pciehp: Ignore Link State Changes after powering off a slot
During a safe hot remove, the OS powers off the slot, which may cause a Data Link Layer State Changed event. The slot has already be
PCI: pciehp: Ignore Link State Changes after powering off a slot
During a safe hot remove, the OS powers off the slot, which may cause a Data Link Layer State Changed event. The slot has already been set to OFF_STATE, so that event results in re-enabling the device, making it impossible to safely remove it.
Clear out the Presence Detect Changed and Data Link Layer State Changed events when the disabled slot has settled down.
It is still possible to re-enable the device if it remains in the slot after pressing the Attention Button by pressing it again.
Fixes the problem that Micah reported below: an NVMe drive power button may not actually turn off the drive.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203237 Reported-by: Micah Parrish <[email protected]> Tested-by: Micah Parrish <[email protected]> Signed-off-by: Sergey Miroshnichenko <[email protected]> [bhelgaas: changelog, add bugzilla URL] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Lukas Wunner <[email protected]> Cc: [email protected] # v4.19+
show more ...
|
|
Revision tags: v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3 |
|
| #
125450f8 |
| 08-Sep-2018 |
Lukas Wunner <[email protected]> |
PCI: hotplug: Embed hotplug_slot
When the PCI hotplug core and its first user, cpqphp, were introduced in February 2002 with historic commit a8a2069f432c, cpqphp allocated a slot struct for its inte
PCI: hotplug: Embed hotplug_slot
When the PCI hotplug core and its first user, cpqphp, were introduced in February 2002 with historic commit a8a2069f432c, cpqphp allocated a slot struct for its internal use plus a hotplug_slot struct to be registered with the hotplug core and linked the two with pointers: https://git.kernel.org/tglx/history/c/a8a2069f432c
Nowadays, the predominant pattern in the tree is to embed ("subclass") such structures in one another and cast to the containing struct with container_of(). But it wasn't until July 2002 that container_of() was introduced with historic commit ec4f214232cf: https://git.kernel.org/tglx/history/c/ec4f214232cf
pnv_php, introduced in 2016, did the right thing and embedded struct hotplug_slot in its internal struct pnv_php_slot, but all other drivers cargo-culted cpqphp's design and linked separate structs with pointers.
Embedding structs is preferrable to linking them with pointers because it requires fewer allocations, thereby reducing overhead and simplifying error paths. Casting an embedded struct to the containing struct becomes a cheap subtraction rather than a dereference. And having fewer pointers reduces the risk of them pointing nowhere either accidentally or due to an attack.
Convert all drivers to embed struct hotplug_slot in their internal slot struct. The "private" pointer in struct hotplug_slot thereby becomes unused, so drop it.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Rafael J. Wysocki <[email protected]> Acked-by: Tyrel Datwyler <[email protected]> # drivers/pci/hotplug/rpa* Acked-by: Sebastian Ott <[email protected]> # drivers/pci/hotplug/s390* Acked-by: Andy Shevchenko <[email protected]> # drivers/platform/x86 Cc: Len Brown <[email protected]> Cc: Scott Murray <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Oliver OHalloran <[email protected]> Cc: Gavin Shan <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Corentin Chary <[email protected]> Cc: Darren Hart <[email protected]>
show more ...
|
| #
4ff3126e |
| 08-Sep-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Rename controller struct members for clarity
Of the members which were just moved from pciehp's slot struct to the controller struct, rename "lock" to "state_lock" and rename "work" to
PCI: pciehp: Rename controller struct members for clarity
Of the members which were just moved from pciehp's slot struct to the controller struct, rename "lock" to "state_lock" and rename "work" to "button_work" for clarity. Perform the rename separately to the unification of the two structs per Sinan's request.
No functional change intended.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Sinan Kaya <[email protected]>
show more ...
|
| #
5790a9c7 |
| 18-Sep-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Unify controller and slot structs
pciehp was originally introduced together with shpchp in a single commit, c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI Express hot-plug drivers"): http
PCI: pciehp: Unify controller and slot structs
pciehp was originally introduced together with shpchp in a single commit, c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI Express hot-plug drivers"): https://git.kernel.org/tglx/history/c/c16b4b14d980
shpchp supports up to 31 slots per controller, hence uses separate slot and controller structs. pciehp has a 1:1 relationship between slot and controller and therefore never required this separation. Nevertheless, because much of the code had been copy-pasted between the two drivers, pciehp likewise uses separate structs to this very day.
The artificial separation of data structures adds unnecessary complexity and bloat to pciehp and requires constantly chasing pointers at runtime.
Simplify the driver by merging struct slot into struct controller. Merge the slot constructor pcie_init_slot() and the destructor pcie_cleanup_slot() into the controller counterparts.
No functional change intended.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
| #
80696f99 |
| 08-Sep-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Tolerate Presence Detect hardwired to zero
The WiGig Bus Extension (WBE) specification allows tunneling PCIe over IEEE 802.11. A product implementing this spec is the wil6210 from Wilo
PCI: pciehp: Tolerate Presence Detect hardwired to zero
The WiGig Bus Extension (WBE) specification allows tunneling PCIe over IEEE 802.11. A product implementing this spec is the wil6210 from Wilocity (now part of Qualcomm Atheros). It integrates a PCIe switch with a wireless network adapter:
00.0-+ [1ae9:0101] Upstream Port +-00.0-+ [1ae9:0200] Downstream Port | +-00.0 [168c:0034] Atheros AR9462 Wireless Network Adapter +-02.0 [1ae9:0201] Downstream Port +-03.0 [1ae9:0201] Downstream Port
Wirelessly attached devices presumably appear below the hotplug ports with device ID [1ae9:0201]. Oddly, the Downstream Port [1ae9:0200] leading to the wireless network adapter is likewise Hotplug Capable, but has its Presence Detect State bit hardwired to zero. Even if the Link Active bit is set, Presence Detect is zero, so this cannot be caused by in-band presence detection but only by broken hardware.
pciehp assumes an empty slot if Presence Detect State is zero, regardless of Link Active being one. Consequently, up until v4.18 it removes the wireless network adapter in pciehp_resume(). From v4.19 it already does so in pciehp_probe().
Be lenient towards broken hardware and assume the slot is occupied if Link Active is set: Introduce pciehp_card_present_or_link_active() and use it in lieu of pciehp_get_adapter_status() everywhere, except in pciehp_handle_presence_or_link_change() whose log messages depend on which of Presence Detect State or Link Active is set.
Remove the Presence Detect State check from __pciehp_enable_slot() because it is only called if either of Presence Detect State or Link Active is set.
Caution: There is a possibility that broken hardware exists which has working Presence Detect but hardwires Link Active to one. On such hardware the slot will now incorrectly be considered always occupied. If such hardware is discovered, this commit can be rolled back and a quirk can be added which sets is_hotplug_bridge = 0 for [1ae9:0200].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200839 Reported-and-tested-by: David Yang <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Rajat Jain <[email protected]> Cc: Ashok Raj <[email protected]>
show more ...
|
|
Revision tags: v4.19-rc2, v4.19-rc1 |
|
| #
eee6e273 |
| 19-Aug-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Drop hotplug_slot_ops wrappers
pciehp's ->enable_slot, ->disable_slot, ->get_attention_status and ->reset_slot callbacks are currently implemented by wrapper functions that do nothing e
PCI: pciehp: Drop hotplug_slot_ops wrappers
pciehp's ->enable_slot, ->disable_slot, ->get_attention_status and ->reset_slot callbacks are currently implemented by wrapper functions that do nothing else but call down to a backend function. The backends are not called from anywhere else, so drop the wrappers and use the backends directly as callbacks, thereby shaving off a few lines of unnecessary code.
No functional change intended.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
| #
7d4ba523 |
| 19-Aug-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Drop unnecessary includes
Drop the following includes from pciehp source files which no longer use any of the included symbols:
* <linux/sched/signal.h> in pciehp.h <linux/signal.h>
PCI: pciehp: Drop unnecessary includes
Drop the following includes from pciehp source files which no longer use any of the included symbols:
* <linux/sched/signal.h> in pciehp.h <linux/signal.h> in pciehp_hpc.c Added by commit de25968cc87c ("fix more missing includes") to accommodate for a call to signal_pending(). The call was removed by commit 262303fe329a ("pciehp: fix wait command completion").
* <linux/interrupt.h> in pciehp_core.c Added by historic commit f308a2dfbe63 ("PCI: add PCI Express Port Bus Driver subsystem") to accommodate for a call to free_irq(): https://git.kernel.org/tglx/history/c/f308a2dfbe63 The call was removed by commit 407f452b05f9 ("pciehp: remove unnecessary free_irq").
* <linux/time.h> in pciehp_core.c and pciehp_hpc.c Added by commit 34d03419f03b ("PCIEHP: Add Electro Mechanical Interlock (EMI) support to the PCIE hotplug driver."), which was reverted by commit bd3d99c17039 ("PCI: Remove untested Electromechanical Interlock (EMI) support in pciehp.").
* <linux/module.h> in pciehp_ctrl.c, pciehp_hpc.c and pciehp_pci.c Added by historic commit c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI Express hot-plug drivers"): https://git.kernel.org/tglx/history/c/c16b4b14d980 Module-related symbols were neither used back then in those files, nor are they used today.
* <linux/slab.h> in pciehp_ctrl.c Added by commit 5a0e3ad6af86 ("include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h") to accommodate for calls to kmalloc(). The calls were removed by commit 0e94916e6091 ("PCI: pciehp: Handle events synchronously").
* "../pci.h" in pciehp_ctrl.c Added by historic commit 67f4660b72f2 ("PCI: ASPM patch for") to accommodate for usage of the global variable pcie_mch_quirk: https://git.kernel.org/tglx/history/c/67f4660b72f2 The global variable was removed by commit 0ba379ec0fb1 ("PCI: Simplify hotplug mch quirk").
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
|
Revision tags: v4.18, v4.18-rc8 |
|
| #
11e87702 |
| 31-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Differentiate between surprise and safe removal
When removing PCI devices below a hotplug bridge, pciehp marks them as disconnected if the card is no longer present in the slot or it qu
PCI: pciehp: Differentiate between surprise and safe removal
When removing PCI devices below a hotplug bridge, pciehp marks them as disconnected if the card is no longer present in the slot or it quiesces them if the card is still present (by disabling INTx interrupts, bus mastering and SERR# reporting).
To detect whether the card is still present, pciehp checks the Presence Detect State bit in the Slot Status register. The problem with this approach is that even if the card is present, the link to it may be down, and it that case it would be better to mark the devices as disconnected instead of trying to quiesce them. Moreover, if the card in the slot was quickly replaced by another one, the Presence Detect State bit would be set, yet trying to quiesce the new card's devices would be wrong and the correct thing to do is to mark the previous card's devices as disconnected.
Instead of looking at the Presence Detect State bit, it is better to differentiate whether the card was surprise removed versus safely removed (via sysfs or an Attention Button press). On surprise removal, the devices should be marked as disconnected, whereas on safe removal it is correct to quiesce the devices.
The knowledge whether a surprise removal or a safe removal is at hand does exist further up in the call stack: A surprise removal is initiated by pciehp_handle_presence_or_link_change(), a safe removal by pciehp_handle_disable_request().
Pass that information down to pciehp_unconfigure_device() and use it in lieu of the Presence Detect State bit. While there, add kernel-doc to pciehp_unconfigure_device() and pciehp_configure_device().
Tested-by: Alexandru Gagniuc <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Keith Busch <[email protected]>
show more ...
|
|
Revision tags: v4.18-rc7 |
|
| #
8bb46b07 |
| 28-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Avoid implicit fallthroughs in switch statements
Per Mika's request, add an explicit break to the last case of switch statements everywhere in pciehp to be more defensive towards future
PCI: pciehp: Avoid implicit fallthroughs in switch statements
Per Mika's request, add an explicit break to the last case of switch statements everywhere in pciehp to be more defensive towards future amendments.
Per Gustavo's request, mark all non-empty implicit fallthroughs with a comment to silence warnings triggered by -Wimplicit-fallthrough=2.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Acked-by: Gustavo A. R. Silva <[email protected]>
show more ...
|
|
Revision tags: v4.18-rc6 |
|
| #
83503074 |
| 19-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Resume to D0 on enable/disable
pciehp's IRQ thread ensures accessibility of the port by runtime resuming its parent to D0. However when the slot is enabled/disabled, the port itself ne
PCI: pciehp: Resume to D0 on enable/disable
pciehp's IRQ thread ensures accessibility of the port by runtime resuming its parent to D0. However when the slot is enabled/disabled, the port itself needs to be in D0 because its secondary bus is accessed in:
pciehp_check_link_status(), pciehp_configure_device() (both called from board_added()) and pciehp_unconfigure_device() (called from remove_board()).
Thus, acquire a runtime PM ref on enable/disablement of the slot.
Yinghai Lu additionally discovered that some SkyLake servers feature a Power Controller for their PCIe hotplug ports (PCIe r3.1, sec 6.7.1.8) which requires the port to be in D0 when invoking
pciehp_power_on_slot() (likewise called from board_added()).
If slot power is turned on while in D3hot, link training later fails: https://lkml.kernel.org/r/[email protected]
The spec is silent about such a requirement, but it seems prudent to assume that any hotplug port with a Power Controller may need this.
The present commit holds a runtime PM ref whenever slot power is turned on and off, but it doesn't keep the port in D0 as long as slot power is on. If vendors determine that's necessary, they need to amend pciehp to acquire a runtime PM ref in pciehp_power_on_slot() and release one in pciehp_power_off_slot().
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Mika Westerberg <[email protected]> Cc: Ashok Raj <[email protected]> Cc: Keith Busch <[email protected]> Cc: Yinghai Lu <[email protected]>
show more ...
|
| #
d331710e |
| 19-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Become resilient to missed events
A hotplug port's Slot Status register does not count how often each type of event occurred, it only records the fact *that* an event has occurred.
Pre
PCI: pciehp: Become resilient to missed events
A hotplug port's Slot Status register does not count how often each type of event occurred, it only records the fact *that* an event has occurred.
Previously pciehp queued a work item for each event. But if it missed an event, e.g. removal of a card in-between two back-to-back insertions, it queued up the wrong work item or no work item at all. Commit fad214b0aa72 ("PCI: pciehp: Process all hotplug events before looking for new ones") sought to improve the situation by shrinking the window during which events may be missed.
But Stefan Roese reports unbalanced Card present and Link Up events, suggesting that we're still missing events if they occur very rapidly. Bjorn Helgaas responds that he considers pciehp's event handling "baroque" and calls for its simplification and rationalization: https://lkml.kernel.org/r/[email protected]
It gets worse once a hotplug port is runtime suspended: The port can signal an interrupt while it and its parents are in D3hot, i.e. while it is inaccessible. By the time we've runtime resumed all parents to D0 and read the port's Slot Status register, we may have missed an arbitrary number of events. Event handling therefore needs to be reworked to become resilient to missed events.
Assume that a Presence Detect Changed event has occurred. Consider the following truth table: - Slot is in OFF_STATE and is currently empty. => Do nothing. (The event is trailing a Link Down or we've missed an insertion and subsequent removal.) - Slot is in OFF_STATE and is currently occupied. => Turn the slot on. - Slot is in ON_STATE and is currently empty. => Turn the slot off. - Slot is in ON_STATE and is currently occupied. => Turn the slot off, (Be cautious and assume the card in then back on. the slot isn't the same as before.)
This leads to the following simple algorithm: 1 If the slot is in ON_STATE, turn it off unconditionally. 2 If the slot is currently occupied, turn it on.
Because those actions are now carried out synchronously, rather than by scheduled work items, pciehp reacts to the *current* situation and missed events no longer matter.
Data Link Layer State Changed events can be handled identically to Presence Detect Changed events. Note that in the above truth table, a Link Up trailing a Card present event didn't have to be accounted for: It is filtered out by pciehp_check_link_status().
As for Attention Button Pressed events, PCIe r4.0, sec 6.7.1.5 says: "Once the Power Indicator begins blinking, a 5-second abort interval exists during which a second depression of the Attention Button cancels the operation." In other words, the user can only expect the system to react to a button press after it starts blinking. Missed button presses that occur in-between are irrelevant.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Stefan Roese <[email protected]> Cc: Mayurkumar Patel <[email protected]> Cc: Mika Westerberg <[email protected]> Cc: Kenji Kaneshige <[email protected]>
show more ...
|
| #
25c83b84 |
| 19-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Declare pciehp_enable/disable_slot() static
No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c remain, so declare the functions static. For now this requires forward d
PCI: pciehp: Declare pciehp_enable/disable_slot() static
No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c remain, so declare the functions static. For now this requires forward declarations. Those can be eliminated by reshuffling functions once the ongoing effort to refactor the driver has settled.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
| #
1656716d |
| 19-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Drop enable/disable lock
Previously slot enablement and disablement could happen concurrently. But now it's under the exclusive control of the IRQ thread, rendering the locking obsolete
PCI: pciehp: Drop enable/disable lock
Previously slot enablement and disablement could happen concurrently. But now it's under the exclusive control of the IRQ thread, rendering the locking obsolete. Drop it.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|
| #
32a8cef2 |
| 19-Jul-2018 |
Lukas Wunner <[email protected]> |
PCI: pciehp: Enable/disable exclusively from IRQ thread
Besides the IRQ thread, there are several other places in the driver which enable or disable the slot:
- pciehp_probe() enables the slot if i
PCI: pciehp: Enable/disable exclusively from IRQ thread
Besides the IRQ thread, there are several other places in the driver which enable or disable the slot:
- pciehp_probe() enables the slot if it's occupied and the pciehp_force module parameter is used.
- pciehp_resume() enables or disables the slot after system sleep.
- pciehp_queue_pushbutton_work() enables or disables the slot after the 5 second delay following an Attention Button press.
- pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or disable the slot on sysfs write.
This requires locking and complicates pciehp's state machine.
A simplification can be achieved by enabling and disabling the slot exclusively from the IRQ thread.
Amend the functions listed above to request slot enable/disablement from the IRQ thread by either synthesizing a Presence Detect Changed event or, in the case of a disable user request (via sysfs or an Attention Button press), submitting a newly introduced force disable request. The latter is needed because the slot shall be forced off despite being occupied. For this force disable request, avoid colliding with Slot Status register bits by using a bit number greater than 16.
For synchronous execution of requests (on sysfs write), wait for the request to finish and retrieve the result. There can only ever be one sysfs write in flight due to the locking in kernfs_fop_write(), hence there is no risk of returning the result of a different sysfs request to user space.
The POWERON_STATE and POWEROFF_STATE is now no longer entered by the above-listed functions, but solely by the IRQ thread when it begins a power transition. Afterwards, it moves to STATIC_STATE. The same applies to canceling the Attention Button work, it likewise becomes an IRQ thread only operation.
An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is never observed by the IRQ thread itself, only by functions called in a different context, such as pciehp_sysfs_enable_slot(). So remove handling of these states from pciehp_handle_button_press() and pciehp_handle_link_change() which are exclusively called from the IRQ thread.
Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
show more ...
|