History log of /linux-6.15/drivers/pci/hotplug/pciehp_ctrl.c (Results 1 – 25 of 114)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4
# 665745f2 18-Oct-2024 Ilpo Järvinen <[email protected]>

PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller

This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth
notification"). An upcoming commit extends this driver build

PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller

This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth
notification"). An upcoming commit extends this driver building PCIe
bandwidth controller on top of it.

PCIe bandwidth notifications were first added in the commit e8303bb7a75c
("PCI/LINK: Report degraded links via link bandwidth notification") but
later had to be removed. The significant changes compared with the old
bandwidth notification driver include:

1) Don't print the notifications into kernel log, just keep the Link
Speed cached in struct pci_bus updated. While somewhat unfortunate,
the log spam was the source of complaints that eventually lead to
the removal of the bandwidth notifications driver (see the links
below for further information).

2) Besides the Link Bandwidth Management Interrupt, also enable Link
Autonomous Bandwidth Interrupt to cover the other source of bandwidth
changes.

3) Handle Link Speed updates robustly. Refresh the cached Link Speed
when enabling Bandwidth Notification Interrupts, and solve the race
between Link Speed read and LBMS/LABS update in
pcie_bwnotif_irq_thread().

4) Use concurrency safe LNKCTL RMW operations.

5) The driver is now called PCIe bwctrl (bandwidth controller) instead
of just bandwidth notifications because of increased scope and
functionality within the driver.

6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain().
Provide LBMS counting API for it.

7) Tweaks to variable/functions names for consistency and length reasons.

Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to
keep track PCIe Link Speed changes.

[bhelgaas: This is based on previous work by Alexandru Gagniuc
<[email protected]>; see e8303bb7a75c ("PCI/LINK: Report degraded links
via link bandwidth notification")]

Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/linux-pci/[email protected]/
Link: https://lore.kernel.org/linux-pci/[email protected]/
Suggested-by: Lukas Wunner <[email protected]> # Building bwctrl on top of bwnotif
Signed-off-by: Ilpo Järvinen <[email protected]>
[bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler:
https://lore.kernel.org/r/[email protected]]
Signed-off-by: Bjorn Helgaas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>

show more ...


Revision tags: v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2
# e8afd0d9 12-May-2023 Rongguang Wei <[email protected]>

PCI: pciehp: Cancel bringup sequence if card is not present

If a PCIe hotplug slot has an Attention Button, the normal hot-add flow is:

- Slot is empty and slot power is off
- User inserts card

PCI: pciehp: Cancel bringup sequence if card is not present

If a PCIe hotplug slot has an Attention Button, the normal hot-add flow is:

- Slot is empty and slot power is off
- User inserts card in slot and presses Attention Button
- OS blinks Power Indicator for 5 seconds
- After 5 seconds, OS turns on Power Indicator, turns on slot power, and
enumerates the device

Previously, if a user pressed the Attention Button on an *empty* slot,
pciehp logged the following messages and blinked the Power Indicator
until a second button press:

[0.000] pciehp: Button press: will power on in 5 sec
[0.001] # Power Indicator starts blinking
[5.001] # 5 second timeout; slot is empty, so we should cancel the
request to power on and turn off Power Indicator

[7.000] # Power Indicator still blinking
[8.000] # possible card insertion
[9.000] pciehp: Button press: canceling request to power on

The first button press incorrectly left the slot in BLINKINGON_STATE, so
the second was interpreted as a "cancel power on" event regardless of
whether a card was present.

If the slot is empty, turn off the Power Indicator and return from
BLINKINGON_STATE to OFF_STATE after 5 seconds, effectively canceling the
request to power on. Putting the slot in OFF_STATE also means the second
button press will correctly request a slot power on if the slot is
occupied.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/[email protected]
Fixes: d331710ea78f ("PCI: pciehp: Become resilient to missed events")
Suggested-by: Lukas Wunner <[email protected]>
Signed-off-by: Rongguang Wei <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Lukas Wunner <[email protected]>

show more ...


# 5054133a 22-May-2023 Bjorn Helgaas <[email protected]>

PCI: pciehp: Simplify Attention Button logging

Previously, pressing the Attention Button always logged two lines, the
first from pciehp_ist() and the second from pciehp_handle_button_press():

Att

PCI: pciehp: Simplify Attention Button logging

Previously, pressing the Attention Button always logged two lines, the
first from pciehp_ist() and the second from pciehp_handle_button_press():

Attention button pressed
Powering on due to button press

Since pciehp_handle_button_press() always logs the more detailed message,
remove the generic "Attention button pressed" message. Reword the
pciehp_handle_button_press() to be of the form:

Button press: will power on in 5 sec
Button press: will power off in 5 sec
Button press: canceling request to power on
Button press: canceling request to power off
Button press: ignoring invalid state %#x

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Lukas Wunner <[email protected]>

show more ...


Revision tags: v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6
# 8a614499 17-Sep-2020 Lukas Wunner <[email protected]>

PCI: pciehp: Reduce noisiness on hot removal

When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of u

PCI: pciehp: Reduce noisiness on hot removal

When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of up to 244 msec between the two events with Thunderbolt.

After pciehp has brought down the slot in response to the first event, the
other bit may still be set. It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear. So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity. If the slot is no longer
occupied, the messages are false positives and annoy users.

Stuart Hayes reports the following splat on hot removal:

KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status

Dongdong Liu complains about a similar splat:

KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status

Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs. In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change. In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f991424
("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1. On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat). To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see
6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link").

pciehp is designed to work with all of these variants. Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all. Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.

But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:

* Drop the "Timeout waiting for Presence Detect" message in
pcie_wait_for_presence(). The sole caller of that function,
pciehp_check_link_status(), ignores the timeout and carries on. It emits
error messages of its own and I don't think this particular message adds
much value.

* There's a single error condition in pciehp_check_link_status() which
does not emit a message. Adding one allows dropping the "Failed to check
link status" message emitted by board_added() if
pciehp_check_link_status() returns a non-zero integer.

* Tone down all messages in pciehp_check_link_status() to KERN_INFO
severity and rephrase them to look as innocuous as possible. To this
end, move the message emitted by pcie_wait_for_link_delay() to its
callers.

As a result, Stuart Hayes' splat becomes:

KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001

Dongdong Liu's splat becomes:

KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link

The messages now merely serve as information that presence or link bits
were set a little longer than expected. Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.

Link: https://lore.kernel.org/linux-pci/[email protected]/
Link: https://lore.kernel.org/linux-pci/[email protected]/
Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de
Reported-by: Stuart Hayes <[email protected]>
Reported-by: Dongdong Liu <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>

show more ...


Revision tags: v5.9-rc5, v5.9-rc4, v5.9-rc3
# df561f66 23-Aug-2020 Gustavo A. R. Silva <[email protected]>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through mar

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <[email protected]>

show more ...


Revision tags: v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6
# 87d0f2a5 29-Oct-2019 Mika Westerberg <[email protected]>

PCI: pciehp: Prevent deadlock on disconnect

This addresses deadlocks in these common cases in hierarchies containing
two switches:

- All involved ports are runtime suspended and they are unplugge

PCI: pciehp: Prevent deadlock on disconnect

This addresses deadlocks in these common cases in hierarchies containing
two switches:

- All involved ports are runtime suspended and they are unplugged. This
can happen easily if the drivers involved automatically enable runtime
PM (xHCI for example does that).

- System is suspended (e.g., closing the lid on a laptop) with a dock +
something else connected, and the dock is unplugged while suspended.

These cases lead to the following deadlock:

INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
irq/126-pciehp D 0 198 2 0x80000000
Call Trace:
schedule+0x2c/0x80
schedule_timeout+0x246/0x350
wait_for_completion+0xb7/0x140
kthread_stop+0x49/0x110
free_irq+0x32/0x70
pcie_shutdown_notification+0x2f/0x50
pciehp_remove+0x27/0x50
pcie_port_remove_service+0x36/0x50
device_release_driver+0x12/0x20
bus_remove_device+0xec/0x160
device_del+0x13b/0x350
device_unregister+0x1a/0x60
remove_iter+0x1e/0x30
device_for_each_child+0x56/0x90
pcie_port_device_remove+0x22/0x40
pcie_portdrv_remove+0x20/0x60
pci_device_remove+0x3e/0xc0
device_release_driver_internal+0x18c/0x250
device_release_driver+0x12/0x20
pci_stop_bus_device+0x6f/0x90
pci_stop_bus_device+0x31/0x90
pci_stop_and_remove_bus_device+0x12/0x20
pciehp_unconfigure_device+0x88/0x140
pciehp_disable_slot+0x6a/0x110
pciehp_handle_presence_or_link_change+0x263/0x400
pciehp_ist+0x1c9/0x1d0
irq_thread_fn+0x24/0x60
irq_thread+0xeb/0x190
kthread+0x120/0x140

INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
irq/190-pciehp D 0 2288 2 0x80000000
Call Trace:
__schedule+0x2a2/0x880
schedule+0x2c/0x80
schedule_preempt_disabled+0xe/0x10
mutex_lock+0x2c/0x30
pci_lock_rescan_remove+0x15/0x20
pciehp_unconfigure_device+0x4d/0x140
pciehp_disable_slot+0x6a/0x110
pciehp_handle_presence_or_link_change+0x263/0x400
pciehp_ist+0x1c9/0x1d0
irq_thread_fn+0x24/0x60
irq_thread+0xeb/0x190
kthread+0x120/0x140

What happens here is that the whole hierarchy is runtime resumed and the
parent PCIe downstream port, which got the hot-remove event, starts
removing devices below it, taking pci_lock_rescan_remove() lock. When the
child PCIe port is runtime resumed it calls pciehp_check_presence() which
ends up calling pciehp_card_present() and pciehp_check_link_active(). Both
of these use pcie_capability_read_word(), which notices that the underlying
device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the
capability value set to 0. When pciehp gets this value it thinks that its
child device is also hot-removed and schedules its IRQ thread to handle the
event.

The deadlock happens when the child's IRQ thread runs and tries to acquire
pci_lock_rescan_remove() which is already taken by the parent and the
parent waits for the child's IRQ thread to finish.

Prevent this from happening by checking the return value of
pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
performing any hot-removal activities.

[bhelgaas: add common scenarios to commit log]
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


Revision tags: v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4
# 157c1062 09-Aug-2019 Lukas Wunner <[email protected]>

PCI: pciehp: Avoid returning prematurely from sysfs requests

A sysfs request to enable or disable a PCIe hotplug slot should not
return before it has been carried out. That is sought to be achieved

PCI: pciehp: Avoid returning prematurely from sysfs requests

A sysfs request to enable or disable a PCIe hotplug slot should not
return before it has been carried out. That is sought to be achieved by
waiting until the controller's "pending_events" have been cleared.

However the IRQ thread pciehp_ist() clears the "pending_events" before
it acts on them. If pciehp_sysfs_enable_slot() / _disable_slot() happen
to check the "pending_events" after they have been cleared but while
pciehp_ist() is still running, the functions may return prematurely
with an incorrect return value.

Fix by introducing an "ist_running" flag which must be false before a sysfs
request is allowed to return.

Fixes: 32a8cef274fe ("PCI: pciehp: Enable/disable exclusively from IRQ thread")
Link: https://lore.kernel.org/linux-pci/[email protected]
Link: https://lore.kernel.org/r/4174210466e27eb7e2243dd1d801d5f75baaffd8.1565345211.git.lukas@wunner.de
Reported-and-tested-by: Xiongfeng Wang <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: [email protected] # v4.19+

show more ...


# 4a06c2c3 05-Sep-2019 Bjorn Helgaas <[email protected]>

PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments

The PCIe spec doesn't mention "green LEDs" or "amber LEDs". Replace those
terms with "Power Indicator" and "Attention Indicator" so

PCI: pciehp: Refer to "Indicators" instead of "LEDs" in comments

The PCIe spec doesn't mention "green LEDs" or "amber LEDs". Replace those
terms with "Power Indicator" and "Attention Indicator" so the comments
match the spec language. Comment changes only.

Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


# 9194094b 03-Sep-2019 Denis Efremov <[email protected]>

PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()

Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators()
instead, since the code is mostly the same.

[bhelgaas: drop set_power_

PCI: pciehp: Remove pciehp_green_led_{on,off,blink}()

Remove pciehp_green_led_{on,off,blink}() and use pciehp_set_indicators()
instead, since the code is mostly the same.

[bhelgaas: drop set_power_indicator() wrapper to reduce the number of
interfaces]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Denis Efremov <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>

show more ...


# 94719ba0 03-Sep-2019 Denis Efremov <[email protected]>

PCI: pciehp: Combine adjacent indicator updates

Combine adjacent updates of power and attention indicators into a single
pciehp_set_indicators() call. This sends one command to the hotplug
controll

PCI: pciehp: Combine adjacent indicator updates

Combine adjacent updates of power and attention indicators into a single
pciehp_set_indicators() call. This sends one command to the hotplug
controller instead of two.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Denis Efremov <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
Reviewed-by: Lukas Wunner <[email protected]>

show more ...


Revision tags: v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1
# 94dbc956 07-May-2019 Frederick Lawler <[email protected]>

PCI: pciehp: Log messages with pci_dev, not pcie_device

Log messages with pci_dev, not pcie_device. Factor out common message
prefixes with dev_fmt().

Example output change:

- pciehp 0000:00:06

PCI: pciehp: Log messages with pci_dev, not pcie_device

Log messages with pci_dev, not pcie_device. Factor out common message
prefixes with dev_fmt().

Example output change:

- pciehp 0000:00:06.0:pcie004: Slot(0) Powering on due to button press
+ pcieport 0000:00:06.0: pciehp: Slot(0) Powering on due to button press

Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Frederick Lawler <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>

show more ...


Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1
# 3943af9d 12-Mar-2019 Sergey Miroshnichenko <[email protected]>

PCI: pciehp: Ignore Link State Changes after powering off a slot

During a safe hot remove, the OS powers off the slot, which may cause a
Data Link Layer State Changed event. The slot has already be

PCI: pciehp: Ignore Link State Changes after powering off a slot

During a safe hot remove, the OS powers off the slot, which may cause a
Data Link Layer State Changed event. The slot has already been set to
OFF_STATE, so that event results in re-enabling the device, making it
impossible to safely remove it.

Clear out the Presence Detect Changed and Data Link Layer State Changed
events when the disabled slot has settled down.

It is still possible to re-enable the device if it remains in the slot
after pressing the Attention Button by pressing it again.

Fixes the problem that Micah reported below: an NVMe drive power button may
not actually turn off the drive.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203237
Reported-by: Micah Parrish <[email protected]>
Tested-by: Micah Parrish <[email protected]>
Signed-off-by: Sergey Miroshnichenko <[email protected]>
[bhelgaas: changelog, add bugzilla URL]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Lukas Wunner <[email protected]>
Cc: [email protected] # v4.19+

show more ...


Revision tags: v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3
# 125450f8 08-Sep-2018 Lukas Wunner <[email protected]>

PCI: hotplug: Embed hotplug_slot

When the PCI hotplug core and its first user, cpqphp, were introduced in
February 2002 with historic commit a8a2069f432c, cpqphp allocated a slot
struct for its inte

PCI: hotplug: Embed hotplug_slot

When the PCI hotplug core and its first user, cpqphp, were introduced in
February 2002 with historic commit a8a2069f432c, cpqphp allocated a slot
struct for its internal use plus a hotplug_slot struct to be registered
with the hotplug core and linked the two with pointers:
https://git.kernel.org/tglx/history/c/a8a2069f432c

Nowadays, the predominant pattern in the tree is to embed ("subclass")
such structures in one another and cast to the containing struct with
container_of(). But it wasn't until July 2002 that container_of() was
introduced with historic commit ec4f214232cf:
https://git.kernel.org/tglx/history/c/ec4f214232cf

pnv_php, introduced in 2016, did the right thing and embedded struct
hotplug_slot in its internal struct pnv_php_slot, but all other drivers
cargo-culted cpqphp's design and linked separate structs with pointers.

Embedding structs is preferrable to linking them with pointers because
it requires fewer allocations, thereby reducing overhead and simplifying
error paths. Casting an embedded struct to the containing struct
becomes a cheap subtraction rather than a dereference. And having fewer
pointers reduces the risk of them pointing nowhere either accidentally
or due to an attack.

Convert all drivers to embed struct hotplug_slot in their internal slot
struct. The "private" pointer in struct hotplug_slot thereby becomes
unused, so drop it.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
Acked-by: Tyrel Datwyler <[email protected]> # drivers/pci/hotplug/rpa*
Acked-by: Sebastian Ott <[email protected]> # drivers/pci/hotplug/s390*
Acked-by: Andy Shevchenko <[email protected]> # drivers/platform/x86
Cc: Len Brown <[email protected]>
Cc: Scott Murray <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Oliver OHalloran <[email protected]>
Cc: Gavin Shan <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Corentin Chary <[email protected]>
Cc: Darren Hart <[email protected]>

show more ...


# 4ff3126e 08-Sep-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Rename controller struct members for clarity

Of the members which were just moved from pciehp's slot struct to the
controller struct, rename "lock" to "state_lock" and rename "work" to

PCI: pciehp: Rename controller struct members for clarity

Of the members which were just moved from pciehp's slot struct to the
controller struct, rename "lock" to "state_lock" and rename "work" to
"button_work" for clarity. Perform the rename separately to the
unification of the two structs per Sinan's request.

No functional change intended.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Sinan Kaya <[email protected]>

show more ...


# 5790a9c7 18-Sep-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Unify controller and slot structs

pciehp was originally introduced together with shpchp in a single
commit, c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI Express hot-plug
drivers"):
http

PCI: pciehp: Unify controller and slot structs

pciehp was originally introduced together with shpchp in a single
commit, c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI Express hot-plug
drivers"):
https://git.kernel.org/tglx/history/c/c16b4b14d980

shpchp supports up to 31 slots per controller, hence uses separate slot
and controller structs. pciehp has a 1:1 relationship between slot and
controller and therefore never required this separation. Nevertheless,
because much of the code had been copy-pasted between the two drivers,
pciehp likewise uses separate structs to this very day.

The artificial separation of data structures adds unnecessary complexity
and bloat to pciehp and requires constantly chasing pointers at runtime.

Simplify the driver by merging struct slot into struct controller.
Merge the slot constructor pcie_init_slot() and the destructor
pcie_cleanup_slot() into the controller counterparts.

No functional change intended.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


# 80696f99 08-Sep-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Tolerate Presence Detect hardwired to zero

The WiGig Bus Extension (WBE) specification allows tunneling PCIe over
IEEE 802.11. A product implementing this spec is the wil6210 from
Wilo

PCI: pciehp: Tolerate Presence Detect hardwired to zero

The WiGig Bus Extension (WBE) specification allows tunneling PCIe over
IEEE 802.11. A product implementing this spec is the wil6210 from
Wilocity (now part of Qualcomm Atheros). It integrates a PCIe switch
with a wireless network adapter:

00.0-+ [1ae9:0101] Upstream Port
+-00.0-+ [1ae9:0200] Downstream Port
| +-00.0 [168c:0034] Atheros AR9462 Wireless Network Adapter
+-02.0 [1ae9:0201] Downstream Port
+-03.0 [1ae9:0201] Downstream Port

Wirelessly attached devices presumably appear below the hotplug ports
with device ID [1ae9:0201]. Oddly, the Downstream Port [1ae9:0200]
leading to the wireless network adapter is likewise Hotplug Capable,
but has its Presence Detect State bit hardwired to zero. Even if the
Link Active bit is set, Presence Detect is zero, so this cannot be
caused by in-band presence detection but only by broken hardware.

pciehp assumes an empty slot if Presence Detect State is zero,
regardless of Link Active being one. Consequently, up until v4.18 it
removes the wireless network adapter in pciehp_resume(). From v4.19 it
already does so in pciehp_probe().

Be lenient towards broken hardware and assume the slot is occupied if
Link Active is set: Introduce pciehp_card_present_or_link_active()
and use it in lieu of pciehp_get_adapter_status() everywhere, except
in pciehp_handle_presence_or_link_change() whose log messages depend
on which of Presence Detect State or Link Active is set.

Remove the Presence Detect State check from __pciehp_enable_slot()
because it is only called if either of Presence Detect State or Link
Active is set.

Caution: There is a possibility that broken hardware exists which has
working Presence Detect but hardwires Link Active to one. On such
hardware the slot will now incorrectly be considered always occupied.
If such hardware is discovered, this commit can be rolled back and a
quirk can be added which sets is_hotplug_bridge = 0 for [1ae9:0200].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200839
Reported-and-tested-by: David Yang <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Rajat Jain <[email protected]>
Cc: Ashok Raj <[email protected]>

show more ...


Revision tags: v4.19-rc2, v4.19-rc1
# eee6e273 19-Aug-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Drop hotplug_slot_ops wrappers

pciehp's ->enable_slot, ->disable_slot, ->get_attention_status and
->reset_slot callbacks are currently implemented by wrapper functions
that do nothing e

PCI: pciehp: Drop hotplug_slot_ops wrappers

pciehp's ->enable_slot, ->disable_slot, ->get_attention_status and
->reset_slot callbacks are currently implemented by wrapper functions
that do nothing else but call down to a backend function. The backends
are not called from anywhere else, so drop the wrappers and use the
backends directly as callbacks, thereby shaving off a few lines of
unnecessary code.

No functional change intended.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


# 7d4ba523 19-Aug-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Drop unnecessary includes

Drop the following includes from pciehp source files which no longer use
any of the included symbols:

* <linux/sched/signal.h> in pciehp.h
<linux/signal.h>

PCI: pciehp: Drop unnecessary includes

Drop the following includes from pciehp source files which no longer use
any of the included symbols:

* <linux/sched/signal.h> in pciehp.h
<linux/signal.h> in pciehp_hpc.c
Added by commit de25968cc87c ("fix more missing includes") to
accommodate for a call to signal_pending().
The call was removed by commit 262303fe329a ("pciehp: fix wait command
completion").

* <linux/interrupt.h> in pciehp_core.c
Added by historic commit f308a2dfbe63 ("PCI: add PCI Express Port Bus
Driver subsystem") to accommodate for a call to free_irq():
https://git.kernel.org/tglx/history/c/f308a2dfbe63
The call was removed by commit 407f452b05f9 ("pciehp: remove
unnecessary free_irq").

* <linux/time.h> in pciehp_core.c and pciehp_hpc.c
Added by commit 34d03419f03b ("PCIEHP: Add Electro Mechanical
Interlock (EMI) support to the PCIE hotplug driver."),
which was reverted by commit bd3d99c17039 ("PCI: Remove untested
Electromechanical Interlock (EMI) support in pciehp.").

* <linux/module.h> in pciehp_ctrl.c, pciehp_hpc.c and pciehp_pci.c
Added by historic commit c16b4b14d980 ("PCI Hotplug: Add SHPC and PCI
Express hot-plug drivers"):
https://git.kernel.org/tglx/history/c/c16b4b14d980
Module-related symbols were neither used back then in those files,
nor are they used today.

* <linux/slab.h> in pciehp_ctrl.c
Added by commit 5a0e3ad6af86 ("include cleanup: Update gfp.h and
slab.h includes to prepare for breaking implicit slab.h inclusion from
percpu.h") to accommodate for calls to kmalloc().
The calls were removed by commit 0e94916e6091 ("PCI: pciehp: Handle
events synchronously").

* "../pci.h" in pciehp_ctrl.c
Added by historic commit 67f4660b72f2 ("PCI: ASPM patch for") to
accommodate for usage of the global variable pcie_mch_quirk:
https://git.kernel.org/tglx/history/c/67f4660b72f2
The global variable was removed by commit 0ba379ec0fb1 ("PCI: Simplify
hotplug mch quirk").

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


Revision tags: v4.18, v4.18-rc8
# 11e87702 31-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Differentiate between surprise and safe removal

When removing PCI devices below a hotplug bridge, pciehp marks them as
disconnected if the card is no longer present in the slot or it qu

PCI: pciehp: Differentiate between surprise and safe removal

When removing PCI devices below a hotplug bridge, pciehp marks them as
disconnected if the card is no longer present in the slot or it quiesces
them if the card is still present (by disabling INTx interrupts, bus
mastering and SERR# reporting).

To detect whether the card is still present, pciehp checks the Presence
Detect State bit in the Slot Status register. The problem with this
approach is that even if the card is present, the link to it may be
down, and it that case it would be better to mark the devices as
disconnected instead of trying to quiesce them. Moreover, if the card
in the slot was quickly replaced by another one, the Presence Detect
State bit would be set, yet trying to quiesce the new card's devices
would be wrong and the correct thing to do is to mark the previous
card's devices as disconnected.

Instead of looking at the Presence Detect State bit, it is better to
differentiate whether the card was surprise removed versus safely
removed (via sysfs or an Attention Button press). On surprise removal,
the devices should be marked as disconnected, whereas on safe removal it
is correct to quiesce the devices.

The knowledge whether a surprise removal or a safe removal is at hand
does exist further up in the call stack: A surprise removal is
initiated by pciehp_handle_presence_or_link_change(), a safe removal by
pciehp_handle_disable_request().

Pass that information down to pciehp_unconfigure_device() and use it in
lieu of the Presence Detect State bit. While there, add kernel-doc to
pciehp_unconfigure_device() and pciehp_configure_device().

Tested-by: Alexandru Gagniuc <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Keith Busch <[email protected]>

show more ...


Revision tags: v4.18-rc7
# 8bb46b07 28-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Avoid implicit fallthroughs in switch statements

Per Mika's request, add an explicit break to the last case of switch
statements everywhere in pciehp to be more defensive towards future

PCI: pciehp: Avoid implicit fallthroughs in switch statements

Per Mika's request, add an explicit break to the last case of switch
statements everywhere in pciehp to be more defensive towards future
amendments.

Per Gustavo's request, mark all non-empty implicit fallthroughs with a
comment to silence warnings triggered by -Wimplicit-fallthrough=2.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Acked-by: Gustavo A. R. Silva <[email protected]>

show more ...


Revision tags: v4.18-rc6
# 83503074 19-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Resume to D0 on enable/disable

pciehp's IRQ thread ensures accessibility of the port by runtime resuming
its parent to D0. However when the slot is enabled/disabled, the port
itself ne

PCI: pciehp: Resume to D0 on enable/disable

pciehp's IRQ thread ensures accessibility of the port by runtime resuming
its parent to D0. However when the slot is enabled/disabled, the port
itself needs to be in D0 because its secondary bus is accessed in:

pciehp_check_link_status(),
pciehp_configure_device() (both called from board_added())
and
pciehp_unconfigure_device() (called from remove_board()).

Thus, acquire a runtime PM ref on enable/disablement of the slot.

Yinghai Lu additionally discovered that some SkyLake servers feature a
Power Controller for their PCIe hotplug ports (PCIe r3.1, sec 6.7.1.8)
which requires the port to be in D0 when invoking

pciehp_power_on_slot() (likewise called from board_added()).

If slot power is turned on while in D3hot, link training later fails:
https://lkml.kernel.org/r/[email protected]

The spec is silent about such a requirement, but it seems prudent to
assume that any hotplug port with a Power Controller may need this.

The present commit holds a runtime PM ref whenever slot power is turned
on and off, but it doesn't keep the port in D0 as long as slot power is
on. If vendors determine that's necessary, they need to amend pciehp to
acquire a runtime PM ref in pciehp_power_on_slot() and release one in
pciehp_power_off_slot().

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Mika Westerberg <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Yinghai Lu <[email protected]>

show more ...


# d331710e 19-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Become resilient to missed events

A hotplug port's Slot Status register does not count how often each type
of event occurred, it only records the fact *that* an event has occurred.

Pre

PCI: pciehp: Become resilient to missed events

A hotplug port's Slot Status register does not count how often each type
of event occurred, it only records the fact *that* an event has occurred.

Previously pciehp queued a work item for each event. But if it missed
an event, e.g. removal of a card in-between two back-to-back insertions,
it queued up the wrong work item or no work item at all. Commit
fad214b0aa72 ("PCI: pciehp: Process all hotplug events before looking
for new ones") sought to improve the situation by shrinking the window
during which events may be missed.

But Stefan Roese reports unbalanced Card present and Link Up events,
suggesting that we're still missing events if they occur very rapidly.
Bjorn Helgaas responds that he considers pciehp's event handling
"baroque" and calls for its simplification and rationalization:
https://lkml.kernel.org/r/[email protected]

It gets worse once a hotplug port is runtime suspended: The port can
signal an interrupt while it and its parents are in D3hot, i.e. while
it is inaccessible. By the time we've runtime resumed all parents to D0
and read the port's Slot Status register, we may have missed an arbitrary
number of events. Event handling therefore needs to be reworked to
become resilient to missed events.

Assume that a Presence Detect Changed event has occurred.
Consider the following truth table:
- Slot is in OFF_STATE and is currently empty. => Do nothing.
(The event is trailing a Link Down or we've
missed an insertion and subsequent removal.)
- Slot is in OFF_STATE and is currently occupied. => Turn the slot on.
- Slot is in ON_STATE and is currently empty. => Turn the slot off.
- Slot is in ON_STATE and is currently occupied. => Turn the slot off,
(Be cautious and assume the card in then back on.
the slot isn't the same as before.)

This leads to the following simple algorithm:
1 If the slot is in ON_STATE, turn it off unconditionally.
2 If the slot is currently occupied, turn it on.

Because those actions are now carried out synchronously, rather than by
scheduled work items, pciehp reacts to the *current* situation and
missed events no longer matter.

Data Link Layer State Changed events can be handled identically to
Presence Detect Changed events. Note that in the above truth table,
a Link Up trailing a Card present event didn't have to be accounted for:
It is filtered out by pciehp_check_link_status().

As for Attention Button Pressed events, PCIe r4.0, sec 6.7.1.5 says:
"Once the Power Indicator begins blinking, a 5-second abort interval
exists during which a second depression of the Attention Button cancels
the operation." In other words, the user can only expect the system to
react to a button press after it starts blinking. Missed button presses
that occur in-between are irrelevant.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Stefan Roese <[email protected]>
Cc: Mayurkumar Patel <[email protected]>
Cc: Mika Westerberg <[email protected]>
Cc: Kenji Kaneshige <[email protected]>

show more ...


# 25c83b84 19-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Declare pciehp_enable/disable_slot() static

No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c
remain, so declare the functions static. For now this requires forward
d

PCI: pciehp: Declare pciehp_enable/disable_slot() static

No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c
remain, so declare the functions static. For now this requires forward
declarations. Those can be eliminated by reshuffling functions once the
ongoing effort to refactor the driver has settled.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


# 1656716d 19-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Drop enable/disable lock

Previously slot enablement and disablement could happen concurrently.
But now it's under the exclusive control of the IRQ thread, rendering
the locking obsolete

PCI: pciehp: Drop enable/disable lock

Previously slot enablement and disablement could happen concurrently.
But now it's under the exclusive control of the IRQ thread, rendering
the locking obsolete. Drop it.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


# 32a8cef2 19-Jul-2018 Lukas Wunner <[email protected]>

PCI: pciehp: Enable/disable exclusively from IRQ thread

Besides the IRQ thread, there are several other places in the driver
which enable or disable the slot:

- pciehp_probe() enables the slot if i

PCI: pciehp: Enable/disable exclusively from IRQ thread

Besides the IRQ thread, there are several other places in the driver
which enable or disable the slot:

- pciehp_probe() enables the slot if it's occupied and the pciehp_force
module parameter is used.

- pciehp_resume() enables or disables the slot after system sleep.

- pciehp_queue_pushbutton_work() enables or disables the slot after the
5 second delay following an Attention Button press.

- pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or
disable the slot on sysfs write.

This requires locking and complicates pciehp's state machine.

A simplification can be achieved by enabling and disabling the slot
exclusively from the IRQ thread.

Amend the functions listed above to request slot enable/disablement from
the IRQ thread by either synthesizing a Presence Detect Changed event or,
in the case of a disable user request (via sysfs or an Attention Button
press), submitting a newly introduced force disable request. The latter
is needed because the slot shall be forced off despite being occupied.
For this force disable request, avoid colliding with Slot Status register
bits by using a bit number greater than 16.

For synchronous execution of requests (on sysfs write), wait for the
request to finish and retrieve the result. There can only ever be one
sysfs write in flight due to the locking in kernfs_fop_write(), hence
there is no risk of returning the result of a different sysfs request to
user space.

The POWERON_STATE and POWEROFF_STATE is now no longer entered by the
above-listed functions, but solely by the IRQ thread when it begins a
power transition. Afterwards, it moves to STATIC_STATE. The same
applies to canceling the Attention Button work, it likewise becomes an
IRQ thread only operation.

An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is
never observed by the IRQ thread itself, only by functions called in a
different context, such as pciehp_sysfs_enable_slot(). So remove
handling of these states from pciehp_handle_button_press() and
pciehp_handle_link_change() which are exclusively called from the IRQ
thread.

Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

show more ...


12345