| ce2fa1db | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
We had originally thought to have the mailbox go to ready in the background while we were doing other things. One issue
fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
We had originally thought to have the mailbox go to ready in the background while we were doing other things. One issue with this though is that we can't disable it by clearing the ready state without also blocking interrupts or calls to mbx_poll as it will just pop back to life during an interrupt.
In order to prevent that from happening we can pull the code for toggling to ready out of the interrupt path and instead place it in the fbnic_mbx_poll_tx_ready path so that it becomes the only spot where the Rx/Tx can toggle to the ready state. By doing this we can prevent races where we disable the DMA and/or free buffers only to have an interrupt fire and undo what we have done.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654722518.499179.11612865740376848478.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| 1b34d1c1 | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
This change pulls the call to fbnic_fw_xmit_cap_msg out of fbnic_mbx_init_desc_ring and instead places it in the polling function for g
fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
This change pulls the call to fbnic_fw_xmit_cap_msg out of fbnic_mbx_init_desc_ring and instead places it in the polling function for getting the Tx ready. Doing that we can avoid the potential issue with an interrupt coming in later from the firmware that causes it to get fired in interrupt context.
Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654721876.499179.9839651602256668493.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| ab064f60 | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
There were a couple different issues found in fbnic_mbx_poll_tx_ready. Among them were the fact that we were sleeping much longer than we act
fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
There were a couple different issues found in fbnic_mbx_poll_tx_ready. Among them were the fact that we were sleeping much longer than we actually needed to as the actual FW could respond in under 20ms. The other issue was that we would just keep polling the mailbox even if the device itself had gone away.
To address the responsiveness issues we can decrease the sleeps to 20ms and use a jiffies based timeout value rather than just counting the number of times we slept and then polled.
To address the hardware going away we can move the check for the firmware BAR being present from where it was and place it inside the loop after the mailbox descriptor ring is initialized and before we sleep so that we just abort and return an error if the device went away during initialization.
With these two changes we see a significant improvement in boot times for the driver.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654721224.499179.2698616208976624755.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| cdbb2dc3 | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Cleanup handling of completions
There was an issue in that if we were to shutdown we could be left with a completion in flight as the mailbox went away. To address that I have added an fbnic_
fbnic: Cleanup handling of completions
There was an issue in that if we were to shutdown we could be left with a completion in flight as the mailbox went away. To address that I have added an fbnic_mbx_evict_all_cmpl function that is meant to essentially create a "broken pipe" type response so that all callers will receive an error indicating that the connection has been broken as a result of us shutting down the mailbox.
Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654720578.499179.380252598204530873.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| 0f9a959a | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Actually flush_tx instead of stalling out
The fbnic_mbx_flush_tx function had a number of issues.
First, we were waiting 200ms for the firmware to process the packets. We can drop this to 20
fbnic: Actually flush_tx instead of stalling out
The fbnic_mbx_flush_tx function had a number of issues.
First, we were waiting 200ms for the firmware to process the packets. We can drop this to 20ms and in almost all cases this should be more than enough time. So by changing this we can significantly reduce shutdown time.
Second, we were not making sure that the Tx path was actually shut off. As such we could still have packets added while we were flushing the mailbox. To prevent that we can now clear the ready flag for the Tx side and it should stay down since the interrupt is disabled.
Third, we kept re-reading the tail due to the second issue. The tail should not move after we have started the flush so we can just read it once while we are holding the mailbox Tx lock. By doing that we are guaranteed that the value should be consistent.
Fourth, we were keeping a count of descriptors cleaned due to the second and third issues called out. That count is not a valid reason to be exiting the cleanup, and with the tail only being read once we shouldn't see any cases where the tail moves after the disable so the tracking of count can be dropped.
Fifth, we were using attempts * sleep time to determine how long we would wait in our polling loop to flush out the Tx. This can be very imprecise. In order to tighten up the timing we are shifting over to using a jiffies value of jiffies + 10 * HZ + 1 to determine the jiffies value we should stop polling at as this should be accurate within once sleep cycle for the total amount of time spent polling.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654719929.499179.16406653096197423749.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| 682a6128 | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Add additional handling of IRQs
We have two issues that need to be addressed in our IRQ handling.
One is the fact that we can end up double-freeing IRQs in the event of an exception handling
fbnic: Add additional handling of IRQs
We have two issues that need to be addressed in our IRQ handling.
One is the fact that we can end up double-freeing IRQs in the event of an exception handling error such as a PCIe reset/recovery that fails. To prevent that from becoming an issue we can use the msix_vector values to indicate that we have successfully requested/freed the IRQ by only setting or clearing them when we have completed the given action.
The other issue is that we have several potential races in our IRQ path due to us manipulating the mask before the vector has been truly disabled. In order to handle that in the case of the FW mailbox we need to not auto-enable the IRQ and instead will be enabling/disabling it separately. In the case of the PCS vector we can mitigate this by unmapping it and synchronizing the IRQ before we clear the mask.
The general order of operations after this change is now to request the interrupt, poll the FW mailbox to ready, and then enable the interrupt. For the shutdown we do the reverse where we disable the interrupt, flush any pending Tx, and then free the IRQ. I am renaming the enable/disable to request/free to be equivilent with the IRQ calls being used. We may see additions in the future to enable/disable the IRQs versus request/free them for certain use cases.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Fixes: 69684376eed5 ("eth: fbnic: Add link detection") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654719271.499179.3634535105127848325.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| 3b12f00d | 06-May-2025 |
Alexander Duyck <[email protected]> |
fbnic: Gate AXI read/write enabling on FW mailbox
In order to prevent the device from throwing spurious writes and/or reads at us we need to gate the AXI fabric interface to the PCIe until such time
fbnic: Gate AXI read/write enabling on FW mailbox
In order to prevent the device from throwing spurious writes and/or reads at us we need to gate the AXI fabric interface to the PCIe until such time as we know the FW is in a known good state.
To accomplish this we use the mailbox as a mechanism for us to recognize that the FW has acknowledged our presence and is no longer sending any stale message data to us.
We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function, disabling the DMA in both directions, and then invalidating all the descriptors in each ring.
We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt is set by the FW we pick it up and mark the mailboxes as ready, while also enabling the DMA.
Once we have completed all the transactions and need to shut down we call into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for each ring and shut down the DMA and once again invalidate the descriptors.
Fixes: 3646153161f1 ("eth: fbnic: Add register init to set PCIe/Ethernet device config") Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://patch.msgid.link/174654718623.499179.7445197308109347982.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| 6cbf18a0 | 06-Mar-2025 |
Jakub Kicinski <[email protected]> |
eth: fbnic: support ring size configuration
Support ethtool -g / -G. Leverage the code added for -l / -L to alloc / stop / start / free.
Check parameters against HW min/max but also our own min/max
eth: fbnic: support ring size configuration
Support ethtool -g / -G. Leverage the code added for -l / -L to alloc / stop / start / free.
Check parameters against HW min/max but also our own min/max. Min HW queue is 16 entries, we can't deal with TWQs that small because of the queue waking logic. Add similar contraint on RCQ for symmetry.
We need 3 sizes on Rx, as the NIC does header-data split two separate buffer pools: (1) head page ring - how many empty pages we post for headers (2) payload page ring - how many empty pages we post for payloads (3) completion ring - where NIC produces the Rx descriptors
Acked-by: Joe Damato <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| bfb522f3 | 06-Mar-2025 |
Jakub Kicinski <[email protected]> |
eth: fbnic: fix typo in compile assert
We should be validating the Rx count on the Rx struct, not the Tx struct. There is no real change here, rx_stats and tx_stats are instances of the same struct.
eth: fbnic: fix typo in compile assert
We should be validating the Rx count on the Rx struct, not the Tx struct. There is no real change here, rx_stats and tx_stats are instances of the same struct.
Acked-by: Joe Damato <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| e5cf5107 | 28-Feb-2025 |
Lee Trager <[email protected]> |
eth: fbnic: Update fbnic_tlv_attr_get_string() to work like nla_strscpy()
Allow fbnic_tlv_attr_get_string() to return an error code. In the event the source mailbox attribute is missing return -EINV
eth: fbnic: Update fbnic_tlv_attr_get_string() to work like nla_strscpy()
Allow fbnic_tlv_attr_get_string() to return an error code. In the event the source mailbox attribute is missing return -EINVAL. Like nla_strscpy() return -E2BIG when the source string is larger than the destination string. In this case the amount of data copied is equal to dstsize.
Signed-off-by: Lee Trager <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| e4e7c9be | 21-Feb-2025 |
Mohsin Bashir <[email protected]> |
eth: fbnic: Consolidate PUL_USER CSR section
Move PUL_USER CSRs in the relevant section, update the end boundary address, and remove the redundant definition of end boundary.
Signed-off-by: Mohsin
eth: fbnic: Consolidate PUL_USER CSR section
Move PUL_USER CSRs in the relevant section, update the end boundary address, and remove the redundant definition of end boundary.
Signed-off-by: Mohsin Bashir <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| 0ec02328 | 11-Feb-2025 |
Jakub Kicinski <[email protected]> |
eth: fbnic: re-sort the objects in the Makefile
Looks like recent commit broke the sort order, fix it.
Acked-by: Joe Damato <[email protected]> Reviewed-by: Alexander Lobakin <aleksander.lobakin@i
eth: fbnic: re-sort the objects in the Makefile
Looks like recent commit broke the sort order, fix it.
Acked-by: Joe Damato <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|