|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3 |
|
| #
c9b19ea6 |
| 15-Apr-2025 |
Marek Szyprowski <[email protected]> |
dma-mapping: avoid potential unused data compilation warning
When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might report unused data compilation warnings for dma_unmap_*() calls
dma-mapping: avoid potential unused data compilation warning
When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might report unused data compilation warnings for dma_unmap_*() calls arguments. Redefine macros for those calls to let compiler to notice that it is okay when the provided arguments are not used.
Reported-by: Andy Shevchenko <[email protected]> Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Marek Szyprowski <[email protected]> Tested-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3 |
|
| #
be164349 |
| 13-Oct-2024 |
Christoph Hellwig <[email protected]> |
dma-mapping: drop unneeded includes from dma-mapping.h
Back in the day a lot of logic was implemented inline in dma-mapping.h and needed various includes. Move of this has long been moved out of li
dma-mapping: drop unneeded includes from dma-mapping.h
Back in the day a lot of logic was implemented inline in dma-mapping.h and needed various includes. Move of this has long been moved out of line, so we can drop various includes to improve kernel rebuild times.
Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
| #
334304ac |
| 19-Jul-2024 |
Christoph Hellwig <[email protected]> |
dma-mapping: don't return errors from dma_set_max_seg_size
A NULL dev->dma_parms indicates either a bus that is not DMA capable or grave bug in the implementation of the bus code.
There isn't much
dma-mapping: don't return errors from dma_set_max_seg_size
A NULL dev->dma_parms indicates either a bus that is not DMA capable or grave bug in the implementation of the bus code.
There isn't much the driver can do in terms of error handling for either case, so just warn and continue as DMA operations will fail anyway.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Acked-by: Ulf Hansson <[email protected]> # For MMC
show more ...
|
| #
560a861a |
| 19-Jul-2024 |
Christoph Hellwig <[email protected]> |
dma-mapping: don't return errors from dma_set_seg_boundary
A NULL dev->dma_parms indicates either a bus that is not DMA capable or grave bug in the implementation of the bus code.
There isn't much
dma-mapping: don't return errors from dma_set_seg_boundary
A NULL dev->dma_parms indicates either a bus that is not DMA capable or grave bug in the implementation of the bus code.
There isn't much the driver can do in terms of error handling for either case, so just warn and continue as DMA operations will fail anyway.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]>
show more ...
|
| #
c42a0126 |
| 19-Jul-2024 |
Christoph Hellwig <[email protected]> |
dma-mapping: don't return errors from dma_set_min_align_mask
A NULL dev->dma_parms indicates either a bus that is not DMA capable or grave bug in the implementation of the bus code.
There isn't muc
dma-mapping: don't return errors from dma_set_min_align_mask
A NULL dev->dma_parms indicates either a bus that is not DMA capable or grave bug in the implementation of the bus code.
There isn't much the driver can do in terms of error handling for either case, so just warn and continue as DMA operations will fail anyway.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]>
show more ...
|
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9 |
|
| #
a6016aac |
| 09-May-2024 |
Alexander Lobakin <[email protected]> |
dma: fix DMA sync for drivers not calling dma_set_mask*()
There are several reports that the DMA sync shortcut broke non-coherent devices. dev->dma_need_sync is false after the &device allocation an
dma: fix DMA sync for drivers not calling dma_set_mask*()
There are several reports that the DMA sync shortcut broke non-coherent devices. dev->dma_need_sync is false after the &device allocation and if a driver didn't call dma_set_mask*(), it will still be false even if the device is not DMA-coherent and thus needs synchronizing. Due to historical reasons, there's still a lot of drivers not calling it. Invert the boolean, so that the sync will be performed by default and the shortcut will be enabled only when calling dma_set_mask*().
Reported-by: Steven Price <[email protected]> Closes: https://lore.kernel.org/lkml/[email protected] Reported-by: Marek Szyprowski <[email protected]> Closes: https://lore.kernel.org/lkml/[email protected] Fixes: f406c8e4b770. ("dma: avoid redundant calls for sync operations") Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Steven Price <[email protected]> Tested-by: Marek Szyprowski <[email protected]>
show more ...
|
| #
f406c8e4 |
| 07-May-2024 |
Alexander Lobakin <[email protected]> |
dma: avoid redundant calls for sync operations
Quite often, devices do not need dma_sync operations on x86_64 at least. Indeed, when dev_is_dma_coherent(dev) is true and dev_use_swiotlb(dev) is fals
dma: avoid redundant calls for sync operations
Quite often, devices do not need dma_sync operations on x86_64 at least. Indeed, when dev_is_dma_coherent(dev) is true and dev_use_swiotlb(dev) is false, iommu_dma_sync_single_for_cpu() and friends do nothing.
However, indirectly calling them when CONFIG_RETPOLINE=y consumes about 10% of cycles on a cpu receiving packets from softirq at ~100Gbit rate. Even if/when CONFIG_RETPOLINE is not set, there is a cost of about 3%.
Add dev->need_dma_sync boolean and turn it off during the device initialization (dma_set_mask()) depending on the setup: dev_is_dma_coherent() for the direct DMA, !(sync_single_for_device || sync_single_for_cpu) or the new dma_map_ops flag, %DMA_F_CAN_SKIP_SYNC, advertised for non-NULL DMA ops. Then later, if/when swiotlb is used for the first time, the flag is reset back to on, from swiotlb_tbl_map_single().
On iavf, the UDP trafficgen with XDP_DROP in skb mode test shows +3-5% increase for direct DMA.
Suggested-by: Christoph Hellwig <[email protected]> # direct DMA shortcut Co-developed-by: Eric Dumazet <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
| #
fe7514b1 |
| 07-May-2024 |
Alexander Lobakin <[email protected]> |
dma: compile-out DMA sync op calls when not used
Some platforms do have DMA, but DMA there is always direct and coherent. Currently, even on such platforms DMA sync operations are compiled and calle
dma: compile-out DMA sync op calls when not used
Some platforms do have DMA, but DMA there is always direct and coherent. Currently, even on such platforms DMA sync operations are compiled and called. Add a new hidden Kconfig symbol, DMA_NEED_SYNC, and set it only when either sync operations are needed or there is DMA ops or swiotlb or DMA debug is enabled. Compile global dma_sync_*() and dma_need_sync() only when it's set, otherwise provide empty inline stubs. The change allows for future optimizations of DMA sync calls depending on runtime conditions.
Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6 |
|
| #
8ae0e970 |
| 28-Oct-2023 |
Jia He <[email protected]> |
dma-mapping: move dma_addressing_limited() out of line
This patch moves dma_addressing_limited() out of line, serving as a preliminary step to prevent the introduction of a new publicly accessible l
dma-mapping: move dma_addressing_limited() out of line
This patch moves dma_addressing_limited() out of line, serving as a preliminary step to prevent the introduction of a new publicly accessible low-level helper when validating whether all system RAM is mapped within the DMA mapping range.
Suggested-by: Christoph Hellwig <[email protected]> Signed-off-by: Jia He <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5 |
|
| #
79636caa |
| 01-Aug-2023 |
Petr Tesarik <[email protected]> |
swiotlb: if swiotlb is full, fall back to a transient memory pool
Try to allocate a transient memory pool if no suitable slots can be found and the respective SWIOTLB is allowed to grow. The transie
swiotlb: if swiotlb is full, fall back to a transient memory pool
Try to allocate a transient memory pool if no suitable slots can be found and the respective SWIOTLB is allowed to grow. The transient pool is just enough big for this one bounce buffer. It is inserted into a per-device list of transient memory pools, and it is freed again when the bounce buffer is unmapped.
Transient memory pools are kept in an RCU list. A memory barrier is required after adding a new entry, because any address within a transient buffer must be immediately recognized as belonging to the SWIOTLB, even if it is passed to another CPU.
Deletion does not require any synchronization beyond RCU ordering guarantees. After a buffer is unmapped, its physical addresses may no longer be passed to the DMA API, so the memory range of the corresponding stale entry in the RCU list never matches. If the memory range gets allocated again, then it happens only after a RCU quiescent state.
Since bounce buffers can now be allocated from different pools, add a parameter to swiotlb_alloc_pool() to let the caller know which memory pool is used. Add swiotlb_find_pool() to find the memory pool corresponding to an address. This function is now also used by is_swiotlb_buffer(), because a simple boundary check is no longer sufficient.
The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(), simplified and enhanced to use coherent memory pools if needed.
Note that this is not the most efficient way to provide a bounce buffer, but when a DMA buffer can't be mapped, something may (and will) actually break. At that point it is better to make an allocation, even if it may be an expensive operation.
Signed-off-by: Petr Tesarik <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7 |
|
| #
8c57da28 |
| 12-Jun-2023 |
Catalin Marinas <[email protected]> |
dma: allow dma_get_cache_alignment() to be overridden by the arch code
On arm64, ARCH_DMA_MINALIGN is larger than most cache line size configurations deployed. Allow an architecture to override dma
dma: allow dma_get_cache_alignment() to be overridden by the arch code
On arm64, ARCH_DMA_MINALIGN is larger than most cache line size configurations deployed. Allow an architecture to override dma_get_cache_alignment() in order to return a run-time probed value (e.g. cache_line_size()).
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Isaac J. Manjarres <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Will Deacon <[email protected]> Cc: Alasdair Kergon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Jerry Snitselaar <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Lars-Peter Clausen <[email protected]> Cc: Logan Gunthorpe <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Brown <[email protected]> Cc: Mike Snitzer <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Saravana Kannan <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
4ab5f8ec |
| 12-Jun-2023 |
Catalin Marinas <[email protected]> |
mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
Patch series "mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8", v7.
A series reducing the kmalloc() minimum alignment on arm64 to 8
mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN
Patch series "mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8", v7.
A series reducing the kmalloc() minimum alignment on arm64 to 8 (from 128).
This patch (of 17):
In preparation for supporting a kmalloc() minimum alignment smaller than the arch DMA alignment, decouple the two definitions. This requires that either the kmalloc() caches are aligned to a (run-time) cache-line size or the DMA API bounces unaligned kmalloc() allocations. Subsequent patches will implement both options.
After this patch, ARCH_DMA_MINALIGN is expected to be used in static alignment annotations and defined by an architecture to be the maximum alignment for all supported configurations/SoCs in a single Image. Architectures opting in to a smaller ARCH_KMALLOC_MINALIGN will need to define its value in the arch headers.
Since ARCH_DMA_MINALIGN is now always defined, adjust the #ifdef in dma_get_cache_alignment() so that there is no change for architectures not requiring a minimum DMA alignment.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]> Tested-by: Isaac J. Manjarres <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Alasdair Kergon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Mark Brown <[email protected]> Cc: Mike Snitzer <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Saravana Kannan <[email protected]> Cc: Will Deacon <[email protected]> Cc: Jerry Snitselaar <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Lars-Peter Clausen <[email protected]> Cc: Logan Gunthorpe <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2 |
|
| #
9fc18f6d |
| 21-Aug-2022 |
Christoph Hellwig <[email protected]> |
dma-mapping: mark dma_supported static
Now that the remaining users in drivers are gone, this function can be marked static.
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Revision tags: v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6 |
|
| #
159bf192 |
| 08-Jul-2022 |
Logan Gunthorpe <[email protected]> |
dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support
Add a flags member to the dma_map_ops structure with one flag to indicate support for PCI P2PDMA.
Also, add a helper to check if
dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support
Add a flags member to the dma_map_ops structure with one flag to indicate support for PCI P2PDMA.
Also, add a helper to check if a device supports PCI P2PDMA.
Signed-off-by: Logan Gunthorpe <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
| #
a229cc14 |
| 14-Jul-2022 |
John Garry <[email protected]> |
dma-mapping: add dma_opt_mapping_size()
Streaming DMA mapping involving an IOMMU may be much slower for larger total mapping size. This is because every IOMMU DMA mapping requires an IOVA to be allo
dma-mapping: add dma_opt_mapping_size()
Streaming DMA mapping involving an IOMMU may be much slower for larger total mapping size. This is because every IOMMU DMA mapping requires an IOVA to be allocated and freed. IOVA sizes above a certain limit are not cached, which can have a big impact on DMA mapping performance.
Provide an API for device drivers to know this "optimal" limit, such that they may try to produce mapping which don't exceed it.
Signed-off-by: John Garry <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Acked-by: Martin K. Petersen <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1 |
|
| #
901c7280 |
| 28-Mar-2022 |
Linus Torvalds <[email protected]> |
Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""
Halil Pasic points out [1] that the full revert of that commit (revert in bddac7c1e02b), and that a partial revert that only
Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""
Halil Pasic points out [1] that the full revert of that commit (revert in bddac7c1e02b), and that a partial revert that only reverts the problematic case, but still keeps some of the cleanups is probably better. 
And that partial revert [2] had already been verified by Oleksandr Natalenko to also fix the issue, I had just missed that in the long discussion.
So let's reinstate the cleanups from commit aa6f8dcbab47 ("swiotlb: rework "fix info leak with DMA_FROM_DEVICE""), and effectively only revert the part that caused problems.
Link: https://lore.kernel.org/all/[email protected]/ [1] Link: https://lore.kernel.org/all/[email protected]/ [2] Link: https://lore.kernel.org/all/[email protected]/ [3] Suggested-by: Halil Pasic <[email protected]> Tested-by: Oleksandr Natalenko <[email protected]> Cc: Christoph Hellwig" <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
bddac7c1 |
| 26-Mar-2022 |
Linus Torvalds <[email protected]> |
Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""
This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.
It turns out this breaks at least the ath9k wireless driver, and possibly
Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""
This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.
It turns out this breaks at least the ath9k wireless driver, and possibly others.
What the ath9k driver does on packet receive is to set up the DMA transfer with:
int ath_rx_init(..) .. bf->bf_buf_addr = dma_map_single(sc->dev, skb->data, common->rx_bufsize, DMA_FROM_DEVICE);
and then the receive logic (through ath_rx_tasklet()) will fetch incoming packets
static bool ath_edma_get_buffers(..) .. dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr, common->rx_bufsize, DMA_FROM_DEVICE);
ret = ath9k_hw_process_rxdesc_edma(ah, rs, skb->data); if (ret == -EINPROGRESS) { /*let device gain the buffer again*/ dma_sync_single_for_device(sc->dev, bf->bf_buf_addr, common->rx_bufsize, DMA_FROM_DEVICE); return false; }
and it's worth noting how that first DMA sync:
dma_sync_single_for_cpu(..DMA_FROM_DEVICE);
is there to make sure the CPU can read the DMA buffer (possibly by copying it from the bounce buffer area, or by doing some cache flush). The iommu correctly turns that into a "copy from bounce bufer" so that the driver can look at the state of the packets.
In the meantime, the device may continue to write to the DMA buffer, but we at least have a snapshot of the state due to that first DMA sync.
But that _second_ DMA sync:
dma_sync_single_for_device(..DMA_FROM_DEVICE);
is telling the DMA mapping that the CPU wasn't interested in the area because the packet wasn't there. In the case of a DMA bounce buffer, that is a no-op.
Note how it's not a sync for the CPU (the "for_device()" part), and it's not a sync for data written by the CPU (the "DMA_FROM_DEVICE" part).
Or rather, it _should_ be a no-op. That's what commit aa6f8dcbab47 broke: it made the code bounce the buffer unconditionally, and changed the DMA_FROM_DEVICE to just unconditionally and illogically be DMA_TO_DEVICE.
[ Side note: purely within the confines of the swiotlb driver it wasn't entirely illogical: The reason it did that odd DMA_FROM_DEVICE -> DMA_TO_DEVICE conversion thing is because inside the swiotlb driver, it uses just a swiotlb_bounce() helper that doesn't care about the whole distinction of who the sync is for - only which direction to bounce.
So it took the "sync for device" to mean that the CPU must have been the one writing, and thought it meant DMA_TO_DEVICE. ]
Also note how the commentary in that commit was wrong, probably due to that whole confusion, claiming that the commit makes the swiotlb code
"bounce unconditionally (that is, also when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale data from the swiotlb buffer"
which is nonsensical for two reasons:
- that "also when dir == DMA_TO_DEVICE" is nonsensical, as that was exactly when it always did - and should do - the bounce.
- since this is a sync for the device (not for the CPU), we're clearly fundamentally not coping back stale data from the bounce buffers at all, because we'd be copying *to* the bounce buffers.
So that commit was just very confused. It confused the direction of the synchronization (to the device, not the cpu) with the direction of the DMA (from the device).
Reported-and-bisected-by: Oleksandr Natalenko <[email protected]> Reported-by: Olha Cherevyk <[email protected]> Cc: Halil Pasic <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Toke Høiland-Jørgensen <[email protected]> Cc: Maxime Bizon <[email protected]> Cc: Johannes Berg <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.17, v5.17-rc8, v5.17-rc7 |
|
| #
aa6f8dcb |
| 05-Mar-2022 |
Halil Pasic <[email protected]> |
swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
swiotlb: rework "fix info leak with DMA_FROM_DEVICE"
Unfortunately, we ended up merging an old version of the patch "fix info leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph (the swiotlb maintainer), he asked me to create an incremental fix (after I have pointed this out the mix up, and asked him for guidance). So here we go.
The main differences between what we got and what was agreed are: * swiotlb_sync_single_for_device is also required to do an extra bounce * We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters * The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE must take precedence over DMA_ATTR_SKIP_CPU_SYNC
Thus this patch removes DMA_ATTR_OVERWRITE, and makes swiotlb_sync_single_for_device() bounce unconditionally (that is, also when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale data from the swiotlb buffer.
Let me note, that if the size used with dma_sync_* API is less than the size used with dma_[un]map_*, under certain circumstances we may still end up with swiotlb not being transparent. In that sense, this is no perfect fix either.
To get this bullet proof, we would have to bounce the entire mapping/bounce buffer. For that we would have to figure out the starting address, and the size of the mapping in swiotlb_sync_single_for_device(). While this does seem possible, there seems to be no firm consensus on how things are supposed to work.
Signed-off-by: Halil Pasic <[email protected]> Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE") Cc: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc6, v5.17-rc5, v5.17-rc4 |
|
| #
ddbd89de |
| 11-Feb-2022 |
Halil Pasic <[email protected]> |
swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering cve-2018-1000204.
A short description of what happens follows: 1) The test case issues
swiotlb: fix info leak with DMA_FROM_DEVICE
The problem I'm addressing was discovered by the LTP test covering cve-2018-1000204.
A short description of what happens follows: 1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV and a corresponding dxferp. The peculiar thing about this is that TUR is not reading from the device. 2) In sg_start_req() the invocation of blk_rq_map_user() effectively bounces the user-space buffer. As if the device was to transfer into it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()") we make sure this first bounce buffer is allocated with GFP_ZERO. 3) For the rest of the story we keep ignoring that we have a TUR, so the device won't touch the buffer we prepare as if the we had a DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device and the buffer allocated by SG is mapped by the function virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here scatter-gather and not scsi generics). This mapping involves bouncing via the swiotlb (we need swiotlb to do virtio in protected guest like s390 Secure Execution, or AMD SEV). 4) When the SCSI TUR is done, we first copy back the content of the second (that is swiotlb) bounce buffer (which most likely contains some previous IO data), to the first bounce buffer, which contains all zeros. Then we copy back the content of the first bounce buffer to the user-space buffer. 5) The test case detects that the buffer, which it zero-initialized, ain't all zeros and fails.
One can argue that this is an swiotlb problem, because without swiotlb we leak all zeros, and the swiotlb should be transparent in a sense that it does not affect the outcome (if all other participants are well behaved).
Copying the content of the original buffer into the swiotlb buffer is the only way I can think of to make swiotlb transparent in such scenarios. So let's do just that if in doubt, but allow the driver to tell us that the whole mapped buffer is going to be overwritten, in which case we can preserve the old behavior and avoid the performance impact of the extra bounce.
Signed-off-by: Halil Pasic <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6 |
|
| #
2a047e06 |
| 09-Aug-2021 |
Christoph Hellwig <[email protected]> |
dma-mapping: return an unsigned int from dma_map_sg{,_attrs}
These can only return 0 for failure or the number of entries, so turn the return value into an unsigned int.
Signed-off-by: Christoph He
dma-mapping: return an unsigned int from dma_map_sg{,_attrs}
These can only return 0 for failure or the number of entries, so turn the return value into an unsigned int.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]>
show more ...
|
|
Revision tags: v5.14-rc5, v5.14-rc4 |
|
| #
fffe3cc8 |
| 29-Jul-2021 |
Logan Gunthorpe <[email protected]> |
dma-mapping: allow map_sg() ops to return negative error codes
Allow dma_map_sgtable() to pass errors from the map_sg() ops. This will be required for returning appropriate error codes when mapping
dma-mapping: allow map_sg() ops to return negative error codes
Allow dma_map_sgtable() to pass errors from the map_sg() ops. This will be required for returning appropriate error codes when mapping P2PDMA memory.
Introduce __dma_map_sg_attrs() which will return the raw error code from the map_sg operation (whether it be negative or zero). Then add a dma_map_sg_attrs() wrapper to convert any negative errors to zero to satisfy the existing calling convention.
dma_map_sgtable() defines three error codes that .map_sg implementations are allowed to return: -EINVAL, -ENOMEM and -EIO. The latter of which is a generic return for cases that are passing DMA_MAPPING_ERROR through.
dma_map_sgtable() will convert a zero error return for old map_sg() ops into a -EIO return and return any negative errors as reported.
This allows map_sg implementations to start returning multiple negative error codes. Legacy map_sg implementations can continue to return zero until they are all converted.
Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5 |
|
| #
a7f3d3d3 |
| 26-Mar-2021 |
Heiner Kallweit <[email protected]> |
dma-mapping: add unlikely hint to error path in dma_mapping_error
Zillions of drivers use the unlikely() hint when checking the result of dma_mapping_error(). This is an inline function anyway, so w
dma-mapping: add unlikely hint to error path in dma_mapping_error
Zillions of drivers use the unlikely() hint when checking the result of dma_mapping_error(). This is an inline function anyway, so we can move the hint into the function and remove it from drivers over time.
Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc4, v5.12-rc3 |
|
| #
84fcfbda |
| 12-Mar-2021 |
Wang Qing <[email protected]> |
dma-mapping: remove a pointless empty line in dma_alloc_coherent
Signed-off-by: Wang Qing <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Revision tags: v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6 |
|
| #
7d5b5738 |
| 28-Jan-2021 |
Christoph Hellwig <[email protected]> |
dma-mapping: add a dma_alloc_noncontiguous API
Add a new API that returns a potentiall virtually non-contigous sg_table and a DMA address. This API is only properly implemented for dma-iommu and wi
dma-mapping: add a dma_alloc_noncontiguous API
Add a new API that returns a potentiall virtually non-contigous sg_table and a DMA address. This API is only properly implemented for dma-iommu and will simply return a contigious chunk as a fallback.
The intent is that drivers can use this API if either:
- no kernel mapping or only temporary kernel mappings are required. That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING - a kernel mapping is required for cached and DMA mapped pages, but the driver also needs the pages to e.g. map them to userspace. In that sense it is a replacement for some aspects of the recently removed and never fully implemented DMA_ATTR_NON_CONSISTENT
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Tomasz Figa <[email protected]> Tested-by: Ricardo Ribalda <[email protected]>
show more ...
|
| #
eedb0b12 |
| 28-Jan-2021 |
Christoph Hellwig <[email protected]> |
dma-mapping: add a dma_mmap_pages helper
Add a helper to map memory allocated using dma_alloc_pages into a user address space, similar to the dma_alloc_attrs function for coherent allocations.
Sign
dma-mapping: add a dma_mmap_pages helper
Add a helper to map memory allocated using dma_alloc_pages into a user address space, similar to the dma_alloc_attrs function for coherent allocations.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Tomasz Figa <[email protected]> Tested-by: Ricardo Ribalda <[email protected]>
show more ...
|