History log of /linux-6.15/include/linux/dma-mapping.h (Results 1 – 25 of 205)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3
# c9b19ea6 15-Apr-2025 Marek Szyprowski <[email protected]>

dma-mapping: avoid potential unused data compilation warning

When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might
report unused data compilation warnings for dma_unmap_*() calls

dma-mapping: avoid potential unused data compilation warning

When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might
report unused data compilation warnings for dma_unmap_*() calls
arguments. Redefine macros for those calls to let compiler to notice that
it is okay when the provided arguments are not used.

Reported-by: Andy Shevchenko <[email protected]>
Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>
Tested-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3
# be164349 13-Oct-2024 Christoph Hellwig <[email protected]>

dma-mapping: drop unneeded includes from dma-mapping.h

Back in the day a lot of logic was implemented inline in dma-mapping.h and
needed various includes. Move of this has long been moved out of li

dma-mapping: drop unneeded includes from dma-mapping.h

Back in the day a lot of logic was implemented inline in dma-mapping.h and
needed various includes. Move of this has long been moved out of line,
so we can drop various includes to improve kernel rebuild times.

Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1
# 334304ac 19-Jul-2024 Christoph Hellwig <[email protected]>

dma-mapping: don't return errors from dma_set_max_seg_size

A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't much

dma-mapping: don't return errors from dma_set_max_seg_size

A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't much the driver can do in terms of error handling for either
case, so just warn and continue as DMA operations will fail anyway.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Acked-by: Ulf Hansson <[email protected]> # For MMC

show more ...


# 560a861a 19-Jul-2024 Christoph Hellwig <[email protected]>

dma-mapping: don't return errors from dma_set_seg_boundary

A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't much

dma-mapping: don't return errors from dma_set_seg_boundary

A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't much the driver can do in terms of error handling for either
case, so just warn and continue as DMA operations will fail anyway.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>

show more ...


# c42a0126 19-Jul-2024 Christoph Hellwig <[email protected]>

dma-mapping: don't return errors from dma_set_min_align_mask

A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't muc

dma-mapping: don't return errors from dma_set_min_align_mask

A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't much the driver can do in terms of error handling for either
case, so just warn and continue as DMA operations will fail anyway.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>

show more ...


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9
# a6016aac 09-May-2024 Alexander Lobakin <[email protected]>

dma: fix DMA sync for drivers not calling dma_set_mask*()

There are several reports that the DMA sync shortcut broke non-coherent
devices.
dev->dma_need_sync is false after the &device allocation an

dma: fix DMA sync for drivers not calling dma_set_mask*()

There are several reports that the DMA sync shortcut broke non-coherent
devices.
dev->dma_need_sync is false after the &device allocation and if a driver
didn't call dma_set_mask*(), it will still be false even if the device
is not DMA-coherent and thus needs synchronizing. Due to historical
reasons, there's still a lot of drivers not calling it.
Invert the boolean, so that the sync will be performed by default and
the shortcut will be enabled only when calling dma_set_mask*().

Reported-by: Steven Price <[email protected]>
Closes: https://lore.kernel.org/lkml/[email protected]
Reported-by: Marek Szyprowski <[email protected]>
Closes: https://lore.kernel.org/lkml/[email protected]
Fixes: f406c8e4b770. ("dma: avoid redundant calls for sync operations")
Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Steven Price <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>

show more ...


# f406c8e4 07-May-2024 Alexander Lobakin <[email protected]>

dma: avoid redundant calls for sync operations

Quite often, devices do not need dma_sync operations on x86_64 at least.
Indeed, when dev_is_dma_coherent(dev) is true and
dev_use_swiotlb(dev) is fals

dma: avoid redundant calls for sync operations

Quite often, devices do not need dma_sync operations on x86_64 at least.
Indeed, when dev_is_dma_coherent(dev) is true and
dev_use_swiotlb(dev) is false, iommu_dma_sync_single_for_cpu()
and friends do nothing.

However, indirectly calling them when CONFIG_RETPOLINE=y consumes about
10% of cycles on a cpu receiving packets from softirq at ~100Gbit rate.
Even if/when CONFIG_RETPOLINE is not set, there is a cost of about 3%.

Add dev->need_dma_sync boolean and turn it off during the device
initialization (dma_set_mask()) depending on the setup:
dev_is_dma_coherent() for the direct DMA, !(sync_single_for_device ||
sync_single_for_cpu) or the new dma_map_ops flag, %DMA_F_CAN_SKIP_SYNC,
advertised for non-NULL DMA ops.
Then later, if/when swiotlb is used for the first time, the flag
is reset back to on, from swiotlb_tbl_map_single().

On iavf, the UDP trafficgen with XDP_DROP in skb mode test shows
+3-5% increase for direct DMA.

Suggested-by: Christoph Hellwig <[email protected]> # direct DMA shortcut
Co-developed-by: Eric Dumazet <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


# fe7514b1 07-May-2024 Alexander Lobakin <[email protected]>

dma: compile-out DMA sync op calls when not used

Some platforms do have DMA, but DMA there is always direct and coherent.
Currently, even on such platforms DMA sync operations are compiled and
calle

dma: compile-out DMA sync op calls when not used

Some platforms do have DMA, but DMA there is always direct and coherent.
Currently, even on such platforms DMA sync operations are compiled and
called.
Add a new hidden Kconfig symbol, DMA_NEED_SYNC, and set it only when
either sync operations are needed or there is DMA ops or swiotlb
or DMA debug is enabled. Compile global dma_sync_*() and dma_need_sync()
only when it's set, otherwise provide empty inline stubs.
The change allows for future optimizations of DMA sync calls depending
on runtime conditions.

Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6
# 8ae0e970 28-Oct-2023 Jia He <[email protected]>

dma-mapping: move dma_addressing_limited() out of line

This patch moves dma_addressing_limited() out of line, serving as a
preliminary step to prevent the introduction of a new publicly accessible
l

dma-mapping: move dma_addressing_limited() out of line

This patch moves dma_addressing_limited() out of line, serving as a
preliminary step to prevent the introduction of a new publicly accessible
low-level helper when validating whether all system RAM is mapped within
the DMA mapping range.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jia He <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5
# 79636caa 01-Aug-2023 Petr Tesarik <[email protected]>

swiotlb: if swiotlb is full, fall back to a transient memory pool

Try to allocate a transient memory pool if no suitable slots can be found
and the respective SWIOTLB is allowed to grow. The transie

swiotlb: if swiotlb is full, fall back to a transient memory pool

Try to allocate a transient memory pool if no suitable slots can be found
and the respective SWIOTLB is allowed to grow. The transient pool is just
enough big for this one bounce buffer. It is inserted into a per-device
list of transient memory pools, and it is freed again when the bounce
buffer is unmapped.

Transient memory pools are kept in an RCU list. A memory barrier is
required after adding a new entry, because any address within a transient
buffer must be immediately recognized as belonging to the SWIOTLB, even if
it is passed to another CPU.

Deletion does not require any synchronization beyond RCU ordering
guarantees. After a buffer is unmapped, its physical addresses may no
longer be passed to the DMA API, so the memory range of the corresponding
stale entry in the RCU list never matches. If the memory range gets
allocated again, then it happens only after a RCU quiescent state.

Since bounce buffers can now be allocated from different pools, add a
parameter to swiotlb_alloc_pool() to let the caller know which memory pool
is used. Add swiotlb_find_pool() to find the memory pool corresponding to
an address. This function is now also used by is_swiotlb_buffer(), because
a simple boundary check is no longer sufficient.

The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(),
simplified and enhanced to use coherent memory pools if needed.

Note that this is not the most efficient way to provide a bounce buffer,
but when a DMA buffer can't be mapped, something may (and will) actually
break. At that point it is better to make an allocation, even if it may be
an expensive operation.

Signed-off-by: Petr Tesarik <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7
# 8c57da28 12-Jun-2023 Catalin Marinas <[email protected]>

dma: allow dma_get_cache_alignment() to be overridden by the arch code

On arm64, ARCH_DMA_MINALIGN is larger than most cache line size
configurations deployed. Allow an architecture to override
dma

dma: allow dma_get_cache_alignment() to be overridden by the arch code

On arm64, ARCH_DMA_MINALIGN is larger than most cache line size
configurations deployed. Allow an architecture to override
dma_get_cache_alignment() in order to return a run-time probed value (e.g.
cache_line_size()).

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Tested-by: Isaac J. Manjarres <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Jerry Snitselaar <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Lars-Peter Clausen <[email protected]>
Cc: Logan Gunthorpe <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Saravana Kannan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 4ab5f8ec 12-Jun-2023 Catalin Marinas <[email protected]>

mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN

Patch series "mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8", v7.

A series reducing the kmalloc() minimum alignment on arm64 to 8

mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN

Patch series "mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8", v7.

A series reducing the kmalloc() minimum alignment on arm64 to 8 (from
128).


This patch (of 17):

In preparation for supporting a kmalloc() minimum alignment smaller than
the arch DMA alignment, decouple the two definitions. This requires that
either the kmalloc() caches are aligned to a (run-time) cache-line size or
the DMA API bounces unaligned kmalloc() allocations. Subsequent patches
will implement both options.

After this patch, ARCH_DMA_MINALIGN is expected to be used in static
alignment annotations and defined by an architecture to be the maximum
alignment for all supported configurations/SoCs in a single Image.
Architectures opting in to a smaller ARCH_KMALLOC_MINALIGN will need to
define its value in the arch headers.

Since ARCH_DMA_MINALIGN is now always defined, adjust the #ifdef in
dma_get_cache_alignment() so that there is no change for architectures not
requiring a minimum DMA alignment.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Tested-by: Isaac J. Manjarres <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Saravana Kannan <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Jerry Snitselaar <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Lars-Peter Clausen <[email protected]>
Cc: Logan Gunthorpe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2
# 9fc18f6d 21-Aug-2022 Christoph Hellwig <[email protected]>

dma-mapping: mark dma_supported static

Now that the remaining users in drivers are gone, this function can be
marked static.

Signed-off-by: Christoph Hellwig <[email protected]>


Revision tags: v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6
# 159bf192 08-Jul-2022 Logan Gunthorpe <[email protected]>

dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support

Add a flags member to the dma_map_ops structure with one flag to
indicate support for PCI P2PDMA.

Also, add a helper to check if

dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support

Add a flags member to the dma_map_ops structure with one flag to
indicate support for PCI P2PDMA.

Also, add a helper to check if a device supports PCI P2PDMA.

Signed-off-by: Logan Gunthorpe <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


# a229cc14 14-Jul-2022 John Garry <[email protected]>

dma-mapping: add dma_opt_mapping_size()

Streaming DMA mapping involving an IOMMU may be much slower for larger
total mapping size. This is because every IOMMU DMA mapping requires an
IOVA to be allo

dma-mapping: add dma_opt_mapping_size()

Streaming DMA mapping involving an IOMMU may be much slower for larger
total mapping size. This is because every IOMMU DMA mapping requires an
IOVA to be allocated and freed. IOVA sizes above a certain limit are not
cached, which can have a big impact on DMA mapping performance.

Provide an API for device drivers to know this "optimal" limit, such that
they may try to produce mapping which don't exceed it.

Signed-off-by: John Garry <[email protected]>
Reviewed-by: Damien Le Moal <[email protected]>
Acked-by: Martin K. Petersen <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1
# 901c7280 28-Mar-2022 Linus Torvalds <[email protected]>

Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""

Halil Pasic points out [1] that the full revert of that commit (revert
in bddac7c1e02b), and that a partial revert that only

Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""

Halil Pasic points out [1] that the full revert of that commit (revert
in bddac7c1e02b), and that a partial revert that only reverts the
problematic case, but still keeps some of the cleanups is probably
better. 

And that partial revert [2] had already been verified by Oleksandr
Natalenko to also fix the issue, I had just missed that in the long
discussion.

So let's reinstate the cleanups from commit aa6f8dcbab47 ("swiotlb:
rework "fix info leak with DMA_FROM_DEVICE""), and effectively only
revert the part that caused problems.

Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
Link: https://lore.kernel.org/all/[email protected]/ [3]
Suggested-by: Halil Pasic <[email protected]>
Tested-by: Oleksandr Natalenko <[email protected]>
Cc: Christoph Hellwig" <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# bddac7c1 26-Mar-2022 Linus Torvalds <[email protected]>

Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""

This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.

It turns out this breaks at least the ath9k wireless driver, and
possibly

Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""

This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.

It turns out this breaks at least the ath9k wireless driver, and
possibly others.

What the ath9k driver does on packet receive is to set up the DMA
transfer with:

int ath_rx_init(..)
..
bf->bf_buf_addr = dma_map_single(sc->dev, skb->data,
common->rx_bufsize,
DMA_FROM_DEVICE);

and then the receive logic (through ath_rx_tasklet()) will fetch
incoming packets

static bool ath_edma_get_buffers(..)
..
dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
common->rx_bufsize, DMA_FROM_DEVICE);

ret = ath9k_hw_process_rxdesc_edma(ah, rs, skb->data);
if (ret == -EINPROGRESS) {
/*let device gain the buffer again*/
dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
common->rx_bufsize, DMA_FROM_DEVICE);
return false;
}

and it's worth noting how that first DMA sync:

dma_sync_single_for_cpu(..DMA_FROM_DEVICE);

is there to make sure the CPU can read the DMA buffer (possibly by
copying it from the bounce buffer area, or by doing some cache flush).
The iommu correctly turns that into a "copy from bounce bufer" so that
the driver can look at the state of the packets.

In the meantime, the device may continue to write to the DMA buffer, but
we at least have a snapshot of the state due to that first DMA sync.

But that _second_ DMA sync:

dma_sync_single_for_device(..DMA_FROM_DEVICE);

is telling the DMA mapping that the CPU wasn't interested in the area
because the packet wasn't there. In the case of a DMA bounce buffer,
that is a no-op.

Note how it's not a sync for the CPU (the "for_device()" part), and it's
not a sync for data written by the CPU (the "DMA_FROM_DEVICE" part).

Or rather, it _should_ be a no-op. That's what commit aa6f8dcbab47
broke: it made the code bounce the buffer unconditionally, and changed
the DMA_FROM_DEVICE to just unconditionally and illogically be
DMA_TO_DEVICE.

[ Side note: purely within the confines of the swiotlb driver it wasn't
entirely illogical: The reason it did that odd DMA_FROM_DEVICE ->
DMA_TO_DEVICE conversion thing is because inside the swiotlb driver,
it uses just a swiotlb_bounce() helper that doesn't care about the
whole distinction of who the sync is for - only which direction to
bounce.

So it took the "sync for device" to mean that the CPU must have been
the one writing, and thought it meant DMA_TO_DEVICE. ]

Also note how the commentary in that commit was wrong, probably due to
that whole confusion, claiming that the commit makes the swiotlb code

"bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer"

which is nonsensical for two reasons:

- that "also when dir == DMA_TO_DEVICE" is nonsensical, as that was
exactly when it always did - and should do - the bounce.

- since this is a sync for the device (not for the CPU), we're clearly
fundamentally not coping back stale data from the bounce buffers at
all, because we'd be copying *to* the bounce buffers.

So that commit was just very confused. It confused the direction of the
synchronization (to the device, not the cpu) with the direction of the
DMA (from the device).

Reported-and-bisected-by: Oleksandr Natalenko <[email protected]>
Reported-by: Olha Cherevyk <[email protected]>
Cc: Halil Pasic <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Kalle Valo <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Toke Høiland-Jørgensen <[email protected]>
Cc: Maxime Bizon <[email protected]>
Cc: Johannes Berg <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.17, v5.17-rc8, v5.17-rc7
# aa6f8dcb 05-Mar-2022 Halil Pasic <[email protected]>

swiotlb: rework "fix info leak with DMA_FROM_DEVICE"

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph

swiotlb: rework "fix info leak with DMA_FROM_DEVICE"

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.

Signed-off-by: Halil Pasic <[email protected]>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: [email protected]
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.17-rc6, v5.17-rc5, v5.17-rc4
# ddbd89de 11-Feb-2022 Halil Pasic <[email protected]>

swiotlb: fix info leak with DMA_FROM_DEVICE

The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.

A short description of what happens follows:
1) The test case issues

swiotlb: fix info leak with DMA_FROM_DEVICE

The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.

A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
and a corresponding dxferp. The peculiar thing about this is that TUR
is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
bounces the user-space buffer. As if the device was to transfer into
it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
sg_build_indirect()") we make sure this first bounce buffer is
allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
device won't touch the buffer we prepare as if the we had a
DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
and the buffer allocated by SG is mapped by the function
virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
scatter-gather and not scsi generics). This mapping involves bouncing
via the swiotlb (we need swiotlb to do virtio in protected guest like
s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
(that is swiotlb) bounce buffer (which most likely contains some
previous IO data), to the first bounce buffer, which contains all
zeros. Then we copy back the content of the first bounce buffer to
the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
ain't all zeros and fails.

One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).

Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.

Signed-off-by: Halil Pasic <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6
# 2a047e06 09-Aug-2021 Christoph Hellwig <[email protected]>

dma-mapping: return an unsigned int from dma_map_sg{,_attrs}

These can only return 0 for failure or the number of entries, so turn
the return value into an unsigned int.

Signed-off-by: Christoph He

dma-mapping: return an unsigned int from dma_map_sg{,_attrs}

These can only return 0 for failure or the number of entries, so turn
the return value into an unsigned int.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Logan Gunthorpe <[email protected]>

show more ...


Revision tags: v5.14-rc5, v5.14-rc4
# fffe3cc8 29-Jul-2021 Logan Gunthorpe <[email protected]>

dma-mapping: allow map_sg() ops to return negative error codes

Allow dma_map_sgtable() to pass errors from the map_sg() ops. This
will be required for returning appropriate error codes when mapping

dma-mapping: allow map_sg() ops to return negative error codes

Allow dma_map_sgtable() to pass errors from the map_sg() ops. This
will be required for returning appropriate error codes when mapping
P2PDMA memory.

Introduce __dma_map_sg_attrs() which will return the raw error code
from the map_sg operation (whether it be negative or zero). Then add a
dma_map_sg_attrs() wrapper to convert any negative errors to zero to
satisfy the existing calling convention.

dma_map_sgtable() defines three error codes that .map_sg implementations
are allowed to return: -EINVAL, -ENOMEM and -EIO. The latter of which
is a generic return for cases that are passing DMA_MAPPING_ERROR
through.

dma_map_sgtable() will convert a zero error return for old map_sg() ops
into a -EIO return and return any negative errors as reported.

This allows map_sg implementations to start returning multiple
negative error codes. Legacy map_sg implementations can continue
to return zero until they are all converted.

Signed-off-by: Logan Gunthorpe <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5
# a7f3d3d3 26-Mar-2021 Heiner Kallweit <[email protected]>

dma-mapping: add unlikely hint to error path in dma_mapping_error

Zillions of drivers use the unlikely() hint when checking the result of
dma_mapping_error(). This is an inline function anyway, so w

dma-mapping: add unlikely hint to error path in dma_mapping_error

Zillions of drivers use the unlikely() hint when checking the result of
dma_mapping_error(). This is an inline function anyway, so we can move
the hint into the function and remove it from drivers over time.

Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Robin Murphy <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>

show more ...


Revision tags: v5.12-rc4, v5.12-rc3
# 84fcfbda 12-Mar-2021 Wang Qing <[email protected]>

dma-mapping: remove a pointless empty line in dma_alloc_coherent

Signed-off-by: Wang Qing <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>


Revision tags: v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6
# 7d5b5738 28-Jan-2021 Christoph Hellwig <[email protected]>

dma-mapping: add a dma_alloc_noncontiguous API

Add a new API that returns a potentiall virtually non-contigous sg_table
and a DMA address. This API is only properly implemented for dma-iommu
and wi

dma-mapping: add a dma_alloc_noncontiguous API

Add a new API that returns a potentiall virtually non-contigous sg_table
and a DMA address. This API is only properly implemented for dma-iommu
and will simply return a contigious chunk as a fallback.

The intent is that drivers can use this API if either:

- no kernel mapping or only temporary kernel mappings are required.
That is as a better replacement for DMA_ATTR_NO_KERNEL_MAPPING
- a kernel mapping is required for cached and DMA mapped pages, but
the driver also needs the pages to e.g. map them to userspace.
In that sense it is a replacement for some aspects of the recently
removed and never fully implemented DMA_ATTR_NON_CONSISTENT

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Tested-by: Ricardo Ribalda <[email protected]>

show more ...


# eedb0b12 28-Jan-2021 Christoph Hellwig <[email protected]>

dma-mapping: add a dma_mmap_pages helper

Add a helper to map memory allocated using dma_alloc_pages into
a user address space, similar to the dma_alloc_attrs function for
coherent allocations.

Sign

dma-mapping: add a dma_mmap_pages helper

Add a helper to map memory allocated using dma_alloc_pages into
a user address space, similar to the dma_alloc_attrs function for
coherent allocations.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Tested-by: Ricardo Ribalda <[email protected]>

show more ...


123456789