|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3 |
|
| #
1b5695b0 |
| 07-Aug-2024 |
Mike Rapoport (Microsoft) <[email protected]> |
mm: make range-to-target_node lookup facility a part of numa_memblks
The x86 implementation of range-to-target_node lookup (i.e. phys_to_target_node() and memory_add_physaddr_to_nid()) relies on nu
mm: make range-to-target_node lookup facility a part of numa_memblks
The x86 implementation of range-to-target_node lookup (i.e. phys_to_target_node() and memory_add_physaddr_to_nid()) relies on numa_memblks.
Since numa_memblks are now part of the generic code, move these functions from x86 to mm/numa_memblks.c and select CONFIG_NUMA_KEEP_MEMINFO when CONFIG_NUMA_MEMBLKS=y for dax and cxl.
[[email protected]: fix build] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Tested-by: Zi Yan <[email protected]> # for x86_64 and arm64 Tested-by: Jonathan Cameron <[email protected]> [arm64 + CXL via QEMU] Reviewed-by: Dan Williams <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: David S. Miller <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Rob Herring (Arm) <[email protected]> Cc: Samuel Holland <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2 |
|
| #
0c16c83e |
| 14-Feb-2023 |
Arnd Bergmann <[email protected]> |
dax: cxl: add CXL_REGION dependency
There is already a dependency on CXL_REGION, which depends on CXL_BUS, but since CXL_REGION is a 'bool' symbol, it's possible to configure DAX as built-in even th
dax: cxl: add CXL_REGION dependency
There is already a dependency on CXL_REGION, which depends on CXL_BUS, but since CXL_REGION is a 'bool' symbol, it's possible to configure DAX as built-in even though CXL itself is a loadable module:
x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_probe': cxl.c:(.text+0xb): undefined reference to `to_cxl_dax_region' x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_driver_init': cxl.c:(.init.text+0x10): undefined reference to `__cxl_driver_register' x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_driver_exit': cxl.c:(.exit.text+0x9): undefined reference to `cxl_driver_unregister'
Prevent this with another depndency on the tristate symbol.
Fixes: 09d09e04d2fc ("cxl/dax: Create dax devices for CXL RAM regions") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc8 |
|
| #
09d09e04 |
| 10-Feb-2023 |
Dan Williams <[email protected]> |
cxl/dax: Create dax devices for CXL RAM regions
While platform firmware takes some responsibility for mapping the RAM capacity of CXL devices present at boot, the OS is responsible for mapping the r
cxl/dax: Create dax devices for CXL RAM regions
While platform firmware takes some responsibility for mapping the RAM capacity of CXL devices present at boot, the OS is responsible for mapping the remainder and hot-added devices. Platform firmware is also responsible for identifying the platform general purpose memory pool, typically DDR attached DRAM, and arranging for the remainder to be 'Soft Reserved'. That reservation allows the CXL subsystem to route the memory to core-mm via memory-hotplug (dax_kmem), or leave it for dedicated access (device-dax).
The new 'struct cxl_dax_region' object allows for a CXL memory resource (region) to be published, but also allow for udev and module policy to act on that event. It also prevents cxl_core.ko from having a module loading dependency on any drivers/dax/ modules.
Tested-by: Fan Ni <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Link: https://lore.kernel.org/r/167602003896.1924368.10335442077318970468.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <[email protected]>
show more ...
|
| #
e9ee9fe3 |
| 10-Feb-2023 |
Dan Williams <[email protected]> |
dax: Assign RAM regions to memory-hotplug by default
The default mode for device-dax instances is backwards for RAM-regions as evidenced by the fact that it tends to catch end users by surprise. "Wh
dax: Assign RAM regions to memory-hotplug by default
The default mode for device-dax instances is backwards for RAM-regions as evidenced by the fact that it tends to catch end users by surprise. "Where is my memory?". Recall that platforms are increasingly shipping with performance-differentiated memory pools beyond typical DRAM and NUMA effects. This includes HBM (high-bandwidth-memory) and CXL (dynamic interleave, varied media types, and future fabric attached possibilities).
For this reason the EFI_MEMORY_SP (EFI Special Purpose Memory => Linux 'Soft Reserved') attribute is expected to be applied to all memory-pools that are not the general purpose pool. This designation gives an Operating System a chance to defer usage of a memory pool until later in the boot process where its performance properties can be interrogated and administrator policy can be applied.
'Soft Reserved' memory can be anything from too limited and precious to be part of the general purpose pool (HBM), too slow to host hot kernel data structures (some PMEM media), or anything in between. However, in the absence of an explicit policy, the memory should at least be made usable by default. The current device-dax default hides all non-general-purpose memory behind a device interface.
The expectation is that the distribution of users that want the memory online by default vs device-dedicated-access by default follows the Pareto principle. A small number of enlightened users may want to do userspace memory management through a device, but general users just want the kernel to make the memory available with an option to get more advanced later.
Arrange for all device-dax instances not backed by PMEM to default to attaching to the dax_kmem driver. From there the baseline memory hotplug policy (CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE / memhp_default_state=) gates whether the memory comes online or stays offline. Where, if it stays offline, it can be reliably converted back to device-mode where it can be partitioned, or fronted by a userspace allocator.
So, if someone wants device-dax instances for their 'Soft Reserved' memory:
1/ Build a kernel with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n or boot with memhp_default_state=offline, or roll the dice and hope that the kernel has not pinned a page in that memory before step 2.
2/ Write a udev rule to convert the target dax device(s) from 'system-ram' mode to 'devdax' mode:
daxctl reconfigure-device $dax -m devdax -f
Cc: Michal Hocko <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Dave Hansen <[email protected]> Reviewed-by: Gregory Price <[email protected]> Tested-by: Fan Ni <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/167602003336.1924368.6809503401422267885.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <[email protected]>
show more ...
|
| #
7dab174e |
| 10-Feb-2023 |
Dan Williams <[email protected]> |
dax/hmem: Move hmem device registration to dax_hmem.ko
In preparation for the CXL region driver to take over the responsibility of registering device-dax instances for CXL regions, move the registra
dax/hmem: Move hmem device registration to dax_hmem.ko
In preparation for the CXL region driver to take over the responsibility of registering device-dax instances for CXL regions, move the registration of "hmem" devices to dax_hmem.ko.
Previously the builtin component of this enabling (drivers/dax/hmem/device.o) would register platform devices for each address range and trigger the dax_hmem.ko module to load and attach device-dax instances to those devices. Now, the ranges are collected from the HMAT and EFI memory map walking, but the device creation is deferred. A new "hmem_platform" device is created which triggers dax_hmem.ko to load and register the platform devices.
Tested-by: Fan Ni <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Link: https://lore.kernel.org/r/167602002771.1924368.5653558226424530127.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7 |
|
| #
a870acc1 |
| 23-Nov-2022 |
Paul E. McKenney <[email protected]> |
drivers/dax: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements.
drivers/dax: Remove "select SRCU"
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements.
Signed-off-by: Paul E. McKenney <[email protected]> Cc: Dan Williams <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Jiang <[email protected]> Cc: <[email protected]> Acked-by: Dan Williams <[email protected]> Reviewed-by: John Ogness <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4 |
|
| #
afd586f0 |
| 29-Nov-2021 |
Christoph Hellwig <[email protected]> |
dax: remove CONFIG_DAX_DRIVER
CONFIG_DAX_DRIVER only selects CONFIG_DAX now, so remove it.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: h
dax: remove CONFIG_DAX_DRIVER
CONFIG_DAX_DRIVER only selects CONFIG_DAX now, so remove it.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v5.16-rc3, v5.16-rc2 |
|
| #
83762cb5 |
| 15-Nov-2021 |
Dan Williams <[email protected]> |
dax: Kill DEV_DAX_PMEM_COMPAT
The /sys/class/dax compatibility option has shipped in the kernel for 4 years now which should be sufficient time for tools to abandon the old ABI in favor of the /sys/
dax: Kill DEV_DAX_PMEM_COMPAT
The /sys/class/dax compatibility option has shipped in the kernel for 4 years now which should be sufficient time for tools to abandon the old ABI in favor of the /sys/bus/dax device-model. Delete it now and see if anyone screams.
Since this compatibility option shipped there has been more reports of users being surprised by the compat ABI than surprised by the "new", so the compat infrastructure has outlived its usefulness. Recall that /sys/bus/dax device-model is required for the dax kmem driver which allows PMEM to be used as "System RAM".
The following projects were known to have a dependency on /sys/class/dax and have dropped their dependency as of the listed version:
- ndctl (including libndctl, daxctl, and libdaxctl): v64+ - fio: v3.13+ - pmdk: v1.5.2+
As further evidence this option is no longer needed some distributions have already stopped enabling CONFIG_DEV_DAX_PMEM_COMPAT.
Cc: Ira Weiny <[email protected]> Cc: Dave Jiang <[email protected]> Reported-by: Vishal Verma <[email protected]> Acked-by: Dave Hansen <[email protected]> Reviewed-by: Jane Chu <[email protected]> Link: https://lore.kernel.org/r/163701116195.3784476.726128179293466337.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5 |
|
| #
a927bd6b |
| 22-Nov-2020 |
Dan Williams <[email protected]> |
mm: fix phys_to_target_node() and memory_add_physaddr_to_nid() exports
The core-mm has a default __weak implementation of phys_to_target_node() to mirror the weak definition of memory_add_physaddr_t
mm: fix phys_to_target_node() and memory_add_physaddr_to_nid() exports
The core-mm has a default __weak implementation of phys_to_target_node() to mirror the weak definition of memory_add_physaddr_to_nid(). That symbol is exported for modules. However, while the export in mm/memory_hotplug.c exported the symbol in the configuration cases of:
CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_HOTPLUG=y
...and:
CONFIG_NUMA_KEEP_MEMINFO=n CONFIG_MEMORY_HOTPLUG=y
...it failed to export the symbol in the case of:
CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_HOTPLUG=n
Not only is that broken, but Christoph points out that the kernel should not be exporting any __weak symbol, which means that memory_add_physaddr_to_nid() example that phys_to_target_node() copied is broken too.
Rework the definition of phys_to_target_node() and memory_add_physaddr_to_nid() to not require weak symbols. Move to the common arch override design-pattern of an asm header defining a symbol to replace the default implementation.
The only common header that all memory_add_physaddr_to_nid() producing architectures implement is asm/sparsemem.h. In fact, powerpc already defines its memory_add_physaddr_to_nid() helper in sparsemem.h. Double-down on that observation and define phys_to_target_node() where necessary in asm/sparsemem.h. An alternate consideration that was discarded was to put this override in asm/numa.h, but that entangles with the definition of MAX_NUMNODES relative to the inclusion of linux/nodemask.h, and requires powerpc to grow a new header.
The dependency on NUMA_KEEP_MEMINFO for DEV_DAX_HMEM_DEVICES is invalid now that the symbol is properly exported / stubbed in all combinations of CONFIG_NUMA_KEEP_MEMINFO and CONFIG_MEMORY_HOTPLUG.
[[email protected]: v4] Link: https://lkml.kernel.org/r/160461461867.1505359.5301571728749534585.stgit@dwillia2-desk3.amr.corp.intel.com [[email protected]: powerpc: fix create_section_mapping compile warning] Link: https://lkml.kernel.org/r/160558386174.2948926.2740149041249041764.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: a035b6bf863e ("mm/memory_hotplug: introduce default phys_to_target_node() implementation") Reported-by: Randy Dunlap <[email protected]> Reported-by: Thomas Gleixner <[email protected]> Reported-by: kernel test robot <[email protected]> Reported-by: Christoph Hellwig <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Randy Dunlap <[email protected]> Tested-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Joao Martins <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: https://lkml.kernel.org/r/160447639846.1133764.7044090803980177548.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1 |
|
| #
5ccac54f |
| 13-Oct-2020 |
Dan Williams <[email protected]> |
ACPI: HMAT: attach a device for each soft-reserved range
The hmem enabling in commit cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") only registered ranges to
ACPI: HMAT: attach a device for each soft-reserved range
The hmem enabling in commit cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") only registered ranges to the hmem driver for each soft-reservation that also appeared in the HMAT. While this is meant to encourage platform firmware to "do the right thing" and publish an HMAT, the corollary is that platforms that fail to publish an accurate HMAT will strand memory from Linux usage. Additionally, the "efi_fake_mem" kernel command line option enabling will strand memory by default without an HMAT.
Arrange for "soft reserved" memory that goes unclaimed by HMAT entries to be published as raw resource ranges for the hmem driver to consume.
Include a module parameter to disable either this fallback behavior, or the hmat enabling from creating hmem devices. The module parameter requires the hmem device enabling to have unique name in the module namespace: "device_hmem".
The driver depends on the architecture providing phys_to_target_node() which is only x86 via numa_meminfo() and arm64 via a generic memblock implementation.
[[email protected]: require NUMA_KEEP_MEMINFO for phys_to_target_node()] Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jia He <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643098298.4062302.17587338161136144730.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
c01044cc |
| 13-Oct-2020 |
Dan Williams <[email protected]> |
ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device
In preparation for exposing "Soft Reserved" memory ranges without an HMAT, move the hmem device registration to its own compi
ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device
In preparation for exposing "Soft Reserved" memory ranges without an HMAT, move the hmem device registration to its own compilation unit and make the implementation generic.
The generic implementation drops usage acpi_map_pxm_to_online_node() that was translating ACPI proximity domain values and instead relies on numa_map_to_online_node() to determine the numa node for the device.
[[email protected]: CONFIG_DEV_DAX_HMEM_DEVICES should depend on CONFIG_DAX=y] Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643096584.4062302.5035370788475153738.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/158318761484.2216124.2049322072599482736.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7 |
|
| #
a6c7f4c6 |
| 07-Nov-2019 |
Dan Williams <[email protected]> |
device-dax: Add a driver for "hmem" devices
Platform firmware like EFI/ACPI may publish "hmem" platform devices. Such a device is a performance differentiated memory range likely reserved for an app
device-dax: Add a driver for "hmem" devices
Platform firmware like EFI/ACPI may publish "hmem" platform devices. Such a device is a performance differentiated memory range likely reserved for an application specific use case. The driver gives access to 100% of the capacity via a device-dax mmap instance by default.
However, if over-subscription and other kernel memory management is desired the resulting dax device can be assigned to the core-mm via the kmem driver.
This consumes "hmem" devices the producer of "hmem" devices is saved for a follow-on patch so that it can reference the new CONFIG_DEV_DAX_HMEM symbol to gate performing the enumeration work.
Reported-by: kbuild test robot <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Signed-off-by: Dan Williams <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
show more ...
|
|
Revision tags: v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1 |
|
| #
ec8f24b7 |
| 19-May-2019 |
Thomas Gleixner <[email protected]> |
treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project
treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4 |
|
| #
67476656 |
| 01-Apr-2019 |
Aneesh Kumar K.V <[email protected]> |
drivers/dax: Allow to include DEV_DAX_PMEM as builtin
This move the dependency to DEV_DAX_PMEM_COMPAT such that only if DEV_DAX_PMEM is built as module we can allow the compat support.
This allows
drivers/dax: Allow to include DEV_DAX_PMEM as builtin
This move the dependency to DEV_DAX_PMEM_COMPAT such that only if DEV_DAX_PMEM is built as module we can allow the compat support.
This allows to test the new code easily in a emulation setup where we often build things without module support.
Cc: <[email protected]> Fixes: 730926c3b099 ("device-dax: Add /sys/class/dax backwards compatibility") Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0 |
|
| #
c221c0b0 |
| 25-Feb-2019 |
Dave Hansen <[email protected]> |
device-dax: "Hotplug" persistent memory for use like normal RAM
This is intended for use with NVDIMMs that are physically persistent (physically like flash) so that they can be used as a cost-effect
device-dax: "Hotplug" persistent memory for use like normal RAM
This is intended for use with NVDIMMs that are physically persistent (physically like flash) so that they can be used as a cost-effective RAM replacement. Intel Optane DC persistent memory is one implementation of this kind of NVDIMM.
Currently, a persistent memory region is "owned" by a device driver, either the "Direct DAX" or "Filesystem DAX" drivers. These drivers allow applications to explicitly use persistent memory, generally by being modified to use special, new libraries. (DIMM-based persistent memory hardware/software is described in great detail here: Documentation/nvdimm/nvdimm.txt).
However, this limits persistent memory use to applications which *have* been modified. To make it more broadly usable, this driver "hotplugs" memory into the kernel, to be managed and used just like normal RAM would be.
To make this work, management software must remove the device from being controlled by the "Device DAX" infrastructure:
echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
and then tell the new driver that it can bind to the device:
echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
After this, there will be a number of new memory sections visible in sysfs that can be onlined, or that may get onlined by existing udev-initiated memory hotplug rules.
This rebinding procedure is currently a one-way trip. Once memory is bound to "kmem", it's there permanently and can not be unbound and assigned back to device_dax.
The kmem driver will never bind to a dax device unless the device is *explicitly* bound to the driver. There are two reasons for this: One, since it is a one-way trip, it can not be undone if bound incorrectly. Two, the kmem driver destroys data on the device. Think of if you had good data on a pmem device. It would be catastrophic if you compile-in "kmem", but leave out the "device_dax" driver. kmem would take over the device and write volatile data all over your good data.
This inherits any existing NUMA information for the newly-added memory from the persistent memory device that came from the firmware. On Intel platforms, the firmware has guarantees that require each socket's persistent memory to be in a separate memory-only NUMA node. That means that this patch is not expected to create NUMA nodes, but will simply hotplug memory into existing nodes.
Because NUMA nodes are created, the existing NUMA APIs and tools are sufficient to create policies for applications or memory areas to have affinity for or an aversion to using this memory.
There is currently some metadata at the beginning of pmem regions. The section-size memory hotplug restrictions, plus this small reserved area can cause the "loss" of a section or two of capacity. This should be fixable in follow-on patches. But, as a first step, losing 256MB of memory (worst case) out of hundreds of gigabytes is a good tradeoff vs. the required code to fix this up precisely. This calculation is also the reason we export memory_block_size_bytes().
Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Keith Busch <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Michal Hocko <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Huang Ying <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Yaowei Bai <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Jerome Glisse <[email protected]> Reviewed-by: Vishal Verma <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3, v4.17-rc2, v4.17-rc1, v4.16, v4.16-rc7, v4.16-rc6, v4.16-rc5, v4.16-rc4, v4.16-rc3, v4.16-rc2, v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7, v4.15-rc6, v4.15-rc5, v4.15-rc4, v4.15-rc3, v4.15-rc2, v4.15-rc1, v4.14, v4.14-rc8, v4.14-rc7, v4.14-rc6, v4.14-rc5, v4.14-rc4, v4.14-rc3, v4.14-rc2, v4.14-rc1, v4.13, v4.13-rc7, v4.13-rc6, v4.13-rc5, v4.13-rc4, v4.13-rc3, v4.13-rc2 |
|
| #
730926c3 |
| 16-Jul-2017 |
Dan Williams <[email protected]> |
device-dax: Add /sys/class/dax backwards compatibility
On the expectation that some environments may not upgrade libdaxctl (userspace component that depends on the /sys/class/dax hierarchy), provide
device-dax: Add /sys/class/dax backwards compatibility
On the expectation that some environments may not upgrade libdaxctl (userspace component that depends on the /sys/class/dax hierarchy), provide a default / legacy dax_pmem_compat driver. The dax_pmem_compat driver implements the original /sys/class/dax sysfs layout rather than /sys/bus/dax. When userspace is upgraded it can blacklist this module and switch to the dax_pmem driver going forward.
CONFIG_DEV_DAX_PMEM_COMPAT and supporting code will be deleted according to the dax_pmem entry in Documentation/ABI/obsolete/.
Signed-off-by: Dan Williams <[email protected]>
show more ...
|
| #
2080e88a |
| 30-Mar-2018 |
Dan Williams <[email protected]> |
dax: introduce CONFIG_DAX_DRIVER
In support of allowing device-mapper to compile out idle/dead code when there are no dax providers in the system, introduce the DAX_DRIVER symbol. This is selected b
dax: introduce CONFIG_DAX_DRIVER
In support of allowing device-mapper to compile out idle/dead code when there are no dax providers in the system, introduce the DAX_DRIVER symbol. This is selected by all leaf drivers that device-mapper might be layered on top. This allows device-mapper to conditionally 'select DAX' only when a provider is present.
Cc: Martin Schwidefsky <[email protected]> Cc: Heiko Carstens <[email protected]> Reported-by: Bart Van Assche <[email protected]> Reviewed-by: Mike Snitzer <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v4.13-rc1, v4.12, v4.12-rc7, v4.12-rc6, v4.12-rc5, v4.12-rc4, v4.12-rc3, v4.12-rc2, v4.12-rc1 |
|
| #
cf1e2289 |
| 08-May-2017 |
Dan Williams <[email protected]> |
device-dax: kill NR_DEV_DAX
There is no point to ask how many device-dax instances the kernel should support. Since we are already using a dynamic major number, just allow the max number of minors b
device-dax: kill NR_DEV_DAX
There is no point to ask how many device-dax instances the kernel should support. Since we are already using a dynamic major number, just allow the max number of minors by default and be done. This also fixes the fact that the proposed max for the NR_DEV_DAX range was larger than what could be supported by alloc_chrdev_region().
Fixes: ba09c01d2fa8 ("dax: convert to the cdev api") Reported-by: Geert Uytterhoeven <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
| #
74d71a01 |
| 06-May-2017 |
Mike Galbraith <[email protected]> |
device-dax: Tell kbuild DEV_DAX_PMEM depends on DEV_DAX
ERROR: "devm_create_dev_dax" [drivers/dax/dax_pmem.ko] undefined! ERROR: "alloc_dax_region" [drivers/dax/dax_pmem.ko] undefined! ERROR: "dax_r
device-dax: Tell kbuild DEV_DAX_PMEM depends on DEV_DAX
ERROR: "devm_create_dev_dax" [drivers/dax/dax_pmem.ko] undefined! ERROR: "alloc_dax_region" [drivers/dax/dax_pmem.ko] undefined! ERROR: "dax_region_put" [drivers/dax/dax_pmem.ko] undefined!
Signed-off-by: Mike Galbraith <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v4.11, v4.11-rc8, v4.11-rc7 |
|
| #
7b6be844 |
| 11-Apr-2017 |
Dan Williams <[email protected]> |
dax: refactor dax-fs into a generic provider of 'struct dax_device' instances
We want dax capable drivers to be able to publish a set of dax operations [1]. However, we do not want to further abuse
dax: refactor dax-fs into a generic provider of 'struct dax_device' instances
We want dax capable drivers to be able to publish a set of dax operations [1]. However, we do not want to further abuse block_devices to advertise these operations. Instead we will attach these operations to a dax device and add a lookup mechanism to go from block device path to a dax device. A dax capable driver like pmem or brd is responsible for registering a dax device, alongside a block device, and then a dax capable filesystem is responsible for retrieving the dax device by path name if it wants to call dax_operations.
For now, we refactor the dax pseudo-fs to be a generic facility, rather than an implementation detail, of the device-dax use case. Where a "dax device" is just an inode + dax infrastructure, and "Device DAX" is a mapping service layered on top of that base 'struct dax_device'. "Filesystem DAX" is then a mapping service that layers a filesystem on top of that same base device. Filesystem DAX is associated with a block_device for now, but perhaps directly to a dax device in the future, or for new pmem-only filesystems.
[1]: https://lkml.org/lkml/2017/1/19/880
Suggested-by: Christoph Hellwig <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v4.11-rc6 |
|
| #
956a4cd2 |
| 07-Apr-2017 |
Dan Williams <[email protected]> |
device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
The following warning triggers with a new unit test that stresses the device-dax interface.
=============================== [ ERR
device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
The following warning triggers with a new unit test that stresses the device-dax interface.
=============================== [ ERR: suspicious RCU usage. ] 4.11.0-rc4+ #1049 Tainted: G O ------------------------------- ./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0 2 locks held by fio/9070: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0 #1: (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]
Call Trace: dump_stack+0x86/0xc3 lockdep_rcu_suspicious+0xd7/0x110 ___might_sleep+0xac/0x250 __might_sleep+0x4a/0x80 __alloc_pages_nodemask+0x23a/0x360 alloc_pages_current+0xa1/0x1f0 pte_alloc_one+0x17/0x80 __pte_alloc+0x1e/0x120 __get_locked_pte+0x1bf/0x1d0 insert_pfn.isra.70+0x3a/0x100 ? lookup_memtype+0xa6/0xd0 vm_insert_mixed+0x64/0x90 dax_dev_huge_fault+0x520/0x620 [dax] ? dax_dev_huge_fault+0x32/0x620 [dax] dax_dev_fault+0x10/0x20 [dax] __do_fault+0x1e/0x140 __handle_mm_fault+0x9af/0x10d0 handle_mm_fault+0x16d/0x370 ? handle_mm_fault+0x47/0x370 __do_page_fault+0x28c/0x4f0 trace_do_page_fault+0x58/0x2a0 do_async_page_fault+0x1a/0xa0 async_page_fault+0x28/0x30
Inserting a page table entry may trigger an allocation while we are holding a read lock to keep the device instance alive for the duration of the fault. Use srcu for this keep-alive protection.
Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap") Cc: <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v4.11-rc5, v4.11-rc4, v4.11-rc3, v4.11-rc2, v4.11-rc1, v4.10, v4.10-rc8, v4.10-rc7, v4.10-rc6, v4.10-rc5, v4.10-rc4, v4.10-rc3, v4.10-rc2, v4.10-rc1, v4.9, v4.9-rc8, v4.9-rc7, v4.9-rc6, v4.9-rc5, v4.9-rc4, v4.9-rc3 |
|
| #
867dfe34 |
| 25-Oct-2016 |
Arnd Bergmann <[email protected]> |
nvdimm: make CONFIG_NVDIMM_DAX 'bool'
A bugfix just tried to address a randconfig build problem and introduced a variant of the same problem: with CONFIG_LIBNVDIMM=y and CONFIG_NVDIMM_DAX=m, the nvd
nvdimm: make CONFIG_NVDIMM_DAX 'bool'
A bugfix just tried to address a randconfig build problem and introduced a variant of the same problem: with CONFIG_LIBNVDIMM=y and CONFIG_NVDIMM_DAX=m, the nvdimm module now fails to link:
drivers/nvdimm/built-in.o: In function `to_nd_device_type': bus.c:(.text+0x1b5d): undefined reference to `is_nd_dax' drivers/nvdimm/built-in.o: In function `nd_region_notify_driver_action.constprop.2': region_devs.c:(.text+0x6b6c): undefined reference to `is_nd_dax' region_devs.c:(.text+0x6b8c): undefined reference to `to_nd_dax' drivers/nvdimm/built-in.o: In function `nd_region_probe': region.c:(.text+0x70f3): undefined reference to `nd_dax_create' drivers/nvdimm/built-in.o: In function `mode_show': namespace_devs.c:(.text+0xa196): undefined reference to `is_nd_dax' drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe': (.text+0xa55f): undefined reference to `is_nd_dax' drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe': (.text+0xa56e): undefined reference to `to_nd_dax'
This reverts the earlier fix, making NVDIMM_DAX a 'bool' option again as it should be (it gets linked into the libnvdimm module). To fix the original problem, I'm adding a dependency on LIBNVDIMM to DEV_DAX_PMEM, which ensures we can't have that one built-in if the rest is a module.
Fixes: 4e65e9381c7a ("/dev/dax: fix Kconfig dependency build breakage") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Ross Zwisler <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v4.9-rc2, v4.9-rc1, v4.8, v4.8-rc8, v4.8-rc7, v4.8-rc6, v4.8-rc5, v4.8-rc4, v4.8-rc3, v4.8-rc2, v4.8-rc1 |
|
| #
ba09c01d |
| 24-Jul-2016 |
Dan Williams <[email protected]> |
dax: convert to the cdev api
A goal of the device-DAX interface is to be able to support many exclusive allocations (partitions) of performance / feature differentiated memory. This count may excee
dax: convert to the cdev api
A goal of the device-DAX interface is to be able to support many exclusive allocations (partitions) of performance / feature differentiated memory. This count may exceed the default minors limit of 256.
As a result of switching to an embedded cdev the inode-to-dax_dev conversion is simplified, as well as reference counting which can switch to the cdev kobject lifetime.
Cc: Al Viro <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
|
Revision tags: v4.7, v4.7-rc7, v4.7-rc6, v4.7-rc5, v4.7-rc4, v4.7-rc3, v4.7-rc2, v4.7-rc1, v4.6 |
|
| #
dee41079 |
| 14-May-2016 |
Dan Williams <[email protected]> |
/dev/dax, core: file operations and dax-mmap
The "Device DAX" core enables dax mappings of performance / feature differentiated memory. An open mapping or file handle keeps the backing struct devic
/dev/dax, core: file operations and dax-mmap
The "Device DAX" core enables dax mappings of performance / feature differentiated memory. An open mapping or file handle keeps the backing struct device live, but new mappings are only possible while the device is enabled. Faults are handled under rcu_read_lock to synchronize with the enabled state of the device.
Similar to the filesystem-dax case the backing memory may optionally have struct page entries. However, unlike fs-dax there is no support for private mappings, or mappings that are not backed by media (see use of zero-page in fs-dax).
Mappings are always guaranteed to match the alignment of the dax_region. If the dax_region is configured to have a 2MB alignment, all mappings are guaranteed to be backed by a pmd entry. Contrast this determinism with the fs-dax case where pmd mappings are opportunistic. If userspace attempts to force a misaligned mapping, the driver will fail the mmap attempt. See dax_dev_check_vma() for other scenarios that are rejected, like MAP_PRIVATE mappings.
Cc: Hannes Reinecke <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ross Zwisler <[email protected]> Acked-by: "Paul E. McKenney" <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|
| #
ab68f262 |
| 18-May-2016 |
Dan Williams <[email protected]> |
/dev/dax, pmem: direct access to persistent memory
Device DAX is the device-centric analogue of Filesystem DAX (CONFIG_FS_DAX). It allows memory ranges to be allocated and mapped without need of an
/dev/dax, pmem: direct access to persistent memory
Device DAX is the device-centric analogue of Filesystem DAX (CONFIG_FS_DAX). It allows memory ranges to be allocated and mapped without need of an intervening file system. Device DAX is strict, precise and predictable. Specifically this interface:
1/ Guarantees fault granularity with respect to a given page size (pte, pmd, or pud) set at configuration time.
2/ Enforces deterministic behavior by being strict about what fault scenarios are supported.
For example, by forcing MADV_DONTFORK semantics and omitting MAP_PRIVATE support device-dax guarantees that a mapping always behaves/performs the same once established. It is the "what you see is what you get" access mechanism to differentiated memory vs filesystem DAX which has filesystem specific implementation semantics.
Persistent memory is the first target, but the mechanism is also targeted for exclusive allocations of performance differentiated memory ranges.
This commit is limited to the base device driver infrastructure to associate a dax device with pmem range.
Cc: Jeff Moyer <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ross Zwisler <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Dan Williams <[email protected]>
show more ...
|