History log of /linux-6.15/fs/proc/kcore.c (Results 1 – 25 of 99)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13
# 0de47f28 15-Jan-2025 Akihiko Odaki <[email protected]>

crash: Use note name macros

Use note name macros to match with the userspace's expectation.

Signed-off-by: Akihiko Odaki <[email protected]>
Acked-by: Baoquan He <[email protected]>
Reviewed-by

crash: Use note name macros

Use note name macros to match with the userspace's expectation.

Signed-off-by: Akihiko Odaki <[email protected]>
Acked-by: Baoquan He <[email protected]>
Reviewed-by: Dave Martin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>

show more ...


Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7
# 605291e2 09-Nov-2024 Omar Sandoval <[email protected]>

proc/kcore: use percpu_rw_semaphore for kclist_lock

The list of memory ranges for /proc/kcore is protected by a
rw_semaphore. We lock it for reading on every read from /proc/kcore.
This is very heav

proc/kcore: use percpu_rw_semaphore for kclist_lock

The list of memory ranges for /proc/kcore is protected by a
rw_semaphore. We lock it for reading on every read from /proc/kcore.
This is very heavy, especially since it is rarely locked for writing.
Since we want to strongly favor read lock performance, convert it to a
percpu_rw_semaphore. I also experimented with percpu_ref and SRCU, but
this change was the simplest and the fastest.

In my benchmark, this reduces the time per read by yet another 20
nanoseconds on top of the previous two changes, from 195 nanoseconds per
read to 175.

Link: https://github.com/osandov/drgn/issues/106
Signed-off-by: Omar Sandoval <[email protected]>
Link: https://lore.kernel.org/r/83a3b235b4bcc3b8aef7c533e0657f4d7d5d35ae.1731115587.git.osandov@fb.com
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 680e029f 09-Nov-2024 Omar Sandoval <[email protected]>

proc/kcore: don't walk list on every read

We maintain a list of memory ranges for /proc/kcore, which usually has
10-20 entries. Currently, every single read from /proc/kcore walks the
entire list in

proc/kcore: don't walk list on every read

We maintain a list of memory ranges for /proc/kcore, which usually has
10-20 entries. Currently, every single read from /proc/kcore walks the
entire list in order to count the number of entries and compute some
offsets. These values only change when the list of memory ranges
changes, which is very rare (only when memory is hot(un)plugged). We can
cache the values when the list is populated to avoid these redundant
walks.

In my benchmark, this reduces the time per read by another 20
nanoseconds on top of the previous change, from 215 nanoseconds per read
to 195.

Link: https://github.com/osandov/drgn/issues/106
Signed-off-by: Omar Sandoval <[email protected]>
Link: https://lore.kernel.org/r/8d945558b9c9efe74103a34b7780f1cd90d9ce7f.1731115587.git.osandov@fb.com
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# c9136fad 09-Nov-2024 Omar Sandoval <[email protected]>

proc/kcore: mark proc entry as permanent

drgn reads from /proc/kcore to debug the running kernel. For many drgn
scripts, /proc/kcore is actually a bottleneck.

use_pde() and unuse_pde() in prog_reg_

proc/kcore: mark proc entry as permanent

drgn reads from /proc/kcore to debug the running kernel. For many drgn
scripts, /proc/kcore is actually a bottleneck.

use_pde() and unuse_pde() in prog_reg_read() show up hot in profiles.
Since the entry for /proc/kcore can never be removed, this is useless
overhead that can be trivially avoided by marking the entry as
permanent.

In my benchmark, this reduces the time per read by about 20 nanoseconds,
from 235 nanoseconds per read to 215.

Link: https://github.com/osandov/drgn/issues/106
Signed-off-by: Omar Sandoval <[email protected]>
Link: https://lore.kernel.org/r/60873e6afcfda3f08d0456f19e4733612afcf134.1731115587.git.osandov@fb.com
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 088f2946 21-Nov-2024 Jiri Olsa <[email protected]>

fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero

If iov_iter_zero succeeds after failed copy_from_kernel_nofault,
we need to reset the ret value to zero otherwise i

fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero

If iov_iter_zero succeeds after failed copy_from_kernel_nofault,
we need to reset the ret value to zero otherwise it will be returned
as final return value of read_kcore_iter.

This fixes objdump -d dump over /proc/kcore for me.

Cc: [email protected]
Cc: Alexander Gordeev <[email protected]>
Fixes: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
Signed-off-by: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Alexander Gordeev <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.12-rc6
# 82e33f24 29-Oct-2024 Mirsad Todorovac <[email protected]>

fs/proc/kcore.c: fix coccinelle reported ERROR instances

Coccinelle complains about the nested reuse of the pointer `iter' with
different pointer type:

./fs/proc/kcore.c:515:26-30: ERROR: invalid r

fs/proc/kcore.c: fix coccinelle reported ERROR instances

Coccinelle complains about the nested reuse of the pointer `iter' with
different pointer type:

./fs/proc/kcore.c:515:26-30: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:534:23-27: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:550:40-44: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:568:27-31: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:581:28-32: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:599:27-31: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:607:38-42: ERROR: invalid reference to the index variable of the iterator on line 499
./fs/proc/kcore.c:614:26-30: ERROR: invalid reference to the index variable of the iterator on line 499

Replacing `struct kcore_list *iter' with `struct kcore_list *tmp' doesn't change the
scope and the functionality is the same and coccinelle seems happy.

NOTE: There was an issue with using `struct kcore_list *pos' as the nested iterator.
The build did not work!

[[email protected]: s/tmp/pos/]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 04d168c6d42d ("fs/proc/kcore.c: remove check of list iterator against head past the loop body")
Signed-off-by: Jakob Koschel <[email protected]>
Signed-off-by: Mirsad Todorovac <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: "Brian Johannesmeyer" <[email protected]>
Cc: Cristiano Giuffrida <[email protected]>
Cc: "Bos, H.J." <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Hari Bathini <[email protected]>
Cc: Yan Zhen <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2
# 3d5854d7 30-Sep-2024 Alexander Gordeev <[email protected]>

fs/proc/kcore.c: allow translation of physical memory addresses

When /proc/kcore is read an attempt to read the first two pages results in
HW-specific page swap on s390 and another (so called prefix

fs/proc/kcore.c: allow translation of physical memory addresses

When /proc/kcore is read an attempt to read the first two pages results in
HW-specific page swap on s390 and another (so called prefix) pages are
accessed instead. That leads to a wrong read.

Allow architecture-specific translation of memory addresses using
kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily
to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks. That
way an architecture can deal with specific physical memory ranges.

Re-use the existing /dev/mem callback implementation on s390, which
handles the described prefix pages swapping correctly.

For other architectures the default callback is basically NOP. It is
expected the condition (vaddr == __va(__pa(vaddr))) always holds true for
KCORE_RAM memory type.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexander Gordeev <[email protected]>
Suggested-by: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.12-rc1, v6.11
# 698e7d16 09-Sep-2024 Yan Zhen <[email protected]>

proc: Fix typo in the comment

The deference here confuses me.

Maybe here want to say that because show_fd_locks() does not dereference
the files pointer, using the stale value of the files pointer

proc: Fix typo in the comment

The deference here confuses me.

Maybe here want to say that because show_fd_locks() does not dereference
the files pointer, using the stale value of the files pointer is safe.

Correctly spelled comments make it easier for the reader to understand
the code.

replace 'deferences' with 'dereferences' in the comment &
replace 'inialized' with 'initialized' in the comment.

Signed-off-by: Yan Zhen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2
# 443cbaf9 24-Jan-2024 Baoquan He <[email protected]>

crash: split vmcoreinfo exporting code out from crash_core.c

Now move the relevant codes into separate files:
kernel/crash_reserve.c, include/linux/crash_reserve.h.

And add config item CRASH_RESERV

crash: split vmcoreinfo exporting code out from crash_core.c

Now move the relevant codes into separate files:
kernel/crash_reserve.c, include/linux/crash_reserve.h.

And add config item CRASH_RESERVE to control its enabling.

And also update the old ifdeffery of CONFIG_CRASH_CORE, including of
<linux/crash_core.h> and config item dependency on CRASH_CORE
accordingly.

And also do renaming as follows:
- arch/xxx/kernel/{crash_core.c => vmcore_info.c}
because they are only related to vmcoreinfo exporting on x86, arm64,
riscv.

And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to
decide if build in crash_core.c.

[[email protected]: remove duplicated include in vmcore_info.c]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Baoquan He <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Acked-by: Hari Bathini <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Klara Modin <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Yang Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2
# e538a582 11-Sep-2023 Adrian Hunter <[email protected]>

proc/kcore: do not try to access unaccepted memory

Support for unaccepted memory was added recently, refer commit
dcdfdd40fa82 ("mm: Add support for unaccepted memory"), whereby a virtual
machine ma

proc/kcore: do not try to access unaccepted memory

Support for unaccepted memory was added recently, refer commit
dcdfdd40fa82 ("mm: Add support for unaccepted memory"), whereby a virtual
machine may need to accept memory before it can be used.

Do not try to access unaccepted memory because it can cause the guest to
fail.

For /proc/kcore, which is read-only and does not support mmap, this means a
read of unaccepted memory will return zeros.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5
# 17457784 31-Jul-2023 Lorenzo Stoakes <[email protected]>

fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions

Some architectures do not populate the entire range categorised by
KCORE_TEXT, so we must ensure that the kernel address we read from is

fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions

Some architectures do not populate the entire range categorised by
KCORE_TEXT, so we must ensure that the kernel address we read from is
valid.

Unfortunately there is no solution currently available to do so with a
purely iterator solution so reinstate the bounce buffer in this instance
so we can use copy_from_kernel_nofault() in order to avoid page faults
when regions are unmapped.

This change partly reverts commit 2e1c0170771e ("fs/proc/kcore: avoid
bounce buffer for ktext data"), reinstating the bounce buffer, but adapts
the code to continue to use an iterator.

[[email protected]: correct comment to be strictly correct about reasoning]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data")
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reported-by: Jiri Olsa <[email protected]>
Closes: https://lore.kernel.org/all/ZHc2fm+9daF6cgCE@krava
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Will Deacon <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Thorsten Leemhuis <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2
# 9e627588 10-May-2023 Azeem Shaikh <[email protected]>

procfs: replace all non-returning strlcpy with strscpy

strlcpy() reads the entire source buffer first. This read may exceed the
destination size limit. This is both inefficient and can lead to lin

procfs: replace all non-returning strlcpy with strscpy

strlcpy() reads the entire source buffer first. This read may exceed the
destination size limit. This is both inefficient and can lead to linear
read overflows if a source string is not NUL-terminated [1]. In an effort
to remove strlcpy() completely [2], replace strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Azeem Shaikh <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Alexey Dobriyan <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4
# 9b2d38b4 29-Aug-2022 Linus Walleij <[email protected]>

fs/proc/kcore.c: Pass a pointer to virt_addr_valid()

The virt_addr_valid() should be passed a pointer, the current
code passing a long unsigned int is just exploiting the
unintentional polymorphism

fs/proc/kcore.c: Pass a pointer to virt_addr_valid()

The virt_addr_valid() should be passed a pointer, the current
code passing a long unsigned int is just exploiting the
unintentional polymorphism of these calls being implemented
as preprocessor macros.

Signed-off-by: Linus Walleij <[email protected]>

show more ...


# 4c91c07c 22-Mar-2023 Lorenzo Stoakes <[email protected]>

mm: vmalloc: convert vread() to vread_iter()

Having previously laid the foundation for converting vread() to an
iterator function, pull the trigger and do so.

This patch attempts to provide minimal

mm: vmalloc: convert vread() to vread_iter()

Having previously laid the foundation for converting vread() to an
iterator function, pull the trigger and do so.

This patch attempts to provide minimal refactoring and to reflect the
existing logic as best we can, for example we continue to zero portions of
memory not read, as before.

Overall, there should be no functional difference other than a performance
improvement in /proc/kcore access to vmalloc regions.

Now we have eliminated the need for a bounce buffer in read_kcore_iter(),
we dispense with it, and try to write to user memory optimistically but
with faults disabled via copy_page_to_iter_nofault(). We already have
preemption disabled by holding a spin lock. We continue faulting in until
the operation is complete.

Additionally, we must account for the fact that at any point a copy may
fail (most likely due to a fault not being able to occur), we exit
indicating fewer bytes retrieved than expected.

[[email protected]: fix sparc64 warning]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: redo Stephen's sparc build fix]
Link: https://lkml.kernel.org/r/8506cbc667c39205e65a323f750ff9c11a463798.1679566220.git.lstoakes@gmail.com
[[email protected]: unbreak uio.h includes]
Link: https://lkml.kernel.org/r/941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 46c0d6d0 22-Mar-2023 Lorenzo Stoakes <[email protected]>

fs/proc/kcore: convert read_kcore() to read_kcore_iter()

For the time being we still use a bounce buffer for vread(), however in
the next patch we will convert this to interact directly with the ite

fs/proc/kcore: convert read_kcore() to read_kcore_iter()

For the time being we still use a bounce buffer for vread(), however in
the next patch we will convert this to interact directly with the iterator
and eliminate the bounce buffer altogether.

Link: https://lkml.kernel.org/r/ebe12c8d70eebd71f487d80095605f3ad0d1489c.1679511146.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 2e1c0170 22-Mar-2023 Lorenzo Stoakes <[email protected]>

fs/proc/kcore: avoid bounce buffer for ktext data

Patch series "convert read_kcore(), vread() to use iterators", v8.

While reviewing Baoquan's recent changes to permit vread() access to
vm_map_ram

fs/proc/kcore: avoid bounce buffer for ktext data

Patch series "convert read_kcore(), vread() to use iterators", v8.

While reviewing Baoquan's recent changes to permit vread() access to
vm_map_ram regions of vmalloc allocations, Willy pointed out [1] that it
would be nice to refactor vread() as a whole, since its only user is
read_kcore() and the existing form of vread() necessitates the use of a
bounce buffer.

This patch series does exactly that, as well as adjusting how we read the
kernel text section to avoid the use of a bounce buffer in this case as
well.

This has been tested against the test case which motivated Baoquan's
changes in the first place [2] which continues to function correctly, as
do the vmalloc self tests.


This patch (of 4):

Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data")
introduced the use of a bounce buffer to retrieve kernel text data for
/proc/kcore in order to avoid failures arising from hardened user copies
enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().

We can avoid doing this if instead of copy_to_user() we use
_copy_to_user() which bypasses the hardening check. This is more
efficient than using a bounce buffer and simplifies the code.

We do so as part an overall effort to eliminate bounce buffer usage in the
function with an eye to converting it an iterator read.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/Y8WfDSRkc%[email protected]/ [1]
Link: https://lore.kernel.org/all/[email protected]/T/#u [2]
Link: https://lkml.kernel.org/r/fd39b0bfa7edc76d360def7d034baaee71d90158.1679511146.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# e025ab84 18-Oct-2022 Kefeng Wang <[email protected]>

mm: remove kern_addr_valid() completely

Most architectures (except arm64/x86/sparc) simply return 1 for
kern_addr_valid(), which is only used in read_kcore(), and it calls
copy_from_kernel_nofault()

mm: remove kern_addr_valid() completely

Most architectures (except arm64/x86/sparc) simply return 1 for
kern_addr_valid(), which is only used in read_kcore(), and it calls
copy_from_kernel_nofault() which could check whether the address is a
valid kernel address. So as there is no need for kern_addr_valid(), let's
remove it.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]> [m68k]
Acked-by: Heiko Carstens <[email protected]> [s390]
Acked-by: Christoph Hellwig <[email protected]>
Acked-by: Helge Deller <[email protected]> [parisc]
Acked-by: Michael Ellerman <[email protected]> [powerpc]
Acked-by: Guo Ren <[email protected]> [csky]
Acked-by: Catalin Marinas <[email protected]> [arm64]
Cc: Alexander Gordeev <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Anton Ivanov <[email protected]>
Cc: <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Greg Ungerer <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xuerui Wang <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 1eeaa4fd 23-Sep-2022 Liu Shixin <[email protected]>

memory: move hotplug memory notifier priority to same file for easy sorting

The priority of hotplug memory callback is defined in a different file.
And there are some callers using numbers directly

memory: move hotplug memory notifier priority to same file for easy sorting

The priority of hotplug memory callback is defined in a different file.
And there are some callers using numbers directly. Collect them together
into include/linux/memory.h for easy reading. This allows us to sort
their priorities more intuitively without additional comments.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liu Shixin <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: zefan li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 5d89c224 23-Sep-2022 Liu Shixin <[email protected]>

fs/proc/kcore.c: use hotplug_memory_notifier() directly

Commit 76ae847497bc52 ("Documentation: raise minimum supported version of
GCC to 5.1") updated the minimum gcc version to 5.1. So the problem

fs/proc/kcore.c: use hotplug_memory_notifier() directly

Commit 76ae847497bc52 ("Documentation: raise minimum supported version of
GCC to 5.1") updated the minimum gcc version to 5.1. So the problem
mentioned in f02c69680088 ("include/linux/memory.h: implement
register_hotmemory_notifier()") no longer exist. So we can now switch to
use hotplug_memory_notifier() directly rather than
register_hotmemory_notifier().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liu Shixin <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: zefan li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5
# 04d168c6 29-Apr-2022 Jakob Koschel <[email protected]>

fs/proc/kcore.c: remove check of list iterator against head past the loop body

When list_for_each_entry() completes the iteration over the whole list
without breaking the loop, the iterator value wi

fs/proc/kcore.c: remove check of list iterator against head past the loop body

When list_for_each_entry() completes the iteration over the whole list
without breaking the loop, the iterator value will be a bogus pointer
computed based on the head element.

While it is safe to use the pointer to determine if it was computed based
on the head element, either with list_entry_is_head() or &pos->member ==
head, using the iterator variable after the loop should be avoided.

In preparation to limit the scope of a list iterator to the list traversal
loop, use a dedicated pointer to point to the found element [1].

[[email protected]: reduce scope of `iter']
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jakob Koschel <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: "Brian Johannesmeyer" <[email protected]>
Cc: Cristiano Giuffrida <[email protected]>
Cc: "Bos, H.J." <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1
# c6d9eee2 01-Jul-2021 David Hildenbrand <[email protected]>

fs/proc/kcore: use page_offline_(freeze|thaw)

Let's properly synchronize with drivers that set PageOffline().
Unfreeze/thaw every now and then, so drivers that want to set
PageOffline() can make pro

fs/proc/kcore: use page_offline_(freeze|thaw)

Let's properly synchronize with drivers that set PageOffline().
Unfreeze/thaw every now and then, so drivers that want to set
PageOffline() can make progress.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Mike Rapoport <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Cc: Aili Yao <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Alex Shi <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Jiri Bohac <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Wei Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 0daa322b 01-Jul-2021 David Hildenbrand <[email protected]>

fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages

Let's avoid reading:

1) Offline memory sections: the content of offline memory sections is
stale as the m

fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages

Let's avoid reading:

1) Offline memory sections: the content of offline memory sections is
stale as the memory is effectively unused by the kernel. On s390x with
standby memory, offline memory sections (belonging to offline storage
increments) are not accessible. With virtio-mem and the hyper-v
balloon, we can have unavailable memory chunks that should not be
accessed inside offline memory sections. Last but not least, offline
memory sections might contain hwpoisoned pages which we can no longer
identify because the memmap is stale.

2) PG_offline pages: logically offline pages that are documented as
"The content of these pages is effectively stale. Such pages should
not be touched (read/write/dump/save) except by their owner.".
Examples include pages inflated in a balloon or unavailble memory
ranges inside hotplugged memory sections with virtio-mem or the hyper-v
balloon.

3) PG_hwpoison pages: Reading pages marked as hwpoisoned can be fatal.
As documented: "Accessing is not safe since it may cause another
machine check. Don't touch!"

Introduce is_page_hwpoison(), adding a comment that it is inherently racy
but best we can really do.

Reading /proc/kcore now performs similar checks as when reading
/proc/vmcore for kdump via makedumpfile: problematic pages are exclude.
It's also similar to hibernation code, however, we don't skip hwpoisoned
pages when processing pages in kernel/power/snapshot.c:saveable_page()
yet.

Note 1: we can race against memory offlining code, especially memory going
offline and getting unplugged: however, we will properly tear down the
identity mapping and handle faults gracefully when accessing this memory
from kcore code.

Note 2: we can race against drivers setting PageOffline() and turning
memory inaccessible in the hypervisor. We'll handle this in a follow-up
patch.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Cc: Aili Yao <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Alex Shi <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Jiri Bohac <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Wei Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 2711032c 01-Jul-2021 David Hildenbrand <[email protected]>

fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM

Let's resturcture the code, using switch-case, and checking pfn_is_ram()
only when we are dealing with KCORE_RAM.

Link: https://lkml.kernel

fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM

Let's resturcture the code, using switch-case, and checking pfn_is_ram()
only when we are dealing with KCORE_RAM.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Cc: Aili Yao <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Alex Shi <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Jiri Bohac <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Wei Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 3c36b419 01-Jul-2021 David Hildenbrand <[email protected]>

fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER

Patch series "fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages", v3.

Looking for places where the kernel migh

fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER

Patch series "fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages", v3.

Looking for places where the kernel might unconditionally read
PageOffline() pages, I stumbled over /proc/kcore; turns out /proc/kcore
needs some more love to not touch some other pages we really don't want to
read -- i.e., hwpoisoned ones.

Examples for PageOffline() pages are pages inflated in a balloon, memory
unplugged via virtio-mem, and partially-present sections in memory added
by the Hyper-V balloon.

When reading pages inflated in a balloon, we essentially produce
unnecessary load in the hypervisor; holes in partially present sections in
case of Hyper-V are not accessible and already were a problem for
/proc/vmcore, fixed in makedumpfile by detecting PageOffline() pages. In
the future, virtio-mem might disallow reading unplugged memory -- marked
as PageOffline() -- in some environments, resulting in undefined behavior
when accessed; therefore, I'm trying to identify and rework all these
(corner) cases.

With this series, there is really only access via /dev/mem, /proc/vmcore
and kdb left after I ripped out /dev/kmem. kdb is an advanced corner-case
use case -- we won't care for now if someone explicitly tries to do nasty
things by reading from/writing to physical addresses we better not touch.
/dev/mem is a use case we won't support for virtio-mem, at least for now,
so we'll simply disallow mapping any virtio-mem memory via /dev/mem next.
/proc/vmcore is really only a problem when dumping the old kernel via
something that's not makedumpfile (read: basically never), however, we'll
try sanitizing that as well in the second kernel in the future.

Tested via kcore_dump:
https://github.com/schlafwandler/kcore_dump

This patch (of 6):

Commit db779ef67ffe ("proc/kcore: Remove unused kclist_add_remap()")
removed the last user of KCORE_REMAP.

Commit 595dd46ebfc1 ("vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when
dumping vsyscall user page") removed the last user of KCORE_OTHER.

Let's drop both types. While at it, also drop vaddr in "struct
kcore_list", used by KCORE_REMAP only.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Alex Shi <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Aili Yao <[email protected]>
Cc: Jiri Bohac <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1
# 5e545df3 15-Dec-2020 Mike Rapoport <[email protected]>

arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL

ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
which in turn enables memmap_valid_within() function that is intended to
ver

arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL

ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
which in turn enables memmap_valid_within() function that is intended to
verify existence of struct page associated with a pfn when there are holes
in the memory map.

However, the ARCH_HAS_HOLES_MEMORYMODEL also enables HAVE_ARCH_PFN_VALID
and arch-specific pfn_valid() implementation that also deals with the holes
in the memory map.

The only two users of memmap_valid_within() call this function after
a call to pfn_valid() so the memmap_valid_within() check becomes redundant.

Remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and memmap_valid_within() and rely
entirely on ARM's implementation of pfn_valid() that is now enabled
unconditionally.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greg Ungerer <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Meelis Roos <[email protected]>
Cc: Michael Schmitz <[email protected]>
Cc: Russell King <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


1234