|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
0de47f28 |
| 15-Jan-2025 |
Akihiko Odaki <[email protected]> |
crash: Use note name macros
Use note name macros to match with the userspace's expectation.
Signed-off-by: Akihiko Odaki <[email protected]> Acked-by: Baoquan He <[email protected]> Reviewed-by
crash: Use note name macros
Use note name macros to match with the userspace's expectation.
Signed-off-by: Akihiko Odaki <[email protected]> Acked-by: Baoquan He <[email protected]> Reviewed-by: Dave Martin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7 |
|
| #
605291e2 |
| 09-Nov-2024 |
Omar Sandoval <[email protected]> |
proc/kcore: use percpu_rw_semaphore for kclist_lock
The list of memory ranges for /proc/kcore is protected by a rw_semaphore. We lock it for reading on every read from /proc/kcore. This is very heav
proc/kcore: use percpu_rw_semaphore for kclist_lock
The list of memory ranges for /proc/kcore is protected by a rw_semaphore. We lock it for reading on every read from /proc/kcore. This is very heavy, especially since it is rarely locked for writing. Since we want to strongly favor read lock performance, convert it to a percpu_rw_semaphore. I also experimented with percpu_ref and SRCU, but this change was the simplest and the fastest.
In my benchmark, this reduces the time per read by yet another 20 nanoseconds on top of the previous two changes, from 195 nanoseconds per read to 175.
Link: https://github.com/osandov/drgn/issues/106 Signed-off-by: Omar Sandoval <[email protected]> Link: https://lore.kernel.org/r/83a3b235b4bcc3b8aef7c533e0657f4d7d5d35ae.1731115587.git.osandov@fb.com Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
680e029f |
| 09-Nov-2024 |
Omar Sandoval <[email protected]> |
proc/kcore: don't walk list on every read
We maintain a list of memory ranges for /proc/kcore, which usually has 10-20 entries. Currently, every single read from /proc/kcore walks the entire list in
proc/kcore: don't walk list on every read
We maintain a list of memory ranges for /proc/kcore, which usually has 10-20 entries. Currently, every single read from /proc/kcore walks the entire list in order to count the number of entries and compute some offsets. These values only change when the list of memory ranges changes, which is very rare (only when memory is hot(un)plugged). We can cache the values when the list is populated to avoid these redundant walks.
In my benchmark, this reduces the time per read by another 20 nanoseconds on top of the previous change, from 215 nanoseconds per read to 195.
Link: https://github.com/osandov/drgn/issues/106 Signed-off-by: Omar Sandoval <[email protected]> Link: https://lore.kernel.org/r/8d945558b9c9efe74103a34b7780f1cd90d9ce7f.1731115587.git.osandov@fb.com Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
c9136fad |
| 09-Nov-2024 |
Omar Sandoval <[email protected]> |
proc/kcore: mark proc entry as permanent
drgn reads from /proc/kcore to debug the running kernel. For many drgn scripts, /proc/kcore is actually a bottleneck.
use_pde() and unuse_pde() in prog_reg_
proc/kcore: mark proc entry as permanent
drgn reads from /proc/kcore to debug the running kernel. For many drgn scripts, /proc/kcore is actually a bottleneck.
use_pde() and unuse_pde() in prog_reg_read() show up hot in profiles. Since the entry for /proc/kcore can never be removed, this is useless overhead that can be trivially avoided by marking the entry as permanent.
In my benchmark, this reduces the time per read by about 20 nanoseconds, from 235 nanoseconds per read to 215.
Link: https://github.com/osandov/drgn/issues/106 Signed-off-by: Omar Sandoval <[email protected]> Link: https://lore.kernel.org/r/60873e6afcfda3f08d0456f19e4733612afcf134.1731115587.git.osandov@fb.com Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
088f2946 |
| 21-Nov-2024 |
Jiri Olsa <[email protected]> |
fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero
If iov_iter_zero succeeds after failed copy_from_kernel_nofault, we need to reset the ret value to zero otherwise i
fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero
If iov_iter_zero succeeds after failed copy_from_kernel_nofault, we need to reset the ret value to zero otherwise it will be returned as final return value of read_kcore_iter.
This fixes objdump -d dump over /proc/kcore for me.
Cc: [email protected] Cc: Alexander Gordeev <[email protected]> Fixes: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses") Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Alexander Gordeev <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc6 |
|
| #
82e33f24 |
| 29-Oct-2024 |
Mirsad Todorovac <[email protected]> |
fs/proc/kcore.c: fix coccinelle reported ERROR instances
Coccinelle complains about the nested reuse of the pointer `iter' with different pointer type:
./fs/proc/kcore.c:515:26-30: ERROR: invalid r
fs/proc/kcore.c: fix coccinelle reported ERROR instances
Coccinelle complains about the nested reuse of the pointer `iter' with different pointer type:
./fs/proc/kcore.c:515:26-30: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:534:23-27: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:550:40-44: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:568:27-31: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:581:28-32: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:599:27-31: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:607:38-42: ERROR: invalid reference to the index variable of the iterator on line 499 ./fs/proc/kcore.c:614:26-30: ERROR: invalid reference to the index variable of the iterator on line 499
Replacing `struct kcore_list *iter' with `struct kcore_list *tmp' doesn't change the scope and the functionality is the same and coccinelle seems happy.
NOTE: There was an issue with using `struct kcore_list *pos' as the nested iterator. The build did not work!
[[email protected]: s/tmp/pos/] Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Link: https://lkml.kernel.org/r/[email protected] Fixes: 04d168c6d42d ("fs/proc/kcore.c: remove check of list iterator against head past the loop body") Signed-off-by: Jakob Koschel <[email protected]> Signed-off-by: Mirsad Todorovac <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: "Brian Johannesmeyer" <[email protected]> Cc: Cristiano Giuffrida <[email protected]> Cc: "Bos, H.J." <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Yang Li <[email protected]> Cc: Baoquan He <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Yan Zhen <[email protected]> Cc: Alexander Gordeev <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2 |
|
| #
3d5854d7 |
| 30-Sep-2024 |
Alexander Gordeev <[email protected]> |
fs/proc/kcore.c: allow translation of physical memory addresses
When /proc/kcore is read an attempt to read the first two pages results in HW-specific page swap on s390 and another (so called prefix
fs/proc/kcore.c: allow translation of physical memory addresses
When /proc/kcore is read an attempt to read the first two pages results in HW-specific page swap on s390 and another (so called prefix) pages are accessed instead. That leads to a wrong read.
Allow architecture-specific translation of memory addresses using kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks. That way an architecture can deal with specific physical memory ranges.
Re-use the existing /dev/mem callback implementation on s390, which handles the described prefix pages swapping correctly.
For other architectures the default callback is basically NOP. It is expected the condition (vaddr == __va(__pa(vaddr))) always holds true for KCORE_RAM memory type.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexander Gordeev <[email protected]> Suggested-by: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc1, v6.11 |
|
| #
698e7d16 |
| 09-Sep-2024 |
Yan Zhen <[email protected]> |
proc: Fix typo in the comment
The deference here confuses me.
Maybe here want to say that because show_fd_locks() does not dereference the files pointer, using the stale value of the files pointer
proc: Fix typo in the comment
The deference here confuses me.
Maybe here want to say that because show_fd_locks() does not dereference the files pointer, using the stale value of the files pointer is safe.
Correctly spelled comments make it easier for the reader to understand the code.
replace 'deferences' with 'dereferences' in the comment & replace 'inialized' with 'initialized' in the comment.
Signed-off-by: Yan Zhen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2 |
|
| #
443cbaf9 |
| 24-Jan-2024 |
Baoquan He <[email protected]> |
crash: split vmcoreinfo exporting code out from crash_core.c
Now move the relevant codes into separate files: kernel/crash_reserve.c, include/linux/crash_reserve.h.
And add config item CRASH_RESERV
crash: split vmcoreinfo exporting code out from crash_core.c
Now move the relevant codes into separate files: kernel/crash_reserve.c, include/linux/crash_reserve.h.
And add config item CRASH_RESERVE to control its enabling.
And also update the old ifdeffery of CONFIG_CRASH_CORE, including of <linux/crash_core.h> and config item dependency on CRASH_CORE accordingly.
And also do renaming as follows: - arch/xxx/kernel/{crash_core.c => vmcore_info.c} because they are only related to vmcoreinfo exporting on x86, arm64, riscv.
And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to decide if build in crash_core.c.
[[email protected]: remove duplicated include in vmcore_info.c] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Yang Li <[email protected]> Acked-by: Hari Bathini <[email protected]> Cc: Al Viro <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Pingfan Liu <[email protected]> Cc: Klara Modin <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Yang Li <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
| #
e538a582 |
| 11-Sep-2023 |
Adrian Hunter <[email protected]> |
proc/kcore: do not try to access unaccepted memory
Support for unaccepted memory was added recently, refer commit dcdfdd40fa82 ("mm: Add support for unaccepted memory"), whereby a virtual machine ma
proc/kcore: do not try to access unaccepted memory
Support for unaccepted memory was added recently, refer commit dcdfdd40fa82 ("mm: Add support for unaccepted memory"), whereby a virtual machine may need to accept memory before it can be used.
Do not try to access unaccepted memory because it can cause the guest to fail.
For /proc/kcore, which is read-only and does not support mmap, this means a read of unaccepted memory will return zeros.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Adrian Hunter <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5 |
|
| #
17457784 |
| 31-Jul-2023 |
Lorenzo Stoakes <[email protected]> |
fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
Some architectures do not populate the entire range categorised by KCORE_TEXT, so we must ensure that the kernel address we read from is
fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions
Some architectures do not populate the entire range categorised by KCORE_TEXT, so we must ensure that the kernel address we read from is valid.
Unfortunately there is no solution currently available to do so with a purely iterator solution so reinstate the bounce buffer in this instance so we can use copy_from_kernel_nofault() in order to avoid page faults when regions are unmapped.
This change partly reverts commit 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data"), reinstating the bounce buffer, but adapts the code to continue to use an iterator.
[[email protected]: correct comment to be strictly correct about reasoning] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data") Signed-off-by: Lorenzo Stoakes <[email protected]> Reported-by: Jiri Olsa <[email protected]> Closes: https://lore.kernel.org/all/ZHc2fm+9daF6cgCE@krava Tested-by: Jiri Olsa <[email protected]> Tested-by: Will Deacon <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Baoquan He <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Thorsten Leemhuis <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2 |
|
| #
9e627588 |
| 10-May-2023 |
Azeem Shaikh <[email protected]> |
procfs: replace all non-returning strlcpy with strscpy
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to lin
procfs: replace all non-returning strlcpy with strscpy
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Azeem Shaikh <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Alexey Dobriyan <[email protected]> Cc: Baoquan He <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4 |
|
| #
9b2d38b4 |
| 29-Aug-2022 |
Linus Walleij <[email protected]> |
fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
The virt_addr_valid() should be passed a pointer, the current code passing a long unsigned int is just exploiting the unintentional polymorphism
fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
The virt_addr_valid() should be passed a pointer, the current code passing a long unsigned int is just exploiting the unintentional polymorphism of these calls being implemented as preprocessor macros.
Signed-off-by: Linus Walleij <[email protected]>
show more ...
|
| #
4c91c07c |
| 22-Mar-2023 |
Lorenzo Stoakes <[email protected]> |
mm: vmalloc: convert vread() to vread_iter()
Having previously laid the foundation for converting vread() to an iterator function, pull the trigger and do so.
This patch attempts to provide minimal
mm: vmalloc: convert vread() to vread_iter()
Having previously laid the foundation for converting vread() to an iterator function, pull the trigger and do so.
This patch attempts to provide minimal refactoring and to reflect the existing logic as best we can, for example we continue to zero portions of memory not read, as before.
Overall, there should be no functional difference other than a performance improvement in /proc/kcore access to vmalloc regions.
Now we have eliminated the need for a bounce buffer in read_kcore_iter(), we dispense with it, and try to write to user memory optimistically but with faults disabled via copy_page_to_iter_nofault(). We already have preemption disabled by holding a spin lock. We continue faulting in until the operation is complete.
Additionally, we must account for the fact that at any point a copy may fail (most likely due to a fault not being able to occur), we exit indicating fewer bytes retrieved than expected.
[[email protected]: fix sparc64 warning] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: redo Stephen's sparc build fix] Link: https://lkml.kernel.org/r/8506cbc667c39205e65a323f750ff9c11a463798.1679566220.git.lstoakes@gmail.com [[email protected]: unbreak uio.h includes] Link: https://lkml.kernel.org/r/941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Viro <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
46c0d6d0 |
| 22-Mar-2023 |
Lorenzo Stoakes <[email protected]> |
fs/proc/kcore: convert read_kcore() to read_kcore_iter()
For the time being we still use a bounce buffer for vread(), however in the next patch we will convert this to interact directly with the ite
fs/proc/kcore: convert read_kcore() to read_kcore_iter()
For the time being we still use a bounce buffer for vread(), however in the next patch we will convert this to interact directly with the iterator and eliminate the bounce buffer altogether.
Link: https://lkml.kernel.org/r/ebe12c8d70eebd71f487d80095605f3ad0d1489c.1679511146.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
2e1c0170 |
| 22-Mar-2023 |
Lorenzo Stoakes <[email protected]> |
fs/proc/kcore: avoid bounce buffer for ktext data
Patch series "convert read_kcore(), vread() to use iterators", v8.
While reviewing Baoquan's recent changes to permit vread() access to vm_map_ram
fs/proc/kcore: avoid bounce buffer for ktext data
Patch series "convert read_kcore(), vread() to use iterators", v8.
While reviewing Baoquan's recent changes to permit vread() access to vm_map_ram regions of vmalloc allocations, Willy pointed out [1] that it would be nice to refactor vread() as a whole, since its only user is read_kcore() and the existing form of vread() necessitates the use of a bounce buffer.
This patch series does exactly that, as well as adjusting how we read the kernel text section to avoid the use of a bounce buffer in this case as well.
This has been tested against the test case which motivated Baoquan's changes in the first place [2] which continues to function correctly, as do the vmalloc self tests.
This patch (of 4):
Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") introduced the use of a bounce buffer to retrieve kernel text data for /proc/kcore in order to avoid failures arising from hardened user copies enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object().
We can avoid doing this if instead of copy_to_user() we use _copy_to_user() which bypasses the hardening check. This is more efficient than using a bounce buffer and simplifies the code.
We do so as part an overall effort to eliminate bounce buffer usage in the function with an eye to converting it an iterator read.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/Y8WfDSRkc%[email protected]/ [1] Link: https://lore.kernel.org/all/[email protected]/T/#u [2] Link: https://lkml.kernel.org/r/fd39b0bfa7edc76d360def7d034baaee71d90158.1679511146.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
e025ab84 |
| 18-Oct-2022 |
Kefeng Wang <[email protected]> |
mm: remove kern_addr_valid() completely
Most architectures (except arm64/x86/sparc) simply return 1 for kern_addr_valid(), which is only used in read_kcore(), and it calls copy_from_kernel_nofault()
mm: remove kern_addr_valid() completely
Most architectures (except arm64/x86/sparc) simply return 1 for kern_addr_valid(), which is only used in read_kcore(), and it calls copy_from_kernel_nofault() which could check whether the address is a valid kernel address. So as there is no need for kern_addr_valid(), let's remove it.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> [m68k] Acked-by: Heiko Carstens <[email protected]> [s390] Acked-by: Christoph Hellwig <[email protected]> Acked-by: Helge Deller <[email protected]> [parisc] Acked-by: Michael Ellerman <[email protected]> [powerpc] Acked-by: Guo Ren <[email protected]> [csky] Acked-by: Catalin Marinas <[email protected]> [arm64] Cc: Alexander Gordeev <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Anton Ivanov <[email protected]> Cc: <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: James Bottomley <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michal Simek <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Stefan Kristiansson <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xuerui Wang <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
1eeaa4fd |
| 23-Sep-2022 |
Liu Shixin <[email protected]> |
memory: move hotplug memory notifier priority to same file for easy sorting
The priority of hotplug memory callback is defined in a different file. And there are some callers using numbers directly
memory: move hotplug memory notifier priority to same file for easy sorting
The priority of hotplug memory callback is defined in a different file. And there are some callers using numbers directly. Collect them together into include/linux/memory.h for easy reading. This allows us to sort their priorities more intuitively without additional comments.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liu Shixin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Waiman Long <[email protected]> Cc: zefan li <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
5d89c224 |
| 23-Sep-2022 |
Liu Shixin <[email protected]> |
fs/proc/kcore.c: use hotplug_memory_notifier() directly
Commit 76ae847497bc52 ("Documentation: raise minimum supported version of GCC to 5.1") updated the minimum gcc version to 5.1. So the problem
fs/proc/kcore.c: use hotplug_memory_notifier() directly
Commit 76ae847497bc52 ("Documentation: raise minimum supported version of GCC to 5.1") updated the minimum gcc version to 5.1. So the problem mentioned in f02c69680088 ("include/linux/memory.h: implement register_hotmemory_notifier()") no longer exist. So we can now switch to use hotplug_memory_notifier() directly rather than register_hotmemory_notifier().
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liu Shixin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Waiman Long <[email protected]> Cc: zefan li <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5 |
|
| #
04d168c6 |
| 29-Apr-2022 |
Jakob Koschel <[email protected]> |
fs/proc/kcore.c: remove check of list iterator against head past the loop body
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value wi
fs/proc/kcore.c: remove check of list iterator against head past the loop body
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value will be a bogus pointer computed based on the head element.
While it is safe to use the pointer to determine if it was computed based on the head element, either with list_entry_is_head() or &pos->member == head, using the iterator variable after the loop should be avoided.
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element [1].
[[email protected]: reduce scope of `iter'] Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jakob Koschel <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: "Brian Johannesmeyer" <[email protected]> Cc: Cristiano Giuffrida <[email protected]> Cc: "Bos, H.J." <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1 |
|
| #
c6d9eee2 |
| 01-Jul-2021 |
David Hildenbrand <[email protected]> |
fs/proc/kcore: use page_offline_(freeze|thaw)
Let's properly synchronize with drivers that set PageOffline(). Unfreeze/thaw every now and then, so drivers that want to set PageOffline() can make pro
fs/proc/kcore: use page_offline_(freeze|thaw)
Let's properly synchronize with drivers that set PageOffline(). Unfreeze/thaw every now and then, so drivers that want to set PageOffline() can make progress.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Mike Rapoport <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Aili Yao <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Alex Shi <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jason Wang <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Steven Price <[email protected]> Cc: Wei Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
0daa322b |
| 01-Jul-2021 |
David Hildenbrand <[email protected]> |
fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages
Let's avoid reading:
1) Offline memory sections: the content of offline memory sections is stale as the m
fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages
Let's avoid reading:
1) Offline memory sections: the content of offline memory sections is stale as the memory is effectively unused by the kernel. On s390x with standby memory, offline memory sections (belonging to offline storage increments) are not accessible. With virtio-mem and the hyper-v balloon, we can have unavailable memory chunks that should not be accessed inside offline memory sections. Last but not least, offline memory sections might contain hwpoisoned pages which we can no longer identify because the memmap is stale.
2) PG_offline pages: logically offline pages that are documented as "The content of these pages is effectively stale. Such pages should not be touched (read/write/dump/save) except by their owner.". Examples include pages inflated in a balloon or unavailble memory ranges inside hotplugged memory sections with virtio-mem or the hyper-v balloon.
3) PG_hwpoison pages: Reading pages marked as hwpoisoned can be fatal. As documented: "Accessing is not safe since it may cause another machine check. Don't touch!"
Introduce is_page_hwpoison(), adding a comment that it is inherently racy but best we can really do.
Reading /proc/kcore now performs similar checks as when reading /proc/vmcore for kdump via makedumpfile: problematic pages are exclude. It's also similar to hibernation code, however, we don't skip hwpoisoned pages when processing pages in kernel/power/snapshot.c:saveable_page() yet.
Note 1: we can race against memory offlining code, especially memory going offline and getting unplugged: however, we will properly tear down the identity mapping and handle faults gracefully when accessing this memory from kcore code.
Note 2: we can race against drivers setting PageOffline() and turning memory inaccessible in the hypervisor. We'll handle this in a follow-up patch.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Aili Yao <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Alex Shi <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jason Wang <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Steven Price <[email protected]> Cc: Wei Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
2711032c |
| 01-Jul-2021 |
David Hildenbrand <[email protected]> |
fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM
Let's resturcture the code, using switch-case, and checking pfn_is_ram() only when we are dealing with KCORE_RAM.
Link: https://lkml.kernel
fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM
Let's resturcture the code, using switch-case, and checking pfn_is_ram() only when we are dealing with KCORE_RAM.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Cc: Aili Yao <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Alex Shi <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jason Wang <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Steven Price <[email protected]> Cc: Wei Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
3c36b419 |
| 01-Jul-2021 |
David Hildenbrand <[email protected]> |
fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER
Patch series "fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages", v3.
Looking for places where the kernel migh
fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER
Patch series "fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages", v3.
Looking for places where the kernel might unconditionally read PageOffline() pages, I stumbled over /proc/kcore; turns out /proc/kcore needs some more love to not touch some other pages we really don't want to read -- i.e., hwpoisoned ones.
Examples for PageOffline() pages are pages inflated in a balloon, memory unplugged via virtio-mem, and partially-present sections in memory added by the Hyper-V balloon.
When reading pages inflated in a balloon, we essentially produce unnecessary load in the hypervisor; holes in partially present sections in case of Hyper-V are not accessible and already were a problem for /proc/vmcore, fixed in makedumpfile by detecting PageOffline() pages. In the future, virtio-mem might disallow reading unplugged memory -- marked as PageOffline() -- in some environments, resulting in undefined behavior when accessed; therefore, I'm trying to identify and rework all these (corner) cases.
With this series, there is really only access via /dev/mem, /proc/vmcore and kdb left after I ripped out /dev/kmem. kdb is an advanced corner-case use case -- we won't care for now if someone explicitly tries to do nasty things by reading from/writing to physical addresses we better not touch. /dev/mem is a use case we won't support for virtio-mem, at least for now, so we'll simply disallow mapping any virtio-mem memory via /dev/mem next. /proc/vmcore is really only a problem when dumping the old kernel via something that's not makedumpfile (read: basically never), however, we'll try sanitizing that as well in the second kernel in the future.
Tested via kcore_dump: https://github.com/schlafwandler/kcore_dump
This patch (of 6):
Commit db779ef67ffe ("proc/kcore: Remove unused kclist_add_remap()") removed the last user of KCORE_REMAP.
Commit 595dd46ebfc1 ("vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page") removed the last user of KCORE_OTHER.
Let's drop both types. While at it, also drop vaddr in "struct kcore_list", used by KCORE_REMAP only.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Jason Wang <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Alex Shi <[email protected]> Cc: Steven Price <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Aili Yao <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Wei Liu <[email protected]> Cc: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1 |
|
| #
5e545df3 |
| 15-Dec-2020 |
Mike Rapoport <[email protected]> |
arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL which in turn enables memmap_valid_within() function that is intended to ver
arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL which in turn enables memmap_valid_within() function that is intended to verify existence of struct page associated with a pfn when there are holes in the memory map.
However, the ARCH_HAS_HOLES_MEMORYMODEL also enables HAVE_ARCH_PFN_VALID and arch-specific pfn_valid() implementation that also deals with the holes in the memory map.
The only two users of memmap_valid_within() call this function after a call to pfn_valid() so the memmap_valid_within() check becomes redundant.
Remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and memmap_valid_within() and rely entirely on ARM's implementation of pfn_valid() that is now enabled unconditionally.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matt Turner <[email protected]> Cc: Meelis Roos <[email protected]> Cc: Michael Schmitz <[email protected]> Cc: Russell King <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|