|
Revision tags: v6.15, v6.15-rc7 |
|
| #
1d6c39c8 |
| 13-May-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Fix persistent buffer when commit page is the reader page
The ring buffer is made up of sub buffers (sometimes called pages as they are by default PAGE_SIZE). It has the following "page
ring-buffer: Fix persistent buffer when commit page is the reader page
The ring buffer is made up of sub buffers (sometimes called pages as they are by default PAGE_SIZE). It has the following "pages":
"tail page" - this is the page that the next write will write to "head page" - this is the page that the reader will swap the reader page with. "reader page" - This belongs to the reader, where it will swap the head page from the ring buffer so that the reader does not race with the writer.
The writer may end up on the "reader page" if the ring buffer hasn't written more than one page, where the "tail page" and the "head page" are the same.
The persistent ring buffer has meta data that points to where these pages exist so on reboot it can re-create the pointers to the cpu_buffer descriptor. But when the commit page is on the reader page, the logic is incorrect.
The check to see if the commit page is on the reader page checked if the head page was the reader page, which would never happen, as the head page is always in the ring buffer. The correct check would be to test if the commit page is on the reader page. If that's the case, then it can exit out early as the commit page is only on the reader page when there's only one page of data in the buffer. There's no reason to iterate the ring buffer pages to find the "commit page" as it is already found.
To trigger this bug:
# echo 1 > /sys/kernel/tracing/instances/boot_mapped/events/syscalls/sys_enter_fchownat/enable # touch /tmp/x # chown sshd /tmp/x # reboot
On boot up, the dmesg will have: Ring buffer meta [0] is from previous boot! Ring buffer meta [1] is from previous boot! Ring buffer meta [2] is from previous boot! Ring buffer meta [3] is from previous boot! Ring buffer meta [4] commit page not found Ring buffer meta [5] is from previous boot! Ring buffer meta [6] is from previous boot! Ring buffer meta [7] is from previous boot!
Where the buffer on CPU 4 had a "commit page not found" error and that buffer is cleared and reset causing the output to be empty and the data lost.
When it works correctly, it has:
# cat /sys/kernel/tracing/instances/boot_mapped/trace_pipe <...>-1137 [004] ..... 998.205323: sys_enter_fchownat: __syscall_nr=0x104 (260) dfd=0xffffff9c (4294967196) filename=(0xffffc90000a0002c) user=0x3e8 (1000) group=0xffffffff (4294967295) flag=0x0 (0
Cc: [email protected] Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 5f3b6e839f3ce ("ring-buffer: Validate boot range memory events") Reported-by: Tasos Sahanidis <[email protected]> Tested-by: Tasos Sahanidis <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
e4d4b867 |
| 02-Apr-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio()
Some architectures do not have data cache coherency between user and kernel space. For these architectures, the cache needs to be
ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio()
Some architectures do not have data cache coherency between user and kernel space. For these architectures, the cache needs to be flushed on both the kernel and user addresses so that user space can see the updates the kernel has made.
Instead of using flush_dcache_folio() and playing with virt_to_folio() within the call to that function, use flush_kernel_vmap_range() which takes the virtual address and does the work for those architectures that need it.
This also fixes a bug where the flush of the reader page only flushed one page. If the sub-buffer order is 1 or more, where the sub-buffer size would be greater than a page, it would miss the rest of the sub-buffer content, as the "reader page" is not just a page, but the size of a sub-buffer.
Link: https://lore.kernel.org/all/CAG48ez3w0my4Rwttbc5tEbNsme6tc0mrSN95thjXUFaJ3aQ6SA@mail.gmail.com/
Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Vincent Donnefort <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Mike Rapoport <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions"); Suggested-by: Jann Horn <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.14 |
|
| #
de48d7ff |
| 17-Mar-2025 |
Jiapeng Chong <[email protected]> |
ring-buffer: Remove the unused variable bmeta
Variable bmeta is not effectively used, so delete it.
kernel/trace/ring_buffer.c:1952:27: warning: variable ‘bmeta’ set but not used.
Link: https://lo
ring-buffer: Remove the unused variable bmeta
Variable bmeta is not effectively used, so delete it.
kernel/trace/ring_buffer.c:1952:27: warning: variable ‘bmeta’ set but not used.
Link: https://lore.kernel.org/[email protected] Reported-by: Abaci Robot <[email protected]> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=19524 Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc7, v6.14-rc6 |
|
| #
b6533482 |
| 05-Mar-2025 |
Steven Rostedt <[email protected]> |
tracing: Have persistent trace instances save KASLR offset
There's no reason to save the KASLR offset for the ring buffer itself. That is used by the tracer. Now that the tracer has a way to save da
tracing: Have persistent trace instances save KASLR offset
There's no reason to save the KASLR offset for the ring buffer itself. That is used by the tracer. Now that the tracer has a way to save data in the persistent memory of the ring buffer, have the tracing infrastructure take care of the saving of the KASLR offset.
Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Link: https://lore.kernel.org/[email protected] Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
4af0a9c5 |
| 05-Mar-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Add ring_buffer_meta_scratch()
Now that there's one meta data at the start of the persistent memory used by the ring buffer, allow the caller to request some memory right after that dat
ring-buffer: Add ring_buffer_meta_scratch()
Now that there's one meta data at the start of the persistent memory used by the ring buffer, allow the caller to request some memory right after that data that it can use as its own persistent memory.
Also fix some white space issues with ring_buffer_alloc().
Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Link: https://lore.kernel.org/[email protected] Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
4009cc31 |
| 05-Mar-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Add buffer meta data for persistent ring buffer
Instead of just having a meta data at the first page of each sub buffer that has duplicate data, add a new meta page to the entire block
ring-buffer: Add buffer meta data for persistent ring buffer
Instead of just having a meta data at the first page of each sub buffer that has duplicate data, add a new meta page to the entire block of memory that holds the duplicate data and remove it from the sub buffer meta data.
This will open up the extra memory in this first page to be used by the tracer for its own persistent data.
Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Link: https://lore.kernel.org/[email protected] Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
bcba8d4d |
| 05-Mar-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Use kaslr address instead of text delta
Instead of saving off the text and data pointers and using them to compare with the current boot's text and data pointers, just save off the KASL
ring-buffer: Use kaslr address instead of text delta
Instead of saving off the text and data pointers and using them to compare with the current boot's text and data pointers, just save off the KASLR offset. Then that can be used to figure out how to read the previous boots buffer.
The last_boot_info will now show this offset, but only if it is for a previous boot:
~# cat instances/boot_mapped/last_boot_info 39000000 [kernel]
~# echo function > instances/boot_mapped/current_tracer ~# cat instances/boot_mapped/last_boot_info # Current
If the KASLR offset saved is for the current boot, the last_boot_info will show the value of "current".
Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Link: https://lore.kernel.org/[email protected] Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc5, v6.14-rc4 |
|
| #
c73f0b69 |
| 23-Feb-2025 |
Feng Yang <[email protected]> |
ring-buffer: Fix bytes_dropped calculation issue
The calculation of bytes-dropped and bytes_dropped_nested is reversed. Although it does not affect the final calculation of total_dropped, it should
ring-buffer: Fix bytes_dropped calculation issue
The calculation of bytes-dropped and bytes_dropped_nested is reversed. Although it does not affect the final calculation of total_dropped, it should still be modified.
Link: https://lore.kernel.org/[email protected] Fixes: 6c43e554a2a5 ("ring-buffer: Add ring buffer startup selftest") Signed-off-by: Feng Yang <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
3ca4d7af |
| 18-Jan-2025 |
Zhouyi Zhou <[email protected]> |
ring-buffer: Fix typo in comment about header page pointer
Fix typo in comment about header page pointer in function rb_get_reader_page.
Link: https://lore.kernel.org/20250118012352.3430519-1-zhouz
ring-buffer: Fix typo in comment about header page pointer
Fix typo in comment about header page pointer in function rb_get_reader_page.
Link: https://lore.kernel.org/[email protected] Signed-off-by: Zhouyi Zhou <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
97937834 |
| 14-Feb-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Update pages_touched to reflect persistent buffer content
The pages_touched field represents the number of subbuffers in the ring buffer that have content that can be read. This is used
ring-buffer: Update pages_touched to reflect persistent buffer content
The pages_touched field represents the number of subbuffers in the ring buffer that have content that can be read. This is used in accounting of "dirty_pages" and "buffer_percent" to allow the user to wait for the buffer to be filled to a certain amount before it reads the buffer in blocking mode.
The persistent buffer never updated this value so it was set to zero, and this accounting would take it as it had no content. This would cause user space to wait for content even though there's enough content in the ring buffer that satisfies the buffer_percent.
Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 5f3b6e839f3ce ("ring-buffer: Validate boot range memory events") Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
f5b95f1f |
| 14-Feb-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Validate the persistent meta data subbuf array
The meta data for a mapped ring buffer contains an array of indexes of all the subbuffers. The first entry is the reader page, and the res
ring-buffer: Validate the persistent meta data subbuf array
The meta data for a mapped ring buffer contains an array of indexes of all the subbuffers. The first entry is the reader page, and the rest of the entries lay out the order of the subbuffers in how the ring buffer link list is to be created.
The validator currently makes sure that all the entries are within the range of 0 and nr_subbufs. But it does not check if there are any duplicates.
While working on the ring buffer, I corrupted this array, where I added duplicates. The validator did not catch it and created the ring buffer link list on top of it. Luckily, the corruption was only that the reader page was also in the writer path and only presented corrupted data but did not crash the kernel. But if there were duplicates in the writer side, then it could corrupt the ring buffer link list and cause a crash.
Create a bitmask array with the size of the number of subbuffers. Then clear it. When walking through the subbuf array checking to see if the entries are within the range, test if its bit is already set in the subbuf_mask. If it is, then there is duplicates and fail the validation. If not, set the corresponding bit and continue.
Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: c76883f18e59b ("ring-buffer: Add test if range of boot buffer is valid") Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
9ba0e175 |
| 13-Feb-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Unlock resize on mmap error
Memory mapping the tracing ring buffer will disable resizing the buffer. But if there's an error in the memory mapping like an invalid parameter, the functio
ring-buffer: Unlock resize on mmap error
Memory mapping the tracing ring buffer will disable resizing the buffer. But if there's an error in the memory mapping like an invalid parameter, the function exits out without re-enabling the resizing of the ring buffer, preventing the ring buffer from being resized after that.
Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions") Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
cd2375a3 |
| 20-Jan-2025 |
Steven Rostedt <[email protected]> |
ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg()
Some architectures can not safely do atomic64 operations in NMI context. Since the ring buffer relies on atomic64 operations t
ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg()
Some architectures can not safely do atomic64 operations in NMI context. Since the ring buffer relies on atomic64 operations to do its time keeping, if an event is requested in NMI context, reject it for these architectures.
Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andreas Larsson <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: c84897c0ff592 ("ring-buffer: Remove 32bit timestamp logic") Closes: https://lore.kernel.org/all/[email protected]/ Reported-by: Ludwig Rydberg <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7 |
|
| #
6e31b759 |
| 10-Jan-2025 |
Jeongjun Park <[email protected]> |
ring-buffer: Make reading page consistent with the code logic
In the loop of __rb_map_vma(), the 's' variable is calculated from the same logic that nr_pages is and they both come from nr_subbufs. B
ring-buffer: Make reading page consistent with the code logic
In the loop of __rb_map_vma(), the 's' variable is calculated from the same logic that nr_pages is and they both come from nr_subbufs. But the relationship is not obvious and there's a WARN_ON_ONCE() around the 's' variable to make sure it never becomes equal to nr_subbufs within the loop. If that happens, then the code is buggy and needs to be fixed.
The 'page' variable is calculated from cpu_buffer->subbuf_ids[s] which is an array of 'nr_subbufs' entries. If the code becomes buggy and 's' becomes equal to or greater than 'nr_subbufs' then this will be an out of bounds hit before the WARN_ON() is triggered and the code exiting safely.
Make the 'page' initialization consistent with the code logic and assign it after the out of bounds check.
Link: https://lore.kernel.org/[email protected] Signed-off-by: Jeongjun Park <[email protected]> [ sdr: rewrote change log ] Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
0568c6eb |
| 08-Jan-2025 |
Vincent Donnefort <[email protected]> |
ring-buffer: Check for empty ring-buffer with rb_num_of_entries()
Currently there are two ways of identifying an empty ring-buffer. One relying on the current status of the commit / reader page (rb_
ring-buffer: Check for empty ring-buffer with rb_num_of_entries()
Currently there are two ways of identifying an empty ring-buffer. One relying on the current status of the commit / reader page (rb_per_cpu_empty()) and the other on the write and read counters (rb_num_of_entries() used in rb_get_reader_page()).
with rb_num_of_entries(). This intends to ease later introduction of ring-buffer writers which are out of the kernel control and with whom, the only information available is through the meta-page counters.
Link: https://lore.kernel.org/[email protected] Signed-off-by: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
c58a812c |
| 18-Dec-2024 |
Edward Adam Davis <[email protected]> |
ring-buffer: Fix overflow in __rb_map_vma
An overflow occurred when performing the following calculation:
nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff;
Add a check before the calculati
ring-buffer: Fix overflow in __rb_map_vma
An overflow occurred when performing the following calculation:
nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff;
Add a check before the calculation to avoid this problem.
syzbot reported this as a slab-out-of-bounds in __rb_map_vma:
BUG: KASAN: slab-out-of-bounds in __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058 Read of size 8 at addr ffff8880767dd2b8 by task syz-executor187/5836
CPU: 0 UID: 0 PID: 5836 Comm: syz-executor187 Not tainted 6.13.0-rc2-syzkaller-00159-gf932fb9b4074 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xc3/0x620 mm/kasan/report.c:489 kasan_report+0xd9/0x110 mm/kasan/report.c:602 __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058 ring_buffer_map+0x56e/0x9b0 kernel/trace/ring_buffer.c:7138 tracing_buffers_mmap+0xa6/0x120 kernel/trace/trace.c:8482 call_mmap include/linux/fs.h:2183 [inline] mmap_file mm/internal.h:124 [inline] __mmap_new_file_vma mm/vma.c:2291 [inline] __mmap_new_vma mm/vma.c:2355 [inline] __mmap_region+0x1786/0x2670 mm/vma.c:2456 mmap_region+0x127/0x320 mm/mmap.c:1348 do_mmap+0xc00/0xfc0 mm/mmap.c:496 vm_mmap_pgoff+0x1ba/0x360 mm/util.c:580 ksys_mmap_pgoff+0x32c/0x5c0 mm/mmap.c:542 __do_sys_mmap arch/x86/kernel/sys_x86_64.c:89 [inline] __se_sys_mmap arch/x86/kernel/sys_x86_64.c:82 [inline] __x64_sys_mmap+0x125/0x190 arch/x86/kernel/sys_x86_64.c:82 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
The reproducer for this bug is:
------------------------8<------------------------- #include <fcntl.h> #include <stdlib.h> #include <unistd.h> #include <asm/types.h> #include <sys/mman.h>
int main(int argc, char **argv) { int page_size = getpagesize(); int fd; void *meta;
system("echo 1 > /sys/kernel/tracing/buffer_size_kb"); fd = open("/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw", O_RDONLY);
meta = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, page_size * 5); } ------------------------>8-------------------------
Cc: [email protected] Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions") Link: https://lore.kernel.org/[email protected] Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=345e4443a21200874b18 Signed-off-by: Edward Adam Davis <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7 |
|
| #
537affea |
| 07-Nov-2024 |
liujing <[email protected]> |
ring-buffer: Correct a grammatical error in a comment
The word "trace" begins with a consonant sound, so "a" should be used instead of "an".
Link: https://lore.kernel.org/20241107095327.6390-1-liuj
ring-buffer: Correct a grammatical error in a comment
The word "trace" begins with a consonant sound, so "a" should be used instead of "an".
Link: https://lore.kernel.org/[email protected] Signed-off-by: liujing <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
580bb355 |
| 14-Nov-2024 |
Steven Rostedt <[email protected]> |
Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug"
A crash happened when testing cpu hotplug with respect to the memory mapped ring buffers. It was assumed that the hot plug
Revert: "ring-buffer: Do not have boot mapped buffers hook to CPU hotplug"
A crash happened when testing cpu hotplug with respect to the memory mapped ring buffers. It was assumed that the hot plug code was adding a per CPU buffer that was already created that caused the crash. The real problem was due to ref counting and was fixed by commit 2cf9733891a4 ("ring-buffer: Fix refcount setting of boot mapped buffers").
When a per CPU buffer is created, it will not be created again even with CPU hotplug, so the fix to not use CPU hotplug was a red herring. In fact, it caused only the boot CPU buffer to be created, leaving the other CPU per CPU buffers disabled.
Revert that change as it was not the culprit of the fix it was intended to be.
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 912da2c384d5 ("ring-buffer: Do not have boot mapped buffers hook to CPU hotplug") Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2 |
|
| #
0b60a7fb |
| 30-Sep-2024 |
Julia Lawall <[email protected]> |
ring-buffer: Reorganize kerneldoc parameter names
Reorganize kerneldoc parameter names to match the parameter order in the function header.
Problems identified using Coccinelle.
Cc: Masami Hiramat
ring-buffer: Reorganize kerneldoc parameter names
Reorganize kerneldoc parameter names to match the parameter order in the function header.
Problems identified using Coccinelle.
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
b237e1f7 |
| 15-Oct-2024 |
Petr Pavlu <[email protected]> |
ring-buffer: Limit time with disabled interrupts in rb_check_pages()
The function rb_check_pages() validates the integrity of a specified per-CPU tracing ring buffer. It does so by traversing the un
ring-buffer: Limit time with disabled interrupts in rb_check_pages()
The function rb_check_pages() validates the integrity of a specified per-CPU tracing ring buffer. It does so by traversing the underlying linked list and checking its next and prev links.
To guarantee that the list isn't modified during the check, a caller typically needs to take cpu_buffer->reader_lock. This prevents the check from running concurrently, for example, with a potential reader which can make the list temporarily inconsistent when swapping its old reader page into the buffer.
A problem with this approach is that the time when interrupts are disabled is non-deterministic, dependent on the ring buffer size. This particularly affects PREEMPT_RT because the reader_lock is a raw spinlock which doesn't become sleepable on PREEMPT_RT kernels.
Modify the check so it still attempts to traverse the entire list, but gives up the reader_lock between checking individual pages. Introduce for this purpose a new variable ring_buffer_per_cpu.cnt which is bumped any time the list is modified. The value is used by rb_check_pages() to detect such a change and restart the check.
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Signed-off-by: Petr Pavlu <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
09661f75 |
| 15-Oct-2024 |
Petr Pavlu <[email protected]> |
ring-buffer: Fix reader locking when changing the sub buffer order
The function ring_buffer_subbuf_order_set() updates each ring_buffer_per_cpu and installs new sub buffers that match the requested
ring-buffer: Fix reader locking when changing the sub buffer order
The function ring_buffer_subbuf_order_set() updates each ring_buffer_per_cpu and installs new sub buffers that match the requested page order. This operation may be invoked concurrently with readers that rely on some of the modified data, such as the head bit (RB_PAGE_HEAD), or the ring_buffer_per_cpu.pages and reader_page pointers. However, no exclusive access is acquired by ring_buffer_subbuf_order_set(). Modifying the mentioned data while a reader also operates on them can then result in incorrect memory access and various crashes.
Fix the problem by taking the reader_lock when updating a specific ring_buffer_per_cpu in ring_buffer_subbuf_order_set().
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 8e7b58c27b3c ("ring-buffer: Just update the subbuffers when changing their allocation order") Signed-off-by: Petr Pavlu <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
912da2c3 |
| 08-Oct-2024 |
Steven Rostedt <[email protected]> |
ring-buffer: Do not have boot mapped buffers hook to CPU hotplug
The boot mapped ring buffer has its buffer mapped at a fixed location found at boot up. It is not dynamic. It cannot grow or be expan
ring-buffer: Do not have boot mapped buffers hook to CPU hotplug
The boot mapped ring buffer has its buffer mapped at a fixed location found at boot up. It is not dynamic. It cannot grow or be expanded when new CPUs come online.
Do not hook fixed memory mapped ring buffers to the CPU hotplug callback, otherwise it can cause a crash when it tries to add the buffer to the memory that is already fully occupied.
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: be68d63a139bd ("ring-buffer: Add ring_buffer_alloc_range()") Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6 |
|
| #
eb2dcde9 |
| 28-Jun-2024 |
Vincent Donnefort <[email protected]> |
ring-buffer: Align meta-page to sub-buffers for improved TLB usage
Previously, the mapped ring-buffer layout caused misalignment between the meta-page and sub-buffers when the sub-buffer size was no
ring-buffer: Align meta-page to sub-buffers for improved TLB usage
Previously, the mapped ring-buffer layout caused misalignment between the meta-page and sub-buffers when the sub-buffer size was not a multiple of PAGE_SIZE. This prevented hardware with larger TLB entries from utilizing them effectively.
Add a padding with the zero-page between the meta-page and sub-buffers. Also update the ring-buffer map_test to verify that padding.
Link: https://lore.kernel.org/[email protected] Signed-off-by: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
d0f2d6e9 |
| 15-Aug-2024 |
Steven Rostedt <[email protected]> |
ring-buffer: Add magic and struct size to boot up meta data
Add a magic number as well as save the struct size of the ring_buffer_meta structure in the meta data to also use as validation. Updating
ring-buffer: Add magic and struct size to boot up meta data
Add a magic number as well as save the struct size of the ring_buffer_meta structure in the meta data to also use as validation. Updating the magic number could be used to force a invalidation between kernel versions, and saving the structure size is also a good method to make sure the content is what is expected.
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
| #
bca704f6 |
| 15-Aug-2024 |
Steven Rostedt <[email protected]> |
ring-buffer: Don't reset persistent ring-buffer meta saved addresses
The text and data address is saved in the meta data so that it can be used to know the delta of the text and data addresses of th
ring-buffer: Don't reset persistent ring-buffer meta saved addresses
The text and data address is saved in the meta data so that it can be used to know the delta of the text and data addresses of the last boot compared to the text and data addresses of the current boot. The delta is used to convert function pointer entries in the ring buffer to something that can be used by kallsyms (note this only works for built-in functions).
But the saved addresses get reset on boot up. If the buffer is not used and there's another reboot, then the saved text and data addresses will be of the last boot and not that of the boot that created the content in the ring buffer.
To get an idea of the issue:
# trace-cmd start -B boot_mapped -p function # reboot # trace-cmd show -B boot_mapped | tail <...>-1 [000] d..1. 461.983243: native_apic_msr_write <-native_kick_ap <...>-1 [000] d..1. 461.983244: __pfx_native_apic_msr_eoi <-native_kick_ap <...>-1 [000] d..1. 461.983244: reserve_irq_vector_locked <-native_kick_ap <...>-1 [000] d..1. 461.983262: branch_emulate_op <-native_kick_ap <...>-1 [000] d..1. 461.983262: __ia32_sys_ia32_pread64 <-native_kick_ap <...>-1 [000] d..1. 461.983263: native_kick_ap <-__smpboot_create_thread <...>-1 [000] d..1. 461.983263: store_cache_disable <-native_kick_ap <...>-1 [000] d..1. 461.983279: acpi_power_off_prepare <-native_kick_ap <...>-1 [000] d..1. 461.983280: __pfx_acpi_ns_delete_node <-acpi_suspend_enter <...>-1 [000] d..1. 461.983280: __pfx_acpi_os_release_lock <-acpi_suspend_enter # reboot # trace-cmd show -B boot_mapped |tail <...>-1 [000] d..1. 461.983243: 0xffffffffa9669220 <-0xffffffffa965f3db <...>-1 [000] d..1. 461.983244: 0xffffffffa96690f0 <-0xffffffffa965f3db <...>-1 [000] d..1. 461.983244: 0xffffffffa9663fa0 <-0xffffffffa965f3db <...>-1 [000] d..1. 461.983262: 0xffffffffa9672e80 <-0xffffffffa965f3e0 <...>-1 [000] d..1. 461.983262: 0xffffffffa962b940 <-0xffffffffa965f3ec <...>-1 [000] d..1. 461.983263: 0xffffffffa965f540 <-0xffffffffa96e1362 <...>-1 [000] d..1. 461.983263: 0xffffffffa963c940 <-0xffffffffa965f55b <...>-1 [000] d..1. 461.983279: 0xffffffffa9ee30c0 <-0xffffffffa965f59b <...>-1 [000] d..1. 461.983280: 0xffffffffa9f16c10 <-0xffffffffa9ee3157 <...>-1 [000] d..1. 461.983280: 0xffffffffa9ee02e0 <-0xffffffffa9ee3157
By not updating the saved text and data addresses in the meta data at every boot up and only updating them when the buffer is reset, it allows multiple boots to see the same data.
Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|