|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2 |
|
| #
b1560408 |
| 30-Jul-2024 |
Steven Rostedt <[email protected]> |
tracing: Have format file honor EVENT_FILE_FL_FREED
When eventfs was introduced, special care had to be done to coordinate the freeing of the file meta data with the files that are exposed to user s
tracing: Have format file honor EVENT_FILE_FL_FREED
When eventfs was introduced, special care had to be done to coordinate the freeing of the file meta data with the files that are exposed to user space. The file meta data would have a ref count that is set when the file is created and would be decremented and freed after the last user that opened the file closed it. When the file meta data was to be freed, it would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed, and any new references made (like new opens or reads) would fail as it is marked freed. This allowed other meta data to be freed after this flag was set (under the event_mutex).
All the files that were dynamically created in the events directory had a pointer to the file meta data and would call event_release() when the last reference to the user space file was closed. This would be the time that it is safe to free the file meta data.
A shortcut was made for the "format" file. It's i_private would point to the "call" entry directly and not point to the file's meta data. This is because all format files are the same for the same "call", so it was thought there was no reason to differentiate them. The other files maintain state (like the "enable", "trigger", etc). But this meant if the file were to disappear, the "format" file would be unaware of it.
This caused a race that could be trigger via the user_events test (that would create dynamic events and free them), and running a loop that would read the user_events format files:
In one console run:
# cd tools/testing/selftests/user_events # while true; do ./ftrace_test; done
And in another console run:
# cd /sys/kernel/tracing/ # while true; do cat events/user_events/__test_event/format; done 2>/dev/null
With KASAN memory checking, it would trigger a use-after-free bug report (which was a real bug). This was because the format file was not checking the file's meta data flag "EVENT_FILE_FL_FREED", so it would access the event that the file meta data pointed to after the event was freed.
After inspection, there are other locations that were found to not check the EVENT_FILE_FL_FREED flag when accessing the trace_event_file. Add a new helper function: event_file_file() that will make sure that the event_mutex is held, and will return NULL if the trace_event_file has the EVENT_FILE_FL_FREED flag set. Have the first reference of the struct file pointer use event_file_file() and check for NULL. Later uses can still use the event_file_data() helper function if the event_mutex is still held and was not released since the event_file_file() call.
Link: https://lore.kernel.org/all/[email protected]/
Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Ajay Kaher <[email protected]> Cc: Ilkka Naulapää <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Al Viro <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Beau Belgrave <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Alexey Makhalov <[email protected]> Cc: Vasavi Sirnapalli <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode") Reported-by: Mathias Krause <[email protected]> Tested-by: Mathias Krause <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1 |
|
| #
e5c624f0 |
| 07-Sep-2023 |
Steven Rostedt (Google) <[email protected]> |
tracing: Have event inject files inc the trace array ref count
The event inject files add events for a specific trace array. For an instance, if the file is opened and the instance is deleted, readi
tracing: Have event inject files inc the trace array ref count
The event inject files add events for a specific trace array. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free.
Up the ref count of the trace_array when a event inject file is opened.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/
Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Zheng Yejian <[email protected]> Fixes: 6c3edaf9fd6a ("tracing: Introduce trace event injection") Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Naresh Kamboju <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
show more ...
|
|
Revision tags: v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3 |
|
| #
c7dce4c5 |
| 16-May-2023 |
Azeem Shaikh <[email protected]> |
tracing: Replace all non-returning strlcpy with strscpy
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to line
tracing: Replace all non-returning strlcpy with strscpy
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy().
No return values were used, so direct replacement with strlcpy is safe.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89
Signed-off-by: Azeem Shaikh <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3 |
|
| #
05770dd0 |
| 22-Nov-2021 |
Masami Hiramatsu <[email protected]> |
tracing: Support __rel_loc relative dynamic data location attribute
Add '__rel_loc' new dynamic data location attribute which encodes the data location from the next to the field itself.
The '__dat
tracing: Support __rel_loc relative dynamic data location attribute
Add '__rel_loc' new dynamic data location attribute which encodes the data location from the next to the field itself.
The '__data_loc' is used for encoding the dynamic data location on the trace event record. But '__data_loc' is not useful if the writer doesn't know the event header (e.g. user event), because it records the dynamic data offset from the entry of the record, not the field itself.
This new '__rel_loc' attribute encodes the data location relatively from the next of the field. For example, when there is a record like below (the number in the parentheses is the size of fields)
|header(N)|common(M)|fields(K)|__data_loc(4)|fields(L)|data(G)|
In this case, '__data_loc' field will be
__data_loc = (G << 16) | (N+M+K+4+L)
If '__rel_loc' is used, this will be
|header(N)|common(M)|fields(K)|__rel_loc(4)|fields(L)|data(G)|
where
__rel_loc = (G << 16) | (L)
This case shows L bytes after the '__rel_loc' attribute field, if there is no fields after the __rel_loc field, L must be 0.
This is relatively easy (and no need to consider the kernel header change) when the event data fields are composed by user who doesn't know header and common fields.
Link: https://lkml.kernel.org/r/163757341258.510314.4214431827833229956.stgit@devnote2
Cc: Beau Belgrave <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Tom Zanussi <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
show more ...
|
|
Revision tags: v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6 |
|
| #
36590c50 |
| 25-Jan-2021 |
Sebastian Andrzej Siewior <[email protected]> |
tracing: Merge irqflags + preempt counter.
The state of the interrupts (irqflags) and the preemption counter are both passed down to tracing_generic_entry_update(). Only one bit of irqflags is actua
tracing: Merge irqflags + preempt counter.
The state of the interrupts (irqflags) and the preemption counter are both passed down to tracing_generic_entry_update(). Only one bit of irqflags is actually required: The on/off state. The complete 32bit of the preemption counter isn't needed. Just whether of the upper bits (softirq, hardirq and NMI) are set and the preemption depth is needed.
The irqflags and the preemption counter could be evaluated early and the information stored in an integer `trace_ctx'. tracing_generic_entry_update() would use the upper bits as the TRACE_FLAG_* and the lower 8bit as the disabled-preemption depth (considering that one must be substracted from the counter in one special cases).
The actual preemption value is not used except for the tracing record. The `irqflags' variable is mostly used only for the tracing record. An exception here is for instance wakeup_tracer_call() or probe_wakeup_sched_switch() which explicilty disable interrupts and use that `irqflags' to save (and restore) the IRQ state and to record the state.
Struct trace_event_buffer has also the `pc' and flags' members which can be replaced with `trace_ctx' since their actual value is not used outside of trace recording.
This will reduce tracing_generic_entry_update() to simply assign values to struct trace_entry. The evaluation of the TRACE_FLAG_* bits is moved to _tracing_gen_ctx_flags() which replaces preempt_count() and local_save_flags() invocations.
As an example, ftrace_syscall_enter() may invoke: - trace_buffer_lock_reserve() -> … -> tracing_generic_entry_update() - event_trigger_unlock_commit() -> ftrace_trace_stack() -> … -> tracing_generic_entry_update() -> ftrace_trace_userstack() -> … -> tracing_generic_entry_update()
In this case the TRACE_FLAG_* bits were evaluated three times. By using the `trace_ctx' they are evaluated once and assigned three times.
A build with all tracers enabled on x86-64 with and without the patch:
text data bss dec hex filename 21970669 17084168 7639260 46694097 2c87ed1 vmlinux.old 21970293 17084168 7639260 46693721 2c87d59 vmlinux.new
text shrank by 379 bytes, data remained constant.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5 |
|
| #
02f4e01c |
| 03-Jan-2020 |
Steven Rostedt (VMware) <[email protected]> |
tracing: Initialize val to zero in parse_entry of inject code
gcc produces a variable may be uninitialized warning for "val" in parse_entry(). This is really a false positive, but the code is subtle
tracing: Initialize val to zero in parse_entry of inject code
gcc produces a variable may be uninitialized warning for "val" in parse_entry(). This is really a false positive, but the code is subtle enough to just initialize val to zero and it's not a fast path to worry about it.
Marked for stable to remove the warning in the stable trees as well.
Cc: [email protected] Fixes: 6c3edaf9fd6a3 ("tracing: Introduce trace event injection") Reported-by: kbuild test robot <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
show more ...
|
|
Revision tags: v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1 |
|
| #
a61f8105 |
| 07-Dec-2019 |
YueHaibing <[email protected]> |
tracing: remove set but not used variable 'buffer'
kernel/trace/trace_events_inject.c: In function trace_inject_entry: kernel/trace/trace_events_inject.c:20:22: warning: variable buffer set but not
tracing: remove set but not used variable 'buffer'
kernel/trace/trace_events_inject.c: In function trace_inject_entry: kernel/trace/trace_events_inject.c:20:22: warning: variable buffer set but not used [-Wunused-but-set-variable]
It is never used, so remove it.
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Hulk Robot <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
show more ...
|
| #
6c3edaf9 |
| 30-Nov-2019 |
Cong Wang <[email protected]> |
tracing: Introduce trace event injection
We have been trying to use rasdaemon to monitor hardware errors like correctable memory errors. rasdaemon uses trace events to monitor various hardware error
tracing: Introduce trace event injection
We have been trying to use rasdaemon to monitor hardware errors like correctable memory errors. rasdaemon uses trace events to monitor various hardware errors. In order to test it, we have to inject some hardware errors, unfortunately not all of them provide error injections. MCE does provide a way to inject MCE errors, but errors like PCI error and devlink error don't, it is not easy to add error injection to each of them. Instead, it is relatively easier to just allow users to inject trace events in a generic way so that all trace events can be injected.
This patch introduces trace event injection, where a new 'inject' is added to each tracepoint directory. Users could write into this file with key=value pairs to specify the value of each fields of the trace event, all unspecified fields are set to zero values by default.
For example, for the net/net_dev_queue tracepoint, we can inject:
INJECT=/sys/kernel/debug/tracing/events/net/net_dev_queue/inject echo "" > $INJECT echo "name='test'" > $INJECT echo "name='test' len=1024" > $INJECT cat /sys/kernel/debug/tracing/trace ... <...>-614 [000] .... 36.571483: net_dev_queue: dev= skbaddr=00000000fbf338c2 len=0 <...>-614 [001] .... 136.588252: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=0 <...>-614 [001] .N.. 208.431878: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=1024
Triggers could be triggered as usual too:
echo "stacktrace if len == 1025" > /sys/kernel/debug/tracing/events/net/net_dev_queue/trigger echo "len=1025" > $INJECT cat /sys/kernel/debug/tracing/trace ... bash-614 [000] .... 36.571483: net_dev_queue: dev= skbaddr=00000000fbf338c2 len=0 bash-614 [001] .... 136.588252: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=0 bash-614 [001] .N.. 208.431878: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=1024 bash-614 [001] .N.1 284.236349: <stack trace> => event_inject_write => vfs_write => ksys_write => do_syscall_64 => entry_SYSCALL_64_after_hwframe
The only thing that can't be injected is string pointers as they require constant string pointers, this can't be done at run time.
Link: http://lkml.kernel.org/r/[email protected]
Cc: Ingo Molnar <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
show more ...
|