|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3 |
|
| #
8f193313 |
| 13-Feb-2025 |
Mateusz Guzik <[email protected]> |
seccomp: avoid the lock trip seccomp_filter_release in common case
Vast majority of threads don't have any seccomp filters, all while the lock taken here is shared between all threads in given proce
seccomp: avoid the lock trip seccomp_filter_release in common case
Vast majority of threads don't have any seccomp filters, all while the lock taken here is shared between all threads in given process and frequently used.
Safety of the check relies on the following: - seccomp_filter_release is only legally called for PF_EXITING threads - SIGNAL_GROUP_EXIT is only ever set with the sighand lock held - PF_EXITING is only ever set with the sighand lock held *or* after SIGNAL_GROUP_EXIT is set *or* the process is single-threaded - seccomp_sync_threads holds the sighand lock and skips all threads if SIGNAL_GROUP_EXIT is set, PF_EXITING threads if not
Resulting reduction of contention gives me a 5% boost in a microbenchmark spawning and killing threads within the same process.
Signed-off-by: Mateusz Guzik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1 |
|
| #
e1cec510 |
| 28-Jan-2025 |
Oleg Nesterov <[email protected]> |
seccomp: remove the 'sd' argument from __seccomp_filter()
After the previous change 'sd' is always NULL.
Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link
seccomp: remove the 'sd' argument from __seccomp_filter()
After the previous change 'sd' is always NULL.
Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
1027cd80 |
| 28-Jan-2025 |
Oleg Nesterov <[email protected]> |
seccomp: remove the 'sd' argument from __secure_computing()
After the previous changes 'sd' is always NULL.
Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Kees Cook <[email protected]> L
seccomp: remove the 'sd' argument from __secure_computing()
After the previous changes 'sd' is always NULL.
Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
b37778be |
| 28-Jan-2025 |
Oleg Nesterov <[email protected]> |
seccomp: fix the __secure_computing() stub for !HAVE_ARCH_SECCOMP_FILTER
Depending on CONFIG_HAVE_ARCH_SECCOMP_FILTER, __secure_computing(NULL) will crash or not. This is not consistent/safe, especi
seccomp: fix the __secure_computing() stub for !HAVE_ARCH_SECCOMP_FILTER
Depending on CONFIG_HAVE_ARCH_SECCOMP_FILTER, __secure_computing(NULL) will crash or not. This is not consistent/safe, especially considering that after the previous change __secure_computing(sd) is always called with sd == NULL.
Fortunately, if CONFIG_HAVE_ARCH_SECCOMP_FILTER=n, __secure_computing() has no callers, these architectures use secure_computing_strict(). Yet it make sense make __secure_computing(NULL) safe in this case.
Note also that with this change we can unexport secure_computing_strict() and change the current callers to use __secure_computing(NULL).
Fixes: 8cf8dfceebda ("seccomp: Stub for !HAVE_ARCH_SECCOMP_FILTER") Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
cf6cb56e |
| 02-Feb-2025 |
Eyal Birger <[email protected]> |
seccomp: passthrough uretprobe systemcall without filtering
When attaching uretprobes to processes running inside docker, the attached process is segfaulted when encountering the retprobe.
The reas
seccomp: passthrough uretprobe systemcall without filtering
When attaching uretprobes to processes running inside docker, the attached process is segfaulted when encountering the retprobe.
The reason is that now that uretprobe is a system call the default seccomp filters in docker block it as they only allow a specific set of known syscalls. This is true for other userspace applications which use seccomp to control their syscall surface.
Since uretprobe is a "kernel implementation detail" system call which is not used by userspace application code directly, it is impractical and there's very little point in forcing all userspace applications to explicitly allow it in order to avoid crashing tracked processes.
Pass this systemcall through seccomp without depending on configuration.
Note: uretprobe is currently only x86_64 and isn't expected to ever be supported in i386.
Fixes: ff474a78cef5 ("uprobe: Add uretprobe syscall to speed up return probe") Reported-by: Rafael Buchbinder <[email protected]> Closes: https://lore.kernel.org/lkml/CAHsH6Gs3Eh8DFU0wq58c_LF8A4_+o6z456J7BidmcVY2AqOnHQ@mail.gmail.com/ Link: https://lore.kernel.org/lkml/[email protected]/T/#me2676c378eff2d6a33f3054fed4a5f3afa64e65b Link: https://lore.kernel.org/lkml/[email protected]/ Cc: [email protected] Signed-off-by: Eyal Birger <[email protected]> Link: https://lore.kernel.org/r/[email protected] [kees: minimized changes for easier backporting, tweaked commit log] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
1751f872 |
| 28-Jan-2025 |
Joel Granados <[email protected]> |
treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysc
treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function.
Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers.
Created this by running an spatch followed by a sed command: Spatch: virtual patch
@ depends on !(file in "net") disable optional_qualifier @
identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@
+ const struct ctl_table table_name [] = { ... };
sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c
Reviewed-by: Song Liu <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> # for kernel/trace/ Reviewed-by: Martin K. Petersen <[email protected]> # SCSI Reviewed-by: Darrick J. Wong <[email protected]> # xfs Acked-by: Jani Nikula <[email protected]> Acked-by: Corey Minyard <[email protected]> Acked-by: Wei Liu <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Reviewed-by: Bill O'Donnell <[email protected]> Acked-by: Baoquan He <[email protected]> Acked-by: Ashutosh Dixit <[email protected]> Acked-by: Anna Schumaker <[email protected]> Signed-off-by: Joel Granados <[email protected]>
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
| #
78eb4ea2 |
| 24-Jul-2024 |
Joel Granados <[email protected]> |
sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ct
sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified.
This patch has been generated by the following coccinelle script:
``` virtual patch
@r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@
int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos);
@r2@ identifier func, ctl, write, buffer, lenp, ppos; @@
int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... }
@r3@ identifier func; @@
int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *);
@r4@ identifier func, ctl; @@
int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *);
@r5@ identifier func, write, buffer, lenp, ppos; @@
int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos);
```
* Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted.
* The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration.
Co-developed-by: Thomas Weißschuh <[email protected]> Signed-off-by: Thomas Weißschuh <[email protected]> Co-developed-by: Joel Granados <[email protected]> Signed-off-by: Joel Granados <[email protected]>
show more ...
|
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6 |
|
| #
bfafe5ef |
| 28-Jun-2024 |
Andrei Vagin <[email protected]> |
seccomp: release task filters when the task exits
Previously, seccomp filters were released in release_task(), which required the process to exit and its zombie to be collected. However, exited thre
seccomp: release task filters when the task exits
Previously, seccomp filters were released in release_task(), which required the process to exit and its zombie to be collected. However, exited threads/processes can't trigger any seccomp events, making it more logical to release filters upon task exits.
This adjustment simplifies scenarios where a parent is tracing its child process. The parent process can now handle all events from a seccomp listening descriptor and then call wait to collect a child zombie.
seccomp_filter_release takes the siglock to avoid races with seccomp_sync_threads. There was an idea to bypass taking the lock by checking PF_EXITING, but it can be set without holding siglock if threads have SIGNAL_GROUP_EXIT. This means it can happen concurently with seccomp_filter_release.
This change also fixes another minor problem. Suppose that a group leader installs the new filter without SECCOMP_FILTER_FLAG_TSYNC, exits, and becomes a zombie. Without this change, SECCOMP_FILTER_FLAG_TSYNC from any other thread can never succeed, seccomp_can_sync_threads() will check a zombie leader and is_ancestor() will fail.
Reviewed-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrei Vagin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Tycho Andersen <[email protected]> Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
95036a79 |
| 28-Jun-2024 |
Andrei Vagin <[email protected]> |
seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited
SECCOMP_IOCTL_NOTIF_RECV promptly returns when a seccomp filter becomes unused, as a filter without users can't trigger any eve
seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited
SECCOMP_IOCTL_NOTIF_RECV promptly returns when a seccomp filter becomes unused, as a filter without users can't trigger any events.
Previously, event listeners had to rely on epoll to detect when all processes had exited.
The change is based on the 'commit 99cdb8b9a573 ("seccomp: notify about unused filter")' which implemented (E)POLLHUP notifications.
Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Andrei Vagin <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Tycho Andersen <[email protected]> Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9 |
|
| #
e406737b |
| 08-May-2024 |
Kees Cook <[email protected]> |
seccomp: Constify sysctl subhelpers
The read_actions_logged() and write_actions_logged() helpers called by the sysctl proc handler seccomp_actions_logged_handler() are already expecting their sysctl
seccomp: Constify sysctl subhelpers
The read_actions_logged() and write_actions_logged() helpers called by the sysctl proc handler seccomp_actions_logged_handler() are already expecting their sysctl table argument to be read-only. Actually mark the argument as const in preparation[1] for global constification of the sysctl tables.
Suggested-by: Thomas Weißschuh <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ [1] Reviewed-by: Luis Chamberlain <[email protected]> Reviewed-by: Thomas Weißschuh <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1 |
|
| #
e822582e |
| 27-Jun-2023 |
Joel Granados <[email protected]> |
seccomp: Remove the now superfluous sentinel elements from ctl_table array
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sent
seccomp: Remove the now superfluous sentinel elements from ctl_table array
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%[email protected]/)
Remove sentinel element from seccomp_sysctl_table.
Acked-by: Kees Cook <[email protected]> Signed-off-by: Joel Granados <[email protected]>
show more ...
|
| #
4e94ddfe |
| 30-Nov-2023 |
Christian Brauner <[email protected]> |
file: remove __receive_fd()
Honestly, there's little value in having a helper with and without that int __user *ufd argument. It's just messy and doesn't really give us anything. Just expose receive
file: remove __receive_fd()
Honestly, there's little value in having a helper with and without that int __user *ufd argument. It's just messy and doesn't really give us anything. Just expose receive_fd() with that argument and get rid of that helper.
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
46822860 |
| 17-Aug-2023 |
Kees Cook <[email protected]> |
seccomp: Add missing kerndoc notations
The kerndoc for some struct member and function arguments were missing. Add them.
Cc: Andy Lutomirski <[email protected]> Cc: Will Drewry <[email protected]>
seccomp: Add missing kerndoc notations
The kerndoc for some struct member and function arguments were missing. Add them.
Cc: Andy Lutomirski <[email protected]> Cc: Will Drewry <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2 |
|
| #
48a1084a |
| 08-Mar-2023 |
Andrei Vagin <[email protected]> |
seccomp: add the synchronous mode for seccomp_unotify
seccomp_unotify allows more privileged processes do actions on behalf of less privileged processes.
In many cases, the workflow is fully synchr
seccomp: add the synchronous mode for seccomp_unotify
seccomp_unotify allows more privileged processes do actions on behalf of less privileged processes.
In many cases, the workflow is fully synchronous. It means a target process triggers a system call and passes controls to a supervisor process that handles the system call and returns controls to the target process. In this context, "synchronous" means that only one process is running and another one is waiting.
There is the WF_CURRENT_CPU flag that is used to advise the scheduler to move the wakee to the current CPU. For such synchronous workflows, it makes context switches a few times faster.
Right now, each interaction takes 12µs. With this patch, it takes about 3µs.
This change introduce the SECCOMP_USER_NOTIF_FD_SYNC_WAKE_UP flag that it used to enable the sync mode.
Signed-off-by: Andrei Vagin <[email protected]> Acked-by: "Peter Zijlstra (Intel)" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
4943b66d |
| 08-Mar-2023 |
Andrei Vagin <[email protected]> |
seccomp: don't use semaphore and wait_queue together
The main reason is to use new wake_up helpers that will be added in the following patches. But here are a few other reasons:
* if we use two dif
seccomp: don't use semaphore and wait_queue together
The main reason is to use new wake_up helpers that will be added in the following patches. But here are a few other reasons:
* if we use two different ways, we always need to call them both. This patch fixes seccomp_notify_recv where we forgot to call wake_up_poll in the error path.
* If we use one primitive, we can control how many waiters are woken up for each request. Our goal is to wake up just one that will handle a request. Right now, wake_up_poll can wake up one waiter and up(&match->notif->request) can wake up one more.
Signed-off-by: Andrei Vagin <[email protected]> Acked-by: "Peter Zijlstra (Intel)" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc1 |
|
| #
02a6b455 |
| 02-Mar-2023 |
Luis Chamberlain <[email protected]> |
seccomp: simplify sysctls with register_sysctl_init()
register_sysctl_paths() is only needed if you have childs (directories) with entries. Just use register_sysctl_init() as it also does the kmemle
seccomp: simplify sysctls with register_sysctl_init()
register_sysctl_paths() is only needed if you have childs (directories) with entries. Just use register_sysctl_init() as it also does the kmemleak check for you.
Acked-by: Kees Cook <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
show more ...
|
|
Revision tags: v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3 |
|
| #
0fb0624b |
| 08-Jan-2023 |
Randy Dunlap <[email protected]> |
seccomp: fix kernel-doc function name warning
Move the ACTION_ONLY() macro so that it is not between the kernel-doc notation and the function definition for seccomp_run_filters(), eliminating a kern
seccomp: fix kernel-doc function name warning
Move the ACTION_ONLY() macro so that it is not between the kernel-doc notation and the function definition for seccomp_run_filters(), eliminating a kernel-doc warning:
kernel/seccomp.c:400: warning: expecting prototype for seccomp_run_filters(). Prototype was for ACTION_ONLY() instead
Signed-off-by: Randy Dunlap <[email protected]> Cc: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Will Drewry <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6 |
|
| #
c2aa2dfe |
| 03-May-2022 |
Sargun Dhillon <[email protected]> |
seccomp: Add wait_killable semantic to seccomp user notifier
This introduces a per-filter flag (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV) that makes it so that when notifications are received by the s
seccomp: Add wait_killable semantic to seccomp user notifier
This introduces a per-filter flag (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV) that makes it so that when notifications are received by the supervisor the notifying process will transition to wait killable semantics. Although wait killable isn't a set of semantics formally exposed to userspace, the concept is searchable. If the notifying process is signaled prior to the notification being received by the userspace agent, it will be handled as normal.
One quirk about how this is handled is that the notifying process only switches to TASK_KILLABLE if it receives a wakeup from either an addfd or a signal. This is to avoid an unnecessary wakeup of the notifying task.
The reasons behind switching into wait_killable only after userspace receives the notification are: * Avoiding unncessary work - Often, workloads will perform work that they may abort (request racing comes to mind). This allows for syscalls to be aborted safely prior to the notification being received by the supervisor. In this, the supervisor doesn't end up doing work that the workload does not want to complete anyways. * Avoiding side effects - We don't want the syscall to be interruptible once the supervisor starts doing work because it may not be trivial to reverse the operation. For example, unmounting a file system may take a long time, and it's hard to rollback, or treat that as reentrant. * Avoid breaking runtimes - Various runtimes do not GC when they are during a syscall (or while running native code that subsequently calls a syscall). If many notifications are blocked, and not picked up by the supervisor, this can get the application into a bad state.
Signed-off-by: Sargun Dhillon <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.18-rc5 |
|
| #
4cbf6f62 |
| 28-Apr-2022 |
Sargun Dhillon <[email protected]> |
seccomp: Use FIFO semantics to order notifications
Previously, the seccomp notifier used LIFO semantics, where each notification would be added on top of the stack, and notifications were popped off
seccomp: Use FIFO semantics to order notifications
Previously, the seccomp notifier used LIFO semantics, where each notification would be added on top of the stack, and notifications were popped off the top of the stack. This could result one process that generates a large number of notifications preventing other notifications from being handled. This patch moves from LIFO (stack) semantics to FIFO (queue semantics).
Signed-off-by: Sargun Dhillon <[email protected]> Reviewed-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4 |
|
| #
355f841a |
| 09-Feb-2022 |
Eric W. Biederman <[email protected]> |
tracehook: Remove tracehook.h
Now that all of the definitions have moved out of tracehook.h into ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in tracehook.h so remove it.
Upda
tracehook: Remove tracehook.h
Now that all of the definitions have moved out of tracehook.h into ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in tracehook.h so remove it.
Update the few files that were depending upon tracehook.h to bring in definitions to use the headers they need directly.
Reviewed-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
| #
495ac306 |
| 08-Feb-2022 |
Kees Cook <[email protected]> |
seccomp: Invalidate seccomp mode to catch death failures
If seccomp tries to kill a process, it should never see that process again. To enforce this proactively, switch the mode to something impossi
seccomp: Invalidate seccomp mode to catch death failures
If seccomp tries to kill a process, it should never see that process again. To enforce this proactively, switch the mode to something impossible. If encountered: WARN, reject all syscalls, and attempt to kill the process again even harder.
Cc: Andy Lutomirski <[email protected]> Cc: Will Drewry <[email protected]> Fixes: 8112c4f140fa ("seccomp: remove 2-phase API") Cc: [email protected] Signed-off-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13 |
|
| #
d21918e5 |
| 23-Jun-2021 |
Eric W. Biederman <[email protected]> |
signal/seccomp: Dump core when there is only one live thread
Replace get_nr_threads with atomic_read(¤t->signal->live) as that is a more accurate number that is decremented sooner.
Acked-by:
signal/seccomp: Dump core when there is only one live thread
Replace get_nr_threads with atomic_read(¤t->signal->live) as that is a more accurate number that is decremented sooner.
Acked-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/87lf6z6qbd.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
| #
307d522f |
| 23-Jun-2021 |
Eric W. Biederman <[email protected]> |
signal/seccomp: Refactor seccomp signal and coredump generation
Factor out force_sig_seccomp from the seccomp signal generation and place it in kernel/signal.c. The function force_sig_seccomp takes
signal/seccomp: Refactor seccomp signal and coredump generation
Factor out force_sig_seccomp from the seccomp signal generation and place it in kernel/signal.c. The function force_sig_seccomp takes a parameter force_coredump to indicate that the sigaction field should be reset to SIGDFL so that a coredump will be generated when the signal is delivered.
force_sig_seccomp is then used to replace both seccomp_send_sigsys and seccomp_init_siginfo.
force_sig_info_to_task gains an extra parameter to force using the default signal action.
With this change seccomp is no longer a special case and there becomes exactly one place do_coredump is called from.
Further it no longer becomes necessary for __seccomp_filter to call do_group_exit.
Acked-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/87r1gr6qc4.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
|
Revision tags: v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2 |
|
| #
b4d8a58f |
| 04-Mar-2021 |
Hsuan-Chi Kuo <[email protected]> |
seccomp: Fix setting loaded filter count during TSYNC
The desired behavior is to set the caller's filter count to thread's. This value is reported via /proc, so this fixes the inaccurate count expos
seccomp: Fix setting loaded filter count during TSYNC
The desired behavior is to set the caller's filter count to thread's. This value is reported via /proc, so this fixes the inaccurate count exposed to userspace; it is not used for reference counting, etc.
Signed-off-by: Hsuan-Chi Kuo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Co-developed-by: Wiktor Garbacz <[email protected]> Signed-off-by: Wiktor Garbacz <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Kees Cook <[email protected]> Cc: [email protected] Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status")
show more ...
|
| #
0ae71c77 |
| 17-May-2021 |
Rodrigo Campos <[email protected]> |
seccomp: Support atomic "addfd + send reply"
Alban Crequy reported a race condition userspace faces when we want to add some fds and make the syscall return them[1] using seccomp notify.
The proble
seccomp: Support atomic "addfd + send reply"
Alban Crequy reported a race condition userspace faces when we want to add some fds and make the syscall return them[1] using seccomp notify.
The problem is that currently two different ioctl() calls are needed by the process handling the syscalls (agent) for another userspace process (target): SECCOMP_IOCTL_NOTIF_ADDFD to allocate the fd and SECCOMP_IOCTL_NOTIF_SEND to return that value. Therefore, it is possible for the agent to do the first ioctl to add a file descriptor but the target is interrupted (EINTR) before the agent does the second ioctl() call.
This patch adds a flag to the ADDFD ioctl() so it adds the fd and returns that value atomically to the target program, as suggested by Kees Cook[2]. This is done by simply allowing seccomp_do_user_notification() to add the fd and return it in this case. Therefore, in this case the target wakes up from the wait in seccomp_do_user_notification() either to interrupt the syscall or to add the fd and return it.
This "allocate an fd and return" functionality is useful for syscalls that return a file descriptor only, like connect(2). Other syscalls that return a file descriptor but not as return value (or return more than one fd), like socketpair(), pipe(), recvmsg with SCM_RIGHTs, will not work with this flag.
This effectively combines SECCOMP_IOCTL_NOTIF_ADDFD and SECCOMP_IOCTL_NOTIF_SEND into an atomic opteration. The notification's return value, nor error can be set by the user. Upon successful invocation of the SECCOMP_IOCTL_NOTIF_ADDFD ioctl with the SECCOMP_ADDFD_FLAG_SEND flag, the notifying process's errno will be 0, and the return value will be the file descriptor number that was installed.
[1]: https://lore.kernel.org/lkml/CADZs7q4sw71iNHmV8EOOXhUKJMORPzF7thraxZYddTZsxta-KQ@mail.gmail.com/ [2]: https://lore.kernel.org/lkml/202012011322.26DCBC64F2@keescook/
Signed-off-by: Rodrigo Campos <[email protected]> Signed-off-by: Sargun Dhillon <[email protected]> Acked-by: Tycho Andersen <[email protected]> Acked-by: Christian Brauner <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|