|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1 |
|
| #
caf6bf48 |
| 25-Nov-2024 |
Christian Brauner <[email protected]> |
binfmt_misc: avoid pointless cred reference count bump
file->f_cred already holds a reference count that is stable during the operation.
Link: https://lore.kernel.org/r/20241125-work-cred-v2-11-68b
binfmt_misc: avoid pointless cred reference count bump
file->f_cred already holds a reference count that is stable during the operation.
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
51c0bcf0 |
| 25-Nov-2024 |
Christian Brauner <[email protected]> |
tree-wide: s/revert_creds_light()/revert_creds()/g
Rename all calls to revert_creds_light() back to revert_creds().
Link: https://lore.kernel.org/r/[email protected] R
tree-wide: s/revert_creds_light()/revert_creds()/g
Rename all calls to revert_creds_light() back to revert_creds().
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
6771e004 |
| 25-Nov-2024 |
Christian Brauner <[email protected]> |
tree-wide: s/override_creds_light()/override_creds()/g
Rename all calls to override_creds_light() back to overrid_creds().
Link: https://lore.kernel.org/r/20241125-work-cred-v2-5-68b9d38bb5b2@kerne
tree-wide: s/override_creds_light()/override_creds()/g
Rename all calls to override_creds_light() back to overrid_creds().
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
f905e009 |
| 25-Nov-2024 |
Christian Brauner <[email protected]> |
tree-wide: s/revert_creds()/put_cred(revert_creds_light())/g
Convert all calls to revert_creds() over to explicitly dropping reference counts in preparation for converting revert_creds() to revert_c
tree-wide: s/revert_creds()/put_cred(revert_creds_light())/g
Convert all calls to revert_creds() over to explicitly dropping reference counts in preparation for converting revert_creds() to revert_creds_light() semantics.
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
0a670e15 |
| 25-Nov-2024 |
Christian Brauner <[email protected]> |
tree-wide: s/override_creds()/override_creds_light(get_new_cred())/g
Convert all callers from override_creds() to override_creds_light(get_new_cred()) in preparation of making override_creds() not t
tree-wide: s/override_creds()/override_creds_light(get_new_cred())/g
Convert all callers from override_creds() to override_creds_light(get_new_cred()) in preparation of making override_creds() not take a separate reference at all.
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.12, v6.12-rc7, v6.12-rc6 |
|
| #
b6709dcd |
| 02-Nov-2024 |
Christophe JAILLET <[email protected]> |
fs: binfmt: Fix a typo
A 't' is missing in "binfm_misc". Add it.
Signed-off-by: Christophe JAILLET <[email protected]> Acked-by: Eric W. Biederman <[email protected]> Link: https://
fs: binfmt: Fix a typo
A 't' is missing in "binfm_misc". Add it.
Signed-off-by: Christophe JAILLET <[email protected]> Acked-by: Eric W. Biederman <[email protected]> Link: https://lore.kernel.org/r/34b8c52b67934b293a67558a9a486aea7ba08951.1730539498.git.christophe.jaillet@wanadoo.fr Signed-off-by: Kees Cook <[email protected]>
show more ...
|
| #
3b832035 |
| 27-Nov-2024 |
Christian Brauner <[email protected]> |
Revert "fs: don't block i_writecount during exec"
This reverts commit 2a010c41285345da60cece35575b4e0af7e7bf44.
Rui Ueyama <[email protected]> writes:
> I'm the creator and the maintainer of the mo
Revert "fs: don't block i_writecount during exec"
This reverts commit 2a010c41285345da60cece35575b4e0af7e7bf44.
Rui Ueyama <[email protected]> writes:
> I'm the creator and the maintainer of the mold linker > (https://github.com/rui314/mold). Recently, we discovered that mold > started causing process crashes in certain situations due to a change > in the Linux kernel. Here are the details: > > - In general, overwriting an existing file is much faster than > creating an empty file and writing to it on Linux, so mold attempts to > reuse an existing executable file if it exists. > > - If a program is running, opening the executable file for writing > previously failed with ETXTBSY. If that happens, mold falls back to > creating a new file. > > - However, the Linux kernel recently changed the behavior so that > writing to an executable file is now always permitted > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a010c412853). > > That caused mold to write to an executable file even if there's a > process running that file. Since changes to mmap'ed files are > immediately visible to other processes, any processes running that > file would almost certainly crash in a very mysterious way. > Identifying the cause of these random crashes took us a few days. > > Rejecting writes to an executable file that is currently running is a > well-known behavior, and Linux had operated that way for a very long > time. So, I don’t believe relying on this behavior was our mistake; > rather, I see this as a regression in the Linux kernel.
Quoting myself from commit 2a010c412853 ("fs: don't block i_writecount during exec")
> Yes, someone in userspace could potentially be relying on this. It's not > completely out of the realm of possibility but let's find out if that's > actually the case and not guess.
It seems we found out that someone is relying on this obscure behavior. So revert the change.
Link: https://github.com/rui314/mold/issues/1361 Link: https://lore.kernel.org/r/[email protected] Cc: <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2 |
|
| #
2a010c41 |
| 31-May-2024 |
Christian Brauner <[email protected]> |
fs: don't block i_writecount during exec
Back in 2021 we already discussed removing deny_write_access() for executables. Back then I was hesistant because I thought that this might cause issues in u
fs: don't block i_writecount during exec
Back in 2021 we already discussed removing deny_write_access() for executables. Back then I was hesistant because I thought that this might cause issues in userspace. But even back then I had started taking some notes on what could potentially depend on this and I didn't come up with a lot so I've changed my mind and I would like to try this.
Here are some of the notes that I took:
(1) The deny_write_access() mechanism is causing really pointless issues such as [1]. If a thread in a thread-group opens a file writable, then writes some stuff, then closing the file descriptor and then calling execve() they can fail the execve() with ETXTBUSY because another thread in the thread-group could have concurrently called fork(). Multi-threaded libraries such as go suffer from this.
(2) There are userspace attacks that rely on overwriting the binary of a running process. These attacks are _mitigated_ but _not at all prevented_ from ocurring by the deny_write_access() mechanism.
I'll go over some details. The clearest example of such attacks was the attack against runC in CVE-2019-5736 (cf. [3]).
An attack could compromise the runC host binary from inside a _privileged_ runC container. The malicious binary could then be used to take over the host.
(It is crucial to note that this attack is _not_ possible with unprivileged containers. IOW, the setup here is already insecure.)
The attack can be made when attaching to a running container or when starting a container running a specially crafted image. For example, when runC attaches to a container the attacker can trick it into executing itself.
This could be done by replacing the target binary inside the container with a custom binary pointing back at the runC binary itself. As an example, if the target binary was /bin/bash, this could be replaced with an executable script specifying the interpreter path #!/proc/self/exe.
As such when /bin/bash is executed inside the container, instead the target of /proc/self/exe will be executed. That magic link will point to the runc binary on the host. The attacker can then proceed to write to the target of /proc/self/exe to try and overwrite the runC binary on the host.
However, this will not succeed because of deny_write_access(). Now, one might think that this would prevent the attack but it doesn't.
To overcome this, the attacker has multiple ways: * Open a file descriptor to /proc/self/exe using the O_PATH flag and then proceed to reopen the binary as O_WRONLY through /proc/self/fd/<nr> and try to write to it in a busy loop from a separate process. Ultimately it will succeed when the runC binary exits. After this the runC binary is compromised and can be used to attack other containers or the host itself. * Use a malicious shared library annotating a function in there with the constructor attribute making the malicious function run as an initializor. The malicious library will then open /proc/self/exe for creating a new entry under /proc/self/fd/<nr>. It'll then call exec to a) force runC to exit and b) hand the file descriptor off to a program that then reopens /proc/self/fd/<nr> for writing (which is now possible because runC has exited) and overwriting that binary.
To sum up: the deny_write_access() mechanism doesn't prevent such attacks in insecure setups. It just makes them minimally harder. That's all.
The only way back then to prevent this is to create a temporary copy of the calling binary itself when it starts or attaches to containers. So what I did back then for LXC (and Aleksa for runC) was to create an anonymous, in-memory file using the memfd_create() system call and to copy itself into the temporary in-memory file, which is then sealed to prevent further modifications. This sealed, in-memory file copy is then executed instead of the original on-disk binary.
Any compromising write operations from a privileged container to the host binary will then write to the temporary in-memory binary and not to the host binary on-disk, preserving the integrity of the host binary. Also as the temporary, in-memory binary is sealed, writes to this will also fail.
The point is that deny_write_access() is uselss to prevent these attacks.
(3) Denying write access to an inode because it's currently used in an exec path could easily be done on an LSM level. It might need an additional hook but that should be about it.
(4) The MAP_DENYWRITE flag for mmap() has been deprecated a long time ago so while we do protect the main executable the bigger portion of the things you'd think need protecting such as the shared libraries aren't. IOW, we let anyone happily overwrite shared libraries.
(5) We removed all remaining uses of VM_DENYWRITE in [2]. That means: (5.1) We removed the legacy uselib() protection for preventing overwriting of shared libraries. Nobody cared in 3 years. (5.2) We allow write access to the elf interpreter after exec completed treating it on a par with shared libraries.
Yes, someone in userspace could potentially be relying on this. It's not completely out of the realm of possibility but let's find out if that's actually the case and not guess.
Link: https://github.com/golang/go/issues/22315 [1] Link: 49624efa65ac ("Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux") [2] Link: https://unit42.paloaltonetworks.com/breaking-docker-via-runc-explaining-cve-2019-5736 [3] Link: https://lwn.net/Articles/866493 Link: https://github.com/golang/go/issues/22220 Link: https://github.com/golang/go/blob/5bf8c0cf09ee5c7e5a37ab90afcce154ab716a97/src/cmd/go/internal/work/buildid.go#L724 Link: https://github.com/golang/go/blob/5bf8c0cf09ee5c7e5a37ab90afcce154ab716a97/src/cmd/go/internal/work/exec.go#L1493 Link: https://github.com/golang/go/blob/5bf8c0cf09ee5c7e5a37ab90afcce154ab716a97/src/cmd/go/internal/script/cmds.go#L457 Link: https://github.com/golang/go/blob/5bf8c0cf09ee5c7e5a37ab90afcce154ab716a97/src/cmd/go/internal/test/test.go#L1557 Link: https://github.com/golang/go/blob/5bf8c0cf09ee5c7e5a37ab90afcce154ab716a97/src/os/exec/lp_linux_test.go#L61 Link: https://github.com/buildkite/agent/pull/2736 Link: https://github.com/rust-lang/rust/issues/114554 Link: https://bugs.openjdk.org/browse/JDK-8068370 Link: https://github.com/dotnet/runtime/issues/58964 Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
2c2a3f62 |
| 27-May-2024 |
Jeff Johnson <[email protected]> |
fs: binfmt: add missing MODULE_DESCRIPTION() macros
Fix the 'make W=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/binfmt_misc.o WARNING: modpost: missing MODULE_DESCRIPTION() in
fs: binfmt: add missing MODULE_DESCRIPTION() macros
Fix the 'make W=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/binfmt_misc.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/binfmt_script.o
Signed-off-by: Jeff Johnson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Al Viro <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5 |
|
| #
16a94965 |
| 04-Oct-2023 |
Jeff Layton <[email protected]> |
fs: convert core infrastructure to new timestamp accessors
Convert the core vfs code to use the new timestamp accessor functions.
Signed-off-by: Jeff Layton <[email protected]> Link: https://lore.
fs: convert core infrastructure to new timestamp accessors
Convert the core vfs code to use the new timestamp accessor functions.
Signed-off-by: Jeff Layton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15 |
|
| #
21ca59b3 |
| 28-Oct-2021 |
Christian Brauner <[email protected]> |
binfmt_misc: enable sandboxed mounts
Enable unprivileged sandboxes to create their own binfmt_misc mounts. This is based on Laurent's work in [1] but has been significantly reworked to fix various i
binfmt_misc: enable sandboxed mounts
Enable unprivileged sandboxes to create their own binfmt_misc mounts. This is based on Laurent's work in [1] but has been significantly reworked to fix various issues we identified in earlier versions.
While binfmt_misc can currently only be mounted in the initial user namespace, binary types registered in this binfmt_misc instance are available to all sandboxes (Either by having them installed in the sandbox or by registering the binary type with the F flag causing the interpreter to be opened right away). So binfmt_misc binary types are already delegated to sandboxes implicitly.
However, while a sandbox has access to all registered binary types in binfmt_misc a sandbox cannot currently register its own binary types in binfmt_misc. This has prevented various use-cases some of which were already outlined in [1] but we have a range of issues associated with this (cf. [3]-[5] below which are just a small sample).
Extend binfmt_misc to be mountable in non-initial user namespaces. Similar to other filesystem such as nfsd, mqueue, and sunrpc we use keyed superblock management. The key determines whether we need to create a new superblock or can reuse an already existing one. We use the user namespace of the mount as key. This means a new binfmt_misc superblock is created once per user namespace creation. Subsequent mounts of binfmt_misc in the same user namespace will mount the same binfmt_misc instance. We explicitly do not create a new binfmt_misc superblock on every binfmt_misc mount as the semantics for load_misc_binary() line up with the keying model. This also allows us to retrieve the relevant binfmt_misc instance based on the caller's user namespace which can be done in a simple (bounded to 32 levels) loop.
Similar to the current binfmt_misc semantics allowing access to the binary types in the initial binfmt_misc instance we do allow sandboxes access to their parent's binfmt_misc mounts if they do not have created a separate binfmt_misc instance.
Overall, this will unblock the use-cases mentioned below and in general will also allow to support and harden execution of another architecture's binaries in tight sandboxes. For instance, using the unshare binary it possible to start a chroot of another architecture and configure the binfmt_misc interpreter without being root to run the binaries in this chroot and without requiring the host to modify its binary type handlers.
Henning had already posted a few experiments in the cover letter at [1]. But here's an additional example where an unprivileged container registers qemu-user-static binary handlers for various binary types in its separate binfmt_misc mount and is then seamlessly able to start containers with a different architecture without affecting the host:
root [lxc monitor] /var/snap/lxd/common/lxd/containers f1 1000000 \_ /sbin/init 1000000 \_ /lib/systemd/systemd-journald 1000000 \_ /lib/systemd/systemd-udevd 1000100 \_ /lib/systemd/systemd-networkd 1000101 \_ /lib/systemd/systemd-resolved 1000000 \_ /usr/sbin/cron -f 1000103 \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 1000000 \_ /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers 1000104 \_ /usr/sbin/rsyslogd -n -iNONE 1000000 \_ /lib/systemd/systemd-logind 1000000 \_ /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220 1000107 \_ dnsmasq --conf-file=/dev/null -u lxc-dnsmasq --strict-order --bind-interfaces --pid-file=/run/lxc/dnsmasq.pid --liste 1000000 \_ [lxc monitor] /var/lib/lxc f1-s390x 1100000 \_ /usr/bin/qemu-s390x-static /sbin/init 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-journald 1100000 \_ /usr/bin/qemu-s390x-static /usr/sbin/cron -f 1100103 \_ /usr/bin/qemu-s390x-static /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-ac 1100000 \_ /usr/bin/qemu-s390x-static /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers 1100104 \_ /usr/bin/qemu-s390x-static /usr/sbin/rsyslogd -n -iNONE 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-logind 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/0 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/1 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/2 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /sbin/agetty -o -p -- \u --noclear --keep-baud pts/3 115200,38400,9600 vt220 1100000 \_ /usr/bin/qemu-s390x-static /lib/systemd/systemd-udevd
[1]: https://lore.kernel.org/all/[email protected] [2]: https://discuss.linuxcontainers.org/t/binfmt-misc-permission-denied [3]: https://discuss.linuxcontainers.org/t/lxd-binfmt-support-for-qemu-static-interpreters [4]: https://discuss.linuxcontainers.org/t/3-1-0-binfmt-support-service-in-unprivileged-guest-requires-write-access-on-hosts-proc-sys-fs-binfmt-misc [5]: https://discuss.linuxcontainers.org/t/qemu-user-static-not-working-4-11
Link: https://lore.kernel.org/r/[email protected] (origin) Link: https://lore.kernel.org/r/[email protected] (v1) Cc: Sargun Dhillon <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Jann Horn <[email protected]> Cc: Henning Schild <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Al Viro <[email protected]> Cc: Laurent Vivier <[email protected]> Cc: [email protected] Signed-off-by: Laurent Vivier <[email protected]> Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Kees Cook <[email protected]> --- /* v2 */ - Serge Hallyn <[email protected]>: - Use GFP_KERNEL_ACCOUNT for userspace triggered allocations when a new binary type handler is registered. - Christian Brauner <[email protected]>: - Switch authorship to me. I refused to do that earlier even though Laurent said I should do so because I think it's genuinely bad form. But by now I have changed so many things that it'd be unfair to blame Laurent for any potential bugs in here. - Add more comments that explain what's going on. - Rename functions while changing them to better reflect what they are doing to make the code easier to understand. - In the first version when a specific binary type handler was removed either through a write to the entry's file or all binary type handlers were removed by a write to the binfmt_misc mount's status file all cleanup work happened during inode eviction. That includes removal of the relevant entries from entry list. While that works fine I disliked that model after thinking about it for a bit. Because it means that there was a window were someone has already removed a or all binary handlers but they could still be safely reached from load_misc_binary() when it has managed to take the read_lock() on the entries list while inode eviction was already happening. Again, that perfectly benign but it's cleaner to remove the binary handler from the list immediately meaning that ones the write to then entry's file or the binfmt_misc status file returns the binary type cannot be executed anymore. That gives stronger guarantees to the user.
show more ...
|
| #
1c5976ef |
| 28-Oct-2021 |
Christian Brauner <[email protected]> |
binfmt_misc: cleanup on filesystem umount
Currently, registering a new binary type pins the binfmt_misc filesystem. Specifically, this means that as long as there is at least one binary type registe
binfmt_misc: cleanup on filesystem umount
Currently, registering a new binary type pins the binfmt_misc filesystem. Specifically, this means that as long as there is at least one binary type registered the binfmt_misc filesystem survives all umounts, i.e. the superblock is not destroyed. Meaning that a umount followed by another mount will end up with the same superblock and the same binary type handlers. This is a behavior we tend to discourage for any new filesystems (apart from a few special filesystems such as e.g. configfs or debugfs). A umount operation without the filesystem being pinned - by e.g. someone holding a file descriptor to an open file - should usually result in the destruction of the superblock and all associated resources. This makes introspection easier and leads to clearly defined, simple and clean semantics. An administrator can rely on the fact that a umount will guarantee a clean slate making it possible to reinitialize a filesystem. Right now all binary types would need to be explicitly deleted before that can happen.
This allows us to remove the heavy-handed calls to simple_pin_fs() and simple_release_fs() when creating and deleting binary types. This in turn allows us to replace the current brittle pinning mechanism abusing dget() which has caused a range of bugs judging from prior fixes in [2] and [3]. The additional dget() in load_misc_binary() pins the dentry but only does so for the sake to prevent ->evict_inode() from freeing the node when a user removes the binary type and kill_node() is run. Which would mean ->interpreter and ->interp_file would be freed causing a UAF.
This isn't really nicely documented nor is it very clean because it relies on simple_pin_fs() pinning the filesystem as long as at least one binary type exists. Otherwise it would cause load_misc_binary() to hold on to a dentry belonging to a superblock that has been shutdown. Replace that implicit pinning with a clean and simple per-node refcount and get rid of the ugly dget() pinning. A similar mechanism exists for e.g. binderfs (cf. [4]). All the cleanup work can now be done in ->evict_inode().
In a follow-up patch we will make it possible to use binfmt_misc in sandboxes. We will use the cleaner semantics where a umount for the filesystem will cause the superblock and all resources to be deallocated. In preparation for this apply the same semantics to the initial binfmt_misc mount. Note, that this is a user-visible change and as such a uapi change but one that we can reasonably risk. We've discussed this in earlier versions of this patchset (cf. [1]).
The main user and provider of binfmt_misc is systemd. Systemd provides binfmt_misc via autofs since it is configurable as a kernel module and is used by a few exotic packages and users. As such a binfmt_misc mount is triggered when /proc/sys/fs/binfmt_misc is accessed and is only provided on demand. Other autofs on demand filesystems include EFI ESP which systemd umounts if the mountpoint stays idle for a certain amount of time. This doesn't apply to the binfmt_misc autofs mount which isn't touched once it is mounted meaning this change can't accidently wipe binary type handlers without someone having explicitly unmounted binfmt_misc. After speaking to systemd folks they don't expect this change to affect them.
In line with our general policy, if we see a regression for systemd or other users with this change we will switch back to the old behavior for the initial binfmt_misc mount and have binary types pin the filesystem again. But while we touch this code let's take the chance and let's improve on the status quo.
[1]: https://lore.kernel.org/r/[email protected] [2]: commit 43a4f2619038 ("exec: binfmt_misc: fix race between load_misc_binary() and kill_node()" [3]: commit 83f918274e4b ("exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()") [4]: commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
Link: https://lore.kernel.org/r/[email protected] (v1) Cc: Sargun Dhillon <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Jann Horn <[email protected]> Cc: Henning Schild <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Al Viro <[email protected]> Cc: Laurent Vivier <[email protected]> Cc: [email protected] Acked-by: Serge Hallyn <[email protected]> Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Kees Cook <[email protected]> --- /* v2 */ - Christian Brauner <[email protected]>: - Add more comments that explain what's going on. - Rename functions while changing them to better reflect what they are doing to make the code easier to understand. - In the first version when a specific binary type handler was removed either through a write to the entry's file or all binary type handlers were removed by a write to the binfmt_misc mount's status file all cleanup work happened during inode eviction. That includes removal of the relevant entries from entry list. While that works fine I disliked that model after thinking about it for a bit. Because it means that there was a window were someone has already removed a or all binary handlers but they could still be safely reached from load_misc_binary() when it has managed to take the read_lock() on the entries list while inode eviction was already happening. Again, that perfectly benign but it's cleaner to remove the binary handler from the list immediately meaning that ones the write to then entry's file or the binfmt_misc status file returns the binary type cannot be executed anymore. That gives stronger guarantees to the user.
show more ...
|
| #
2276e5ba |
| 05-Jul-2023 |
Jeff Layton <[email protected]> |
fs: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime.
Re
fs: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime.
Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Message-Id: <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
6a46bf55 |
| 02-Nov-2022 |
Liu Shixin <[email protected]> |
binfmt_misc: fix shift-out-of-bounds in check_special_flags
UBSAN reported a shift-out-of-bounds warning:
left shift of 1 by 31 places cannot be represented in type 'int' Call Trace: <TASK> _
binfmt_misc: fix shift-out-of-bounds in check_special_flags
UBSAN reported a shift-out-of-bounds warning:
left shift of 1 by 31 places cannot be represented in type 'int' Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106 ubsan_epilogue+0xa/0x44 lib/ubsan.c:151 __ubsan_handle_shift_out_of_bounds+0x1e7/0x208 lib/ubsan.c:322 check_special_flags fs/binfmt_misc.c:241 [inline] create_entry fs/binfmt_misc.c:456 [inline] bm_register_write+0x9d3/0xa20 fs/binfmt_misc.c:654 vfs_write+0x11e/0x580 fs/read_write.c:582 ksys_write+0xcf/0x120 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x34/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x4194e1
Since the type of Node's flags is unsigned long, we should define these macros with same type too.
Signed-off-by: Liu Shixin <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
b42bc9a3 |
| 09-Feb-2022 |
Domenico Andreoli <[email protected]> |
Fix regression due to "fs: move binfmt_misc sysctl to its own file"
Commit 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file") did not go unnoticed, binfmt-support stopped to work on my Deb
Fix regression due to "fs: move binfmt_misc sysctl to its own file"
Commit 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file") did not go unnoticed, binfmt-support stopped to work on my Debian system since v5.17-rc2 (did not check with -rc1).
The existance of the /proc/sys/fs/binfmt_misc is a precondition for attempting to mount the binfmt_misc fs, which in turn triggers the autoload of the binfmt_misc module. Without it, no module is loaded and no binfmt is available at boot.
Building as built-in or manually loading the module and mounting the fs works fine, it's therefore only a matter of interaction with user-space. I could try to improve the Debian systemd configuration but I can't say anything about the other distributions.
This patch restores a working system right after boot.
Fixes: 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file") Signed-off-by: Domenico Andreoli <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Luis Chamberlain <[email protected]> Reviewed-by: Tong Zhang <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
e7f1e883 |
| 29-Jan-2022 |
Tong Zhang <[email protected]> |
binfmt_misc: fix crash when load/unload module
We should unregister the table upon module unload otherwise something horrible will happen when we load binfmt_misc module again. Also note that we sh
binfmt_misc: fix crash when load/unload module
We should unregister the table upon module unload otherwise something horrible will happen when we load binfmt_misc module again. Also note that we should keep value returned by register_sysctl_mount_point() and release it later, otherwise it will leak.
Also, per Christian's comment, to fully restore the old behavior that won't break userspace the check(binfmt_misc_header) should be eliminated.
To reproduce: modprobe binfmt_misc modprobe -r binfmt_misc modprobe binfmt_misc modprobe -r binfmt_misc modprobe binfmt_misc
resulting in
modprobe: can't load module binfmt_misc (kernel/fs/binfmt_misc.ko): Cannot allocate memory
and an unhappy kernel:
binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point BUG: unable to handle page fault for address: fffffbfff8004802 Call Trace: init_misc_binfmt+0x2d/0x1000 [binfmt_misc]
Link: https://lkml.kernel.org/r/[email protected] Fixes: 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file") Signed-off-by: Tong Zhang <[email protected]> Co-developed-by: Christian Brauner<[email protected]> Acked-by: Luis Chamberlain <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Kees Cook <[email protected]> Cc: Iurii Zaikin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
3ba442d5 |
| 22-Jan-2022 |
Luis Chamberlain <[email protected]> |
fs: move binfmt_misc sysctl to its own file
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty dishes, this makes it very difficult to maintain.
To help with this maintenance let's
fs: move binfmt_misc sysctl to its own file
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty dishes, this makes it very difficult to maintain.
To help with this maintenance let's start by moving sysctls to places where they actually belong. The proc sysctl maintainers do not want to know what sysctl knobs you wish to add for your own piece of code, we just care about the core logic.
This moves the binfmt_misc sysctl to its own file to help remove clutter from kernel/sysctl.c.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Luis Chamberlain <[email protected]> Cc: Al Viro <[email protected]> Cc: Amir Goldstein <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Antti Palosaari <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Benjamin LaHaise <[email protected]> Cc: Clemens Ladisch <[email protected]> Cc: David Airlie <[email protected]> Cc: Douglas Gilbert <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: James E.J. Bottomley <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Jan Kara <[email protected]> Cc: Joel Becker <[email protected]> Cc: John Ogness <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Joseph Qi <[email protected]> Cc: Julia Lawall <[email protected]> Cc: Kees Cook <[email protected]> Cc: Lukas Middendorf <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: Paul Turner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Phillip Potter <[email protected]> Cc: Qing Wang <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Sebastian Reichel <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Stephen Kitt <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Xiaoming Ni <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3 |
|
| #
e7850f4d |
| 13-Mar-2021 |
Lior Ribak <[email protected]> |
binfmt_misc: fix possible deadlock in bm_register_write
There is a deadlock in bm_register_write:
First, in the begining of the function, a lock is taken on the binfmt_misc root inode with inode_lo
binfmt_misc: fix possible deadlock in bm_register_write
There is a deadlock in bm_register_write:
First, in the begining of the function, a lock is taken on the binfmt_misc root inode with inode_lock(d_inode(root)).
Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call open_exec on the user-provided interpreter.
open_exec will call a path lookup, and if the path lookup process includes the root of binfmt_misc, it will try to take a shared lock on its inode again, but it is already locked, and the code will get stuck in a deadlock
To reproduce the bug: $ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register
backtrace of where the lock occurs (#5): 0 schedule () at ./arch/x86/include/asm/current.h:15 1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992 2 0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213 3 __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222 4 down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355 5 0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783 6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177 7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366 8 0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396 9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913 10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948 11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682 12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF ", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603 13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF ", count=19) at fs/read_write.c:658 14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46 15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
To solve the issue, the open_exec call is moved to before the write lock is taken by bm_register_write
Link: https://lkml.kernel.org/r/[email protected] Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers") Signed-off-by: Lior Ribak <[email protected]> Acked-by: Helge Deller <[email protected]> Cc: Al Viro <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1 |
|
| #
2347961b |
| 28-Jan-2020 |
Laurent Vivier <[email protected]> |
binfmt_misc: pass binfmt_misc flags to the interpreter
It can be useful to the interpreter to know which flags are in use.
For instance, knowing if the preserve-argv[0] is in use would allow to ski
binfmt_misc: pass binfmt_misc flags to the interpreter
It can be useful to the interpreter to know which flags are in use.
For instance, knowing if the preserve-argv[0] is in use would allow to skip the pathname argument.
This patch uses an unused auxiliary vector, AT_FLAGS, to add a flag to inform interpreter if the preserve-argv[0] is enabled.
Note by Helge Deller: The real-world user of this patch is qemu-user, which needs to know if it has to preserve the argv[0]. See Debian bug #970460.
Signed-off-by: Laurent Vivier <[email protected]> Reviewed-by: YunQiang Su <[email protected]> URL: http://bugs.debian.org/970460 Signed-off-by: Helge Deller <[email protected]>
show more ...
|
| #
986db2d1 |
| 04-Jun-2020 |
Christoph Hellwig <[email protected]> |
exec: simplify the copy_strings_kernel calling convention
copy_strings_kernel is always used with a single argument, adjust the calling convention to that.
Signed-off-by: Christoph Hellwig <hch@lst
exec: simplify the copy_strings_kernel calling convention
copy_strings_kernel is always used with a single argument, adjust the calling convention to that.
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexander Viro <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
56305aa9 |
| 30-May-2020 |
Eric W. Biederman <[email protected]> |
exec: Compute file based creds only once
Move the computation of creds from prepare_binfmt into begin_new_exec so that the creds need only be computed once. This is just code reorganization no sema
exec: Compute file based creds only once
Move the computation of creds from prepare_binfmt into begin_new_exec so that the creds need only be computed once. This is just code reorganization no semantic changes of any kind are made.
Moving the computation is safe. I have looked through the kernel and verified none of the binfmts look at bprm->cred directly, and that there are no helpers that look at bprm->cred indirectly. Which means that it is not a problem to compute the bprm->cred later in the execution flow as it is not used until it becomes current->cred.
A new function bprm_creds_from_file is added to contain the work that needs to be done. bprm_creds_from_file first computes which file bprm->executable or most likely bprm->file that the bprm->creds will be computed from.
The funciton bprm_fill_uid is updated to receive the file instead of accessing bprm->file. The now unnecessary work needed to reset the bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid. A small comment to document that bprm_fill_uid now only deals with the work to handle suid and sgid files. The default case is already heandled by prepare_exec_creds.
The function security_bprm_repopulate_creds is renamed security_bprm_creds_from_file and now is explicitly passed the file from which to compute the creds. The documentation of the bprm_creds_from_file security hook is updated to explain when the hook is called and what it needs to do. The file is passed from cap_bprm_creds_from_file into get_file_caps so that the caps are computed for the appropriate file. The now unnecessary work in cap_bprm_creds_from_file to reset the ambient capabilites has been removed. A small comment to document that the work of cap_bprm_creds_from_file is to read capabilities from the files secureity attribute and derive capabilities from the fact the user had uid 0 has been added.
Reviewed-by: Kees Cook <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
| #
bc2bf338 |
| 18-May-2020 |
Eric W. Biederman <[email protected]> |
exec: Remove recursion from search_binary_handler
Recursion in kernel code is generally a bad idea as it can overflow the kernel stack. Recursion in exec also hides that the code is looping and tha
exec: Remove recursion from search_binary_handler
Recursion in kernel code is generally a bad idea as it can overflow the kernel stack. Recursion in exec also hides that the code is looping and that the loop changes bprm->file.
Instead of recursing in search_binary_handler have the methods that would recurse set bprm->interpreter and return 0. Modify exec_binprm to loop when bprm->interpreter is set. Consolidate all of the reassignments of bprm->file in that loop to make it clear what is going on.
The structure of the new loop in exec_binprm is that all errors return immediately, while successful completion (ret == 0 && !bprm->interpreter) just breaks out of the loop and runs what exec_bprm has always run upon successful completion.
Fail if the an interpreter is being call after execfd has been set. The code has never properly handled an interpreter being called with execfd being set and with reassignments of bprm->file and the assignment of bprm->executable in generic code it has finally become possible to test and fail when if this problematic condition happens.
With the reassignments of bprm->file and the assignment of bprm->executable moved into the generic code add a test to see if bprm->executable is being reassigned.
In search_binary_handler remove the test for !bprm->file. With all reassignments of bprm->file moved to exec_binprm bprm->file can never be NULL in search_binary_handler.
Link: https://lkml.kernel.org/r/[email protected] Acked-by: Linus Torvalds <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
| #
b8a61c9e |
| 14-May-2020 |
Eric W. Biederman <[email protected]> |
exec: Generic execfd support
Most of the support for passing the file descriptor of an executable to an interpreter already lives in the generic code and in binfmt_elf. Rework the fields in binfmt_e
exec: Generic execfd support
Most of the support for passing the file descriptor of an executable to an interpreter already lives in the generic code and in binfmt_elf. Rework the fields in binfmt_elf that deal with executable file descriptor passing to make executable file descriptor passing a first class concept.
Move the fd_install from binfmt_misc into begin_new_exec after the new creds have been installed. This means that accessing the file through /proc/<pid>/fd/N is able to see the creds for the new executable before allowing access to the new executables files.
Performing the install of the executables file descriptor after the point of no return also means that nothing special needs to be done on error. The exiting of the process will close all of it's open files.
Move the would_dump from binfmt_misc into begin_new_exec right after would_dump is called on the bprm->file. This makes it obvious this case exists and that no nesting of bprm->file is currently supported.
In binfmt_misc the movement of fd_install into generic code means that it's special error exit path is no longer needed.
Link: https://lkml.kernel.org/r/[email protected] Acked-by: Linus Torvalds <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
| #
8b72ca90 |
| 14-May-2020 |
Eric W. Biederman <[email protected]> |
exec: Move the call of prepare_binprm into search_binary_handler
The code in prepare_binary_handler needs to be run every time search_binary_handler is called so move the call into search_binary_han
exec: Move the call of prepare_binprm into search_binary_handler
The code in prepare_binary_handler needs to be run every time search_binary_handler is called so move the call into search_binary_handler itself to make the code simpler and easier to understand.
Link: https://lkml.kernel.org/r/[email protected] Acked-by: Linus Torvalds <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: James Morris <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|
| #
a16b3357 |
| 16-May-2020 |
Eric W. Biederman <[email protected]> |
exec: Allow load_misc_binary to call prepare_binprm unconditionally
Add a flag preserve_creds that binfmt_misc can set to prevent credentials from being updated. This allows binfmt_misc to always c
exec: Allow load_misc_binary to call prepare_binprm unconditionally
Add a flag preserve_creds that binfmt_misc can set to prevent credentials from being updated. This allows binfmt_misc to always call prepare_binprm. Allowing the credential computation logic to be consolidated.
Not replacing the credentials with the interpreters credentials is safe because because an open file descriptor to the executable is passed to the interpreter. As the interpreter does not need to reopen the executable it is guaranteed to see the same file that exec sees.
Ref: c407c033de84 ("[PATCH] binfmt_misc: improve calculation of interpreter's credentials") Link: https://lkml.kernel.org/r/[email protected] Acked-by: Linus Torvalds <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
show more ...
|