|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
654b33ad |
| 01-Mar-2025 |
Ye Bin <[email protected]> |
proc: fix UAF in proc_get_inode()
Fix race between rmmod and /proc/XXX's inode instantiation.
The bug is that pde->proc_ops don't belong to /proc, it belongs to a module, therefore dereferencing it
proc: fix UAF in proc_get_inode()
Fix race between rmmod and /proc/XXX's inode instantiation.
The bug is that pde->proc_ops don't belong to /proc, it belongs to a module, therefore dereferencing it after /proc entry has been registered is a bug unless use_pde/unuse_pde() pair has been used.
use_pde/unuse_pde can be avoided (2 atomic ops!) because pde->proc_ops never changes so information necessary for inode instantiation can be saved _before_ proc_register() in PDE itself and used later, avoiding pde->proc_ops->... dereference.
rmmod lookup sys_delete_module proc_lookup_de pde_get(de); proc_get_inode(dir->i_sb, de); mod->exit() proc_remove remove_proc_subtree proc_entry_rundown(de); free_module(mod);
if (S_ISREG(inode->i_mode)) if (de->proc_ops->proc_read_iter) --> As module is already freed, will trigger UAF
BUG: unable to handle page fault for address: fffffbfff80a702b PGD 817fc4067 P4D 817fc4067 PUD 817fc0067 PMD 102ef4067 PTE 0 Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 26 UID: 0 PID: 2667 Comm: ls Tainted: G Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:proc_get_inode+0x302/0x6e0 RSP: 0018:ffff88811c837998 EFLAGS: 00010a06 RAX: dffffc0000000000 RBX: ffffffffc0538140 RCX: 0000000000000007 RDX: 1ffffffff80a702b RSI: 0000000000000001 RDI: ffffffffc0538158 RBP: ffff8881299a6000 R08: 0000000067bbe1e5 R09: 1ffff11023906f20 R10: ffffffffb560ca07 R11: ffffffffb2b43a58 R12: ffff888105bb78f0 R13: ffff888100518048 R14: ffff8881299a6004 R15: 0000000000000001 FS: 00007f95b9686840(0000) GS:ffff8883af100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffbfff80a702b CR3: 0000000117dd2000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> proc_lookup_de+0x11f/0x2e0 __lookup_slow+0x188/0x350 walk_component+0x2ab/0x4f0 path_lookupat+0x120/0x660 filename_lookup+0x1ce/0x560 vfs_statx+0xac/0x150 __do_sys_newstat+0x96/0x110 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e
[[email protected]: don't do 2 atomic ops on the common path] Link: https://lkml.kernel.org/r/3d25ded0-1739-447e-812b-e34da7990dcf@p183 Fixes: 778f3dd5a13c ("Fix procfs compat_ioctl regression") Signed-off-by: Ye Bin <[email protected]> Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Al Viro <[email protected]> Cc: David S. Miller <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2 |
|
| #
5be1fa8a |
| 08-Dec-2024 |
Al Viro <[email protected]> |
Pass parent directory inode and expected name to ->d_revalidate()
->d_revalidate() often needs to access dentry parent and name; that has to be done carefully, since the locking environment varies f
Pass parent directory inode and expected name to ->d_revalidate()
->d_revalidate() often needs to access dentry parent and name; that has to be done carefully, since the locking environment varies from caller to caller. We are not guaranteed that dentry in question will not be moved right under us - not unless the filesystem is such that nothing on it ever gets renamed.
It can be dealt with, but that results in boilerplate code that isn't even needed - the callers normally have just found the dentry via dcache lookup and want to verify that it's in the right place; they already have the values of ->d_parent and ->d_name stable. There is a couple of exceptions (overlayfs and, to less extent, ecryptfs), but for the majority of calls that song and dance is not needed at all.
It's easier to make ecryptfs and overlayfs find and pass those values if there's a ->d_revalidate() instance to be called, rather than doing that in the instances.
This commit only changes the calling conventions; making use of supplied values is left to followups.
NOTE: some instances need more than just the parent - things like CIFS may need to build an entire path from filesystem root, so they need more precautions than the usual boilerplate. This series doesn't do anything to that need - these filesystems have to keep their locking mechanisms (rename_lock loops, use of dentry_path_raw(), private rwsem a-la v9fs).
One thing to keep in mind when using name is that name->name will normally point into the pathname being resolved; the filename in question occupies name->len bytes starting at name->name, and there is NUL somewhere after it, but it the next byte might very well be '/' rather than '\0'. Do not ignore name->len.
Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Gabriel Krisman Bertazi <[email protected]> Signed-off-by: Al Viro <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7 |
|
| #
4ad5f9a0 |
| 08-Sep-2024 |
Alexey Dobriyan <[email protected]> |
proc: fold kmalloc() + strcpy() into kmemdup()
strcpy() will recalculate string length second time which is unnecessary in this case.
Signed-off-by: Alexey Dobriyan <[email protected]> Link: http
proc: fold kmalloc() + strcpy() into kmemdup()
strcpy() will recalculate string length second time which is unnecessary in this case.
Signed-off-by: Alexey Dobriyan <[email protected]> Link: https://lore.kernel.org/r/90af27c1-0b86-47a6-a6c8-61a58b8aa747@p183 Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3 |
|
| #
7f07ee5a |
| 08-Jun-2024 |
Christophe JAILLET <[email protected]> |
proc: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_
proc: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_max() is inclusive. So a -1 has been added when needed.
Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/r/ae10003feb87d240163d0854de95f09e1f00be7d.1717855701.git.christophe.jaillet@wanadoo.fr Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
cda1c8e0 |
| 08-Jun-2024 |
Christophe JAILLET <[email protected]> |
proc: remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_
proc: remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_max() is inclusive. So a -1 has been added when needed.
Link: https://lkml.kernel.org/r/ae10003feb87d240163d0854de95f09e1f00be7d.1717855701.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Alistar Popple <[email protected]> Cc: Christian Gromm <[email protected]> Cc: Eddie James <[email protected]> Cc: Jeremy Kerr <[email protected]> Cc: Joel Stanley <[email protected]> Cc: Parthiban Veerasooran <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6 |
|
| #
0d72b928 |
| 07-Aug-2023 |
Jeff Layton <[email protected]> |
fs: pass the request_mask to generic_fillattr
generic_fillattr just fills in the entire stat struct indiscriminately today, copying data from the inode. There is at least one attribute (STATX_CHANGE
fs: pass the request_mask to generic_fillattr
generic_fillattr just fills in the entire stat struct indiscriminately today, copying data from the inode. There is at least one attribute (STATX_CHANGE_COOKIE) that can have side effects when it is reported, and we're looking at adding more with the addition of multigrain timestamps.
Add a request_mask argument to generic_fillattr and have most callers just pass in the value that is passed to getattr. Have other callers (e.g. ksmbd) just pass in STATX_BASIC_STATS. Also move the setting of STATX_CHANGE_COOKIE into generic_fillattr.
Acked-by: Joseph Qi <[email protected]> Reviewed-by: Xiubo Li <[email protected]> Reviewed-by: "Paulo Alcantara (SUSE)" <[email protected]> Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Message-Id: <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7 |
|
| #
b99b2588 |
| 31-Jan-2023 |
Chao Yu <[email protected]> |
proc: remove mark_inode_dirty() in .setattr()
procfs' .setattr() has updated i_uid, i_gid and i_mode into proc dirent, we don't need to call mark_inode_dirty() for delayed update, remove it.
Link:
proc: remove mark_inode_dirty() in .setattr()
procfs' .setattr() has updated i_uid, i_gid and i_mode into proc dirent, we don't need to call mark_inode_dirty() for delayed update, remove it.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chao Yu <[email protected]> Reviewed-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc6, v6.2-rc5, v6.2-rc4 |
|
| #
b74d24f7 |
| 13-Jan-2023 |
Christian Brauner <[email protected]> |
fs: port ->getattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just
fs: port ->getattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
show more ...
|
| #
c1632a0f |
| 13-Jan-2023 |
Christian Brauner <[email protected]> |
fs: port ->setattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just
fs: port ->setattr() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap.
Acked-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7 |
|
| #
70551977 |
| 10-May-2022 |
Alexey Dobriyan <[email protected]> |
proc: fix dentry/inode overinstantiating under /proc/${pid}/net
When a process exits, /proc/${pid}, and /proc/${pid}/net dentries are flushed. However some leaf dentries like /proc/${pid}/net/arp_c
proc: fix dentry/inode overinstantiating under /proc/${pid}/net
When a process exits, /proc/${pid}, and /proc/${pid}/net dentries are flushed. However some leaf dentries like /proc/${pid}/net/arp_cache aren't. That's because respective PDEs have proc_misc_d_revalidate() hook which returns 1 and leaves dentries/inodes in the LRU.
Force revalidation/lookup on everything under /proc/${pid}/net by inheriting proc_net_dentry_ops.
[[email protected]: coding-style cleanups] Link: https://lkml.kernel.org/r/[email protected] Fixes: c6c75deda813 ("proc: fix lookup in /proc/net subdirectories after setns(2)") Signed-off-by: Alexey Dobriyan <[email protected]> Reported-by: hui li <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1 |
|
| #
6dfbbae1 |
| 22-Jan-2022 |
Muchun Song <[email protected]> |
fs: proc: store PDE()->data into inode->i_private
PDE_DATA(inode) is introduced to get user private data and hide the layout of struct proc_dir_entry. The inode->i_private is used to do the same th
fs: proc: store PDE()->data into inode->i_private
PDE_DATA(inode) is introduced to get user private data and hide the layout of struct proc_dir_entry. The inode->i_private is used to do the same thing as well. Save a copy of user private data to inode-> i_private when proc inode is allocated. This means the user also can get their private data by inode->i_private.
Introduce pde_data() to wrap inode->i_private so that we can remove PDE_DATA() from fs/proc/generic.c and make PTE_DATE() as a wrapper of pde_data(). It will be easier if we decide to remove PDE_DATE() in the future.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Alexey Gladkov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1 |
|
| #
b793cd9a |
| 07-May-2021 |
Alexey Dobriyan <[email protected]> |
proc: save LOC in __xlate_proc_name()
Can't look at this verbosity anymore.
Link: https://lkml.kernel.org/r/YFYXAp/[email protected] Signed-off-by: Alexey Dobriyan <[email protected]
proc: save LOC in __xlate_proc_name()
Can't look at this verbosity anymore.
Link: https://lkml.kernel.org/r/YFYXAp/[email protected] Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
f4bf74d8 |
| 07-May-2021 |
Colin Ian King <[email protected]> |
fs/proc/generic.c: fix incorrect pde_is_permanent check
Currently the pde_is_permanent() check is being run on root multiple times rather than on the next proc directory entry. This looks like a co
fs/proc/generic.c: fix incorrect pde_is_permanent check
Currently the pde_is_permanent() check is being run on root multiple times rather than on the next proc directory entry. This looks like a copy-paste error. Fix this by replacing root with next.
Addresses-Coverity: ("Copy-paste error") Link: https://lkml.kernel.org/r/[email protected] Fixes: d919b33dafb3 ("proc: faster open/read/close with "permanent" files") Signed-off-by: Colin Ian King <[email protected]> Acked-by: Christian Brauner <[email protected]> Reviewed-by: Alexey Dobriyan <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5 |
|
| #
549c7297 |
| 21-Jan-2021 |
Christian Brauner <[email protected]> |
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has b
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches.
As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods.
Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Howells <[email protected]> Cc: Al Viro <[email protected]> Cc: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
0d56a451 |
| 21-Jan-2021 |
Christian Brauner <[email protected]> |
stat: handle idmapped mounts
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped m
stat: handle idmapped mounts
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace before we store the uid and gid. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Howells <[email protected]> Cc: Al Viro <[email protected]> Cc: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: James Morris <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
| #
2f221d6f |
| 21-Jan-2021 |
Christian Brauner <[email protected]> |
attr: handle idmapped mounts
When file attributes are changed most filesystems rely on the setattr_prepare(), setattr_copy(), and notify_change() helpers for initialization and permission checking.
attr: handle idmapped mounts
When file attributes are changed most filesystems rely on the setattr_prepare(), setattr_copy(), and notify_change() helpers for initialization and permission checking. Let them handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Helpers that perform checks on the ia_uid and ia_gid fields in struct iattr assume that ia_uid and ia_gid are intended values and have already been mapped correctly at the userspace-kernelspace boundary as we already do today. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/[email protected] Cc: Christoph Hellwig <[email protected]> Cc: David Howells <[email protected]> Cc: Al Viro <[email protected]> Cc: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1 |
|
| #
c6c75ded |
| 16-Dec-2020 |
Alexey Dobriyan <[email protected]> |
proc: fix lookup in /proc/net subdirectories after setns(2)
Commit 1fde6f21d90f ("proc: fix /proc/net/* after setns(2)") only forced revalidation of regular files under /proc/net/
However, /proc/ne
proc: fix lookup in /proc/net subdirectories after setns(2)
Commit 1fde6f21d90f ("proc: fix /proc/net/* after setns(2)") only forced revalidation of regular files under /proc/net/
However, /proc/net/ is unusual in the sense of /proc/net/foo handlers take netns pointer from parent directory which is old netns.
Steps to reproduce:
(void)open("/proc/net/sctp/snmp", O_RDONLY); unshare(CLONE_NEWNET);
int fd = open("/proc/net/sctp/snmp", O_RDONLY); read(fd, &c, 1);
Read will read wrong data from original netns.
Patch forces lookup on every directory under /proc/net .
Link: https://lkml.kernel.org/r/[email protected] Fixes: 1da4d377f943 ("proc: revalidate misc dentries") Signed-off-by: Alexey Dobriyan <[email protected]> Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)" <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3 |
|
| #
b24c30c6 |
| 04-Nov-2020 |
Christoph Hellwig <[email protected]> |
proc "seq files": switch to ->read_iter
Implement ->read_iter for all proc "seq files" so that splice works on them.
Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Chri
proc "seq files": switch to ->read_iter
Implement ->read_iter for all proc "seq files" so that splice works on them.
Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
7cfc630e |
| 04-Nov-2020 |
Greg Kroah-Hartman <[email protected]> |
proc "single files": switch to ->read_iter
Implement ->read_iter for all proc "single files" so that more bionic tests cases can pass when they call splice() on other fun files like /proc/version
S
proc "single files": switch to ->read_iter
Implement ->read_iter for all proc "single files" so that more bionic tests cases can pass when they call splice() on other fun files like /proc/version
Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2 |
|
| #
6814ef2d |
| 19-Apr-2020 |
Alexey Gladkov <[email protected]> |
proc: add option to mount only a pids subset
This allows to hide all files and directories in the procfs that are not related to tasks.
Signed-off-by: Alexey Gladkov <[email protected]> Revi
proc: add option to mount only a pids subset
This allows to hide all files and directories in the procfs that are not related to tasks.
Signed-off-by: Alexey Gladkov <[email protected]> Reviewed-by: Alexey Dobriyan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Eric W. Biederman <[email protected]>
show more ...
|
|
Revision tags: v5.7-rc1 |
|
| #
d919b33d |
| 07-Apr-2020 |
Alexey Dobriyan <[email protected]> |
proc: faster open/read/close with "permanent" files
Now that "struct proc_ops" exist we can start putting there stuff which could not fly with VFS "struct file_operations"...
Most of fs/proc/inode.
proc: faster open/read/close with "permanent" files
Now that "struct proc_ops" exist we can start putting there stuff which could not fly with VFS "struct file_operations"...
Most of fs/proc/inode.c file is dedicated to make open/read/.../close reliable in the event of disappearing /proc entries which usually happens if module is getting removed. Files like /proc/cpuinfo which never disappear simply do not need such protection.
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such "permanent" files.
Enable "permanent" flag for
/proc/cpuinfo /proc/kmsg /proc/modules /proc/slabinfo /proc/stat /proc/sysvipc/* /proc/swaps
More will come once I figure out foolproof way to prevent out module authors from marking their stuff "permanent" for performance reasons when it is not.
This should help with scalability: benchmark is "read /proc/cpuinfo R times by N threads scattered over the system".
N R t, s (before) t, s (after) ----------------------------------------------------- 64 4096 1.582458 1.530502 -3.2% 256 4096 6.371926 6.125168 -3.9% 1024 4096 25.64888 24.47528 -4.6%
Benchmark source:
#include <chrono> #include <iostream> #include <thread> #include <vector>
#include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h>
const int NR_CPUS = sysconf(_SC_NPROCESSORS_ONLN); int N; const char *filename; int R;
int xxx = 0;
int glue(int n) { cpu_set_t m; CPU_ZERO(&m); CPU_SET(n, &m); return sched_setaffinity(0, sizeof(cpu_set_t), &m); }
void f(int n) { glue(n % NR_CPUS);
while (*(volatile int *)&xxx == 0) { }
for (int i = 0; i < R; i++) { int fd = open(filename, O_RDONLY); char buf[4096]; ssize_t rv = read(fd, buf, sizeof(buf)); asm volatile ("" :: "g" (rv)); close(fd); } }
int main(int argc, char *argv[]) { if (argc < 4) { std::cerr << "usage: " << argv[0] << ' ' << "N /proc/filename R "; return 1; }
N = atoi(argv[1]); filename = argv[2]; R = atoi(argv[3]);
for (int i = 0; i < NR_CPUS; i++) { if (glue(i) == 0) break; }
std::vector<std::thread> T; T.reserve(N); for (int i = 0; i < N; i++) { T.emplace_back(f, i); }
auto t0 = std::chrono::system_clock::now(); { *(volatile int *)&xxx = 1; for (auto& t: T) { t.join(); } } auto t1 = std::chrono::system_clock::now(); std::chrono::duration<double> dt = t1 - t0; std::cout << dt.count() << ' ';
return 0; }
P.S.: Explicit randomization marker is added because adding non-function pointer will silently disable structure layout randomization.
[[email protected]: coding style fixes] Reported-by: kbuild test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Al Viro <[email protected]> Cc: Joe Perches <[email protected]> Link: http://lkml.kernel.org/r/20200222201539.GA22576@avx2 Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1 |
|
| #
d56c0d45 |
| 04-Feb-2020 |
Alexey Dobriyan <[email protected]> |
proc: decouple proc from VFS with "struct proc_ops"
Currently core /proc code uses "struct file_operations" for custom hooks, however, VFS doesn't directly call them. Every time VFS expands file_op
proc: decouple proc from VFS with "struct proc_ops"
Currently core /proc code uses "struct file_operations" for custom hooks, however, VFS doesn't directly call them. Every time VFS expands file_operations hook set, /proc code bloats for no reason.
Introduce "struct proc_ops" which contains only those hooks which /proc allows to call into (open, release, read, write, ioctl, mmap, poll). It doesn't contain module pointer as well.
Save ~184 bytes per usage:
add/remove: 26/26 grow/shrink: 1/4 up/down: 1922/-6674 (-4752) Function old new delta sysvipc_proc_ops - 72 +72 ... config_gz_proc_ops - 72 +72 proc_get_inode 289 339 +50 proc_reg_get_unmapped_area 110 107 -3 close_pdeo 227 224 -3 proc_reg_open 289 284 -5 proc_create_data 60 53 -7 rt_cpu_seq_fops 256 - -256 ... default_affinity_proc_fops 256 - -256 Total: Before=5430095, After=5425343, chg -0.09%
Link: http://lkml.kernel.org/r/20191225172228.GA13378@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1 |
|
| #
5f6354ea |
| 05-Dec-2019 |
Alexey Dobriyan <[email protected]> |
fs/proc/generic.c: delete useless "len" variable
Pointer to next '/' encodes length of path element and next start position. Subtraction and increment are redundant.
Link: http://lkml.kernel.org/r
fs/proc/generic.c: delete useless "len" variable
Pointer to next '/' encodes length of path element and next start position. Subtraction and increment are redundant.
Link: http://lkml.kernel.org/r/20191004234521.GA30246@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
e06689bf |
| 05-Dec-2019 |
Alexey Dobriyan <[email protected]> |
proc: change ->nlink under proc_subdir_lock
Currently gluing PDE into global /proc tree is done under lock, but changing ->nlink is not. Additionally struct proc_dir_entry::nlink is not atomic so u
proc: change ->nlink under proc_subdir_lock
Currently gluing PDE into global /proc tree is done under lock, but changing ->nlink is not. Additionally struct proc_dir_entry::nlink is not atomic so updates can be lost.
Link: http://lkml.kernel.org/r/20190925202436.GA17388@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1 |
|
| #
457c8996 |
| 19-May-2019 |
Thomas Gleixner <[email protected]> |
treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was use
treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|