History log of /linux-6.15/fs/proc/base.c (Results 1 – 25 of 605)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14
# 6287fbad 19-Mar-2025 Bart Van Assche <[email protected]>

fs/procfs: fix the comment above proc_pid_wchan()

proc_pid_wchan() used to report kernel addresses to user space but that is
no longer the case today. Bring the comment above proc_pid_wchan() in
sy

fs/procfs: fix the comment above proc_pid_wchan()

proc_pid_wchan() used to report kernel addresses to user space but that is
no longer the case today. Bring the comment above proc_pid_wchan() in
sync with the implementation.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: b2f73922d119 ("fs/proc, core/debug: Don't expose absolute kernel addresses via wchan")
Signed-off-by: Bart Van Assche <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# dd5bdaf2 17-Mar-2025 Ingo Molnar <[email protected]>

sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional

All the big Linux distros enable CONFIG_SCHED_DEBUG, because
the various features it provides help not just with kernel
development,

sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional

All the big Linux distros enable CONFIG_SCHED_DEBUG, because
the various features it provides help not just with kernel
development, but with system administration and user-space
software development as well.

Reflect this reality and enable this functionality
unconditionally.

Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Shrikanth Hegde <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.14-rc7, v6.14-rc6
# 2dc4dbf8 08-Mar-2025 Thomas Gleixner <[email protected]>

posix-timers: Dont iterate /proc/$PID/timers with sighand:: Siglock held

The readout of /proc/$PID/timers holds sighand::siglock with interrupts
disabled. That is required to protect against concurr

posix-timers: Dont iterate /proc/$PID/timers with sighand:: Siglock held

The readout of /proc/$PID/timers holds sighand::siglock with interrupts
disabled. That is required to protect against concurrent modifications of
the task::signal::posix_timers list because the list is not RCU safe.

With the conversion of the timer storage to a RCU protected hlist, this is
not longer required.

The only requirement is to protect the returned entry against a concurrent
free, which is trivial as the timers are RCU protected.

Removing the trylock of sighand::siglock is benign because the life time of
task_struct::signal is bound to the life time of the task_struct itself.

There are two scenarios where this matters:

1) The process is life and not about to be checkpointed

2) The process is stopped via ptrace for checkpointing

#1 is a racy snapshot of the armed timers and nothing can rely on it. It's
not more than debug information and it has been that way before because
sighand lock is dropped when the buffer is full and the restart of
the iteration might find a completely different set of timers.

The task and therefore task::signal cannot be freed as timers_start()
acquired a reference count via get_pid_task().

#2 the process is stopped for checkpointing so nothing can delete or create
timers at this point. Neither can the process exit during the traversal.

If CRIU fails to observe an exit in progress prior to the dissimination
of the timers, then there are more severe problems to solve in the CRIU
mechanics as they can't rely on posix timers being enabled in the first
place.

Therefore replace the lock acquisition with rcu_read_lock() and switch the
timer storage traversal over to seq_hlist_*_rcu().

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


Revision tags: v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2
# 5be1fa8a 08-Dec-2024 Al Viro <[email protected]>

Pass parent directory inode and expected name to ->d_revalidate()

->d_revalidate() often needs to access dentry parent and name; that has
to be done carefully, since the locking environment varies f

Pass parent directory inode and expected name to ->d_revalidate()

->d_revalidate() often needs to access dentry parent and name; that has
to be done carefully, since the locking environment varies from caller
to caller. We are not guaranteed that dentry in question will not be
moved right under us - not unless the filesystem is such that nothing
on it ever gets renamed.

It can be dealt with, but that results in boilerplate code that isn't
even needed - the callers normally have just found the dentry via dcache
lookup and want to verify that it's in the right place; they already
have the values of ->d_parent and ->d_name stable. There is a couple
of exceptions (overlayfs and, to less extent, ecryptfs), but for the
majority of calls that song and dance is not needed at all.

It's easier to make ecryptfs and overlayfs find and pass those values if
there's a ->d_revalidate() instance to be called, rather than doing that
in the instances.

This commit only changes the calling conventions; making use of supplied
values is left to followups.

NOTE: some instances need more than just the parent - things like CIFS
may need to build an entire path from filesystem root, so they need
more precautions than the usual boilerplate. This series doesn't
do anything to that need - these filesystems have to keep their locking
mechanisms (rename_lock loops, use of dentry_path_raw(), private rwsem
a-la v9fs).

One thing to keep in mind when using name is that name->name will normally
point into the pathname being resolved; the filename in question occupies
name->len bytes starting at name->name, and there is NUL somewhere after it,
but it the next byte might very well be '/' rather than '\0'. Do not
ignore name->len.

Reviewed-by: Jeff Layton <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Al Viro <[email protected]>

show more ...


# 3ab76c76 10-Jan-2025 xu xin <[email protected]>

ksm: add ksm involvement information for each process

In /proc/<pid>/ksm_stat, add two extra ksm involvement items including
KSM_mergeable and KSM_merge_any. It helps administrators to better know

ksm: add ksm involvement information for each process

In /proc/<pid>/ksm_stat, add two extra ksm involvement items including
KSM_mergeable and KSM_merge_any. It helps administrators to better know
the system's KSM behavior at process level.

ksm_merge_any: yes/no
whether the process'mm is added by prctl() into the candidate list
of KSM or not, and fully enabled at process level.

ksm_mergeable: yes/no
whether any VMAs of the process'mm are currently applicable to KSM.

Purpose
=======
These two items are just to improve the observability of KSM at process
level, so that users can know if a certain process has enabled KSM.

For example, if without these two items, when we look at
/proc/<pid>/ksm_stat and there's no merging pages found, We are not sure
whether it is because KSM was not enabled or because KSM did not
successfully merge any pages.

Although "mg" in /proc/<pid>/smaps indicate VM_MERGEABLE, it's opaque
and not very obvious for non professionals.

[[email protected]: wording tweaks, per David and akpm]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: xu xin <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Tested-by: Mario Casquero <[email protected]>
Cc: Wang Yaxin <[email protected]>
Cc: Yang Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.13-rc1, v6.12, v6.12-rc7
# 6017a158 05-Nov-2024 Thomas Gleixner <[email protected]>

posix-timers: Embed sigqueue in struct k_itimer

To cure the SIG_IGN handling for posix interval timers, the preallocated
sigqueue needs to be embedded into struct k_itimer to prevent life time
races

posix-timers: Embed sigqueue in struct k_itimer

To cure the SIG_IGN handling for posix interval timers, the preallocated
sigqueue needs to be embedded into struct k_itimer to prevent life time
races of all sorts.

Now that the prerequisites are in place, embed the sigqueue into struct
k_itimer and fixup the relevant usage sites.

Aside of preparing for proper SIG_IGN handling, this spares an extra
allocation.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


Revision tags: v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1
# cd3f8467 24-Sep-2024 Lorenzo Stoakes <[email protected]>

mm: refactor mm_access() to not return NULL

mm_access() can return NULL if the mm is not found, but this is handled
the same as an error in all callers, with some translating this into an
-ESRCH err

mm: refactor mm_access() to not return NULL

mm_access() can return NULL if the mm is not found, but this is handled
the same as an error in all callers, with some translating this into an
-ESRCH error.

Only proc_mem_open() returns NULL if no mm is found, however in this case
it is clearer and makes more sense to explicitly handle the error.
Additionally we take the opportunity to refactor the function to eliminate
unnecessary nesting.

Simplify things by simply returning -ESRCH if no mm is found - this both
eliminates confusing use of the IS_ERR_OR_NULL() macro, and simplifies
callers which would return -ESRCH by returning this error directly.

[[email protected]: prefer neater pointer error comparison]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Lorenzo Stoakes <[email protected]>
Suggested-by: Arnd Bergmann <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3
# be5498ca 03-Jun-2024 Al Viro <[email protected]>

remove pointless includes of <linux/fdtable.h>

some of those used to be needed, some had been cargo-culted for
no reason...

Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Al Vir

remove pointless includes of <linux/fdtable.h>

some of those used to be needed, some had been cargo-culted for
no reason...

Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Al Viro <[email protected]>

show more ...


# b4dba2ef 30-Aug-2024 Christian Brauner <[email protected]>

proc: store cookie in private data

Store the cookie to detect concurrent seeks on directories in
file->private_data.

Link: https://lore.kernel.org/r/20240830-vfs-file-f_version-v1-14-6d3e4816aa7b@k

proc: store cookie in private data

Store the cookie to detect concurrent seeks on directories in
file->private_data.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 641bb439 09-Aug-2024 Christian Brauner <[email protected]>

fs: move FMODE_UNSIGNED_OFFSET to fop_flags

This is another flag that is statically set and doesn't need to use up
an FMODE_* bit. Move it to ->fop_flags and free up another FMODE_* bit.

(1) mem_op

fs: move FMODE_UNSIGNED_OFFSET to fop_flags

This is another flag that is statically set and doesn't need to use up
an FMODE_* bit. Move it to ->fop_flags and free up another FMODE_* bit.

(1) mem_open() used from proc_mem_operations
(2) adi_open() used from adi_fops
(3) drm_open_helper():
(3.1) accel_open() used from DRM_ACCEL_FOPS
(3.2) drm_open() used from
(3.2.1) amdgpu_driver_kms_fops
(3.2.2) psb_gem_fops
(3.2.3) i915_driver_fops
(3.2.4) nouveau_driver_fops
(3.2.5) panthor_drm_driver_fops
(3.2.6) radeon_driver_kms_fops
(3.2.7) tegra_drm_fops
(3.2.8) vmwgfx_driver_fops
(3.2.9) xe_driver_fops
(3.2.10) DRM_GEM_FOPS
(3.2.11) DEFINE_DRM_GEM_DMA_FOPS
(4) struct memdev sets fmode flags based on type of device opened. For
devices using struct mem_fops unsigned offset is used.

Mark all these file operations as FOP_UNSIGNED_OFFSET and add asserts
into the open helper to ensure that the flag is always set.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Jeff Layton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 3836b31c 06-Aug-2024 Christian Brauner <[email protected]>

proc: block mounting on top of /proc/<pid>/map_files/*

Entries under /proc/<pid>/map_files/* are ephemeral and may go away
before the process dies. As such allowing them to be used as mount
points c

proc: block mounting on top of /proc/<pid>/map_files/*

Entries under /proc/<pid>/map_files/* are ephemeral and may go away
before the process dies. As such allowing them to be used as mount
points creates the ability to leak mounts that linger until the process
dies with no ability to unmount them until then. Don't allow using them
as mountpoints.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 41e8149c 02-Aug-2024 Adrian Ratiu <[email protected]>

proc: add config & param to block forcing mem writes

This adds a Kconfig option and boot param to allow removing
the FOLL_FORCE flag from /proc/pid/mem write calls because
it can be abused.

The tra

proc: add config & param to block forcing mem writes

This adds a Kconfig option and boot param to allow removing
the FOLL_FORCE flag from /proc/pid/mem write calls because
it can be abused.

The traditional forcing behavior is kept as default because
it can break GDB and some other use cases.

Previously we tried a more sophisticated approach allowing
distributions to fine-tune /proc/pid/mem behavior, however
that got NAK-ed by Linus [1], who prefers this simpler
approach with semantics also easier to understand for users.

Link: https://lore.kernel.org/lkml/CAHk-=wiGWLChxYmUA5HrT5aopZrB7_2VTa0NLZcxORgkUe5tEQ@mail.gmail.com/ [1]
Cc: Doug Anderson <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Christian Brauner <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Adrian Ratiu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# ed4fb6d7 14-Aug-2024 Felix Moessbauer <[email protected]>

hrtimer: Use and report correct timerslack values for realtime tasks

The timerslack_ns setting is used to specify how much the hardware
timers should be delayed, to potentially dispatch multiple tim

hrtimer: Use and report correct timerslack values for realtime tasks

The timerslack_ns setting is used to specify how much the hardware
timers should be delayed, to potentially dispatch multiple timers in a
single interrupt. This is a performance optimization. Timers of
realtime tasks (having a realtime scheduling policy) should not be
delayed.

This logic was inconsitently applied to the hrtimers, leading to delays
of realtime tasks which used timed waits for events (e.g. condition
variables). Due to the downstream override of the slack for rt tasks,
the procfs reported incorrect (non-zero) timerslack_ns values.

This is changed by setting the timer_slack_ns task attribute to 0 for
all tasks with a rt policy. By that, downstream users do not need to
specially handle rt tasks (w.r.t. the slack), and the procfs entry
shows the correct value of "0". Setting non-zero slack values (either
via procfs or PR_SET_TIMERSLACK) on tasks with a rt policy is ignored,
as stated in "man 2 PR_SET_TIMERSLACK":

Timer slack is not applied to threads that are scheduled under a
real-time scheduling policy (see sched_setscheduler(2)).

The special handling of timerslack on rt tasks in downstream users
is removed as well.

Signed-off-by: Felix Moessbauer <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# 52dea0a1 10-Jun-2024 Thomas Gleixner <[email protected]>

posix-timers: Convert timer list to hlist

No requirement for a real list. Spare a few bytes.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Frederic Weisbecker <frederic@kernel.

posix-timers: Convert timer list to hlist

No requirement for a real list. Spare a few bytes.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

show more ...


Revision tags: v6.10-rc2
# c2dc78b8 28-May-2024 Chengming Zhou <[email protected]>

mm/ksm: fix ksm_zero_pages accounting

We normally ksm_zero_pages++ in ksmd when page is merged with zero page,
but ksm_zero_pages-- is done from page tables side, where there is no any
accessing pro

mm/ksm: fix ksm_zero_pages accounting

We normally ksm_zero_pages++ in ksmd when page is merged with zero page,
but ksm_zero_pages-- is done from page tables side, where there is no any
accessing protection of ksm_zero_pages.

So we can read very exceptional value of ksm_zero_pages in rare cases,
such as -1, which is very confusing to users.

Fix it by changing to use atomic_long_t, and the same case with the
mm->ksm_zero_pages.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Fixes: 6080d19f0704 ("ksm: add ksm zero pages for each process")
Signed-off-by: Chengming Zhou <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Ran Xiaokai <[email protected]>
Cc: Stefan Roesch <[email protected]>
Cc: xu xin <[email protected]>
Cc: Yang Yang <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3
# 47458802 20-Sep-2023 Al Viro <[email protected]>

procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()

that keeps both around until struct inode is freed, making access
to them safe from rcu-pathwalk

Acked-by: Christian Brauner

procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()

that keeps both around until struct inode is freed, making access
to them safe from rcu-pathwalk

Acked-by: Christian Brauner <[email protected]>
Signed-off-by: Al Viro <[email protected]>

show more ...


Revision tags: v6.6-rc2
# 267c068e 12-Sep-2023 Casey Schaufler <[email protected]>

proc: Use lsmids instead of lsm names for attrs

Use the LSM ID number instead of the LSM name to identify which
security module's attibute data should be shown in /proc/self/attr.
The security_[gs]e

proc: Use lsmids instead of lsm names for attrs

Use the LSM ID number instead of the LSM name to identify which
security module's attibute data should be shown in /proc/self/attr.
The security_[gs]etprocattr() functions have been changed to expect
the LSM ID. The change from a string comparison to an integer comparison
in these functions will provide a minor performance improvement.

Cc: [email protected]
Signed-off-by: Casey Schaufler <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Serge Hallyn <[email protected]>
Reviewed-by: Mickael Salaun <[email protected]>
Reviewed-by: John Johansen <[email protected]>
Signed-off-by: Paul Moore <[email protected]>

show more ...


# 63993102 26-Oct-2023 Yang Li <[email protected]>

fs/proc/base.c: remove unneeded semicolon

./fs/proc/base.c:3829:2-3: Unneeded semicolon

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Li <yang

fs/proc/base.c: remove unneeded semicolon

./fs/proc/base.c:3829:2-3: Unneeded semicolon

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Li <[email protected]>
Reported-by: Abaci Robot <[email protected]>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7057
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 1df4bd83 23-Oct-2023 Oleg Nesterov <[email protected]>

do_io_accounting: use sig->stats_lock

Rather than lock_task_sighand(), sig->stats_lock was specifically designed
for this type of use.

This way the "if (whole)" branch runs lockless in the likely c

do_io_accounting: use sig->stats_lock

Rather than lock_task_sighand(), sig->stats_lock was specifically designed
for this type of use.

This way the "if (whole)" branch runs lockless in the likely case.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 23202220 23-Oct-2023 Oleg Nesterov <[email protected]>

do_io_accounting: use __for_each_thread()

Rather than while_each_thread() which should be avoided when possible.

This makes the code more clear and allows the next change.

Link: https://lkml.kerne

do_io_accounting: use __for_each_thread()

Rather than while_each_thread() which should be avoided when possible.

This makes the code more clear and allows the next change.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 08582d67 09-Oct-2023 Amir Goldstein <[email protected]>

fs: create helper file_user_path() for user displayed mapped file path

Overlayfs uses backing files with "fake" overlayfs f_path and "real"
underlying f_inode, in order to use underlying inode aops

fs: create helper file_user_path() for user displayed mapped file path

Overlayfs uses backing files with "fake" overlayfs f_path and "real"
underlying f_inode, in order to use underlying inode aops for mapped
files and to display the overlayfs path in /proc/<pid>/maps.

In preparation for storing the overlayfs "fake" path instead of the
underlying "real" path in struct backing_file, define a noop helper
file_user_path() that returns f_path for now.

Use the new helper in procfs and kernel logs whenever a path of a
mapped file is displayed to users.

Signed-off-by: Amir Goldstein <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 860a2e7f 29-Sep-2023 Alexey Dobriyan <[email protected]>

proc: use initializer for clearing some buffers

Save LOC by using dark magic of initialisation instead of memset().

Those buffer aren't passed to userspace directly so padding is not
an issue.

Lin

proc: use initializer for clearing some buffers

Save LOC by using dark magic of initialisation instead of memset().

Those buffer aren't passed to userspace directly so padding is not
an issue.

Link: https://lkml.kernel.org/r/3821d3a2-6e10-4629-b0d5-9519d828ab72@p183
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 200d9421 04-Oct-2023 Jeff Layton <[email protected]>

proc: convert to new timestamp accessors

Convert to using the new inode timestamp accessor functions.

Signed-off-by: Jeff Layton <[email protected]>
Link: https://lore.kernel.org/r/20231004185347.

proc: convert to new timestamp accessors

Convert to using the new inode timestamp accessor functions.

Signed-off-by: Jeff Layton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.6-rc1, v6.5
# 33a98138 24-Aug-2023 Oleg Nesterov <[email protected]>

introduce __next_thread(), fix next_tid() vs exec() race

Patch series "introduce __next_thread(), change next_thread()".

After commit dce8f8ed1de1 ("document while_each_thread(), change
first_tid()

introduce __next_thread(), fix next_tid() vs exec() race

Patch series "introduce __next_thread(), change next_thread()".

After commit dce8f8ed1de1 ("document while_each_thread(), change
first_tid() to use for_each_thread()") + this series

1. We have only one lockless user of next_thread(), task_group_seq_get_next().
I think it should be changed too.

2. We have only one user of task_struct->thread_group, thread_group_empty().
The next patches will change thread_group_empty() and kill ->thread_group.


This patch (of 2):

next_tid(start) does:

rcu_read_lock();
if (pid_alive(start)) {
pos = next_thread(start);
if (thread_group_leader(pos))
pos = NULL;
else
get_task_struct(pos);

it should return pos = NULL when next_thread() wraps to the 1st thread
in the thread group, group leader, and the thread_group_leader() check
tries to detect this case.

But this can race with exec. To simplify, suppose we have a main thread
M and a single sub-thread T, next_tid(T) should return NULL.

Now suppose that T execs. If next_tid(T) is called after T changes the
leadership and before it does release_task() which removes the old leader
from list, then next_thread() returns M and thread_group_leader(M) = F.

Lockless use of next_thread() should be avoided. After this change only
task_group_seq_get_next() does this, and I believe it should be changed
as well.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# dce8f8ed 23-Aug-2023 Oleg Nesterov <[email protected]>

document while_each_thread(), change first_tid() to use for_each_thread()

Add the comment to explain that while_each_thread(g,t) is not rcu-safe
unless g is stable (e.g. current). Even if g is a g

document while_each_thread(), change first_tid() to use for_each_thread()

Add the comment to explain that while_each_thread(g,t) is not rcu-safe
unless g is stable (e.g. current). Even if g is a group leader and thus
can't exit before t, t or another sub-thread can exec and remove g from
the thread_group list.

The only lockless user of while_each_thread() is first_tid() and it is
fine in that it can't loop forever, yet for_each_thread() looks better and
I am going to change while_each_thread/next_thread.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


12345678910>>...25