History log of /linux-6.15/include/linux/tick.h (Results 1 – 25 of 96)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6
# 3b1596a2 29-Oct-2024 Frederic Weisbecker <[email protected]>

clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING

The way the clockevent devices are finally stopped while a CPU is
offlining is currently chaotic. The layout being by

clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING

The way the clockevent devices are finally stopped while a CPU is
offlining is currently chaotic. The layout being by order:

1) tick_sched_timer_dying() stops the tick and the underlying clockevent
but only for oneshot case. The periodic tick and its related
clockevent still runs.

2) tick_broadcast_offline() detaches and stops the per-cpu oneshot
broadcast and append it to the released list.

3) Some individual clockevent drivers stop the clockevents (a second time if
the tick is oneshot)

4) Once the CPU is dead, a control CPU remotely detaches and stops
(a 3rd time if oneshot mode) the CPU clockevent and adds it to the
released list.

5) The released list containing the broadcast device released on step 2)
and the remotely detached clockevent from step 4) are unregistered.

These random events can be factorized if the current clockevent is
detached and stopped by the dying CPU at the generic layer, that is
from the dying CPU:

a) Stop the tick
b) Stop/detach the underlying per-cpu oneshot broadcast clockevent
c) Stop/detach the underlying clockevent
d) Release / unregister the clockevents from b) and c)
e) Release / unregister the remaining clockevents from the dying CPU.
This part could be performed by the dying CPU

This way the drivers and the tick layer don't need to care about
clockevent operations during cpuhotplug down. This also unifies the tick
behaviour on offline CPUs between oneshot and periodic modes, avoiding
offline ticks altogether for sanity.

Adopt the simplification.

[ tglx: Remove the WARN_ON() in clockevents_register_device() as that
is called from an upcoming CPU before the CPU is marked online ]

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


Revision tags: v6.12-rc5
# b5413156 26-Oct-2024 Benjamin Segall <[email protected]>

posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone

When cloning a new thread, its posix_cputimers are not inherited, and
are cleared by posix_cputimers_init(). However, this does not clear th

posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone

When cloning a new thread, its posix_cputimers are not inherited, and
are cleared by posix_cputimers_init(). However, this does not clear the
tick dependency it creates in tsk->tick_dep_mask, and the handler does
not reach the code to clear the dependency if there were no timers to
begin with.

Thus if a thread has a cputimer running before clone/fork, all
descendants will prevent nohz_full unless they create a cputimer of
their own.

Fix this by entirely clearing the tick_dep_mask in copy_process().
(There is currently no inherited state that needs a tick dependency)

Process-wide timers do not have this problem because fork does not copy
signal_struct as a baseline, it creates one from scratch.

Fixes: b78783000d5c ("posix-cpu-timers: Migrate to use new tick dependency mask model")
Signed-off-by: Ben Segall <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/all/[email protected]

show more ...


Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5
# 9403408e 17-Jun-2024 Christian Loehle <[email protected]>

tick: Remove unnused tick_nohz_get_idle_calls()

The function returns the idle calls counter for the current cpu and
therefore usually isn't what the caller wants. It is unnused since
commit 466a2b42

tick: Remove unnused tick_nohz_get_idle_calls()

The function returns the idle calls counter for the current cpu and
therefore usually isn't what the caller wants. It is unnused since
commit 466a2b42d676 ("cpufreq: schedutil: Use idle_calls counter of the
remote CPU")

Signed-off-by: Christian Loehle <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7
# 2be2a197 29-Feb-2024 Thomas Gleixner <[email protected]>

sched/idle: Conditionally handle tick broadcast in default_idle_call()

The x86 architecture has an idle routine for AMD CPUs which are affected
by erratum 400. On the affected CPUs the local APIC ti

sched/idle: Conditionally handle tick broadcast in default_idle_call()

The x86 architecture has an idle routine for AMD CPUs which are affected
by erratum 400. On the affected CPUs the local APIC timer stops in the
C1E halt state.

It therefore requires tick broadcasting. The invocation of
tick_broadcast_enter()/exit() from this function violates the RCU
constraints because it can end up in lockdep or tracing, which
rightfully triggers a warning.

tick_broadcast_enter()/exit() must be invoked before ct_cpuidle_enter()
and after ct_cpuidle_exit() in default_idle_call().

Add a static branch conditional invocation of tick_broadcast_enter()/exit()
into this function to allow X86 to replace the AMD specific idle code. It's
guarded by a config switch which will be selected by x86. Otherwise it's
a NOOP.

Reported-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.8-rc6
# 500f8f9b 25-Feb-2024 Frederic Weisbecker <[email protected]>

tick: Assume timekeeping is correctly handed over upon last offline idle call

The timekeeping duty is handed over from the outgoing CPU on stop
machine, then the oneshot tick is stopped right after.

tick: Assume timekeeping is correctly handed over upon last offline idle call

The timekeeping duty is handed over from the outgoing CPU on stop
machine, then the oneshot tick is stopped right after. Therefore it's
guaranteed that the current CPU isn't the timekeeper upon its last call
to idle.

Besides, calling tick_nohz_idle_stop_tick() while the dying CPU goes
into idle suggests that the tick is going to be stopped while it is
actually stopped already from the appropriate CPU hotplug state.

Remove the confusing call and the obsolete case handling and convert it
to a sanity check that verifies the above assumption.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# ef8969bb 25-Feb-2024 Frederic Weisbecker <[email protected]>

tick: Move broadcast cancellation up to CPUHP_AP_TICK_DYING

The broadcast shutdown code is executed through a random explicit call
within stop machine from the outgoing CPU.

However the tick broadc

tick: Move broadcast cancellation up to CPUHP_AP_TICK_DYING

The broadcast shutdown code is executed through a random explicit call
within stop machine from the outgoing CPU.

However the tick broadcast is a midware between the tick callback and
the clocksource, therefore it makes more sense to shut it down after the
tick callback and before the clocksource drivers.

Move it instead to the common tick shutdown CPU hotplug state where
related operations can be ordered from highest to lowest level.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 3ad6eb06 25-Feb-2024 Frederic Weisbecker <[email protected]>

tick: Start centralizing tick related CPU hotplug operations

During the CPU offlining process, the various timer tick features are
shut down from scattered places, sometimes from teardown callbacks

tick: Start centralizing tick related CPU hotplug operations

During the CPU offlining process, the various timer tick features are
shut down from scattered places, sometimes from teardown callbacks on
stop machine, sometimes through explicit calls, sometimes from the
control CPU after the CPU died. The reason why these shutdown operations
are spread around is not always clear and it makes the tick lifecycle
hard to follow.

The tick should be shut down in order from highest to lowest level:

On stop machine from the dying CPU (high-level):

1) Hand-over the timekeeping duty (tick_handover_do_timer())
2) Cancel the tick implementation called by the clockevent callback
(tick_cancel_sched_timer())
3) Shutdown broadcasting (tick_offline_cpu() / tick_broadcast_offline())

On stop machine from the dying CPU (low-level):

4) Shutdown clockevents drivers (CPUHP_AP_*_TIMER_STARTING states)

From the control CPU after the CPU died (low-level):

5) Shutdown/unregister/cleanup clockevents for the dead CPU
(tick_cleanup_dead_cpu())

Instead the current order is 2, 4 (both from CPU hotplug states), then
1 and 3 through direct calls. This layout and order don't make much
sense. The operations 1, 2, 3 should be gathered together and in order.

Sort this situation with creating a new TICK shut-down CPU hotplug state
and start with introducing the timekeeping duty hand-over there. The
state must precede hrtimers migration because the tick hrtimer will be
stopped from it in a further patch.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.8-rc5
# 31a5c0b7 13-Feb-2024 James Morse <[email protected]>

tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef

tick_nohz_full_mask lists the CPUs that are nohz_full. This is only needed when
CONFIG_NO_HZ_FULL is defined. tick_nohz_full_cpu()

tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef

tick_nohz_full_mask lists the CPUs that are nohz_full. This is only needed when
CONFIG_NO_HZ_FULL is defined. tick_nohz_full_cpu() allows a specific CPU to be
tested against the mask, and evaluates to false when CONFIG_NO_HZ_FULL is not
defined.

The resctrl code needs to pick a CPU to run some work on, a new helper prefers
housekeeping CPUs by examining the tick_nohz_full_mask. Hiding the declaration
behind #ifdef CONFIG_NO_HZ_FULL forces all the users to be behind an #ifdef
too.

Move the tick_nohz_full_mask declaration, this lets callers drop the #ifdef,
and guard access to tick_nohz_full_mask with IS_ENABLED() or something like
tick_nohz_full_cpu().

The definition does not need to be moved as any callers should be removed at
compile time unless CONFIG_NO_HZ_FULL is defined.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Reinette Chatre <[email protected]> # for resctrl dependency
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov (AMD) <[email protected]>

show more ...


Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2
# c02a427f 12-Sep-2023 Xueshi Hu <[email protected]>

tick/nohz: Remove unused tick_nohz_idle_stop_tick_protected()

All the caller has been removed since commit 336f560a8917 ("x86/xen: don't
let xen_pv_play_dead() return")

Signed-off-by: Xueshi Hu <xu

tick/nohz: Remove unused tick_nohz_idle_stop_tick_protected()

All the caller has been removed since commit 336f560a8917 ("x86/xen: don't
let xen_pv_play_dead() return")

Signed-off-by: Xueshi Hu <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6
# 58d76682 24-Jan-2023 Joel Fernandes (Google) <[email protected]>

tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem

For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for th

tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem

For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
torture tests that do offlining to end up trying to offline this CPU causing
test failures. Such failure happens on all architectures.

Fix the repeated error messages thrown by this (even if the hotplug errors are
harmless) by asking the opinion of the nohz subsystem on whether the CPU can be
hotplugged.

[ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ]

For drivers/base/ portion:
Acked-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Zhouyi Zhou <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: rcu <[email protected]>
Cc: [email protected]
Fixes: 2987557f52b9 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel")
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>

show more ...


Revision tags: v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4
# f268c373 27-May-2021 Frederic Weisbecker <[email protected]>

tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed

Checking for and processing RCU-nocb deferred wakeup upon user/guest
entry is only relevant when nohz_full runs on the l

tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed

Checking for and processing RCU-nocb deferred wakeup upon user/guest
entry is only relevant when nohz_full runs on the local CPU, otherwise
the periodic tick should take care of it.

Make sure we don't needlessly pollute these fast-paths as a -3%
performance regression on a will-it-scale.per_process_ops has been
reported so far.

Fixes: 47b8ff194c1f (entry: Explicitly flush pending rcuog wakeup before last rescheduling point)
Fixes: 4ae7dc97f726 (entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point)
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.13-rc3, v5.13-rc2
# 1e4ca26d 12-May-2021 Marcelo Tosatti <[email protected]>

tick/nohz: Change signal tick dependency to wake up CPUs of member tasks

Rather than waking up all nohz_full CPUs on the system, only wake up
the target CPUs of member threads of the signal.

Reduce

tick/nohz: Change signal tick dependency to wake up CPUs of member tasks

Rather than waking up all nohz_full CPUs on the system, only wake up
the target CPUs of member threads of the signal.

Reduces interruptions to nohz_full CPUs.

Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# f105dfec 12-May-2021 Peter Zijlstra <[email protected]>

tick/nohz: Evaluate the CPU expression after the static key

When tick_nohz_full_cpu() is called with smp_processor_id(), the latter
is unconditionally evaluated whether the static key is on or off.

tick/nohz: Evaluate the CPU expression after the static key

When tick_nohz_full_cpu() is called with smp_processor_id(), the latter
is unconditionally evaluated whether the static key is on or off. It is
not necessary in the off-case though, so make sure the cpu expression
is executed at the last moment.

Illustrate with the following test function:

int tick_nohz_test(void)
{
return tick_nohz_full_cpu(smp_processor_id());
}

The resulting code before was:

mov %gs:0x7eea92d1(%rip),%eax # smp_processor_id() fetch
nopl 0x0(%rax,%rax,1)
xor %eax,%eax
retq
cmpb $0x0,0x29d393a(%rip) # <tick_nohz_full_running>
je tick_nohz_test+0x29 # jump to below eax clear
mov %eax,%eax
bt %rax,0x29d3936(%rip) # <tick_nohz_full_mask>
setb %al
movzbl %al,%eax
retq
xor %eax,%eax
retq

Now it becomes:

nopl 0x0(%rax,%rax,1)
xor %eax,%eax
retq
cmpb $0x0,0x29d3871(%rip) # <tick_nohz_full_running>
je tick_nohz_test+0x29 # jump to below eax clear
mov %gs:0x7eea91f0(%rip),%eax # smp_processor_id() fetch, after static key
mov %eax,%eax
bt %rax,0x29d3866(%rip) # <tick_nohz_full_mask>
setb %al
movzbl %al,%eax
retq
xor %eax,%eax
retq

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1
# df1e849a 28-Nov-2019 Paul E. McKenney <[email protected]>

rcu: Enable tick for nohz_full CPUs slow to provide expedited QS

An expedited grace period can be stalled by a nohz_full CPU looping
in kernel context. This possibility is currently handled by some

rcu: Enable tick for nohz_full CPUs slow to provide expedited QS

An expedited grace period can be stalled by a nohz_full CPU looping
in kernel context. This possibility is currently handled by some
carefully crafted checks in rcu_read_unlock_special() that enlist help
from ksoftirqd when permitted by the scheduler. However, it is exactly
these checks that require the scheduler avoid holding any of its rq or
pi locks across rcu_read_unlock() without also having held them across
the entire RCU read-side critical section.

It would therefore be very nice if expedited grace periods could
handle nohz_full CPUs looping in kernel context without such checks.
This commit therefore adds code to the expedited grace period's wait
and cleanup code that forces the scheduler-clock interrupt on for CPUs
that fail to quickly supply a quiescent state. "Quickly" is currently
a hard-coded single-jiffy delay.

Signed-off-by: Paul E. McKenney <[email protected]>

show more ...


Revision tags: v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4
# 74c57875 16-Oct-2019 Frederic Weisbecker <[email protected]>

context_tracking: Rename context_tracking_is_enabled() => context_tracking_enabled()

Remove the superfluous "is" in the middle of the name. We want to
standardize the naming so that it can be expand

context_tracking: Rename context_tracking_is_enabled() => context_tracking_enabled()

Remove the superfluous "is" in the middle of the name. We want to
standardize the naming so that it can be expanded through suffixes:

context_tracking_enabled()
context_tracking_enabled_cpu()
context_tracking_enabled_this_cpu()

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Jacek Anaszewski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J . Wysocki <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Viresh Kumar <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Yauheni Kaliuta <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2
# 01b4c399 24-Jul-2019 Frederic Weisbecker <[email protected]>

nohz: Add TICK_DEP_BIT_RCU

If a nohz_full CPU is looping in the kernel, the scheduling-clock tick
might nevertheless remain disabled. In !PREEMPT kernels, this can
prevent RCU's attempts to enlist

nohz: Add TICK_DEP_BIT_RCU

If a nohz_full CPU is looping in the kernel, the scheduling-clock tick
might nevertheless remain disabled. In !PREEMPT kernels, this can
prevent RCU's attempts to enlist the aid of that CPU's executions of
cond_resched(), which can in turn result in an arbitrarily delayed grace
period and thus an OOM. RCU therefore needs a way to enable a holdout
nohz_full CPU's scheduler-clock interrupt.

This commit therefore provides a new TICK_DEP_BIT_RCU value which RCU can
pass to tick_dep_set_cpu() and friends to force on the scheduler-clock
interrupt for a specified CPU or task. In some cases, rcutorture needs
to turn on the scheduler-clock tick, so this commit also exports the
relevant symbols to GPL-licensed modules.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

show more ...


Revision tags: v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3
# 6f9b83ac 27-Mar-2019 Ulf Hansson <[email protected]>

cpuidle: Export the next timer expiration for CPUs

To be able to predict the sleep duration for a CPU entering idle, it
is essential to know the expiration time of the next timer. Both the
teo and

cpuidle: Export the next timer expiration for CPUs

To be able to predict the sleep duration for a CPU entering idle, it
is essential to know the expiration time of the next timer. Both the
teo and the menu cpuidle governors already use this information for
CPU idle state selection.

Moving forward, a similar prediction needs to be made for a group of
idle CPUs rather than for a single one and the following changes
implement a new genpd governor for that purpose.

In order to support that feature, add a new function called
tick_nohz_get_next_hrtimer() that will return the next hrtimer
expiration time of a given CPU to be invoked after deciding
whether or not to stop the scheduler tick on that CPU.

Make the cpuidle core call tick_nohz_get_next_hrtimer() right
before invoking the ->enter() callback provided by the cpuidle
driver for the given state and store its return value in the
per-CPU struct cpuidle_device, so as to make it available to code
outside of cpuidle.

Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(),
the governor's ->select() callback has already returned and indicated
whether or not the tick should be stopped, so in fact the value
returned by tick_nohz_get_next_hrtimer() always is the next hrtimer
expiration time for the given CPU, possibly including the tick (if
it hasn't been stopped).

Co-developed-by: Lina Iyer <[email protected]>
Co-developed-by: Daniel Lezcano <[email protected]>
Acked-by: Daniel Lezcano <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

show more ...


Revision tags: v5.1-rc2
# 1b72d432 21-Mar-2019 Thomas Gleixner <[email protected]>

tick: Remove outgoing CPU from broadcast masks

Valentin reported that unplugging a CPU occasionally results in a warning
in the tick broadcast code which is triggered when an offline CPU is in the
b

tick: Remove outgoing CPU from broadcast masks

Valentin reported that unplugging a CPU occasionally results in a warning
in the tick broadcast code which is triggered when an offline CPU is in the
broadcast mask.

This happens because the outgoing CPU is not removing itself from the
broadcast masks, especially not from the broadcast_force_mask. The removal
happens on the control CPU after the outgoing CPU is dead. It's a long
standing issue, but the warning is harmless.

Rework the hotplug mechanism so that the outgoing CPU removes itself from
the broadcast masks after disabling interrupts and removing itself from the
online mask.

Reported-by: Valentin Schneider <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Valentin Schneider <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3, v4.17-rc2, v4.17-rc1
# 296bb1e5 05-Apr-2018 Rafael J. Wysocki <[email protected]>

cpuidle: menu: Refine idle state selection for running tick

If the tick isn't stopped, the target residency of the state selected
by the menu governor may be greater than the actual time to the next

cpuidle: menu: Refine idle state selection for running tick

If the tick isn't stopped, the target residency of the state selected
by the menu governor may be greater than the actual time to the next
tick and that means lost energy.

To avoid that, make tick_nohz_get_sleep_length() return the current
time to the next event (before stopping the tick) in addition to the
estimated one via an extra pointer argument and make menu_select()
use that value to refine the state selection when necessary.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

show more ...


# 554c8aa8 03-Apr-2018 Rafael J. Wysocki <[email protected]>

sched: idle: Select idle state before stopping the tick

In order to address the issue with short idle duration predictions
by the idle governor after the scheduler tick has been stopped,
reorder the

sched: idle: Select idle state before stopping the tick

In order to address the issue with short idle duration predictions
by the idle governor after the scheduler tick has been stopped,
reorder the code in cpuidle_idle_call() so that the governor idle
state selection runs before tick_nohz_idle_go_idle() and use the
"nohz" hint returned by cpuidle_select() to decide whether or not
to stop the tick.

This isn't straightforward, because menu_select() invokes
tick_nohz_get_sleep_length() to get the time to the next timer
event and the number returned by the latter comes from
__tick_nohz_idle_stop_tick(). Fortunately, however, it is possible
to compute that number without actually stopping the tick and with
the help of the existing code.

Namely, tick_nohz_get_sleep_length() can be made call
tick_nohz_next_event(), introduced earlier, to get the time to the
next non-highres timer event. If that happens, tick_nohz_next_event()
need not be called by __tick_nohz_idle_stop_tick() again.

If it turns out that the scheduler tick cannot be stopped going
forward or the next timer event is too close for the tick to be
stopped, tick_nohz_get_sleep_length() can simply return the time to
the next event currently programmed into the corresponding clock
event device.

In addition to knowing the return value of tick_nohz_next_event(),
however, tick_nohz_get_sleep_length() needs to know the time to the
next highres timer event, but with the scheduler tick timer excluded,
which can be computed with the help of hrtimer_get_next_event().

That minimum of that number and the tick_nohz_next_event() return
value is the total time to the next timer event with the assumption
that the tick will be stopped. It can be returned to the idle
governor which can use it for predicting idle duration (under the
assumption that the tick will be stopped) and deciding whether or
not it makes sense to stop the tick before putting the CPU into the
selected idle state.

With the above, the sleep_length field in struct tick_sched is not
necessary any more, so drop it.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199227
Reported-by: Doug Smythies <[email protected]>
Reported-by: Thomas Ilsche <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>

show more ...


Revision tags: v4.16, v4.16-rc7
# 45f1ff59 22-Mar-2018 Rafael J. Wysocki <[email protected]>

cpuidle: Return nohz hint from cpuidle_select()

Add a new pointer argument to cpuidle_select() and to the ->select
cpuidle governor callback to allow a boolean value indicating
whether or not the ti

cpuidle: Return nohz hint from cpuidle_select()

Add a new pointer argument to cpuidle_select() and to the ->select
cpuidle governor callback to allow a boolean value indicating
whether or not the tick should be stopped before entering the
selected state to be returned from there.

Make the ladder governor ignore that pointer (to preserve its
current behavior) and make the menu governor return 'false" through
it if:
(1) the idle exit latency is constrained at 0, or
(2) the selected state is a polling one, or
(3) the expected idle period duration is within the tick period
range.

In addition to that, the correction factor computations in the menu
governor need to take the possibility that the tick may not be
stopped into account to avoid artificially small correction factor
values. To that end, add a mechanism to record tick wakeups, as
suggested by Peter Zijlstra, and use it to modify the menu_update()
behavior when tick wakeup occurs. Namely, if the CPU is woken up by
the tick and the return value of tick_nohz_get_sleep_length() is not
within the tick boundary, the predicted idle duration is likely too
short, so make menu_update() try to compensate for that by updating
the governor statistics as though the CPU was idle for a long time.

Since the value returned through the new argument pointer of
cpuidle_select() is not used by its caller yet, this change by
itself is not expected to alter the functionality of the code.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

show more ...


Revision tags: v4.16-rc6
# 2aaf709a 15-Mar-2018 Rafael J. Wysocki <[email protected]>

sched: idle: Do not stop the tick upfront in the idle loop

Push the decision whether or not to stop the tick somewhat deeper
into the idle loop.

Stopping the tick upfront leads to unpleasant outcom

sched: idle: Do not stop the tick upfront in the idle loop

Push the decision whether or not to stop the tick somewhat deeper
into the idle loop.

Stopping the tick upfront leads to unpleasant outcomes in case the
idle governor doesn't agree with the nohz code on the duration of the
upcoming idle period. Specifically, if the tick has been stopped and
the idle governor predicts short idle, the situation is bad regardless
of whether or not the prediction is accurate. If it is accurate, the
tick has been stopped unnecessarily which means excessive overhead.
If it is not accurate, the CPU is likely to spend too much time in
the (shallow, because short idle has been predicted) idle state
selected by the governor [1].

As the first step towards addressing this problem, change the code
to make the tick stopping decision inside of the loop in do_idle().
In particular, do not stop the tick in the cpu_idle_poll() code path.
Also don't do that in tick_nohz_irq_exit() which doesn't really have
enough information on whether or not to stop the tick.

Link: https://marc.info/?l=linux-pm&m=150116085925208&w=2 # [1]
Link: https://tu-dresden.de/zih/forschung/ressourcen/dateien/projekte/haec/powernightmares.pdf
Suggested-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

show more ...


# 0e776768 05-Apr-2018 Rafael J. Wysocki <[email protected]>

time: tick-sched: Reorganize idle tick management code

Prepare the scheduler tick code for reworking the idle loop to
avoid stopping the tick in some cases.

The idea is to split the nohz idle entry

time: tick-sched: Reorganize idle tick management code

Prepare the scheduler tick code for reworking the idle loop to
avoid stopping the tick in some cases.

The idea is to split the nohz idle entry call to decouple the idle
time stats accounting and preparatory work from the actual tick stop
code, in order to later be able to delay the tick stop once we reach
more power-knowledgeable callers.

Move away the tick_nohz_start_idle() invocation from
__tick_nohz_idle_enter(), rename the latter to
__tick_nohz_idle_stop_tick() and define tick_nohz_idle_stop_tick()
as a wrapper around it for calling it from the outside.

Make tick_nohz_idle_enter() only call tick_nohz_start_idle() instead
of calling the entire __tick_nohz_idle_enter(), add another wrapper
disabling and enabling interrupts around tick_nohz_idle_stop_tick()
and make the current callers of tick_nohz_idle_enter() call it too
to retain their current functionality.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>

show more ...


Revision tags: v4.16-rc5, v4.16-rc4, v4.16-rc3
# 22ab8bc0 21-Feb-2018 Frederic Weisbecker <[email protected]>

nohz: Allow to check if remote CPU tick is stopped

This check is racy but provides a good heuristic to determine whether
a CPU may need a remote tick or not.

Signed-off-by: Frederic Weisbecker <fre

nohz: Allow to check if remote CPU tick is stopped

This check is racy but provides a good heuristic to determine whether
a CPU may need a remote tick or not.

Signed-off-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Wanpeng Li <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


# a3642983 21-Feb-2018 Frederic Weisbecker <[email protected]>

nohz: Convert tick_nohz_tick_stopped() to bool

It makes this function more self-explanatory about what it does and how
to use it.

Reported-by: Thomas Gleixner <[email protected]>
Signed-off-by: Fr

nohz: Convert tick_nohz_tick_stopped() to bool

It makes this function more self-explanatory about what it does and how
to use it.

Reported-by: Thomas Gleixner <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Wanpeng Li <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


1234