|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3 |
|
| #
6142be7e |
| 10-Oct-2024 |
Thomas Weißschuh <[email protected]> |
powerpc: Split systemcfg struct definitions out from vdso
The systemcfg data has nothing to do anymore with the vdso. Split it into a dedicated header file.
Signed-off-by: Thomas Weißschuh <thomas.
powerpc: Split systemcfg struct definitions out from vdso
The systemcfg data has nothing to do anymore with the vdso. Split it into a dedicated header file.
Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
| #
1184674d |
| 10-Oct-2024 |
Thomas Weißschuh <[email protected]> |
powerpc: Split systemcfg data out of vdso data page
The systemcfg data only has minimal overlap with the vdso data. Splitting the two avoids mapping the implementation-defined vdso data into /proc/p
powerpc: Split systemcfg data out of vdso data page
The systemcfg data only has minimal overlap with the vdso data. Splitting the two avoids mapping the implementation-defined vdso data into /proc/ppc64/systemcfg. It is also a preparation for the standardization of vdso data storage.
The only field actually used by both systemcfg and vdso is tb_ticks_per_sec and it is only changed once during time_init(). Initialize it in both structures there.
Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5 |
|
| #
b76e0d42 |
| 22-Aug-2024 |
Haren Myneni <[email protected]> |
powerpc/pseries: Use correct data types from pseries_hp_errorlog struct
_be32 type is defined for some elements in pseries_hp_errorlog struct but also used them u32 after be32_to_cpu() conversion.
powerpc/pseries: Use correct data types from pseries_hp_errorlog struct
_be32 type is defined for some elements in pseries_hp_errorlog struct but also used them u32 after be32_to_cpu() conversion.
Example: In handle_dlpar_errorlog() hp_elog->_drc_u.drc_index = be32_to_cpu(hp_elog->_drc_u.drc_index);
And later assigned to u32 type dlpar_cpu() - u32 drc_index = hp_elog->_drc_u.drc_index;
This incorrect usage is giving the following warnings and the patch resolve these warnings with the correct assignment.
arch/powerpc/platforms/pseries/dlpar.c:398:53: sparse: sparse: incorrect type in argument 1 (different base types) @@ expected unsigned int [usertype] drc_index @@ got restricted __be32 [usertype] drc_index @@ ... arch/powerpc/platforms/pseries/dlpar.c:418:43: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted __be32 [usertype] drc_count @@ got unsigned int [usertype] @@
Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Haren Myneni <[email protected]>
v3: - Fix warnings from using incorrect data types in pseries_hp_errorlog struct v2: - Remove pr_info() and TODO comments - Update more information in the commit logs
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1 |
|
| #
d1099e22 |
| 05-Jul-2023 |
Michael Ellerman <[email protected]> |
powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs
Integrate with the generic SMT support, so that when a CPU is DLPAR onlined it is brought up with the correct SMT mode.
Signed-off
powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs
Integrate with the generic SMT support, so that when a CPU is DLPAR onlined it is brought up with the correct SMT mode.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
3b3a4d0f |
| 05-Jul-2023 |
Michael Ellerman <[email protected]> |
powerpc/pseries: Initialise CPU hotplug callbacks earlier
As part of the generic HOTPLUG_SMT code, there is support for disabling secondary SMT threads at boot time, by passing "nosmt" on the kernel
powerpc/pseries: Initialise CPU hotplug callbacks earlier
As part of the generic HOTPLUG_SMT code, there is support for disabling secondary SMT threads at boot time, by passing "nosmt" on the kernel command line.
The way that is implemented is the secondary threads are brought partly online, and then taken back offline again. That is done to support x86 CPUs needing certain initialisation done on all threads. However powerpc has similar needs, see commit d70a54e2d085 ("powerpc/powernv: Ignore smt-enabled on Power8 and later").
For that to work the powerpc CPU hotplug callbacks need to be registered before secondary CPUs are brought online, otherwise __cpu_disable() fails due to smp_ops->cpu_disable being NULL.
So split the basic initialisation into pseries_cpu_hotplug_init() which can be called early from setup_arch(). The DLPAR related initialisation can still be done later, because it needs to do allocations.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2 |
|
| #
857d423c |
| 10-Mar-2023 |
Rob Herring <[email protected]> |
powerpc: Use of_property_present() for testing DT property presence
It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_prope
powerpc: Use of_property_present() for testing DT property presence
It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property/of_find_property functions for reading properties. As part of this, convert of_get_property/of_find_property calls to the recently added of_property_present() helper when we just want to test for presence of a property and nothing more.
Signed-off-by: Rob Herring <[email protected]> [mpe: Drop change in ppc4xx_probe_pci_bridge(), formatting] Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.3-rc1, v6.2, v6.2-rc8 |
|
| #
08273c9f |
| 10-Feb-2023 |
Nathan Lynch <[email protected]> |
powerpc/rtas: arch-wide function token lookup conversions
With the tokens for all implemented RTAS functions now available via rtas_function_token(), which is optimal and safe for arbitrary contexts
powerpc/rtas: arch-wide function token lookup conversions
With the tokens for all implemented RTAS functions now available via rtas_function_token(), which is optimal and safe for arbitrary contexts, there is no need to use rtas_token() or cache its result.
Most conversions are trivial, but a few are worth describing in more detail:
* Error injection token comparisons for lockdown purposes are consolidated into a simple predicate: token_is_restricted_errinjct().
* A couple of special cases in block_rtas_call() do not use rtas_token() but perform string comparisons against names in the function table. These are converted to compare against token values instead, which is logically equivalent but less expensive.
* The lookup for the ibm,os-term token can be deferred until needed, instead of caching it at boot to avoid device tree traversal during panic.
* Since rtas_function_token() accesses a read-only data structure without taking any locks, xmon's lookup of set-indicator can be performed as needed instead of cached at startup.
Signed-off-by: Nathan Lynch <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6 |
|
| #
f6aa37c5 |
| 14-Nov-2022 |
Laurent Dufour <[email protected]> |
powerpc/pseries: unregister VPA when hot unplugging a CPU
The VPA should unregister when offlining a CPU. Otherwise there could be a short window where 2 CPUs could share the same VPA.
This happens
powerpc/pseries: unregister VPA when hot unplugging a CPU
The VPA should unregister when offlining a CPU. Otherwise there could be a short window where 2 CPUs could share the same VPA.
This happens because the hypervisor is still keeping the VPA attached to the vCPU even if it became offline.
Here is a potential situation: 1. remove proc A, 2. add proc B. If proc B gets proc A's place in cpu_present_mask, then it registers proc A's VPAs. 3. If proc B is then re-added to the LP, its threads are sharing VPAs with proc A briefly as they come online.
As the hypervisor may check for the VPA's yield_count field oddity, it may detect an unexpected value and kill the LPAR.
Suggested-by: Nathan Lynch <[email protected]> Signed-off-by: Laurent Dufour <[email protected]> Reviewed-by: Nathan Lynch <[email protected]> [mpe: s/cpu_present_map/cpu_present_mask/ in change log] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4 |
|
| #
6ec4836f |
| 21-Jun-2022 |
Liang He <[email protected]> |
powerpc/pseries: Add missing of_node_put()s in hotplug-cpu.c
In pseries_cpuhp_cache_use_count() and pseries_cpuhp_detach_nodes(), we need carefully hold the reference returned by of_find_next_cache_
powerpc/pseries: Add missing of_node_put()s in hotplug-cpu.c
In pseries_cpuhp_cache_use_count() and pseries_cpuhp_detach_nodes(), we need carefully hold the reference returned by of_find_next_cache_node() and use it to call of_node_put() to keep refcount balance.
Signed-off-by: Liang He <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3 |
|
| #
48b63961 |
| 11-Apr-2022 |
Oscar Salvador <[email protected]> |
powerpc/numa: Associate numa node to its cpu earlier
powerpc is the only platform that do not rely on cpu_up()->try_online_node() to bring up a numa node, and special cases it, instead, deep in its
powerpc/numa: Associate numa node to its cpu earlier
powerpc is the only platform that do not rely on cpu_up()->try_online_node() to bring up a numa node, and special cases it, instead, deep in its own machinery:
dlpar_online_cpu find_and_online_cpu_nid try_online_node
This should not be needed, but the thing is that the try_online_node() from cpu_up() will not apply on the right node, because cpu_to_node() will return the old mapping numa<->cpu that gets set on boot stage for all possible cpus.
That can be seen easily if we try to print out the numa node passed to try_online_node() in cpu_up().
The thing is that the numa<->cpu mapping does not get updated till a much later stage in start_secondary:
start_secondary: set_numa_node(numa_cpu_lookup_table[cpu])
But we do not really care, as we already now the CPU <-> NUMA associativity back in find_and_online_cpu_nid(), so let us make use of that and set the proper numa<->cpu mapping, so cpu_to_node() in cpu_up() returns the right node and try_online_node() can do its work.
Signed-off-by: Oscar Salvador <[email protected]> Tested-by: Geetika Moolchandani <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1 |
|
| #
3b54c715 |
| 05-Nov-2021 |
Nicholas Piggin <[email protected]> |
powerpc/pseries: use slab context cpumask allocation in CPU hotplug init
Slab is up at this point, using the bootmem allocator triggers a warning. Switch to using the regular cpumask allocator.
Sig
powerpc/pseries: use slab context cpumask allocation in CPU hotplug init
Slab is up at this point, using the bootmem allocator triggers a warning. Switch to using the regular cpumask allocator.
Signed-off-by: Nicholas Piggin <[email protected]> Tested-by: Sachin Sant <[email protected]> Reviewed-by: Nathan Lynch <[email protected]> Reviewed-by: Laurent Dufour <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4 |
|
| #
f9473a65 |
| 27-Sep-2021 |
Nathan Lynch <[email protected]> |
powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die
This comment likely refers to the obsolete DLPAR workflow where some resource state transitions were driven more directly from use
powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die
This comment likely refers to the obsolete DLPAR workflow where some resource state transitions were driven more directly from user space utilities, but it also seems to contradict itself: "Change isolate state to Isolate [...]" is at odds with the preceding sentences, and it does not relate at all to the code that follows.
Remove it to prevent confusion.
Signed-off-by: Nathan Lynch <[email protected]> Reviewed-by: Daniel Henrique Barboza <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
fa2a5dfe |
| 27-Sep-2021 |
Nathan Lynch <[email protected]> |
powerpc/pseries/cpuhp: delete add/remove_by_count code
The core DLPAR code supports two actions (add and remove) and three subtypes of action:
* By DRC index: the action is attempted on a single sp
powerpc/pseries/cpuhp: delete add/remove_by_count code
The core DLPAR code supports two actions (add and remove) and three subtypes of action:
* By DRC index: the action is attempted on a single specified resource. This is the usual case for processors. * By indexed count: the action is attempted on a range of resources beginning at the specified index. This is implemented only by the memory DLPAR code. * By count: the lower layer (CPU or memory) is responsible for locating the specified number of resources to which the action can be applied.
I cannot find any evidence of the "by count" subtype being used by drmgr or qemu for processors. And when I try to exercise this code, the add case does not work:
$ ppc64_cpu --smt ; nproc SMT=8 24 $ printf "cpu remove count 2" > /sys/kernel/dlpar $ nproc 8 $ printf "cpu add count 2" > /sys/kernel/dlpar -bash: printf: write error: Invalid argument $ dmesg | tail -2 pseries-hotplug-cpu: Failed to find enough CPUs (1 of 2) to add dlpar: Could not handle DLPAR request "cpu add count 2" $ nproc 8 $ drmgr -c cpu -a -q 2 # this uses the by-index method Validating CPU DLPAR capability...yes. CPU 1 CPU 17 $ nproc 24
This is because find_drc_info_cpus_to_add() does not increment drc_index appropriately during its search.
This is not hard to fix. But the _by_count() functions also have the property that they attempt to roll back all prior operations if the entire request cannot be satisfied, even though the rollback itself can encounter errors. It's not possible to provide transaction-like behavior at this level, and it's undesirable to have code that can only pretend to do that. Any users of these functions cannot know what the state of the system is in the error case. And the error paths are, to my knowledge, impossible to test without adding custom error injection code.
Summary:
* This code has not worked reliably since its introduction. * There is no evidence that it is used. * It contains questionable rollback behaviors in error paths which are difficult to test.
So let's remove it.
Fixes: ac71380071d1 ("powerpc/pseries: Add CPU dlpar remove functionality") Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality") Fixes: b015f6bc9547 ("powerpc/pseries: Add cpu DLPAR support for drc-info property") Signed-off-by: Nathan Lynch <[email protected]> Tested-by: Daniel Henrique Barboza <[email protected]> Reviewed-by: Daniel Henrique Barboza <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
7edd5c9a |
| 27-Sep-2021 |
Nathan Lynch <[email protected]> |
powerpc/pseries/cpuhp: cache node corrections
On pseries, cache nodes in the device tree can be added and removed by the CPU DLPAR code as well as the partition migration (mobility) code. PowerVM pa
powerpc/pseries/cpuhp: cache node corrections
On pseries, cache nodes in the device tree can be added and removed by the CPU DLPAR code as well as the partition migration (mobility) code. PowerVM partitions in dedicated processor mode typically have L2 and L3 cache nodes.
The CPU DLPAR code has the following shortcomings:
* Cache nodes returned as siblings of a new CPU node by ibm,configure-connector are silently discarded; only the CPU node is added to the device tree.
* Cache nodes which become unreferenced in the processor removal path are not removed from the device tree. This can lead to duplicate nodes when the post-migration device tree update code replaces cache nodes.
This is long-standing behavior. Presumably it has gone mostly unnoticed because the two bugs have the property of obscuring each other in common simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice only if you cared to inspect the device tree or the sysfs cacheinfo information.
Booted with two processors:
$ pwd /sys/firmware/devicetree/base/cpus $ ls -1d */ l2-cache@2010/ l2-cache@2011/ l3-cache@3110/ l3-cache@3111/ PowerPC,POWER9@0/ PowerPC,POWER9@8/ $ lsprop */l2-cache l2-cache@2010/l2-cache 00003110 (12560) l2-cache@2011/l2-cache 00003111 (12561) PowerPC,POWER9@0/l2-cache 00002010 (8208) PowerPC,POWER9@8/l2-cache 00002011 (8209) $ ls /sys/devices/system/cpu/cpu0/cache/ index0 index1 index2 index3
After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is missing a cache level in its sched domain hierarchy:
$ ls -1d */ l2-cache@2010/ l2-cache@2011/ l3-cache@3110/ l3-cache@3111/ PowerPC,POWER9@0/ PowerPC,POWER9@10/ PowerPC,POWER9@8/ $ lsprop PowerPC\,POWER9@10/l2-cache PowerPC,POWER9@10/l2-cache 00002012 (8210) $ ls /sys/devices/system/cpu/cpu16/cache/ index0 index1 $ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE /sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT /sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE /sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE /sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT /sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE
When removing PowerPC,POWER9@8, we see that its cache nodes are left behind:
$ ls -1d */ l2-cache@2010/ l2-cache@2011/ l3-cache@3110/ l3-cache@3111/ PowerPC,POWER9@0/
When DLPAR is combined with VM migration, we can get duplicate nodes. E.g. removing one processor, then migrating, adding a processor, and then migrating again can result in warnings from the OF core during post-migration device tree updates:
Duplicate name in cpus, renamed to "l2-cache@2011#1" Duplicate name in cpus, renamed to "l3-cache@3111#1"
and nodes with duplicated phandles in the tree, making lookup behavior unpredictable:
$ lsprop l[23]-cache@*/ibm,phandle l2-cache@2010/ibm,phandle 00002010 (8208) l2-cache@2011#1/ibm,phandle 00002011 (8209) l2-cache@2011/ibm,phandle 00002011 (8209) l3-cache@3110/ibm,phandle 00003110 (12560) l3-cache@3111#1/ibm,phandle 00003111 (12561) l3-cache@3111/ibm,phandle 00003111 (12561)
Address these issues by:
* Correctly processing siblings of the node returned from dlpar_configure_connector(). * Removing cache nodes in the CPU remove path when it can be determined that they are not associated with other CPUs or caches.
Use the of_changeset API in both cases, which allows us to keep the error handling in this code from becoming more complex while ensuring that the device tree cannot become inconsistent.
Fixes: ac71380071d1 ("powerpc/pseries: Add CPU dlpar remove functionality") Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality") Signed-off-by: Nathan Lynch <[email protected]> Tested-by: Daniel Henrique Barboza <[email protected]> Reviewed-by: Daniel Henrique Barboza <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7 |
|
| #
8b893ef1 |
| 16-Aug-2021 |
Michael Ellerman <[email protected]> |
powerpc/pseries: Fix build error when NUMA=n
As reported by lkp, if NUMA=n we see a build error:
arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'pseries_cpu_hotplug_init': arch/pow
powerpc/pseries: Fix build error when NUMA=n
As reported by lkp, if NUMA=n we see a build error:
arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'pseries_cpu_hotplug_init': arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: 'node_to_cpumask_map' undeclared 1022 | node_to_cpumask_map[node]);
Use cpumask_of_node() which has an empty stub for NUMA=n, and when NUMA=y does a lookup from node_to_cpumask_map[].
Fixes: bd1dd4c5f528 ("powerpc/pseries: Prevent free CPU ids being reused on another node") Reported-by: kernel test robot <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.14-rc6 |
|
| #
8ddc6448 |
| 12-Aug-2021 |
Aneesh Kumar K.V <[email protected]> |
powerpc/pseries: Consolidate different NUMA distance update code paths
The associativity details of the newly added resourced are collected from the hypervisor via "ibm,configure-connector" rtas cal
powerpc/pseries: Consolidate different NUMA distance update code paths
The associativity details of the newly added resourced are collected from the hypervisor via "ibm,configure-connector" rtas call. Update the numa distance details of the newly added numa node after the above call.
Instead of updating NUMA distance every time we lookup a node id from the associativity property, add helpers that can be used during boot which does this only once. Also remove the distance update from node id lookup helpers.
Currently, we duplicate parsing code for ibm,associativity and ibm,associativity-lookup-arrays in the kernel. The associativity array provided by these device tree properties are very similar and hence can use a helper to parse the node id and numa distance details.
Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1 |
|
| #
bd1dd4c5 |
| 29-Apr-2021 |
Laurent Dufour <[email protected]> |
powerpc/pseries: Prevent free CPU ids being reused on another node
When a CPU is hot added, the CPU ids are taken from the available mask from the lower possible set. If that set of values was previ
powerpc/pseries: Prevent free CPU ids being reused on another node
When a CPU is hot added, the CPU ids are taken from the available mask from the lower possible set. If that set of values was previously used for a CPU attached to a different node, it appears to an application as if these CPUs have migrated from one node to another node which is not expected.
To prevent this, it is needed to record the CPU ids used for each node and to not reuse them on another node. However, to prevent CPU hot plug to fail, in the case the CPU ids is starved on a node, the capability to reuse other nodes’ free CPU ids is kept. A warning is displayed in such a case to warn the user.
A new CPU bit mask (node_recorded_ids_map) is introduced for each possible node. It is populated with the CPU onlined at boot time, and then when a CPU is hot plugged to a node. The bits in that mask remain when the CPU is hot unplugged, to remind this CPU ids have been used for this node.
If no id set was found, a retry is made without removing the ids used on the other nodes to try reusing them. This is the way ids have been allocated prior to this patch.
The effect of this patch can be seen by removing and adding CPUs using the Qemu monitor. In the following case, the first CPU from the node 2 is removed, then the first one from the node 1 is removed too. Later, the first CPU of the node 2 is added back. Without that patch, the kernel will number these CPUs using the first CPU ids available which are the ones freed when removing the second CPU of the node 0. This leads to the CPU ids 16-23 to move from the node 1 to the node 2. With the patch applied, the CPU ids 32-39 are used since they are the lowest free ones which have not been used on another node.
At boot time: [root@vm40 ~]# numactl -H | grep cpus node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
Vanilla kernel, after the CPU hot unplug/plug operations: [root@vm40 ~]# numactl -H | grep cpus node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 1 cpus: 24 25 26 27 28 29 30 31 node 2 cpus: 16 17 18 19 20 21 22 23 40 41 42 43 44 45 46 47
Patched kernel, after the CPU hot unplug/plug operations: [root@vm40 ~]# numactl -H | grep cpus node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 1 cpus: 24 25 26 27 28 29 30 31 node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
Signed-off-by: Laurent Dufour <[email protected]> Reviewed-by: Nathan Lynch <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.12, v5.12-rc8 |
|
| #
ed8029d7 |
| 18-Apr-2021 |
Michael Ellerman <[email protected]> |
powerpc/pseries: Stop calling printk in rtas_stop_self()
RCU complains about us calling printk() from an offline CPU:
============================= WARNING: suspicious RCU usage 5.12.0-rc7-02
powerpc/pseries: Stop calling printk in rtas_stop_self()
RCU complains about us calling printk() from an offline CPU:
============================= WARNING: suspicious RCU usage 5.12.0-rc7-02874-g7cf90e481cb8 #1 Not tainted ----------------------------- kernel/locking/lockdep.c:3568 RCU-list traversed in non-reader section!!
other info that might help us debug this:
RCU used illegally from offline CPU! rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/0/0.
stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc7-02874-g7cf90e481cb8 #1 Call Trace: dump_stack+0xec/0x144 (unreliable) lockdep_rcu_suspicious+0x124/0x144 __lock_acquire+0x1098/0x28b0 lock_acquire+0x128/0x600 _raw_spin_lock_irqsave+0x6c/0xc0 down_trylock+0x2c/0x70 __down_trylock_console_sem+0x60/0x140 vprintk_emit+0x1a8/0x4b0 vprintk_func+0xcc/0x200 printk+0x40/0x54 pseries_cpu_offline_self+0xc0/0x120 arch_cpu_idle_dead+0x54/0x70 do_idle+0x174/0x4a0 cpu_startup_entry+0x38/0x40 rest_init+0x268/0x388 start_kernel+0x748/0x790 start_here_common+0x1c/0x614
Which happens because by the time we get to rtas_stop_self() we are already offline. In addition the message can be spammy, and is not that helpful for users, so remove it.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
29c9a269 |
| 16-Apr-2021 |
Daniel Henrique Barboza <[email protected]> |
powerpc/pseries: Set UNISOLATE on dlpar_cpu_remove() failure
The RTAS set-indicator call, when attempting to UNISOLATE a DRC that is already UNISOLATED or CONFIGURED, returns RTAS_OK and does nothin
powerpc/pseries: Set UNISOLATE on dlpar_cpu_remove() failure
The RTAS set-indicator call, when attempting to UNISOLATE a DRC that is already UNISOLATED or CONFIGURED, returns RTAS_OK and does nothing else for both QEMU and phyp. This gives us an opportunity to use this behavior to signal the hypervisor layer when an error during device removal happens, allowing it to do a proper error handling, while not breaking QEMU/phyp implementations that don't have this support.
This patch introduces this idea by unisolating all CPU DRCs that failed to be removed by dlpar_cpu_remove_by_index(), when handling the PSERIES_HP_ELOG_ID_DRC_INDEX event. This is being done for this event only because its the only CPU removal event QEMU uses, and there's no need at this moment to add this mechanism for phyp only code.
Signed-off-by: Daniel Henrique Barboza <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.12-rc7, v5.12-rc6, v5.12-rc5 |
|
| #
d19b3ad0 |
| 23-Mar-2021 |
Daniel Henrique Barboza <[email protected]> |
powerpc/pseries/hotplug-cpu: Show 'last online CPU' error in dlpar_cpu_offline()
One of the reasons that dlpar_cpu_offline can fail is when attempting to offline the last online CPU of the kernel. T
powerpc/pseries/hotplug-cpu: Show 'last online CPU' error in dlpar_cpu_offline()
One of the reasons that dlpar_cpu_offline can fail is when attempting to offline the last online CPU of the kernel. This can be observed in a pseries QEMU guest that has hotplugged CPUs. If the user offlines all other CPUs of the guest, and a hotplugged CPU is now the last online CPU, trying to reclaim it will fail.
The current error message in this situation returns rc with -EBUSY and a generic explanation, e.g.:
pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16
EBUSY can be caused by other conditions, such as cpu_hotplug_disable being true. Throwing a more specific error message for this case, instead of just "Failed to offline CPU", makes it clearer that the error is in fact a known error situation instead of other generic/unknown cause.
This patch adds a 'last online' check in dlpar_cpu_offline() to catch the 'last online CPU' offline error, eturning a more informative error message:
pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9
Signed-off-by: Daniel Henrique Barboza <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6 |
|
| #
01b0f0ea |
| 26-Nov-2020 |
Nicholas Piggin <[email protected]> |
powerpc/64s: Trim offlined CPUs from mm_cpumasks
When offlining a CPU, powerpc/64s does not flush TLBs, rather it just leaves the CPU set in mm_cpumasks, so it continues to receive TLBIEs to manage
powerpc/64s: Trim offlined CPUs from mm_cpumasks
When offlining a CPU, powerpc/64s does not flush TLBs, rather it just leaves the CPU set in mm_cpumasks, so it continues to receive TLBIEs to manage its TLBs.
However the exit_flush_lazy_tlbs() function expects that after returning, all CPUs (except self) have flushed TLBs for that mm, in which case TLBIEL can be used for this flush. This breaks for offline CPUs because they don't get the IPI to flush their TLB. This can lead to stale translations.
Fix this by clearing the CPU from mm_cpumasks, then flushing all TLBs before going offline.
These offlined CPU bits stuck in the cpumask also prevents the cpumask from being trimmed back to local mode, which means continual broadcast IPIs or TLBIEs are needed for TLB flushing. This patch prevents that situation too.
A cast of many were involved in working this out, but in particular Milton, Aneesh, Paul made key discoveries.
Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask") Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Debugged-by: Milton Miller <[email protected]> Debugged-by: Aneesh Kumar K.V <[email protected]> Debugged-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.10-rc5, v5.10-rc4 |
|
| #
a40fdaf1 |
| 11-Nov-2020 |
Zhang Xiaoxu <[email protected]> |
Revert "powerpc/pseries/hotplug-cpu: Remove double free in error path"
This reverts commit a0ff72f9f5a780341e7ff5e9ba50a0dad5fa1980.
Since the commit b015f6bc9547 ("powerpc/pseries: Add cpu DLPAR s
Revert "powerpc/pseries/hotplug-cpu: Remove double free in error path"
This reverts commit a0ff72f9f5a780341e7ff5e9ba50a0dad5fa1980.
Since the commit b015f6bc9547 ("powerpc/pseries: Add cpu DLPAR support for drc-info property"), the 'cpu_drcs' wouldn't be double freed when the 'cpus' node not found.
So we needn't apply this patch, otherwise, the memory will be leaked.
Fixes: a0ff72f9f5a7 ("powerpc/pseries/hotplug-cpu: Remove double free in error path") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Xiaoxu <[email protected]> [mpe: Caused by me applying a patch to a function that had changed in the interim] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2 |
|
| #
39f87561 |
| 19-Aug-2020 |
Michael Ellerman <[email protected]> |
powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()
We have smp_ops->cpu_die() and ppc_md.cpu_die(). One of them offlines the current CPU and one offlines another CPU, can you guess whi
powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()
We have smp_ops->cpu_die() and ppc_md.cpu_die(). One of them offlines the current CPU and one offlines another CPU, can you guess which is which? Also one is in smp_ops and one is in ppc_md?
So rename ppc_md.cpu_die(), to cpu_offline_self(), because that's what it does. And move it into smp_ops where it belongs.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.9-rc1 |
|
| #
801980f6 |
| 11-Aug-2020 |
Michael Roth <[email protected]> |
powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
For a power9 KVM guest with XIVE enabled, running a test loop where we hotplug 384 vcpus and then unplug them, the following traces can
powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
For a power9 KVM guest with XIVE enabled, running a test loop where we hotplug 384 vcpus and then unplug them, the following traces can be seen (generally within a few loops) either from the unplugged vcpu:
cpu 65 (hwid 65) Ready to die... Querying DEAD? cpu 66 (66) shows 2 list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! Oops: Exception in kernel mode, sig: 5 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ... CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1 NIP: c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac REGS: c0000009e5a17840 TRAP: 0700 Not tainted (4.18.0-221.el8.ppc64le) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000842 XER: 20040000 ... NIP __list_del_entry_valid+0xac/0x100 LR __list_del_entry_valid+0xa8/0x100 Call Trace: __list_del_entry_valid+0xa8/0x100 (unreliable) free_pcppages_bulk+0x1f8/0x940 free_unref_page+0xd0/0x100 xive_spapr_cleanup_queue+0x148/0x1b0 xive_teardown_cpu+0x1bc/0x240 pseries_mach_cpu_die+0x78/0x2f0 cpu_die+0x48/0x70 arch_cpu_idle_dead+0x20/0x40 do_idle+0x2f4/0x4c0 cpu_startup_entry+0x38/0x40 start_secondary+0x7bc/0x8f0 start_secondary_prolog+0x10/0x14
or on the worker thread handling the unplug:
pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a Querying DEAD? cpu 314 (314) shows 2 BUG: Bad page state in process kworker/u768:3 pfn:95de1 cpu 314 (hwid 314) Ready to die... page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 flags: 0x5ffffc00000000() raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ... CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1 Workqueue: pseries hotplug workque pseries_hp_work_fn Call Trace: dump_stack+0xb0/0xf4 (unreliable) bad_page+0x12c/0x1b0 free_pcppages_bulk+0x5bc/0x940 page_alloc_cpu_dead+0x118/0x120 cpuhp_invoke_callback.constprop.5+0xb8/0x760 _cpu_down+0x188/0x340 cpu_down+0x5c/0xa0 cpu_subsys_offline+0x24/0x40 device_offline+0xf0/0x130 dlpar_offline_cpu+0x1c4/0x2a0 dlpar_cpu_remove+0xb8/0x190 dlpar_cpu_remove_by_index+0x12c/0x150 dlpar_cpu+0x94/0x800 pseries_hp_work_fn+0x128/0x1e0 process_one_work+0x304/0x5d0 worker_thread+0xcc/0x7a0 kthread+0x1ac/0x1c0 ret_from_kernel_thread+0x5c/0x80
The latter trace is due to the following sequence:
page_alloc_cpu_dead drain_pages drain_pages_zone free_pcppages_bulk
where drain_pages() in this case is called under the assumption that the unplugged cpu is no longer executing. To ensure that is the case, and early call is made to __cpu_die()->pseries_cpu_die(), which runs a loop that waits for the cpu to reach a halted state by polling its status via query-cpu-stopped-state RTAS calls. It only polls for 25 iterations before giving up, however, and in the trace above this results in the following being printed only .1 seconds after the hotplug worker thread begins processing the unplug request:
pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a Querying DEAD? cpu 314 (314) shows 2
At that point the worker thread assumes the unplugged CPU is in some unknown/dead state and procedes with the cleanup, causing the race with the XIVE cleanup code executed by the unplugged CPU.
Fix this by waiting indefinitely, but also making an effort to avoid spurious lockup messages by allowing for rescheduling after polling the CPU status and printing a warning if we wait for longer than 120s.
Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Suggested-by: Michael Ellerman <[email protected]> Signed-off-by: Michael Roth <[email protected]> Tested-by: Greg Kurz <[email protected]> Reviewed-by: Thiago Jung Bauermann <[email protected]> Reviewed-by: Greg Kurz <[email protected]> [mpe: Trim oopses in change log slightly for readability] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1 |
|
| #
a0ff72f9 |
| 19-Sep-2019 |
Nathan Lynch <[email protected]> |
powerpc/pseries/hotplug-cpu: Remove double free in error path
In the unlikely event that the device tree lacks a /cpus node, find_dlpar_cpus_to_add() oddly frees the cpu_drcs buffer it has been pass
powerpc/pseries/hotplug-cpu: Remove double free in error path
In the unlikely event that the device tree lacks a /cpus node, find_dlpar_cpus_to_add() oddly frees the cpu_drcs buffer it has been passed before returning an error. Its only caller also frees the buffer on error.
Remove the less conventional kfree() of a caller-supplied buffer from find_dlpar_cpus_to_add().
Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality") Signed-off-by: Nathan Lynch <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|