affinity.c - OpenGrok history log for /linux-6.15/kernel/irq/affinity.c

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2
# f7b3ea8c	27-Dec-2022	Ming Lei <[email protected]>	genirq/affinity: Move group_cpus_evenly() into lib/ group_cpus_evenly() has become a generic function which can be used for other subsystems than the interrupt subsystem, so move it into lib/. Sign genirq/affinity: Move group_cpus_evenly() into lib/ group_cpus_evenly() has become a generic function which can be used for other subsystems than the interrupt subsystem, so move it into lib/. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
# 523f1ea7	27-Dec-2022	Ming Lei <[email protected]>	genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly Map irq vector into group, which allows to abstract the algorithm for a generic use case outside of the interrupt core. Rename genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly Map irq vector into group, which allows to abstract the algorithm for a generic use case outside of the interrupt core. Rename irq_build_affinity_masks as group_cpus_evenly, so the API can be reused for blk-mq to make default queue mapping even though irq vectors aren't involved. No functional change, just rename vector as group. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
# e7bdd7f0	27-Dec-2022	Ming Lei <[email protected]>	genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks Prepare for abstracting irq_build_affinity_masks() into a public function for assigning all CPUs evenly into several g genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks Prepare for abstracting irq_build_affinity_masks() into a public function for assigning all CPUs evenly into several groups. Don't pass irq_affinity_desc array to irq_build_affinity_masks, instead return a cpumask array by storing each assigned group into one element of the array. This allows to provide a generic interface for grouping all CPUs evenly from a NUMA and CPU locality viewpoint, and the cost is one extra allocation in irq_build_affinity_masks(), which should be fine since it is done via GFP_KERNEL and irq_build_affinity_masks() is a slow path anyway. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
# 1f962d91	27-Dec-2022	Ming Lei <[email protected]>	genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks Pass affinity managed mask array to irq_build_affinity_masks() so that the index of the first affinity managed vector is genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks Pass affinity managed mask array to irq_build_affinity_masks() so that the index of the first affinity managed vector is always zero. This allows to simplify the implementation a bit. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
# cdf07f0e	27-Dec-2022	Ming Lei <[email protected]>	genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks The 'firstvec' parameter is always same with the parameter of 'startvec', so use 'startvec' directly inside irq_build_a genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks The 'firstvec' parameter is always same with the parameter of 'startvec', so use 'startvec' directly inside irq_build_affinity_masks(). Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
Revision tags: v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2
# 99248e35	23-Jan-2022	Yury Norov <[email protected]>	genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate __irq_build_affinity_masks() calls cpumask_weight() to check if any bit of a given cpumask is set. We can do it more effi genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate __irq_build_affinity_masks() calls cpumask_weight() to check if any bit of a given cpumask is set. We can do it more efficiently with cpumask_empty() because cpumask_empty() stops traversing the cpumask as soon as it finds first set bit, while cpumask_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <[email protected]> show more ...
# 08d835df	31-Mar-2022	Rei Yamamoto <[email protected]>	genirq/affinity: Consider that CPUs on nodes can be unbalanced If CPUs on a node are offline at boot time, the number of nodes is different when building affinity masks for present cpus and when bui genirq/affinity: Consider that CPUs on nodes can be unbalanced If CPUs on a node are offline at boot time, the number of nodes is different when building affinity masks for present cpus and when building affinity masks for possible cpus. This causes the following problem: In the case that the number of vectors is less than the number of nodes there are cases where bits of masks for present cpus are overwritten when building masks for possible cpus. Fix this by excluding CPUs, which are not part of the current build mask (present/possible). [ tglx: Massaged changelog and added comment ] Fixes: b82592199032 ("genirq/affinity: Spread IRQs to all available NUMA nodes") Signed-off-by: Rei Yamamoto <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] show more ...
# 911488de	10-Feb-2022	Yury Norov <[email protected]>	genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate __irq_build_affinity_masks() calls cpumask_weight() to check if any bit of a given cpumask is set. This can be done genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate __irq_build_affinity_masks() calls cpumask_weight() to check if any bit of a given cpumask is set. This can be done more efficiently with cpumask_empty() because cpumask_empty() stops traversing the cpumask as soon as it finds first set bit, while cpumask_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
Revision tags: v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5
# 428e2116	03-Aug-2021	Sebastian Andrzej Siewior <[email protected]>	genirq/affinity: Replace deprecated CPU-hotplug functions. The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lo genirq/affinity: Replace deprecated CPU-hotplug functions. The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] show more ...
Revision tags: v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7
# 101f85b5	28-Aug-2019	Ming Lei <[email protected]>	genirq/affinity: Remove const qualifier from node_to_cpumask argument When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as 'typedef struct cpumask cpumask_var_t[1]', so the argument ' genirq/affinity: Remove const qualifier from node_to_cpumask argument When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as 'typedef struct cpumask cpumask_var_t[1]', so the argument 'node_to_cpumask' alloc_nodes_vectors() can't be declared as 'const cpumask_var_t ' Fixes the following warning: kernel/irq/affinity.c: In function '__irq_build_affinity_masks': alloc_nodes_vectors(numvecs, node_to_cpumask, cpu_mask, ^ kernel/irq/affinity.c:128:13: note: expected 'const struct cpumask ()[1]' but argument is of type 'struct cpumask (*)[1]' static void alloc_nodes_vectors(unsigned int numvecs, ^ Fixes: b1a5a73e64e9 ("genirq/affinity: Spread vectors on node according to nr_cpu ratio") Reported-by: kbuild test robot <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
Revision tags: v5.3-rc6, v5.3-rc5
# b1a5a73e	16-Aug-2019	Ming Lei <[email protected]>	genirq/affinity: Spread vectors on node according to nr_cpu ratio Now __irq_build_affinity_masks() spreads vectors evenly per node, but there is a case that not all vectors have been spread when eac genirq/affinity: Spread vectors on node according to nr_cpu ratio Now __irq_build_affinity_masks() spreads vectors evenly per node, but there is a case that not all vectors have been spread when each numa node has a different number of CPUs which triggers the warning in the spreading code. Improve the spreading algorithm by - assigning vectors according to the ratio of the number of CPUs on a node to the number of remaining CPUs. - running the assignment from smaller nodes to bigger nodes to guarantee that every active node gets allocated at least one vector. This ensures that all vectors are spread out. Asided of that the spread becomes more fair if the nodes have different number of CPUs. For example, on the following machine: CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 ... NUMA node0 CPU(s): 0,1,3,5-9,11,13-15 NUMA node1 CPU(s): 2,4,10,12 When a driver requests to allocate 8 vectors, the following spread results: irq 31, cpu list 2,4 irq 32, cpu list 10,12 irq 33, cpu list 0-1 irq 34, cpu list 3,5 irq 35, cpu list 6-7 irq 36, cpu list 8-9 irq 37, cpu list 11,13 irq 38, cpu list 14-15 So Node 0 has now 6 and Node 1 has 2 vectors assigned. The original algorithm assigned 4 vectors on each node which was unfair versus Node 0. [ tglx: Massaged changelog ] Reported-by: Jon Derrick <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Jon Derrick <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
# 53c1788b	16-Aug-2019	Ming Lei <[email protected]>	genirq/affinity: Improve __irq_build_affinity_masks() One invariant of __irq_build_affinity_masks() is that all CPUs in the specified masks (cpu_mask AND node_to_cpumask for each node) should be cov genirq/affinity: Improve __irq_build_affinity_masks() One invariant of __irq_build_affinity_masks() is that all CPUs in the specified masks (cpu_mask AND node_to_cpumask for each node) should be covered during the spread. Even though all requested vectors have been reached, it's still required to spread vectors among remained CPUs. A similar policy has been taken in case of 'numvecs <= nodes' already. So remove the following check inside the loop: if (done >= numvecs) break; Meantime assign at least 1 vector for remaining nodes if 'numvecs' vectors have been handled already. Also, if the specified cpumask for one numa node is empty, simply do not spread vectors on this node. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
Revision tags: v5.3-rc4, v5.3-rc3
# 491beed3	05-Aug-2019	Ming Lei <[email protected]>	genirq/affinity: Create affinity mask for single vector Since commit c66d4bd110a1f8 ("genirq/affinity: Add new callback for (re)calculating interrupt sets"), irq_create_affinity_masks() returns NULL genirq/affinity: Create affinity mask for single vector Since commit c66d4bd110a1f8 ("genirq/affinity: Add new callback for (re)calculating interrupt sets"), irq_create_affinity_masks() returns NULL in case of single vector. This change has caused regression on some drivers, such as lpfc. The problem is that single vector requests can happen in some generic cases: 1) kdump kernel 2) irq vectors resource is close to exhaustion. If in that situation the affinity mask for a single vector is not created, every caller has to handle the special case. There is no reason why the mask cannot be created, so remove the check for a single vector and create the mask. Fixes: c66d4bd110a1f8 ("genirq/affinity: Add new callback for (re)calculating interrupt sets") Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] show more ...
Revision tags: v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3
# 0e518330	02-Jun-2019	Minwoo Im <[email protected]>	genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks() The affd argument is neither used in irq_build_affinity_masks() nor __irq_build_affinity_masks(). Remove it. Signed-off- genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks() The affd argument is neither used in irq_build_affinity_masks() nor __irq_build_affinity_masks(). Remove it. Signed-off-by: Minwoo Im <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Cc: Minwoo Im <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] show more ...
Revision tags: v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7
# a6a309ed	16-Feb-2019	Thomas Gleixner <[email protected]>	genirq/affinity: Remove the leftovers of the original set support Now that the NVME driver is converted over to the calc_set() callback, the workarounds of the original set support can be removed. genirq/affinity: Remove the leftovers of the original set support Now that the NVME driver is converted over to the calc_set() callback, the workarounds of the original set support can be removed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
# c66d4bd1	16-Feb-2019	Ming Lei <[email protected]>	genirq/affinity: Add new callback for (re)calculating interrupt sets The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set cont genirq/affinity: Add new callback for (re)calculating interrupt sets The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set contains one or more interrupts. Each set is mapped to a specific functionality of a device, e.g. general I/O queues and read I/O queus of multiqueue block devices. The number of interrupts per set is defined by the driver. It depends on the total number of available interrupts for the device, which is determined by the PCI capabilites and the availability of underlying CPU resources, and the number of queues which the device provides and the driver wants to instantiate. The driver passes initial configuration for the interrupt allocation via a pointer to struct irq_affinity. Right now the allocation mechanism is complex as it requires to have a loop in the driver to determine the maximum number of interrupts which are provided by the PCI capabilities and the underlying CPU resources. This loop would have to be replicated in every driver which wants to utilize this mechanism. That's unwanted code duplication and error prone. In order to move this into generic facilities it is required to have a mechanism, which allows the recalculation of the interrupt sets and their size, in the core code. As the core code does not have any knowledge about the underlying device, a driver specific callback is required in struct irq_affinity, which can be invoked by the core code. The callback gets the number of available interupts as an argument, so the driver can calculate the corresponding number and size of interrupt sets. At the moment the struct irq_affinity pointer which is handed in from the driver and passed through to several core functions is marked 'const', but for the callback to be able to modify the data in the struct it's required to remove the 'const' qualifier. Add the optional callback to struct irq_affinity, which allows drivers to recalculate the number and size of interrupt sets and remove the 'const' qualifier. For simple invocations, which do not supply a callback, a default callback is installed, which just sets nr_sets to 1 and transfers the number of spreadable vectors to the set_size array at index 0. This is for now guarded by a check for nr_sets != 0 to keep the NVME driver working until it is converted to the callback mechanism. To make sure that the driver configuration is correct under all circumstances the callback is invoked even when there are no interrupts for queues left, i.e. the pre/post requirements already exhaust the numner of available interrupts. At the PCI layer irq_create_affinity_masks() has to be invoked even for the case where the legacy interrupt is used. That ensures that the callback is invoked and the device driver can adjust to that situation. [ tglx: Fixed the simple case (no sets required). Moved the sanity check for nr_sets after the invocation of the callback so it catches broken drivers. Fixed the kernel doc comments for struct irq_affinity and de-'This patch'-ed the changelog ] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
# 9cfef55b	16-Feb-2019	Ming Lei <[email protected]>	genirq/affinity: Store interrupt sets size in struct irq_affinity The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set contain genirq/affinity: Store interrupt sets size in struct irq_affinity The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set contains one or more interrupts. Each set is mapped to a specific functionality of a device, e.g. general I/O queues and read I/O queus of multiqueue block devices. The number of interrupts per set is defined by the driver. It depends on the total number of available interrupts for the device, which is determined by the PCI capabilites and the availability of underlying CPU resources, and the number of queues which the device provides and the driver wants to instantiate. The driver passes initial configuration for the interrupt allocation via a pointer to struct irq_affinity. Right now the allocation mechanism is complex as it requires to have a loop in the driver to determine the maximum number of interrupts which are provided by the PCI capabilities and the underlying CPU resources. This loop would have to be replicated in every driver which wants to utilize this mechanism. That's unwanted code duplication and error prone. In order to move this into generic facilities it is required to have a mechanism, which allows the recalculation of the interrupt sets and their size, in the core code. As the core code does not have any knowledge about the underlying device, a driver specific callback will be added to struct affinity_desc, which will be invoked by the core code. The callback will get the number of available interupts as an argument, so the driver can calculate the corresponding number and size of interrupt sets. To support this, two modifications for the handling of struct irq_affinity are required: 1) The (optional) interrupt sets size information is contained in a separate array of integers and struct irq_affinity contains a pointer to it. This is cumbersome and as the maximum number of interrupt sets is small, there is no reason to have separate storage. Moving the size array into struct affinity_desc avoids indirections and makes the code simpler. 2) At the moment the struct irq_affinity pointer which is handed in from the driver and passed through to several core functions is marked 'const'. With the upcoming callback to recalculate the number and size of interrupt sets, it's necessary to remove the 'const' qualifier. Otherwise the callback would not be able to update the data. Implement #1 and store the interrupt sets size in 'struct irq_affinity'. No functional change. [ tglx: Fixed the memcpy() size so it won't copy beyond the size of the source. Fixed the kernel doc comments for struct irq_affinity and de-'This patch'-ed the changelog ] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
# 0145c30e	16-Feb-2019	Thomas Gleixner <[email protected]>	genirq/affinity: Code consolidation All information and calculations in the interrupt affinity spreading code is strictly unsigned int. Though the code uses int all over the place. Convert it over genirq/affinity: Code consolidation All information and calculations in the interrupt affinity spreading code is strictly unsigned int. Though the code uses int all over the place. Convert it over to unsigned int. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
Revision tags: v5.0-rc6, v5.0-rc5, v5.0-rc4
# 347253c4	25-Jan-2019	Ming Lei <[email protected]>	genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks() 'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(), so move it into irq_build_affinity genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks() 'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(), so move it into irq_build_affinity_masks(). No functioanl change. Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Bjorn Helgaas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] show more ...
Revision tags: v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6
# c410abbb	04-Dec-2018	Dou Liyang <[email protected]>	genirq/affinity: Add is_managed to struct irq_affinity_desc Devices which use managed interrupts usually have two classes of interrupts: - Interrupts for multiple device queues - Interrupts for genirq/affinity: Add is_managed to struct irq_affinity_desc Devices which use managed interrupts usually have two classes of interrupts: - Interrupts for multiple device queues - Interrupts for general device management Currently both classes are treated the same way, i.e. as managed interrupts. The general interrupts get the default affinity mask assigned while the device queue interrupts are spread out over the possible CPUs. Treating the general interrupts as managed is both a limitation and under certain circumstances a bug. Assume the following situation: default_irq_affinity = 4..7 So if CPUs 4-7 are offlined, then the core code will shut down the device management interrupts because the last CPU in their affinity mask went offline. It's also a limitation because it's desired to allow manual placement of the general device interrupts for various reasons. If they are marked managed then the interrupt affinity setting from both user and kernel space is disabled. That limitation was reported by Kashyap and Sumit. Expand struct irq_affinity_desc with a new bit 'is_managed' which is set for truly managed interrupts (queue interrupts) and cleared for the general device interrupts. [ tglx: Simplify code and massage changelog ] Reported-by: Kashyap Desai <[email protected]> Reported-by: Sumit Saxena <[email protected]> Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] show more ...
# bec04037	04-Dec-2018	Dou Liyang <[email protected]>	genirq/core: Introduce struct irq_affinity_desc The interrupt affinity management uses straight cpumask pointers to convey the automatically assigned affinity masks for managed interrupts. The core genirq/core: Introduce struct irq_affinity_desc The interrupt affinity management uses straight cpumask pointers to convey the automatically assigned affinity masks for managed interrupts. The core interrupt descriptor allocation also decides based on the pointer being non NULL whether an interrupt is managed or not. Devices which use managed interrupts usually have two classes of interrupts: - Interrupts for multiple device queues - Interrupts for general device management Currently both classes are treated the same way, i.e. as managed interrupts. The general interrupts get the default affinity mask assigned while the device queue interrupts are spread out over the possible CPUs. Treating the general interrupts as managed is both a limitation and under certain circumstances a bug. Assume the following situation: default_irq_affinity = 4..7 So if CPUs 4-7 are offlined, then the core code will shut down the device management interrupts because the last CPU in their affinity mask went offline. It's also a limitation because it's desired to allow manual placement of the general device interrupts for various reasons. If they are marked managed then the interrupt affinity setting from both user and kernel space is disabled. To remedy that situation it's required to convey more information than the cpumasks through various interfaces related to interrupt descriptor allocation. Instead of adding yet another argument, create a new data structure 'irq_affinity_desc' which for now just contains the cpumask. This struct can be expanded to convey auxilliary information in the next step. No functional change, just preparatory work. [ tglx: Simplified logic and clarified changelog ] Suggested-by: Thomas Gleixner <[email protected]> Suggested-by: Bjorn Helgaas <[email protected]> Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] show more ...
# c2899c34	18-Dec-2018	Thomas Gleixner <[email protected]>	genirq/affinity: Remove excess indentation Plus other coding style issues which stood out while staring at that code. Signed-off-by: Thomas Gleixner <[email protected]>
Revision tags: v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1
# 6da4b3ab	02-Nov-2018	Jens Axboe <[email protected]>	genirq/affinity: Add support for allocating interrupt sets A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts, and have them appropriately affinitized. Add support for defin genirq/affinity: Add support for allocating interrupt sets A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts, and have them appropriately affinitized. Add support for defining a number of sets in the irq_affinity structure, of varying sizes, and get each set affinitized correctly across the machine. [ tglx: Minor changelog tweaks ] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] show more ...
# 060746d9	02-Nov-2018	Ming Lei <[email protected]>	genirq/affinity: Pass first vector to __irq_build_affinity_masks() No functional change. Prepares for support of allocating and affinitizing sets of interrupts, in which each set of interrupts need genirq/affinity: Pass first vector to __irq_build_affinity_masks() No functional change. Prepares for support of allocating and affinitizing sets of interrupts, in which each set of interrupts needs a full two stage spreading. The first vector argument is necessary for this so the affinitizing starts from the first vector of each set. [ tglx: Minor changelog tweaks ] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Hannes Reinecke <[email protected]> Cc: Keith Busch <[email protected]> Cc: Sagi Grimberg <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
# 5c903e10	02-Nov-2018	Ming Lei <[email protected]>	genirq/affinity: Move two stage affinity spreading into a helper function No functional change. Prepares for supporting allocating and affinitizing interrupt sets. [ tglx: Minor changelog tweaks ] genirq/affinity: Move two stage affinity spreading into a helper function No functional change. Prepares for supporting allocating and affinitizing interrupt sets. [ tglx: Minor changelog tweaks ] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Hannes Reinecke <[email protected]> Cc: Keith Busch <[email protected]> Cc: Sagi Grimberg <[email protected]> Link: https://lkml.kernel.org/r/[email protected] show more ...
12