History log of /linux-6.15/kernel/irq/affinity.c (Results 1 – 25 of 47)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2
# f7b3ea8c 27-Dec-2022 Ming Lei <[email protected]>

genirq/affinity: Move group_cpus_evenly() into lib/

group_cpus_evenly() has become a generic function which can be used for
other subsystems than the interrupt subsystem, so move it into lib/.

Sign

genirq/affinity: Move group_cpus_evenly() into lib/

group_cpus_evenly() has become a generic function which can be used for
other subsystems than the interrupt subsystem, so move it into lib/.

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 523f1ea7 27-Dec-2022 Ming Lei <[email protected]>

genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly

Map irq vector into group, which allows to abstract the algorithm for
a generic use case outside of the interrupt core.

Rename

genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly

Map irq vector into group, which allows to abstract the algorithm for
a generic use case outside of the interrupt core.

Rename irq_build_affinity_masks as group_cpus_evenly, so the API can be
reused for blk-mq to make default queue mapping even though irq vectors
aren't involved.

No functional change, just rename vector as group.

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# e7bdd7f0 27-Dec-2022 Ming Lei <[email protected]>

genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks

Prepare for abstracting irq_build_affinity_masks() into a public function
for assigning all CPUs evenly into several g

genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks

Prepare for abstracting irq_build_affinity_masks() into a public function
for assigning all CPUs evenly into several groups.

Don't pass irq_affinity_desc array to irq_build_affinity_masks, instead
return a cpumask array by storing each assigned group into one element of
the array.

This allows to provide a generic interface for grouping all CPUs evenly
from a NUMA and CPU locality viewpoint, and the cost is one extra allocation
in irq_build_affinity_masks(), which should be fine since it is done via
GFP_KERNEL and irq_build_affinity_masks() is a slow path anyway.

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: John Garry <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 1f962d91 27-Dec-2022 Ming Lei <[email protected]>

genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks

Pass affinity managed mask array to irq_build_affinity_masks() so that the
index of the first affinity managed vector is

genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks

Pass affinity managed mask array to irq_build_affinity_masks() so that the
index of the first affinity managed vector is always zero.

This allows to simplify the implementation a bit.

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: John Garry <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# cdf07f0e 27-Dec-2022 Ming Lei <[email protected]>

genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks

The 'firstvec' parameter is always same with the parameter of
'startvec', so use 'startvec' directly inside irq_build_a

genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks

The 'firstvec' parameter is always same with the parameter of
'startvec', so use 'startvec' directly inside irq_build_affinity_masks().

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: John Garry <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2
# 99248e35 23-Jan-2022 Yury Norov <[email protected]>

genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate

__irq_build_affinity_masks() calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more effi

genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate

__irq_build_affinity_masks() calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <[email protected]>

show more ...


# 08d835df 31-Mar-2022 Rei Yamamoto <[email protected]>

genirq/affinity: Consider that CPUs on nodes can be unbalanced

If CPUs on a node are offline at boot time, the number of nodes is
different when building affinity masks for present cpus and when bui

genirq/affinity: Consider that CPUs on nodes can be unbalanced

If CPUs on a node are offline at boot time, the number of nodes is
different when building affinity masks for present cpus and when building
affinity masks for possible cpus. This causes the following problem:

In the case that the number of vectors is less than the number of nodes
there are cases where bits of masks for present cpus are overwritten when
building masks for possible cpus.

Fix this by excluding CPUs, which are not part of the current build mask
(present/possible).

[ tglx: Massaged changelog and added comment ]

Fixes: b82592199032 ("genirq/affinity: Spread IRQs to all available NUMA nodes")
Signed-off-by: Rei Yamamoto <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 911488de 10-Feb-2022 Yury Norov <[email protected]>

genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate

__irq_build_affinity_masks() calls cpumask_weight() to check if any bit of
a given cpumask is set.

This can be done

genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate

__irq_build_affinity_masks() calls cpumask_weight() to check if any bit of
a given cpumask is set.

This can be done more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5
# 428e2116 03-Aug-2021 Sebastian Andrzej Siewior <[email protected]>

genirq/affinity: Replace deprecated CPU-hotplug functions.

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lo

genirq/affinity: Replace deprecated CPU-hotplug functions.

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7
# 101f85b5 28-Aug-2019 Ming Lei <[email protected]>

genirq/affinity: Remove const qualifier from node_to_cpumask argument

When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as

'typedef struct cpumask cpumask_var_t[1]',

so the argument '

genirq/affinity: Remove const qualifier from node_to_cpumask argument

When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as

'typedef struct cpumask cpumask_var_t[1]',

so the argument 'node_to_cpumask' alloc_nodes_vectors() can't be declared
as 'const cpumask_var_t *'

Fixes the following warning:

kernel/irq/affinity.c: In function '__irq_build_affinity_masks':
alloc_nodes_vectors(numvecs, node_to_cpumask, cpu_mask,
^
kernel/irq/affinity.c:128:13: note: expected 'const struct cpumask (*)[1]' but argument is of type 'struct cpumask (*)[1]'
static void alloc_nodes_vectors(unsigned int numvecs,
^
Fixes: b1a5a73e64e9 ("genirq/affinity: Spread vectors on node according to nr_cpu ratio")
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.3-rc6, v5.3-rc5
# b1a5a73e 16-Aug-2019 Ming Lei <[email protected]>

genirq/affinity: Spread vectors on node according to nr_cpu ratio

Now __irq_build_affinity_masks() spreads vectors evenly per node, but there
is a case that not all vectors have been spread when eac

genirq/affinity: Spread vectors on node according to nr_cpu ratio

Now __irq_build_affinity_masks() spreads vectors evenly per node, but there
is a case that not all vectors have been spread when each numa node has a
different number of CPUs which triggers the warning in the spreading code.

Improve the spreading algorithm by

- assigning vectors according to the ratio of the number of CPUs on a node
to the number of remaining CPUs.

- running the assignment from smaller nodes to bigger nodes to guarantee
that every active node gets allocated at least one vector.

This ensures that all vectors are spread out. Asided of that the spread
becomes more fair if the nodes have different number of CPUs.

For example, on the following machine:
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
...
NUMA node0 CPU(s): 0,1,3,5-9,11,13-15
NUMA node1 CPU(s): 2,4,10,12

When a driver requests to allocate 8 vectors, the following spread results:

irq 31, cpu list 2,4
irq 32, cpu list 10,12
irq 33, cpu list 0-1
irq 34, cpu list 3,5
irq 35, cpu list 6-7
irq 36, cpu list 8-9
irq 37, cpu list 11,13
irq 38, cpu list 14-15

So Node 0 has now 6 and Node 1 has 2 vectors assigned. The original
algorithm assigned 4 vectors on each node which was unfair versus Node 0.

[ tglx: Massaged changelog ]

Reported-by: Jon Derrick <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Jon Derrick <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# 53c1788b 16-Aug-2019 Ming Lei <[email protected]>

genirq/affinity: Improve __irq_build_affinity_masks()

One invariant of __irq_build_affinity_masks() is that all CPUs in the
specified masks (cpu_mask AND node_to_cpumask for each node) should be
cov

genirq/affinity: Improve __irq_build_affinity_masks()

One invariant of __irq_build_affinity_masks() is that all CPUs in the
specified masks (cpu_mask AND node_to_cpumask for each node) should be
covered during the spread. Even though all requested vectors have been
reached, it's still required to spread vectors among remained CPUs. A
similar policy has been taken in case of 'numvecs <= nodes' already.

So remove the following check inside the loop:

if (done >= numvecs)
break;

Meantime assign at least 1 vector for remaining nodes if 'numvecs' vectors
have been handled already.

Also, if the specified cpumask for one numa node is empty, simply do not
spread vectors on this node.

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.3-rc4, v5.3-rc3
# 491beed3 05-Aug-2019 Ming Lei <[email protected]>

genirq/affinity: Create affinity mask for single vector

Since commit c66d4bd110a1f8 ("genirq/affinity: Add new callback for
(re)calculating interrupt sets"), irq_create_affinity_masks() returns
NULL

genirq/affinity: Create affinity mask for single vector

Since commit c66d4bd110a1f8 ("genirq/affinity: Add new callback for
(re)calculating interrupt sets"), irq_create_affinity_masks() returns
NULL in case of single vector. This change has caused regression on some
drivers, such as lpfc.

The problem is that single vector requests can happen in some generic cases:

1) kdump kernel

2) irq vectors resource is close to exhaustion.

If in that situation the affinity mask for a single vector is not created,
every caller has to handle the special case.

There is no reason why the mask cannot be created, so remove the check for
a single vector and create the mask.

Fixes: c66d4bd110a1f8 ("genirq/affinity: Add new callback for (re)calculating interrupt sets")
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3
# 0e518330 02-Jun-2019 Minwoo Im <[email protected]>

genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks()

The *affd argument is neither used in irq_build_affinity_masks() nor
__irq_build_affinity_masks(). Remove it.

Signed-off-

genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks()

The *affd argument is neither used in irq_build_affinity_masks() nor
__irq_build_affinity_masks(). Remove it.

Signed-off-by: Minwoo Im <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Cc: Minwoo Im <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7
# a6a309ed 16-Feb-2019 Thomas Gleixner <[email protected]>

genirq/affinity: Remove the leftovers of the original set support

Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

genirq/affinity: Remove the leftovers of the original set support

Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Sagi Grimberg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Keith Busch <[email protected]>
Cc: Sumit Saxena <[email protected]>
Cc: Kashyap Desai <[email protected]>
Cc: Shivasharan Srikanteshwara <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# c66d4bd1 16-Feb-2019 Ming Lei <[email protected]>

genirq/affinity: Add new callback for (re)calculating interrupt sets

The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set cont

genirq/affinity: Add new callback for (re)calculating interrupt sets

The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources. This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
for nr_sets after the invocation of the callback so it catches
broken drivers. Fixed the kernel doc comments for struct
irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Sagi Grimberg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Keith Busch <[email protected]>
Cc: Sumit Saxena <[email protected]>
Cc: Kashyap Desai <[email protected]>
Cc: Shivasharan Srikanteshwara <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# 9cfef55b 16-Feb-2019 Ming Lei <[email protected]>

genirq/affinity: Store interrupt sets size in struct irq_affinity

The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contain

genirq/affinity: Store interrupt sets size in struct irq_affinity

The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one
or more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via
a pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a
loop in the driver to determine the maximum number of interrupts which
are provided by the PCI capabilities and the underlying CPU resources.
This loop would have to be replicated in every driver which wants to
utilize this mechanism. That's unwanted code duplication and error
prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and
their size, in the core code. As the core code does not have any
knowledge about the underlying device, a driver specific callback will
be added to struct affinity_desc, which will be invoked by the core
code. The callback will get the number of available interupts as an
argument, so the driver can calculate the corresponding number and size
of interrupt sets.

To support this, two modifications for the handling of struct irq_affinity
are required:

1) The (optional) interrupt sets size information is contained in a
separate array of integers and struct irq_affinity contains a
pointer to it.

This is cumbersome and as the maximum number of interrupt sets is small,
there is no reason to have separate storage. Moving the size array into
struct affinity_desc avoids indirections and makes the code simpler.

2) At the moment the struct irq_affinity pointer which is handed in from
the driver and passed through to several core functions is marked
'const'.

With the upcoming callback to recalculate the number and size of
interrupt sets, it's necessary to remove the 'const'
qualifier. Otherwise the callback would not be able to update the data.

Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

No functional change.

[ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
source. Fixed the kernel doc comments for struct irq_affinity and
de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Sagi Grimberg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Keith Busch <[email protected]>
Cc: Sumit Saxena <[email protected]>
Cc: Kashyap Desai <[email protected]>
Cc: Shivasharan Srikanteshwara <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# 0145c30e 16-Feb-2019 Thomas Gleixner <[email protected]>

genirq/affinity: Code consolidation

All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over

genirq/affinity: Code consolidation

All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over to unsigned int.

Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Sagi Grimberg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Keith Busch <[email protected]>
Cc: Sumit Saxena <[email protected]>
Cc: Kashyap Desai <[email protected]>
Cc: Shivasharan Srikanteshwara <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.0-rc6, v5.0-rc5, v5.0-rc4
# 347253c4 25-Jan-2019 Ming Lei <[email protected]>

genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks()

'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(),
so move it into irq_build_affinity

genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks()

'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(),
so move it into irq_build_affinity_masks().

No functioanl change.

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Bjorn Helgaas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Sagi Grimberg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


Revision tags: v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6
# c410abbb 04-Dec-2018 Dou Liyang <[email protected]>

genirq/affinity: Add is_managed to struct irq_affinity_desc

Devices which use managed interrupts usually have two classes of
interrupts:

- Interrupts for multiple device queues
- Interrupts for

genirq/affinity: Add is_managed to struct irq_affinity_desc

Devices which use managed interrupts usually have two classes of
interrupts:

- Interrupts for multiple device queues
- Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled. That limitation was reported by Kashyap and Sumit.

Expand struct irq_affinity_desc with a new bit 'is_managed' which is set
for truly managed interrupts (queue interrupts) and cleared for the general
device interrupts.

[ tglx: Simplify code and massage changelog ]

Reported-by: Kashyap Desai <[email protected]>
Reported-by: Sumit Saxena <[email protected]>
Signed-off-by: Dou Liyang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# bec04037 04-Dec-2018 Dou Liyang <[email protected]>

genirq/core: Introduce struct irq_affinity_desc

The interrupt affinity management uses straight cpumask pointers to convey
the automatically assigned affinity masks for managed interrupts. The core

genirq/core: Introduce struct irq_affinity_desc

The interrupt affinity management uses straight cpumask pointers to convey
the automatically assigned affinity masks for managed interrupts. The core
interrupt descriptor allocation also decides based on the pointer being non
NULL whether an interrupt is managed or not.

Devices which use managed interrupts usually have two classes of
interrupts:

- Interrupts for multiple device queues
- Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled.

To remedy that situation it's required to convey more information than the
cpumasks through various interfaces related to interrupt descriptor
allocation.

Instead of adding yet another argument, create a new data structure
'irq_affinity_desc' which for now just contains the cpumask. This struct
can be expanded to convey auxilliary information in the next step.

No functional change, just preparatory work.

[ tglx: Simplified logic and clarified changelog ]

Suggested-by: Thomas Gleixner <[email protected]>
Suggested-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Dou Liyang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# c2899c34 18-Dec-2018 Thomas Gleixner <[email protected]>

genirq/affinity: Remove excess indentation

Plus other coding style issues which stood out while staring at that code.

Signed-off-by: Thomas Gleixner <[email protected]>


Revision tags: v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1
# 6da4b3ab 02-Nov-2018 Jens Axboe <[email protected]>

genirq/affinity: Add support for allocating interrupt sets

A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
and have them appropriately affinitized.

Add support for defin

genirq/affinity: Add support for allocating interrupt sets

A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
and have them appropriately affinitized.

Add support for defining a number of sets in the irq_affinity structure, of
varying sizes, and get each set affinitized correctly across the machine.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# 060746d9 02-Nov-2018 Ming Lei <[email protected]>

genirq/affinity: Pass first vector to __irq_build_affinity_masks()

No functional change.

Prepares for support of allocating and affinitizing sets of interrupts, in
which each set of interrupts need

genirq/affinity: Pass first vector to __irq_build_affinity_masks()

No functional change.

Prepares for support of allocating and affinitizing sets of interrupts, in
which each set of interrupts needs a full two stage spreading. The first
vector argument is necessary for this so the affinitizing starts from the
first vector of each set.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Hannes Reinecke <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


# 5c903e10 02-Nov-2018 Ming Lei <[email protected]>

genirq/affinity: Move two stage affinity spreading into a helper function

No functional change. Prepares for supporting allocating and affinitizing
interrupt sets.

[ tglx: Minor changelog tweaks ]

genirq/affinity: Move two stage affinity spreading into a helper function

No functional change. Prepares for supporting allocating and affinitizing
interrupt sets.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Cc: Hannes Reinecke <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

show more ...


12