History log of /linux-6.15/arch/x86/kernel/cpu/resctrl/ctrlmondata.c (Results 1 – 25 of 65)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7
# f62b4e45 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Move get_config_index() to a header

get_config_index() is used by the architecture specific code to map
a CLOSID+type pair to an index in the configuration arrays.

MPAM needs to do thi

x86/resctrl: Move get_config_index() to a header

get_config_index() is used by the architecture specific code to map
a CLOSID+type pair to an index in the configuration arrays.

MPAM needs to do this too to preserve the ABI to user-space, there is no
reason to do it differently.

Move the helper to a header file to allow all architectures that either
use or emulate CDP to use the same pattern of CLOSID values. Moving
this to a header file means it must be marked inline, which matches
the existing compiler choice for this static function.

Co-developed-by: Dave Martin <[email protected]>
Signed-off-by: Dave Martin <[email protected]>
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# d012b66a 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h

The architecture specific parts of resctrl provide helpers like
is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses t

x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h

The architecture specific parts of resctrl provide helpers like
is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses to the
rdt_mon_features bitmap.

Exposing a group of helpers between the architecture and filesystem code is
preferable to a single unsigned-long like rdt_mon_features. Helpers can be more
readable and have a well defined behaviour, while allowing architectures to hide
more complex behaviour.

Once the filesystem parts of resctrl are moved, these existing helpers can no
longer live in internal.h. Move them to include/linux/resctrl.h Once these are
exposed to the wider kernel, they should have a 'resctrl_arch_' prefix, to fit
the rest of the arch<->fs interface.

Move and rename the helpers that touch rdt_mon_features directly. is_mbm_event()
and is_mbm_enabled() are only called from rdtgroup.c, so can be moved into that
file.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# e3d5138c 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code

rdt_find_domain() finds a domain given a resource and a cache-id. This is
used by both the architecture code and the filesystem

x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code

rdt_find_domain() finds a domain given a resource and a cache-id. This is
used by both the architecture code and the filesystem code.

After the filesystem code moves to live in /fs/, this helper is either
duplicated by all architectures, or needs exposing by the filesystem code.

Add the declaration to the global header file. As it's now globally visible,
and has only a handful of callers, swap the 'rdt' for 'resctrl'. Move the
function to live with its caller in ctrlmondata.c as the filesystem code will
not have anything corresponding to core.c.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# dbc58f7e 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Generate default_ctrl instead of sharing it

The struct rdt_resource default_ctrl is used by both the architecture code for
resetting the hardware controls, and sometimes by the filesyst

x86/resctrl: Generate default_ctrl instead of sharing it

The struct rdt_resource default_ctrl is used by both the architecture code for
resetting the hardware controls, and sometimes by the filesystem code as the
default value for the schema, unless the bandwidth software controller is in
use.

Having the default exposed by the architecture code causes unnecessary
duplication for each architecture as the default value must be specified, but
can be derived from other schema properties. Now that the maximum bandwidth is
explicitly described, resctrl can derive the default value from the schema
format and the other resource properties.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 634ebb98 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Add max_bw to struct resctrl_membw

__rdt_get_mem_config_amd() and __get_mem_config_intel() both use the
default_ctrl property as a maximum value. This is because the MBA schema works
di

x86/resctrl: Add max_bw to struct resctrl_membw

__rdt_get_mem_config_amd() and __get_mem_config_intel() both use the
default_ctrl property as a maximum value. This is because the MBA schema works
differently between these platforms. Doing this complicates determining
whether the default_ctrl property belongs to the arch code, or can be derived
from the schema format.

Deriving the maximum or default value from the schema format would avoid the
architecture code having to tell resctrl such obvious things as the maximum
percentage is 100, and the maximum bitmap is all ones.

Maximum bandwidth is always going to vary per platform. Add max_bw as
a special case. This is currently used for the maximum MBA percentage on Intel
platforms, but can be removed from the architecture code if 'percentage'
becomes a schema format resctrl supports directly.

This value isn't needed for other schema formats.

This will allow the default_ctrl to be generated from the schema properties
when it is needed.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 43312b8e 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Remove data_width and the tabular format

The resctrl architecture code provides a data_width for the controls of each
resource. This is used to zero pad all control values in the schema

x86/resctrl: Remove data_width and the tabular format

The resctrl architecture code provides a data_width for the controls of each
resource. This is used to zero pad all control values in the schemata file so
they appear in columns. The same is done with the resource names to complete
the visual effect. e.g.

| SMBA:0=2048
| L3:0=00ff

AMD platforms discover their maximum bandwidth for the MB resource from
firmware, but hard-code the data_width to 4. If the maximum bandwidth requires
more digits - the tabular format is silently broken. This is also broken when
the mba_MBps mount option is used as the field width isn't updated. If new
schema are added resctrl will need to be able to determine the maximum width.
The benefit of this pretty-printing is questionable.

Instead of handling runtime discovery of the data_width for AMD platforms,
remove the feature. These fields are always zero padded so should be harmless
to remove if the whole field has been treated as a number. In the above
example, this would now look like this:

| SMBA:0=2048
| L3:0=ff

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# bb9343c8 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Use schema type to determine the schema format string

Resctrl's architecture code gets to specify a format string that is
used when printing schema entries. This is expected to be one o

x86/resctrl: Use schema type to determine the schema format string

Resctrl's architecture code gets to specify a format string that is
used when printing schema entries. This is expected to be one of two
values that the filesystem code supports.

Setting this format string allows the architecture code to change
the ABI resctrl presents to user-space.

Instead, use the schema format enum to choose which format string to
use.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# c24f5eab 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Use schema type to determine how to parse schema values

Resctrl's architecture code gets to specify a function pointer that is used
when parsing schema entries. This is expected to be o

x86/resctrl: Use schema type to determine how to parse schema values

Resctrl's architecture code gets to specify a function pointer that is used
when parsing schema entries. This is expected to be one of two helpers from
the filesystem code.

Setting this function pointer allows the architecture code to change the ABI
resctrl presents to user-space, and forces resctrl to expose these helpers.

Instead, add a schema format enum to choose which schema parser to use. This
allows the helpers to be made static and the structs used for passing
arguments moved out of shared headers.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 3c021531 11-Mar-2025 James Morse <[email protected]>

x86/resctrl: Add a helper to avoid reaching into the arch code resource list

Resctrl occasionally wants to know something about a specific resource, in
these cases it reaches into the arch code's rd

x86/resctrl: Add a helper to avoid reaching into the arch code resource list

Resctrl occasionally wants to know something about a specific resource, in
these cases it reaches into the arch code's rdt_resources_all[] array.

Once the filesystem parts of resctrl are moved to /fs/, this means it will
need visibility of the architecture specific struct rdt_hw_resource
definition, and the array of all resources. All architectures would also need
a r_resctrl member in this struct.

Instead, abstract this via a helper to allow architectures to do different
things here. Move the level enum to the resctrl header and add a helper to
retrieve the struct rdt_resource by 'rid'.

resctrl_arch_get_resource() should not return NULL for any value in the enum,
it may instead return a dummy resource that is !alloc_enabled && !mon_enabled.

Co-developed-by: Dave Martin <[email protected]>
Signed-off-by: Dave Martin <[email protected]>
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Amit Singh Tomar <[email protected]> # arm64
Tested-by: Shanker Donthineni <[email protected]> # arm64
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2
# 8e931105 06-Dec-2024 Tony Luck <[email protected]>

x86/resctrl: Add write option to "mba_MBps_event" file

The "mba_MBps" mount option provides an alternate method to control memory
bandwidth. Instead of specifying allowable bandwidth as a percentage

x86/resctrl: Add write option to "mba_MBps_event" file

The "mba_MBps" mount option provides an alternate method to control memory
bandwidth. Instead of specifying allowable bandwidth as a percentage of
maximum possible, the user provides a MiB/s limit value.

There is a file in each CTRL_MON group directory that shows the event
currently in use.

Allow writing that file to choose a different event.

A user can choose any of the memory bandwidth monitoring events listed in
/sys/fs/resctrl/info/L3_mon/mon_features independently for each CTRL_MON group
by writing to each of the "mba_MBps_event" files.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# f5cd0e31 06-Dec-2024 Tony Luck <[email protected]>

x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directories

The "mba_MBps" mount option provides an alternate method to control memory
bandwidth. Instead of specifying allowable bandwidth as a pe

x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directories

The "mba_MBps" mount option provides an alternate method to control memory
bandwidth. Instead of specifying allowable bandwidth as a percentage of
maximum possible, the user provides a MiB/s limit value.

In preparation to allow the user to pick the memory bandwidth monitoring event
used as input to the feedback loop, provide a file in each CTRL_MON group
directory that shows the event currently in use. Note that this file is only
visible when the "mba_MBps" mount option is in use.

Suggested-by: Reinette Chatre <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2
# 2b564841 01-Oct-2024 Martin Kletzander <[email protected]>

x86/resctrl: Avoid overflow in MB settings in bw_validate()

The resctrl schemata file supports specifying memory bandwidth associated with
the Memory Bandwidth Allocation (MBA) feature via a percent

x86/resctrl: Avoid overflow in MB settings in bw_validate()

The resctrl schemata file supports specifying memory bandwidth associated with
the Memory Bandwidth Allocation (MBA) feature via a percentage (this is the
default) or bandwidth in MiBps (when resctrl is mounted with the "mba_MBps"
option).

The allowed range for the bandwidth percentage is from
/sys/fs/resctrl/info/MB/min_bandwidth to 100, using a granularity of
/sys/fs/resctrl/info/MB/bandwidth_gran. The supported range for the MiBps
bandwidth is 0 to U32_MAX.

There are two issues with parsing of MiBps memory bandwidth:

* The user provided MiBps is mistakenly rounded up to the granularity
that is unique to percentage input.

* The user provided MiBps is parsed using unsigned long (thus accepting
values up to ULONG_MAX), and then assigned to u32 that could result in
overflow.

Do not round up the MiBps value and parse user provided bandwidth as the u32
it is intended to be. Use the appropriate kstrtou32() that can detect out of
range values.

Fixes: 8205a078ba78 ("x86/intel_rdt/mba_sc: Add schemata support")
Fixes: 6ce1560d35f6 ("x86/resctrl: Switch over to the resctrl mbps_val list")
Co-developed-by: Reinette Chatre <[email protected]>
Signed-off-by: Reinette Chatre <[email protected]>
Signed-off-by: Martin Kletzander <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Tony Luck <[email protected]>

show more ...


Revision tags: v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6
# c8c7d3d9 28-Jun-2024 Tony Luck <[email protected]>

x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter

mon_event_read() fills out most fields of the struct rmid_read that is passed
via an smp_call*() function to a CPU that is

x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter

mon_event_read() fills out most fields of the struct rmid_read that is passed
via an smp_call*() function to a CPU that is part of the correct domain to
read the monitor counters.

With Sub-NUMA Cluster (SNC) mode there are now two cases to handle:

1) Reading a file that returns a value for a single domain.
+ Choose the CPU to execute from the domain cpu_mask

2) Reading a file that must sum across domains sharing an L3 cache
instance.
+ Indicate to called code that a sum is needed by passing a NULL
rdt_mon_domain pointer.
+ Choose the CPU from the L3 shared_cpu_map.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 587edd70 28-Jun-2024 Tony Luck <[email protected]>

x86/resctrl: Initialize on-stack struct rmid_read instances

New semantics rely on some struct rmid_read members having NULL values to
distinguish between the SNC and non-SNC scenarios. resctrl can

x86/resctrl: Initialize on-stack struct rmid_read instances

New semantics rely on some struct rmid_read members having NULL values to
distinguish between the SNC and non-SNC scenarios. resctrl can thus no longer
rely on this struct not being initialized properly.

Initialize all on-stack declarations of struct rmid_read:

rdtgroup_mondata_show()
mbm_update()
mkdir_mondata_subdir()

to ensure that garbage values from the stack are not passed down to other
functions.

[ bp: Massage commit message. ]

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# cae2bcb6 28-Jun-2024 Tony Luck <[email protected]>

x86/resctrl: Split the rdt_domain and rdt_hw_domain structures

The same rdt_domain structure is used for both control and monitor
functions. But this results in wasted memory as some of the fields a

x86/resctrl: Split the rdt_domain and rdt_hw_domain structures

The same rdt_domain structure is used for both control and monitor
functions. But this results in wasted memory as some of the fields are
only used by control functions, while most are only used for monitor
functions.

Split into separate rdt_ctrl_domain and rdt_mon_domain structures with
just the fields required for control and monitoring respectively.

Similar split of the rdt_hw_domain structure into rdt_hw_ctrl_domain
and rdt_hw_mon_domain.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# cd84f72b 28-Jun-2024 Tony Luck <[email protected]>

x86/resctrl: Prepare for different scope for control/monitor operations

Resctrl assumes that control and monitor operations on a resource are
performed at the same scope.

Prepare for systems that u

x86/resctrl: Prepare for different scope for control/monitor operations

Resctrl assumes that control and monitor operations on a resource are
performed at the same scope.

Prepare for systems that use different scope (specifically Intel needs
to split the RDT_RESOURCE_L3 resource to use L3 scope for cache control
and NODE scope for cache occupancy and memory bandwidth monitoring).

Create separate domain lists for control and monitor operations.

Note that errors during initialization of either control or monitor
functions on a domain would previously result in that domain being
excluded from both control and monitor operations. Now the domains are
allocated independently it is no longer required to disable both control
and monitor operations if either fail.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# c103d4d4 28-Jun-2024 Tony Luck <[email protected]>

x86/resctrl: Prepare to split rdt_domain structure

The rdt_domain structure is used for both control and monitor features.
It is about to be split into separate structures for these two usages
becau

x86/resctrl: Prepare to split rdt_domain structure

The rdt_domain structure is used for both control and monitor features.
It is about to be split into separate structures for these two usages
because the scope for control and monitoring features for a resource
will be different for future resources.

To allow for common code that scans a list of domains looking for a
specific domain id, move all the common fields ("list", "id", "cpu_mask")
into their own structure within the rdt_domain structure.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# f436cb69 28-Jun-2024 Tony Luck <[email protected]>

x86/resctrl: Prepare for new domain scope

Resctrl resources operate on subsets of CPUs in the system with the
defining attribute of each subset being an instance of a particular
level of cache. E.g.

x86/resctrl: Prepare for new domain scope

Resctrl resources operate on subsets of CPUs in the system with the
defining attribute of each subset being an instance of a particular
level of cache. E.g. all CPUs sharing an L3 cache would be part of the
same domain.

In preparation for features that are scoped at the NUMA node level,
change the code from explicit references to "cache_level" to a more
generic scope. At this point the only options for this scope are groups
of CPUs that share an L2 cache or L3 cache.

Clean up the error handling when looking up domains. Report invalid ids
before calling rdt_find_domain() in preparation for better messages when
scope can be other than cache scope. This means that rdt_find_domain()
will never return an error. So remove checks for error from the call sites.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8
# bd4955d4 08-Mar-2024 Tony Luck <[email protected]>

x86/resctrl: Simplify call convention for MSR update functions

The per-resource MSR update functions cat_wrmsr(), mba_wrmsr_intel(),
and mba_wrmsr_amd() all take three arguments:

(struct rdt_doma

x86/resctrl: Simplify call convention for MSR update functions

The per-resource MSR update functions cat_wrmsr(), mba_wrmsr_intel(),
and mba_wrmsr_amd() all take three arguments:

(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)

struct msr_param contains pointers to both struct rdt_resource and struct
rdt_domain, thus only struct msr_param is necessary.

Pass struct msr_param as a single parameter. Clean up formatting and
fix some fir tree declaration ordering.

No functional change.

Suggested-by: Reinette Chatre <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Maciej Wieczor-Retman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# e3ca96e4 08-Mar-2024 Tony Luck <[email protected]>

x86/resctrl: Pass domain to target CPU

reset_all_ctrls() and resctrl_arch_update_domains() use on_each_cpu_mask()
to call rdt_ctrl_update() on potentially one CPU from each domain.

But this means r

x86/resctrl: Pass domain to target CPU

reset_all_ctrls() and resctrl_arch_update_domains() use on_each_cpu_mask()
to call rdt_ctrl_update() on potentially one CPU from each domain.

But this means rdt_ctrl_update() needs to figure out which domain to
apply changes to. Doing so requires a search of all domains in a resource,
which can only be done safely if cpus_lock is held. Both callers do hold
this lock, but there isn't a way for a function called on another CPU
via IPI to verify this.

Commit

c0d848fcb09d ("x86/resctrl: Remove lockdep annotation that triggers
false positive")

removed the incorrect assertions.

Add the target domain to the msr_param structure and call
rdt_ctrl_update() for each domain separately using
smp_call_function_single(). This means that rdt_ctrl_update() doesn't
need to search for the domain and get_domain_from_cpu() can safely
assert that the cpus_lock is held since the remaining callers do not use
IPI.

Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: James Morse <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Maciej Wieczor-Retman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.8-rc7, v6.8-rc6, v6.8-rc5
# fb700810 13-Feb-2024 James Morse <[email protected]>

x86/resctrl: Separate arch and fs resctrl locks

resctrl has one mutex that is taken by the architecture-specific code, and the
filesystem parts. The two interact via cpuhp, where the architecture co

x86/resctrl: Separate arch and fs resctrl locks

resctrl has one mutex that is taken by the architecture-specific code, and the
filesystem parts. The two interact via cpuhp, where the architecture code
updates the domain list. Filesystem handlers that walk the domains list should
not run concurrently with the cpuhp callback modifying the list.

Exposing a lock from the filesystem code means the interface is not cleanly
defined, and creates the possibility of cross-architecture lock ordering
headaches. The interaction only exists so that certain filesystem paths are
serialised against CPU hotplug. The CPU hotplug code already has a mechanism to
do this using cpus_read_lock().

MPAM's monitors have an overflow interrupt, so it needs to be possible to walk
the domains list in irq context. RCU is ideal for this, but some paths need to
be able to sleep to allocate memory.

Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp
callback, cpus_read_lock() must always be taken first.
rdtgroup_schemata_write() already does this.

Most of the filesystem code's domain list walkers are currently protected by
the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are
rdt_bit_usage_show() and the mon_config helpers which take the lock directly.

Make the domain list protected by RCU. An architecture-specific lock prevents
concurrent writers. rdt_bit_usage_show() could walk the domain list using RCU,
but to keep all the filesystem operations the same, this is changed to call
cpus_read_lock(). The mon_config helpers send multiple IPIs, take the
cpus_read_lock() in these cases.

The other filesystem list walkers need to be able to sleep. Add
cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't
be invoked when file system operations are occurring.

Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live()
call isn't obvious.

Resctrl's domain online/offline calls now need to take the rdtgroup_mutex
themselves.

[ bp: Fold in a build fix: https://lore.kernel.org/r/87zfvwieli.ffs@tglx ]

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov (AMD) <[email protected]>

show more ...


# 978fcca9 13-Feb-2024 James Morse <[email protected]>

x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU

When a CPU is taken offline resctrl may need to move the overflow or limbo
handlers to run on a different CPU.

Once the off

x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU

When a CPU is taken offline resctrl may need to move the overflow or limbo
handlers to run on a different CPU.

Once the offline callbacks have been split, cqm_setup_limbo_handler() will be
called while the CPU that is going offline is still present in the CPU mask.

Pass the CPU to exclude to cqm_setup_limbo_handler() and
mbm_setup_overflow_handler(). These functions can use a variant of
cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need
excluding.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov (AMD) <[email protected]>

show more ...


# e557999f 13-Feb-2024 James Morse <[email protected]>

x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()

Depending on the number of monitors available, Arm's MPAM may need to
allocate a monitor prior to reading the counter va

x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()

Depending on the number of monitors available, Arm's MPAM may need to
allocate a monitor prior to reading the counter value. Allocating a
contended resource may involve sleeping.

__check_limbo() and mon_event_count() each make multiple calls to
resctrl_arch_rmid_read(), to avoid extra work on contended systems,
the allocation should be valid for multiple invocations of
resctrl_arch_rmid_read().

The memory or hardware allocated is not specific to a domain.

Add arch hooks for this allocation, which need calling before
resctrl_arch_rmid_read(). The allocated monitor is passed to
resctrl_arch_rmid_read(), then freed again afterwards. The helper
can be called on any CPU, and can sleep.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov (AMD) <[email protected]>

show more ...


# 09909e09 13-Feb-2024 James Morse <[email protected]>

x86/resctrl: Queue mon_event_read() instead of sending an IPI

Intel is blessed with an abundance of monitors, one per RMID, that can be
read from any CPU in the domain. MPAMs monitors reside in the

x86/resctrl: Queue mon_event_read() instead of sending an IPI

Intel is blessed with an abundance of monitors, one per RMID, that can be
read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC,
the number implemented is up to the manufacturer. This means when there are
fewer monitors than needed, they need to be allocated and freed.

MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The
CSU counter is allowed to return 'not ready' for a small number of
micro-seconds after programming. To allow one CSU hardware monitor to be
used for multiple control or monitor groups, the CPU accessing the
monitor needs to be able to block when configuring and reading the
counter.

Worse, the domain may be broken up into slices, and the MMIO accesses
for each slice may need performing from different CPUs.

These two details mean MPAMs monitor code needs to be able to sleep, and
IPI another CPU in the domain to read from a resource that has been sliced.

mon_event_read() already invokes mon_event_count() via IPI, which means
this isn't possible. On systems using nohz-full, some CPUs need to be
interrupted to run kernel work as they otherwise stay in user-space
running realtime workloads. Interrupting these CPUs should be avoided,
and scheduling work on them may never complete.

Change mon_event_read() to pick a housekeeping CPU, (one that is not using
nohz_full) and schedule mon_event_count() and wait. If all the CPUs
in a domain are using nohz-full, then an IPI is used as the fallback.

This function is only used in response to a user-space filesystem request
(not the timing sensitive overflow code).

This allows MPAM to hide the slice behaviour from resctrl, and to keep
the monitor-allocation in monitor.c. When the IPI fallback is used on
machines where MPAM needs to make an access on multiple CPUs, the counter
read will always fail.

Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Peter Newman <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Peter Newman <[email protected]>
Tested-by: Babu Moger <[email protected]>
Tested-by: Carl Worth <[email protected]> # arm64
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov (AMD) <[email protected]>

show more ...


Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6
# 0e3cd31f 10-Oct-2023 Maciej Wieczor-Retman <[email protected]>

x86/resctrl: Enable non-contiguous CBMs in Intel CAT

The setting for non-contiguous 1s support in Intel CAT is
hardcoded to false. On these systems, writing non-contiguous
1s into the schemata file

x86/resctrl: Enable non-contiguous CBMs in Intel CAT

The setting for non-contiguous 1s support in Intel CAT is
hardcoded to false. On these systems, writing non-contiguous
1s into the schemata file will fail before resctrl passes
the value to the hardware.

In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
being reserved and now carry information about non-contiguous 1s
value support for L3 and L2 cache respectively. The CAT
capacity bitmask (CBM) supports a non-contiguous 1s value if
the bit is set.

The exception are Haswell systems where non-contiguous 1s value
support needs to stay disabled since they can't make use of CPUID
for Cache allocation.

Originally-by: Fenghua Yu <[email protected]>
Signed-off-by: Maciej Wieczor-Retman <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
Reviewed-by: Peter Newman <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Tested-by: Peter Newman <[email protected]>
Link: https://lore.kernel.org/r/1849b487256fe4de40b30f88450cba3d9abc9171.1696934091.git.maciej.wieczor-retman@intel.com

show more ...


123