|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7 |
|
| #
f62b4e45 |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Move get_config_index() to a header
get_config_index() is used by the architecture specific code to map a CLOSID+type pair to an index in the configuration arrays.
MPAM needs to do thi
x86/resctrl: Move get_config_index() to a header
get_config_index() is used by the architecture specific code to map a CLOSID+type pair to an index in the configuration arrays.
MPAM needs to do this too to preserve the ABI to user-space, there is no reason to do it differently.
Move the helper to a header file to allow all architectures that either use or emulate CDP to use the same pattern of CLOSID values. Moving this to a header file means it must be marked inline, which matches the existing compiler choice for this static function.
Co-developed-by: Dave Martin <[email protected]> Signed-off-by: Dave Martin <[email protected]> Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
d012b66a |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h
The architecture specific parts of resctrl provide helpers like is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses t
x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.h
The architecture specific parts of resctrl provide helpers like is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses to the rdt_mon_features bitmap.
Exposing a group of helpers between the architecture and filesystem code is preferable to a single unsigned-long like rdt_mon_features. Helpers can be more readable and have a well defined behaviour, while allowing architectures to hide more complex behaviour.
Once the filesystem parts of resctrl are moved, these existing helpers can no longer live in internal.h. Move them to include/linux/resctrl.h Once these are exposed to the wider kernel, they should have a 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface.
Move and rename the helpers that touch rdt_mon_features directly. is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c, so can be moved into that file.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
e3d5138c |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
rdt_find_domain() finds a domain given a resource and a cache-id. This is used by both the architecture code and the filesystem
x86/resctrl: Move rdt_find_domain() to be visible to arch and fs code
rdt_find_domain() finds a domain given a resource and a cache-id. This is used by both the architecture code and the filesystem code.
After the filesystem code moves to live in /fs/, this helper is either duplicated by all architectures, or needs exposing by the filesystem code.
Add the declaration to the global header file. As it's now globally visible, and has only a handful of callers, swap the 'rdt' for 'resctrl'. Move the function to live with its caller in ctrlmondata.c as the filesystem code will not have anything corresponding to core.c.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Shaopeng Tan <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
dbc58f7e |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Generate default_ctrl instead of sharing it
The struct rdt_resource default_ctrl is used by both the architecture code for resetting the hardware controls, and sometimes by the filesyst
x86/resctrl: Generate default_ctrl instead of sharing it
The struct rdt_resource default_ctrl is used by both the architecture code for resetting the hardware controls, and sometimes by the filesystem code as the default value for the schema, unless the bandwidth software controller is in use.
Having the default exposed by the architecture code causes unnecessary duplication for each architecture as the default value must be specified, but can be derived from other schema properties. Now that the maximum bandwidth is explicitly described, resctrl can derive the default value from the schema format and the other resource properties.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
634ebb98 |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Add max_bw to struct resctrl_membw
__rdt_get_mem_config_amd() and __get_mem_config_intel() both use the default_ctrl property as a maximum value. This is because the MBA schema works di
x86/resctrl: Add max_bw to struct resctrl_membw
__rdt_get_mem_config_amd() and __get_mem_config_intel() both use the default_ctrl property as a maximum value. This is because the MBA schema works differently between these platforms. Doing this complicates determining whether the default_ctrl property belongs to the arch code, or can be derived from the schema format.
Deriving the maximum or default value from the schema format would avoid the architecture code having to tell resctrl such obvious things as the maximum percentage is 100, and the maximum bitmap is all ones.
Maximum bandwidth is always going to vary per platform. Add max_bw as a special case. This is currently used for the maximum MBA percentage on Intel platforms, but can be removed from the architecture code if 'percentage' becomes a schema format resctrl supports directly.
This value isn't needed for other schema formats.
This will allow the default_ctrl to be generated from the schema properties when it is needed.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
43312b8e |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Remove data_width and the tabular format
The resctrl architecture code provides a data_width for the controls of each resource. This is used to zero pad all control values in the schema
x86/resctrl: Remove data_width and the tabular format
The resctrl architecture code provides a data_width for the controls of each resource. This is used to zero pad all control values in the schemata file so they appear in columns. The same is done with the resource names to complete the visual effect. e.g.
| SMBA:0=2048 | L3:0=00ff
AMD platforms discover their maximum bandwidth for the MB resource from firmware, but hard-code the data_width to 4. If the maximum bandwidth requires more digits - the tabular format is silently broken. This is also broken when the mba_MBps mount option is used as the field width isn't updated. If new schema are added resctrl will need to be able to determine the maximum width. The benefit of this pretty-printing is questionable.
Instead of handling runtime discovery of the data_width for AMD platforms, remove the feature. These fields are always zero padded so should be harmless to remove if the whole field has been treated as a number. In the above example, this would now look like this:
| SMBA:0=2048 | L3:0=ff
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
bb9343c8 |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Use schema type to determine the schema format string
Resctrl's architecture code gets to specify a format string that is used when printing schema entries. This is expected to be one o
x86/resctrl: Use schema type to determine the schema format string
Resctrl's architecture code gets to specify a format string that is used when printing schema entries. This is expected to be one of two values that the filesystem code supports.
Setting this format string allows the architecture code to change the ABI resctrl presents to user-space.
Instead, use the schema format enum to choose which format string to use.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
c24f5eab |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Use schema type to determine how to parse schema values
Resctrl's architecture code gets to specify a function pointer that is used when parsing schema entries. This is expected to be o
x86/resctrl: Use schema type to determine how to parse schema values
Resctrl's architecture code gets to specify a function pointer that is used when parsing schema entries. This is expected to be one of two helpers from the filesystem code.
Setting this function pointer allows the architecture code to change the ABI resctrl presents to user-space, and forces resctrl to expose these helpers.
Instead, add a schema format enum to choose which schema parser to use. This allows the helpers to be made static and the structs used for passing arguments moved out of shared headers.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
3c021531 |
| 11-Mar-2025 |
James Morse <[email protected]> |
x86/resctrl: Add a helper to avoid reaching into the arch code resource list
Resctrl occasionally wants to know something about a specific resource, in these cases it reaches into the arch code's rd
x86/resctrl: Add a helper to avoid reaching into the arch code resource list
Resctrl occasionally wants to know something about a specific resource, in these cases it reaches into the arch code's rdt_resources_all[] array.
Once the filesystem parts of resctrl are moved to /fs/, this means it will need visibility of the architecture specific struct rdt_hw_resource definition, and the array of all resources. All architectures would also need a r_resctrl member in this struct.
Instead, abstract this via a helper to allow architectures to do different things here. Move the level enum to the resctrl header and add a helper to retrieve the struct rdt_resource by 'rid'.
resctrl_arch_get_resource() should not return NULL for any value in the enum, it may instead return a dummy resource that is !alloc_enabled && !mon_enabled.
Co-developed-by: Dave Martin <[email protected]> Signed-off-by: Dave Martin <[email protected]> Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Tested-by: Shaopeng Tan <[email protected]> Tested-by: Amit Singh Tomar <[email protected]> # arm64 Tested-by: Shanker Donthineni <[email protected]> # arm64 Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2 |
|
| #
8e931105 |
| 06-Dec-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Add write option to "mba_MBps_event" file
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage
x86/resctrl: Add write option to "mba_MBps_event" file
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage of maximum possible, the user provides a MiB/s limit value.
There is a file in each CTRL_MON group directory that shows the event currently in use.
Allow writing that file to choose a different event.
A user can choose any of the memory bandwidth monitoring events listed in /sys/fs/resctrl/info/L3_mon/mon_features independently for each CTRL_MON group by writing to each of the "mba_MBps_event" files.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
f5cd0e31 |
| 06-Dec-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directories
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a pe
x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directories
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage of maximum possible, the user provides a MiB/s limit value.
In preparation to allow the user to pick the memory bandwidth monitoring event used as input to the feedback loop, provide a file in each CTRL_MON group directory that shows the event currently in use. Note that this file is only visible when the "mba_MBps" mount option is in use.
Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2 |
|
| #
2b564841 |
| 01-Oct-2024 |
Martin Kletzander <[email protected]> |
x86/resctrl: Avoid overflow in MB settings in bw_validate()
The resctrl schemata file supports specifying memory bandwidth associated with the Memory Bandwidth Allocation (MBA) feature via a percent
x86/resctrl: Avoid overflow in MB settings in bw_validate()
The resctrl schemata file supports specifying memory bandwidth associated with the Memory Bandwidth Allocation (MBA) feature via a percentage (this is the default) or bandwidth in MiBps (when resctrl is mounted with the "mba_MBps" option).
The allowed range for the bandwidth percentage is from /sys/fs/resctrl/info/MB/min_bandwidth to 100, using a granularity of /sys/fs/resctrl/info/MB/bandwidth_gran. The supported range for the MiBps bandwidth is 0 to U32_MAX.
There are two issues with parsing of MiBps memory bandwidth:
* The user provided MiBps is mistakenly rounded up to the granularity that is unique to percentage input.
* The user provided MiBps is parsed using unsigned long (thus accepting values up to ULONG_MAX), and then assigned to u32 that could result in overflow.
Do not round up the MiBps value and parse user provided bandwidth as the u32 it is intended to be. Use the appropriate kstrtou32() that can detect out of range values.
Fixes: 8205a078ba78 ("x86/intel_rdt/mba_sc: Add schemata support") Fixes: 6ce1560d35f6 ("x86/resctrl: Switch over to the resctrl mbps_val list") Co-developed-by: Reinette Chatre <[email protected]> Signed-off-by: Reinette Chatre <[email protected]> Signed-off-by: Martin Kletzander <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Tony Luck <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6 |
|
| #
c8c7d3d9 |
| 28-Jun-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter
mon_event_read() fills out most fields of the struct rmid_read that is passed via an smp_call*() function to a CPU that is
x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter
mon_event_read() fills out most fields of the struct rmid_read that is passed via an smp_call*() function to a CPU that is part of the correct domain to read the monitor counters.
With Sub-NUMA Cluster (SNC) mode there are now two cases to handle:
1) Reading a file that returns a value for a single domain. + Choose the CPU to execute from the domain cpu_mask
2) Reading a file that must sum across domains sharing an L3 cache instance. + Indicate to called code that a sum is needed by passing a NULL rdt_mon_domain pointer. + Choose the CPU from the L3 shared_cpu_map.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
587edd70 |
| 28-Jun-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Initialize on-stack struct rmid_read instances
New semantics rely on some struct rmid_read members having NULL values to distinguish between the SNC and non-SNC scenarios. resctrl can
x86/resctrl: Initialize on-stack struct rmid_read instances
New semantics rely on some struct rmid_read members having NULL values to distinguish between the SNC and non-SNC scenarios. resctrl can thus no longer rely on this struct not being initialized properly.
Initialize all on-stack declarations of struct rmid_read:
rdtgroup_mondata_show() mbm_update() mkdir_mondata_subdir()
to ensure that garbage values from the stack are not passed down to other functions.
[ bp: Massage commit message. ]
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
cae2bcb6 |
| 28-Jun-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
The same rdt_domain structure is used for both control and monitor functions. But this results in wasted memory as some of the fields a
x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
The same rdt_domain structure is used for both control and monitor functions. But this results in wasted memory as some of the fields are only used by control functions, while most are only used for monitor functions.
Split into separate rdt_ctrl_domain and rdt_mon_domain structures with just the fields required for control and monitoring respectively.
Similar split of the rdt_hw_domain structure into rdt_hw_ctrl_domain and rdt_hw_mon_domain.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
cd84f72b |
| 28-Jun-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Prepare for different scope for control/monitor operations
Resctrl assumes that control and monitor operations on a resource are performed at the same scope.
Prepare for systems that u
x86/resctrl: Prepare for different scope for control/monitor operations
Resctrl assumes that control and monitor operations on a resource are performed at the same scope.
Prepare for systems that use different scope (specifically Intel needs to split the RDT_RESOURCE_L3 resource to use L3 scope for cache control and NODE scope for cache occupancy and memory bandwidth monitoring).
Create separate domain lists for control and monitor operations.
Note that errors during initialization of either control or monitor functions on a domain would previously result in that domain being excluded from both control and monitor operations. Now the domains are allocated independently it is no longer required to disable both control and monitor operations if either fail.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
c103d4d4 |
| 28-Jun-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Prepare to split rdt_domain structure
The rdt_domain structure is used for both control and monitor features. It is about to be split into separate structures for these two usages becau
x86/resctrl: Prepare to split rdt_domain structure
The rdt_domain structure is used for both control and monitor features. It is about to be split into separate structures for these two usages because the scope for control and monitoring features for a resource will be different for future resources.
To allow for common code that scans a list of domains looking for a specific domain id, move all the common fields ("list", "id", "cpu_mask") into their own structure within the rdt_domain structure.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
f436cb69 |
| 28-Jun-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Prepare for new domain scope
Resctrl resources operate on subsets of CPUs in the system with the defining attribute of each subset being an instance of a particular level of cache. E.g.
x86/resctrl: Prepare for new domain scope
Resctrl resources operate on subsets of CPUs in the system with the defining attribute of each subset being an instance of a particular level of cache. E.g. all CPUs sharing an L3 cache would be part of the same domain.
In preparation for features that are scoped at the NUMA node level, change the code from explicit references to "cache_level" to a more generic scope. At this point the only options for this scope are groups of CPUs that share an L2 cache or L3 cache.
Clean up the error handling when looking up domains. Report invalid ids before calling rdt_find_domain() in preparation for better messages when scope can be other than cache scope. This means that rdt_find_domain() will never return an error. So remove checks for error from the call sites.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8 |
|
| #
bd4955d4 |
| 08-Mar-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Simplify call convention for MSR update functions
The per-resource MSR update functions cat_wrmsr(), mba_wrmsr_intel(), and mba_wrmsr_amd() all take three arguments:
(struct rdt_doma
x86/resctrl: Simplify call convention for MSR update functions
The per-resource MSR update functions cat_wrmsr(), mba_wrmsr_intel(), and mba_wrmsr_amd() all take three arguments:
(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
struct msr_param contains pointers to both struct rdt_resource and struct rdt_domain, thus only struct msr_param is necessary.
Pass struct msr_param as a single parameter. Clean up formatting and fix some fir tree declaration ordering.
No functional change.
Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Maciej Wieczor-Retman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
e3ca96e4 |
| 08-Mar-2024 |
Tony Luck <[email protected]> |
x86/resctrl: Pass domain to target CPU
reset_all_ctrls() and resctrl_arch_update_domains() use on_each_cpu_mask() to call rdt_ctrl_update() on potentially one CPU from each domain.
But this means r
x86/resctrl: Pass domain to target CPU
reset_all_ctrls() and resctrl_arch_update_domains() use on_each_cpu_mask() to call rdt_ctrl_update() on potentially one CPU from each domain.
But this means rdt_ctrl_update() needs to figure out which domain to apply changes to. Doing so requires a search of all domains in a resource, which can only be done safely if cpus_lock is held. Both callers do hold this lock, but there isn't a way for a function called on another CPU via IPI to verify this.
Commit
c0d848fcb09d ("x86/resctrl: Remove lockdep annotation that triggers false positive")
removed the incorrect assertions.
Add the target domain to the msr_param structure and call rdt_ctrl_update() for each domain separately using smp_call_function_single(). This means that rdt_ctrl_update() doesn't need to search for the domain and get_domain_from_cpu() can safely assert that the cpus_lock is held since the remaining callers do not use IPI.
Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: James Morse <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Maciej Wieczor-Retman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.8-rc7, v6.8-rc6, v6.8-rc5 |
|
| #
fb700810 |
| 13-Feb-2024 |
James Morse <[email protected]> |
x86/resctrl: Separate arch and fs resctrl locks
resctrl has one mutex that is taken by the architecture-specific code, and the filesystem parts. The two interact via cpuhp, where the architecture co
x86/resctrl: Separate arch and fs resctrl locks
resctrl has one mutex that is taken by the architecture-specific code, and the filesystem parts. The two interact via cpuhp, where the architecture code updates the domain list. Filesystem handlers that walk the domains list should not run concurrently with the cpuhp callback modifying the list.
Exposing a lock from the filesystem code means the interface is not cleanly defined, and creates the possibility of cross-architecture lock ordering headaches. The interaction only exists so that certain filesystem paths are serialised against CPU hotplug. The CPU hotplug code already has a mechanism to do this using cpus_read_lock().
MPAM's monitors have an overflow interrupt, so it needs to be possible to walk the domains list in irq context. RCU is ideal for this, but some paths need to be able to sleep to allocate memory.
Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp callback, cpus_read_lock() must always be taken first. rdtgroup_schemata_write() already does this.
Most of the filesystem code's domain list walkers are currently protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are rdt_bit_usage_show() and the mon_config helpers which take the lock directly.
Make the domain list protected by RCU. An architecture-specific lock prevents concurrent writers. rdt_bit_usage_show() could walk the domain list using RCU, but to keep all the filesystem operations the same, this is changed to call cpus_read_lock(). The mon_config helpers send multiple IPIs, take the cpus_read_lock() in these cases.
The other filesystem list walkers need to be able to sleep. Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't be invoked when file system operations are occurring.
Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live() call isn't obvious.
Resctrl's domain online/offline calls now need to take the rdtgroup_mutex themselves.
[ bp: Fold in a build fix: https://lore.kernel.org/r/87zfvwieli.ffs@tglx ]
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov (AMD) <[email protected]>
show more ...
|
| #
978fcca9 |
| 13-Feb-2024 |
James Morse <[email protected]> |
x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU
When a CPU is taken offline resctrl may need to move the overflow or limbo handlers to run on a different CPU.
Once the off
x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU
When a CPU is taken offline resctrl may need to move the overflow or limbo handlers to run on a different CPU.
Once the offline callbacks have been split, cqm_setup_limbo_handler() will be called while the CPU that is going offline is still present in the CPU mask.
Pass the CPU to exclude to cqm_setup_limbo_handler() and mbm_setup_overflow_handler(). These functions can use a variant of cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need excluding.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov (AMD) <[email protected]>
show more ...
|
| #
e557999f |
| 13-Feb-2024 |
James Morse <[email protected]> |
x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()
Depending on the number of monitors available, Arm's MPAM may need to allocate a monitor prior to reading the counter va
x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()
Depending on the number of monitors available, Arm's MPAM may need to allocate a monitor prior to reading the counter value. Allocating a contended resource may involve sleeping.
__check_limbo() and mon_event_count() each make multiple calls to resctrl_arch_rmid_read(), to avoid extra work on contended systems, the allocation should be valid for multiple invocations of resctrl_arch_rmid_read().
The memory or hardware allocated is not specific to a domain.
Add arch hooks for this allocation, which need calling before resctrl_arch_rmid_read(). The allocated monitor is passed to resctrl_arch_rmid_read(), then freed again afterwards. The helper can be called on any CPU, and can sleep.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov (AMD) <[email protected]>
show more ...
|
| #
09909e09 |
| 13-Feb-2024 |
James Morse <[email protected]> |
x86/resctrl: Queue mon_event_read() instead of sending an IPI
Intel is blessed with an abundance of monitors, one per RMID, that can be read from any CPU in the domain. MPAMs monitors reside in the
x86/resctrl: Queue mon_event_read() instead of sending an IPI
Intel is blessed with an abundance of monitors, one per RMID, that can be read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC, the number implemented is up to the manufacturer. This means when there are fewer monitors than needed, they need to be allocated and freed.
MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The CSU counter is allowed to return 'not ready' for a small number of micro-seconds after programming. To allow one CSU hardware monitor to be used for multiple control or monitor groups, the CPU accessing the monitor needs to be able to block when configuring and reading the counter.
Worse, the domain may be broken up into slices, and the MMIO accesses for each slice may need performing from different CPUs.
These two details mean MPAMs monitor code needs to be able to sleep, and IPI another CPU in the domain to read from a resource that has been sliced.
mon_event_read() already invokes mon_event_count() via IPI, which means this isn't possible. On systems using nohz-full, some CPUs need to be interrupted to run kernel work as they otherwise stay in user-space running realtime workloads. Interrupting these CPUs should be avoided, and scheduling work on them may never complete.
Change mon_event_read() to pick a housekeeping CPU, (one that is not using nohz_full) and schedule mon_event_count() and wait. If all the CPUs in a domain are using nohz-full, then an IPI is used as the fallback.
This function is only used in response to a user-space filesystem request (not the timing sensitive overflow code).
This allows MPAM to hide the slice behaviour from resctrl, and to keep the monitor-allocation in monitor.c. When the IPI fallback is used on machines where MPAM needs to make an access on multiple CPUs, the counter read will always fail.
Signed-off-by: James Morse <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Shaopeng Tan <[email protected]> Reviewed-by: Peter Newman <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Shaopeng Tan <[email protected]> Tested-by: Peter Newman <[email protected]> Tested-by: Babu Moger <[email protected]> Tested-by: Carl Worth <[email protected]> # arm64 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov (AMD) <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6 |
|
| #
0e3cd31f |
| 10-Oct-2023 |
Maciej Wieczor-Retman <[email protected]> |
x86/resctrl: Enable non-contiguous CBMs in Intel CAT
The setting for non-contiguous 1s support in Intel CAT is hardcoded to false. On these systems, writing non-contiguous 1s into the schemata file
x86/resctrl: Enable non-contiguous CBMs in Intel CAT
The setting for non-contiguous 1s support in Intel CAT is hardcoded to false. On these systems, writing non-contiguous 1s into the schemata file will fail before resctrl passes the value to the hardware.
In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped being reserved and now carry information about non-contiguous 1s value support for L3 and L2 cache respectively. The CAT capacity bitmask (CBM) supports a non-contiguous 1s value if the bit is set.
The exception are Haswell systems where non-contiguous 1s value support needs to stay disabled since they can't make use of CPUID for Cache allocation.
Originally-by: Fenghua Yu <[email protected]> Signed-off-by: Maciej Wieczor-Retman <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Ilpo Järvinen <[email protected]> Reviewed-by: Peter Newman <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Tested-by: Peter Newman <[email protected]> Link: https://lore.kernel.org/r/1849b487256fe4de40b30f88450cba3d9abc9171.1696934091.git.maciej.wieczor-retman@intel.com
show more ...
|