History log of /linux-6.15/include/uapi/drm/xe_drm.h (Results 1 – 25 of 136)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6
# 77613a2e 06-Mar-2025 Matthew Brost <[email protected]>

drm/xe/uapi: Add DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR

Add the DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR device query flag,
which indicates whether the device supports CPU address mirrorin

drm/xe/uapi: Add DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR

Add the DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR device query flag,
which indicates whether the device supports CPU address mirroring. The
intent is for UMDs to use this query to determine if a VM can be set up
with CPU address mirroring. This flag is implemented by checking if the
device supports GPU faults.

v7:
- Only report enabled if CONFIG_DRM_GPUSVM is selected (CI)

Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Reviewed-by: Tejas Upadhyay <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# b43e864a 06-Mar-2025 Matthew Brost <[email protected]>

drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR

Add the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag, which is used to
create unpopulated virtual memory areas (VMAs) without memory backing or
GPU p

drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR

Add the DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag, which is used to
create unpopulated virtual memory areas (VMAs) without memory backing or
GPU page tables. These VMAs are referred to as CPU address mirror VMAs.
The idea is that upon a page fault or prefetch, the memory backing and
GPU page tables will be populated.

CPU address mirror VMAs only update GPUVM state; they do not have an
internal page table (PT) state, nor do they have GPU mappings.

It is expected that CPU address mirror VMAs will be mixed with buffer
object (BO) VMAs within a single VM. In other words, system allocations
and runtime allocations can be mixed within a single user-mode driver
(UMD) program.

Expected usage:

- Bind the entire virtual address (VA) space upon program load using the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag.
- If a buffer object (BO) requires GPU mapping (runtime allocation),
allocate a CPU address using mmap(PROT_NONE), bind the BO to the
mmapped address using existing bind IOCTLs. If a CPU map of the BO is
needed, mmap it again to the same CPU address using mmap(MAP_FIXED)
- If a BO no longer requires GPU mapping, munmap it from the CPU address
space and them bind the mapping address with the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag.
- Any malloc'd or mmapped CPU address accessed by the GPU will be
faulted in via the SVM implementation (system allocation).
- Upon freeing any mmapped or malloc'd data, the SVM implementation will
remove GPU mappings.

Only supporting 1 to 1 mapping between user address space and GPU
address space at the moment as that is the expected use case. uAPI
defines interface for non 1 to 1 but enforces 1 to 1, this restriction
can be lifted if use cases arrise for non 1 to 1 mappings.

This patch essentially short-circuits the code in the existing VM bind
paths to avoid populating page tables when the
DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR flag is set.

v3:
- Call vm_bind_ioctl_ops_fini on -ENODATA
- Don't allow DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR on non-faulting VMs
- s/DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATOR/DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR (Thomas)
- Rework commit message for expected usage (Thomas)
- Describe state of code after patch in commit message (Thomas)
v4:
- Fix alignment (Checkpatch)

Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.14-rc5
# 5488bec9 28-Feb-2025 Tejas Upadhyay <[email protected]>

drm/xe/uapi: Use hint for guc to set GT frequency

Allow user to provide a low latency hint. When set, KMD sends a hint
to GuC which results in special handling for that process. SLPC will
ramp the G

drm/xe/uapi: Use hint for guc to set GT frequency

Allow user to provide a low latency hint. When set, KMD sends a hint
to GuC which results in special handling for that process. SLPC will
ramp the GT frequency aggressively every time it switches to this
process.

We need to enable the use of SLPC Compute strategy during init, but
it will apply only to processes that set this bit during process
creation.

Improvement with this approach as below:

Before,

:~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Graphics [0xe20b]
Driver version : 24.52.0 (Linux x64)
Compute units : 160
Clock frequency : 2850 MHz
Kernel launch latency : 283.16 us

After,

:~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency
Platform: Intel(R) OpenCL Graphics
Device: Intel(R) Graphics [0xe20b]
Driver version : 24.52.0 (Linux x64)
Compute units : 160
Clock frequency : 2850 MHz

Kernel launch latency : 63.38 us

Compute PR: https://github.com/intel/compute-runtime/pull/794
Mesa PR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33214
IGT PR: https://patchwork.freedesktop.org/patch/639989/

V10(Lucas):
- Remove doc from drm-uapi.rst
v9(Vinay):
- remove extra line, align commit message
v8(Vinay):
- Add separate example for using low latency hint
v7(Jose):
- Update UMD PR
- applicable to all gpus
V6:
- init flags, remove redundant flags check (MAuld)
V5:
- Move uapi doc to documentation and GuC ABI specific change (Rodrigo)
- Modify logic to restrict exec queue flags (MAuld)
V4:
- To make it clear, dont use exec queue word (Vinay)
- Correct typo in description of flag (Jose/Vinay)
- rename set_strategy api and replace ctx with exec queue(Vinay)
- Start with 0th bit to indentify user flags (Jose)
V3:
- Conver user flag to kernel internal flag and use (Oak)
- Support query config for use to check kernel support (Jose)
- Dont need to take runtime pm (Vinay)
V2:
- DRM_XE_EXEC_QUEUE_LOW_LATENCY_HINT 1 planned for other hint(Szymon)
- Add motivation to description (Lucas)

Acked-by: Lucas De Marchi <[email protected]>
Reviewed-by: Vinay Belgaumkar <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Tejas Upadhyay <[email protected]>

show more ...


# cd5bbb25 26-Feb-2025 Harish Chegondi <[email protected]>

drm/xe/uapi: Add a device query to get EU stall sampling information

User space can get the EU stall data record size, EU stall capabilities,
EU stall sampling rates, and per XeCore buffer size with

drm/xe/uapi: Add a device query to get EU stall sampling information

User space can get the EU stall data record size, EU stall capabilities,
EU stall sampling rates, and per XeCore buffer size with query IOCTL
DRM_IOCTL_XE_DEVICE_QUERY with .query set to DRM_XE_DEVICE_QUERY_EU_STALL.
A struct drm_xe_query_eu_stall will be returned to the user space along
with an array of supported sampling rates sorted in the fastest sampling
rate first order. sampling_rates in struct drm_xe_query_eu_stall will
point to the array of sampling rates.

Any capabilities in EU stall sampling as of this patch are considered
as base capabilities. New capability bits will be added for any new
functionality added later.

v12: Rename has_eu_stall_sampling_support() to
xe_eu_stall_supported_on_platform() and move it to header file.
v11: Check if EU stall sampling is supported on the platform.
v10: Change comments and variable names as per feedback
v9: Move reserved fields above num_sampling_rates in
struct drm_xe_query_eu_stall.
v7: Change sampling_rates from a pointer to flexible array.
v6: Include EU stall sampling rates information and
per XeCore buffer size in the query information.

Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Harish Chegondi <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/67ba42796a5a99d648239c315694cd222812a49b.1740533885.git.harish.chegondi@intel.com

show more ...


# 1537ec85 26-Feb-2025 Harish Chegondi <[email protected]>

drm/xe/uapi: Introduce API for EU stall sampling

A new hardware feature first introduced in PVC gives capability to
periodically sample EU stall state and record counts for different stall
reasons,

drm/xe/uapi: Introduce API for EU stall sampling

A new hardware feature first introduced in PVC gives capability to
periodically sample EU stall state and record counts for different stall
reasons, on a per IP basis, aggregate across all EUs in a subslice and
record the samples in a buffer in each subslice. Eventually, the aggregated
data is written out to a buffer in the memory. This feature is also
supported in XE2 and later architecture GPUs.

Use an existing IOCTL - DRM_IOCTL_XE_OBSERVATION as the interface into the
driver from the user space to do initial setup and obtain a file descriptor
for the EU stall data stream. Input parameter to the IOCTL is a struct
drm_xe_observation_param in which observation_type should be set to
DRM_XE_OBSERVATION_TYPE_EU_STALL, observation_op should be
DRM_XE_OBSERVATION_OP_STREAM_OPEN and param should point to a chain of
drm_xe_ext_set_property structures in which each structure has a pair of
property and value. The EU stall sampling input properties are defined in
drm_xe_eu_stall_property_id enum.

With the file descriptor obtained from DRM_IOCTL_XE_OBSERVATION, user space
can enable and disable EU stall sampling with the IOCTLs:
DRM_XE_OBSERVATION_IOCTL_ENABLE and DRM_XE_OBSERVATION_IOCTL_DISABLE.
User space can also call poll() to check for availability of data in the
buffer. The data can be read with read(). Finally, the file descriptor
can be closed with close().

v11: Changed a couple of variables in struct eu_stall_open_properties
from unsigned int to int.
v10: Use extension number while parsing chain of extensions.
Remove function description for static functions.
Move code around as per review feedback.
v9: Changed some u32 to unsigned int.
Moved some code around as per review feedback from v8.
v8: Used div_u64 instead of / to fix 32-bit build issue.
Changed copyright year in xe_eu_stall.c/h to 2025.
v7: Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT
to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with
OA. Renamed the corresponding internal variables.
Fixed some commit messages based on review feedback.
v6: Change the input sampling rate to GPU cycles instead of
GPU cycles multiplier.

Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Harish Chegondi <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/bb707a27975c33e4a912b9839b023acb7a1f9c90.1740533885.git.harish.chegondi@intel.com

show more ...


Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1
# 41a97c4a 29-Jan-2025 Daniele Ceraolo Spurio <[email protected]>

drm/xe/pxp/uapi: Add API to mark a BO as using PXP

The driver needs to know if a BO is encrypted with PXP to enable the
display decryption at flip time.
Furthermore, we want to keep track of the sta

drm/xe/pxp/uapi: Add API to mark a BO as using PXP

The driver needs to know if a BO is encrypted with PXP to enable the
display decryption at flip time.
Furthermore, we want to keep track of the status of the encryption and
reject any operation that involves a BO that is encrypted using an old
key. There are two points in time where such checks can kick in:

1 - at VM bind time, all operations except for unmapping will be
rejected if the key used to encrypt the BO is no longer valid. This
check is opt-in via a new VM_BIND flag, to avoid a scenario where a
malicious app purposely shares an invalid BO with a non-PXP aware
app (such as a compositor). If the VM_BIND was failed, the
compositor would be unable to display anything at all. Allowing the
bind to go through means that output still works, it just displays
garbage data within the bounds of the illegal BO.

2 - at job submission time, if the queue is marked as using PXP, all
objects bound to the VM will be checked and the submission will be
rejected if any of them was encrypted with a key that is no longer
valid.

Note that there is no risk of leaking the encrypted data if a user does
not opt-in to those checks; the only consequence is that the user will
not realize that the encryption key is changed and that the data is no
longer valid.

v2: Better commnnts and descriptions (John), rebase

v3: Properly return the result of key_assign up the stack, do not use
xe_bo in display headers (Jani)

v4: improve key_instance variable documentation (John)

Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: John Harrison <[email protected]>
Cc: Jani Nikula <[email protected]>
Reviewed-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# bd98ac2e 29-Jan-2025 Daniele Ceraolo Spurio <[email protected]>

drm/xe/pxp/uapi: Add a query for PXP status

PXP prerequisites (SW proxy and HuC auth via GSC) are completed
asynchronously from driver load, which means that userspace can start
submitting before we

drm/xe/pxp/uapi: Add a query for PXP status

PXP prerequisites (SW proxy and HuC auth via GSC) are completed
asynchronously from driver load, which means that userspace can start
submitting before we're ready to start a PXP session. Therefore, we need
a query that userspace can use to check not only if PXP is supported but
also to wait until the prerequisites are done.

v2: Improve doc, do not report TYPE_NONE as supported (José)
v3: Better comments, remove unneeded copy_from_user (John)

Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: José Roberto de Souza <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# 72d47960 29-Jan-2025 Daniele Ceraolo Spurio <[email protected]>

drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues

Userspace is required to mark a queue as using PXP to guarantee that the
PXP instructions will work. In addition to managing the P

drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues

Userspace is required to mark a queue as using PXP to guarantee that the
PXP instructions will work. In addition to managing the PXP sessions,
when a PXP queue is created the driver will set the relevant bits in
its context control register.

On submission of a valid PXP queue, the driver will validate all
encrypted objects mapped to the VM to ensured they were encrypted with
the current key.

v2: Remove pxp_types include outside of PXP code (Jani), better comments
and code cleanup (John)

v3: split the internal PXP management to a separate patch for ease of
review. re-order ioctl checks to always return -EINVAL if parameters are
invalid, rebase on msix changes.

Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.13
# a46ea12e 17-Jan-2025 Rodrigo Vivi <[email protected]>

drm/xe/uapi: Fix documentation indentation

Fix these issues:

Documentation/gpu/driver-uapi:29: include/uapi/drm/xe_drm.h:817: WARNING:
+Bullet list ends without a blank line; unexpected unindent.
D

drm/xe/uapi: Fix documentation indentation

Fix these issues:

Documentation/gpu/driver-uapi:29: include/uapi/drm/xe_drm.h:817: WARNING:
+Bullet list ends without a blank line; unexpected unindent.
Documentation/gpu/driver-uapi:29: include/uapi/drm/xe_drm.h:835: WARNING:
+Definition list ends without a blank line; unexpected unindent.

Fixes: 75d37750a753 ("drm/xe/mmap: Add mmap support for PCI memory barrier")
Reported-by: Stephen Rothwell <[email protected]>
Closes: https://lore.kernel.org/intel-xe/[email protected]/
Cc: Tejas Upadhyay <[email protected]>
Tested-by: Bagas Sanjaya <[email protected]>
Reviewed-by: Tejas Upadhyay <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

show more ...


# 75d37750 13-Jan-2025 Tejas Upadhyay <[email protected]>

drm/xe/mmap: Add mmap support for PCI memory barrier

In order to avoid having userspace to use MI_MEM_FENCE,
we are adding a mechanism for userspace to generate a
PCI memory barrier with low overhea

drm/xe/mmap: Add mmap support for PCI memory barrier

In order to avoid having userspace to use MI_MEM_FENCE,
we are adding a mechanism for userspace to generate a
PCI memory barrier with low overhead (avoiding IOCTL call
as well as writing to VRAM will adds some overhead).

This is implemented by memory-mapping a page as uncached
that is backed by MMIO on the dGPU and thus allowing userspace
to do memory write to the page without invoking an IOCTL.
We are selecting the MMIO so that it is not accessible from
the PCI bus so that the MMIO writes themselves are ignored,
but the PCI memory barrier will still take action as the MMIO
filtering will happen after the memory barrier effect.

When we detect special defined offset in mmap(), We are mapping
4K page which contains the last of page of doorbell MMIO range
to userspace for same purpose.

For user to query special offset we are adding special flag in
mmap_offset ioctl which needs to be passed as follows,
struct drm_xe_gem_mmap_offset mmo = {
.handle = 0, /* this must be 0 */
.flags = DRM_XE_MMAP_OFFSET_FLAG_PCI_BARRIER,
};
igt_ioctl(fd, DRM_IOCTL_XE_GEM_MMAP_OFFSET, &mmo);
map = mmap(NULL, size, PROT_WRITE, MAP_SHARED, fd, mmo);

IGT : https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/commit/b2dbc6f22815128c0dd5c737504f42e1f1a6ad62
UMD : https://github.com/intel/compute-runtime/pull/772

V7:
- Dgpu filter added
V6(MAuld)
- Move physical mmap to fault handler
- Modify kernel-doc and attach UMD PR when ready
V5(MAuld)
- Return invalid early in case of non 4K PAGE_SIZE
- Format kernel-doc and add note for 4K PAGE_SIZE HW limit
V4(MAuld)
- Add kernel-doc for uapi change
- Restrict page size to 4K
V3(MAuld)
- Remove offset defination from UAPI to be able to change later
- Edit commit message for special flag addition
V2(MAuld)
- Add fault handler with dummy page to handle unplug device
- Add Build check for special offset to be below normal start page
- Test d3hot, mapping seems to be valid in d3hot as well
- Add more info to commit message

Cc: Matthew Auld <[email protected]>
Acked-by: Michal Mrozek <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3
# 5637797a 12-Dec-2024 Ashutosh Dixit <[email protected]>

drm/xe/oa/uapi: Expose an unblock after N reports OA property

Expose an "unblock after N reports" OA property, to allow userspace threads
to be woken up less frequently.

Co-developed-by: Umesh Nerl

drm/xe/oa/uapi: Expose an unblock after N reports OA property

Expose an "unblock after N reports" OA property, to allow userspace threads
to be woken up less frequently.

Co-developed-by: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Reviewed-by: Jonathan Cavitt <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.13-rc2
# 720f63a8 05-Dec-2024 Sai Teja Pottumuttu <[email protected]>

drm/xe/oa/uapi: Make OA buffer size configurable

Add a new property called DRM_XE_OA_PROPERTY_OA_BUFFER_SIZE to
allow OA buffer size to be configurable from userspace.

With this OA buffer size can

drm/xe/oa/uapi: Make OA buffer size configurable

Add a new property called DRM_XE_OA_PROPERTY_OA_BUFFER_SIZE to
allow OA buffer size to be configurable from userspace.

With this OA buffer size can be configured to any power of 2
size between 128KB and 128MB and it would default to 16MB in case
the size is not supplied.

v2:
- Rebase
v3:
- Add oa buffer size to capabilities [Ashutosh]
- Address several nitpicks [Ashutosh]
- Fix commit message/subject [Ashutosh]

BSpec: 61100, 61228
Signed-off-by: Sai Teja Pottumuttu <[email protected]>
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5
# c8507a25 22-Oct-2024 Ashutosh Dixit <[email protected]>

drm/xe/oa/uapi: Define and parse OA sync properties

Now that we have laid the groundwork, introduce OA sync properties in the
uapi and parse the input xe_sync array as is done elsewhere in the
drive

drm/xe/oa/uapi: Define and parse OA sync properties

Now that we have laid the groundwork, introduce OA sync properties in the
uapi and parse the input xe_sync array as is done elsewhere in the
driver. Also add DRM_XE_OA_CAPS_SYNCS bit in OA capabilities for userspace.

v2: Fix and document DRM_XE_SYNC_TYPE_USER_FENCE for OA (Matt B)
Add DRM_XE_OA_CAPS_SYNCS bit to OA capabilities (Jose)

Acked-by: José Roberto de Souza <[email protected]>
Reviewed-by: Jonathan Cavitt <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.12-rc4, v6.12-rc3
# 9ab440a9 07-Oct-2024 Shekhar Chauhan <[email protected]>

drm/xe/ptl: L3bank mask is not available on the media GT

On PTL platforms with media version 30.00, the fuse registers for
reporting L3 bank availability to the GT just read out as ~0 and do not
pro

drm/xe/ptl: L3bank mask is not available on the media GT

On PTL platforms with media version 30.00, the fuse registers for
reporting L3 bank availability to the GT just read out as ~0 and do not
provide proper values. Xe does not use the L3 bank mask for anything
internally; it only passes the mask through to userspace via the GT
topology query.

Since we don't have any way to get the real L3 bank mask, we don't want
to pass garbage to userspace. Passing a zeroed mask or a copy of the
primary GT's L3 bank mask would also be inaccurate and likely to cause
confusion for userspace. The best approach is to simply not include L3
in the list of masks returned by the topology query in cases where we
aren't able to provide a meaningful value. This won't change the
behavior for any existing platforms (where we can always obtain L3 masks
successfully for all GTs), it will only prevent us from mis-reporting
bad information on upcoming platform(s).

There's a good chance this will become a formal workaround in the
future, but for now we don't have a lineage number so "no_media_l3" is
used in place of a lineage as the OOB workaround descriptor.

v2:
- Re-calculate query size to properly match data returned. (Gustavo)
- Update kerneldoc to clarify that the L3bank mask may not be included
in the query results if the hardware doesn't make it available.
(Gustavo)

Cc: Matt Atwood <[email protected]>
Cc: Gustavo Sousa <[email protected]>
Signed-off-by: Shekhar Chauhan <[email protected]>
Co-developed-by: Matt Roper <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Jonathan Cavitt <[email protected]>
Reviewed-by: Gustavo Sousa <[email protected]>
Acked-by: Francois Dugast <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2
# ad614a70 29-Jul-2024 Geert Uytterhoeven <[email protected]>

drm/xe/oa/uapi: Make bit masks unsigned

When building with gcc-5:

In function ‘decode_oa_format.isra.26’,
inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6:
././

drm/xe/oa/uapi: Make bit masks unsigned

When building with gcc-5:

In function ‘decode_oa_format.isra.26’,
inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6:
././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant
[...]
./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’
__BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
^
drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’
u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt);
^

Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
(cherry picked from commit f2881dfdaaa9ec873dbd383ef5512fc31e576cbb)
Signed-off-by: Rodrigo Vivi <[email protected]>

show more ...


# f2881dfd 29-Jul-2024 Geert Uytterhoeven <[email protected]>

drm/xe/oa/uapi: Make bit masks unsigned

When building with gcc-5:

In function ‘decode_oa_format.isra.26’,
inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6:
././

drm/xe/oa/uapi: Make bit masks unsigned

When building with gcc-5:

In function ‘decode_oa_format.isra.26’,
inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6:
././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant
[...]
./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’
__BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
^
drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’
u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt);
^

Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

show more ...


Revision tags: v6.11-rc1, v6.10
# 7108b4a5 10-Jul-2024 Lucas De Marchi <[email protected]>

drm/xe/uapi: Expose SIMD16 EU mask in topology query

PVC, Xe2 and later platforms have 16-wide EUs. We were implicitly
reporting for PVC the number of 16-wide EUs without giving userspace any
hint t

drm/xe/uapi: Expose SIMD16 EU mask in topology query

PVC, Xe2 and later platforms have 16-wide EUs. We were implicitly
reporting for PVC the number of 16-wide EUs without giving userspace any
hint that they were different than for other platforms. Xe2 and later
also have 16-wide, but in those cases the reported number would
correspond to the 8-wide count.

To avoid confusion and make sure the right number is used by userspace
depending on the platform, add a new item to the topology query and drop
the one that is not available. The new mask reported for both PVC and
Xe2 should now match the numbers reported via hwconfig.

v2: Use a different topo item with EU type in its name to report the
new mask instead of adding the type itself as the item (Matt Roper)

Reviewed-by: Matt Roper <[email protected]>
Acked-by: José Roberto de Souza <[email protected]>
Acked-by: Mateusz Jablonski <[email protected]>
Acked-by: Wenbin Lu <[email protected]>
Acked-by: Effie Yu <[email protected]>
Acked-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>

show more ...


Revision tags: v6.10-rc7
# 5207c393 05-Jul-2024 Thomas Hellström <[email protected]>

drm/xe: Use write-back caching mode for system memory on DGFX

The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, w

drm/xe: Use write-back caching mode for system memory on DGFX

The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.

Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.

So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.

Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.

v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.

Signed-off-by: Thomas Hellström <[email protected]>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: [email protected]
Cc: Joonas Lahtinen <[email protected]>
Cc: Effie Yu <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Jose Souza <[email protected]>
Cc: Michal Mrozek <[email protected]>
Cc: <[email protected]> # v6.8+
Acked-by: Matthew Auld <[email protected]>
Acked-by: José Roberto de Souza <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <[email protected]>
Acked-by: Effie Yu <[email protected]> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 01e0cfc994be484ddcb9e121e353e51d8bb837c0)
Signed-off-by: Lucas De Marchi <[email protected]>

show more ...


# 63347fe0 03-Jul-2024 Ashutosh Dixit <[email protected]>

drm/xe/uapi: Rename xe perf layer as xe observation layer

In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessa

drm/xe/uapi: Rename xe perf layer as xe observation layer

In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessarily
so. Also, the name "perf" is a carryover from i915 and is not preferred.

Here we propose the name "observation" for this common layer which allows
capture of different types of these counter streams.

v2: Rename observability layer to observation layer (Lucas/Rodrigo)
v3: Rename sysctl file to "observation_paranoid" (Jose)

Fixes: 52c2e956dceb ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types")
Fixes: fe8929bdf835 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl")
Acked-by: Lucas De Marchi <[email protected]>
Acked-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Acked-by: José Roberto de Souza <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8169b2097d88d99d7e4a72e20e4b549efe9eb8d7)
Signed-off-by: Rodrigo Vivi <[email protected]>

show more ...


# 01e0cfc9 05-Jul-2024 Thomas Hellström <[email protected]>

drm/xe: Use write-back caching mode for system memory on DGFX

The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, w

drm/xe: Use write-back caching mode for system memory on DGFX

The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.

Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.

So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.

Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.

v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.

Signed-off-by: Thomas Hellström <[email protected]>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: [email protected]
Cc: Joonas Lahtinen <[email protected]>
Cc: Effie Yu <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Jose Souza <[email protected]>
Cc: Michal Mrozek <[email protected]>
Cc: <[email protected]> # v6.8+
Acked-by: Matthew Auld <[email protected]>
Acked-by: José Roberto de Souza <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <[email protected]>
Acked-by: Effie Yu <[email protected]> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# 8169b209 03-Jul-2024 Ashutosh Dixit <[email protected]>

drm/xe/uapi: Rename xe perf layer as xe observation layer

In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessa

drm/xe/uapi: Rename xe perf layer as xe observation layer

In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessarily
so. Also, the name "perf" is a carryover from i915 and is not preferred.

Here we propose the name "observation" for this common layer which allows
capture of different types of these counter streams.

v2: Rename observability layer to observation layer (Lucas/Rodrigo)
v3: Rename sysctl file to "observation_paranoid" (Jose)

Fixes: 52c2e956dceb ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types")
Fixes: fe8929bdf835 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl")
Acked-by: Lucas De Marchi <[email protected]>
Acked-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Acked-by: José Roberto de Souza <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.10-rc6
# 406d058d 26-Jun-2024 Ashutosh Dixit <[email protected]>

drm/xe/oa/uapi: Allow preemption to be disabled on the stream exec queue

Mesa VK_KHR_performance_query use case requires preemption and timeslicing
to be disabled for the stream exec queue. Implemen

drm/xe/oa/uapi: Allow preemption to be disabled on the stream exec queue

Mesa VK_KHR_performance_query use case requires preemption and timeslicing
to be disabled for the stream exec queue. Implement this functionality
here.

v2: Minor change to debug print to print both ret values (Umesh)

Acked-by: José Roberto de Souza <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

show more ...


Revision tags: v6.10-rc5
# 7e5161da 23-Jun-2024 Ashutosh Dixit <[email protected]>

drm/xe/oa: Fix kernel doc in xe_drm.h

Fix kernel doc in xe_drm.h. Also eliminate private/non-abi enum
definitions.

v2: Remove __DRM_XE_PERF_TYPE_MAX since it is unused (Michal)
v3: Also remove DRM_

drm/xe/oa: Fix kernel doc in xe_drm.h

Fix kernel doc in xe_drm.h. Also eliminate private/non-abi enum
definitions.

v2: Remove __DRM_XE_PERF_TYPE_MAX since it is unused (Michal)
v3: Also remove DRM_XE_OA_PROPERTY_MAX since it can also be
eliminated (Michal)

Suggested-by: Michal Wajdeczko <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Reviewed-by: Michal Wajdeczko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>

show more ...


# dd6b4718 18-Jun-2024 Ashutosh Dixit <[email protected]>

drm/xe/oa/uapi: Query OA unit properties

Implement query for properties of OA units present on a device.

v2: Clean up reserved/pad fields (Umesh)
Follow the same scheme as other query structs
v

drm/xe/oa/uapi: Query OA unit properties

Implement query for properties of OA units present on a device.

v2: Clean up reserved/pad fields (Umesh)
Follow the same scheme as other query structs
v3: Skip reporting reserved engines attached to OA units
v4: Expose oa_buf_size via DRM_XE_PERF_IOCTL_INFO (Umesh)
v5: Don't expose capabilities as OR of properties (Umesh)
v6: Add extensions to query output structs: drm_xe_oa_unit,
drm_xe_query_oa_units and drm_xe_oa_stream_info
v7: Change oa_units[] array to __u64 type

Acked-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# efb315d0 18-Jun-2024 Ashutosh Dixit <[email protected]>

drm/xe/oa/uapi: Read file_operation

Implement the OA stream read file_operation. Both blocking and non-blocking
reads are supported. As part of read system call, the read copies OA perf
data from th

drm/xe/oa/uapi: Read file_operation

Implement the OA stream read file_operation. Both blocking and non-blocking
reads are supported. As part of read system call, the read copies OA perf
data from the OA buffer to the user buffer, after appending packet headers
for status and data packets.

v2: Drop OA report headers, implement DRM_XE_PERF_IOCTL_STATUS (Umesh)
v3: Introduce 'struct drm_xe_oa_stream_status'
v4: Define oa_status register bitfields (Umesh)
v5: Add extensions to 'struct drm_xe_oa_stream_status'
v6: Minor cleanup, eliminate report32 variable
v7: Use -EIO to signal to userspace to read OASTATUS using
DRM_XE_PERF_IOCTL_STATUS, change previous sites returning -EIO to
return -EINVAL
Make drm_xe_oa_stream_status bits contiguous (Jose, Umesh)
rmw oa_status bits (Umesh)

Acked-by: Rodrigo Vivi <[email protected]>
Acked-by: José Roberto de Souza <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


123456