|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7 |
|
| #
459a3511 |
| 15-Mar-2025 |
Sean Christopherson <[email protected]> |
KVM: Allow building irqbypass.ko as as module when kvm.ko is a module
Convert HAVE_KVM_IRQ_BYPASS into a tristate so that selecting IRQ_BYPASS_MANAGER follows KVM={m,y}, i.e. doesn't force irqbypass
KVM: Allow building irqbypass.ko as as module when kvm.ko is a module
Convert HAVE_KVM_IRQ_BYPASS into a tristate so that selecting IRQ_BYPASS_MANAGER follows KVM={m,y}, i.e. doesn't force irqbypass.ko to be built-in.
Note, PPC allows building KVM as a module, but selects HAVE_KVM_IRQ_BYPASS from a boolean Kconfig, i.e. KVM PPC unnecessarily forces irqbpass.ko to be built-in. But that flaw is a longstanding PPC specific issue.
Fixes: 61df71ee992d ("kvm: move "select IRQ_BYPASS_MANAGER" to common code") Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc6, v6.14-rc5 |
|
| #
b2aba529 |
| 24-Feb-2025 |
Sean Christopherson <[email protected]> |
KVM: Drop kvm_arch_sync_events() now that all implementations are nops
Remove kvm_arch_sync_events() now that x86 no longer uses it (no other arch has ever used it).
No functional change intended.
KVM: Drop kvm_arch_sync_events() now that all implementations are nops
Remove kvm_arch_sync_events() now that x86 no longer uses it (no other arch has ever used it).
No functional change intended.
Signed-off-by: Sean Christopherson <[email protected]> Acked-by: Claudio Imbrenda <[email protected]> Reviewed-by: Bibo Mao <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2 |
|
| #
aa34b811 |
| 04-Feb-2025 |
James Houghton <[email protected]> |
KVM: Allow lockless walk of SPTEs when handing aging mmu_notifier event
It is possible to correctly do aging without taking the KVM MMU lock, or while taking it for read; add a Kconfig to let archit
KVM: Allow lockless walk of SPTEs when handing aging mmu_notifier event
It is possible to correctly do aging without taking the KVM MMU lock, or while taking it for read; add a Kconfig to let architectures do so. Architectures that select KVM_MMU_LOCKLESS_AGING are responsible for correctness.
Suggested-by: Yu Zhao <[email protected]> Signed-off-by: James Houghton <[email protected]> Reviewed-by: David Matlack <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: massage shortlog+changelog, fix Kconfig goof and shorten name] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc1 |
|
| #
6f612694 |
| 24-Jan-2025 |
Paolo Bonzini <[email protected]> |
KVM: remove kvm_arch_post_init_vm
The only statement in a kvm_arch_post_init_vm implementation can be moved into the x86 kvm_arch_init_vm. Do so and remove all traces from architecture-independent
KVM: remove kvm_arch_post_init_vm
The only statement in a kvm_arch_post_init_vm implementation can be moved into the x86 kvm_arch_init_vm. Do so and remove all traces from architecture-independent code.
Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
|
Revision tags: v6.13, v6.13-rc7 |
|
| #
344315e9 |
| 11-Jan-2025 |
Sean Christopherson <[email protected]> |
KVM: x86: Drop double-underscores from __kvm_set_memory_region()
Now that there's no outer wrapper for __kvm_set_memory_region() and it's static, drop its double-underscore prefix.
No functional ch
KVM: x86: Drop double-underscores from __kvm_set_memory_region()
Now that there's no outer wrapper for __kvm_set_memory_region() and it's static, drop its double-underscore prefix.
No functional change intended.
Cc: Tao Su <[email protected]> Reviewed-by: Xiaoyao Li <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Acked-by: Christoph Schlameuss <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
| #
156bffdb |
| 11-Jan-2025 |
Sean Christopherson <[email protected]> |
KVM: Add a dedicated API for setting KVM-internal memslots
Add a dedicated API for setting internal memslots, and have it explicitly disallow setting userspace memslots. Setting a userspace memslot
KVM: Add a dedicated API for setting KVM-internal memslots
Add a dedicated API for setting internal memslots, and have it explicitly disallow setting userspace memslots. Setting a userspace memslots without a direct command from userspace would result in all manner of issues.
No functional change intended.
Cc: Tao Su <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Christian Borntraeger <[email protected]> Reviewed-by: Xiaoyao Li <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Acked-by: Christoph Schlameuss <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
| #
f81a6d12 |
| 11-Jan-2025 |
Sean Christopherson <[email protected]> |
KVM: Open code kvm_set_memory_region() into its sole caller (ioctl() API)
Open code kvm_set_memory_region() into its sole caller in preparation for adding a dedicated API for setting internal memslo
KVM: Open code kvm_set_memory_region() into its sole caller (ioctl() API)
Open code kvm_set_memory_region() into its sole caller in preparation for adding a dedicated API for setting internal memslots.
Oppurtunistically use the fancy new guard(mutex) to avoid a local 'r' variable.
Cc: Tao Su <[email protected]> Reviewed-by: Xiaoyao Li <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Acked-by: Christoph Schlameuss <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
| #
dca6c885 |
| 18-Jul-2024 |
Isaku Yamahata <[email protected]> |
KVM: Add member to struct kvm_gfn_range to indicate private/shared
Add new members to strut kvm_gfn_range to indicate which mapping (private-vs-shared) to operate on: enum kvm_gfn_range_filter attr_
KVM: Add member to struct kvm_gfn_range to indicate private/shared
Add new members to strut kvm_gfn_range to indicate which mapping (private-vs-shared) to operate on: enum kvm_gfn_range_filter attr_filter. Update the core zapping operations to set them appropriately.
TDX utilizes two GPA aliases for the same memslots, one for memory that is for private memory and one that is for shared. For private memory, KVM cannot always perform the same operations it does on memory for default VMs, such as zapping pages and having them be faulted back in, as this requires guest coordination. However, some operations such as guest driven conversion of memory between private and shared should zap private memory.
Internally to the MMU, private and shared mappings are tracked on separate roots. Mapping and zapping operations will operate on the respective GFN alias for each root (private or shared). So zapping operations will by default zap both aliases. Add fields in struct kvm_gfn_range to allow callers to specify which aliases so they can only target the aliases appropriate for their specific operation.
There was feedback that target aliases should be specified such that the default value (0) is to operate on both aliases. Several options were considered. Several variations of having separate bools defined such that the default behavior was to process both aliases. They either allowed nonsensical configurations, or were confusing for the caller. A simple enum was also explored and was close, but was hard to process in the caller. Instead, use an enum with the default value (0) reserved as a disallowed value. Catch ranges that didn't have the target aliases specified by looking for that specific value.
Set target alias with enum appropriately for these MMU operations: - For KVM's mmu notifier callbacks, zap shared pages only because private pages won't have a userspace mapping - For setting memory attributes, kvm_arch_pre_set_memory_attributes() chooses the aliases based on the attribute. - For guest_memfd invalidations, zap private only.
Link: https://lore.kernel.org/kvm/[email protected]/ Signed-off-by: Isaku Yamahata <[email protected]> Co-developed-by: Rick Edgecombe <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
| #
67b43038 |
| 04-Nov-2024 |
Yan Zhao <[email protected]> |
KVM: guest_memfd: Remove RCU-protected attribute from slot->gmem.file
Remove the RCU-protected attribute from slot->gmem.file. No need to use RCU primitives rcu_assign_pointer()/synchronize_rcu() to
KVM: guest_memfd: Remove RCU-protected attribute from slot->gmem.file
Remove the RCU-protected attribute from slot->gmem.file. No need to use RCU primitives rcu_assign_pointer()/synchronize_rcu() to update this pointer.
- slot->gmem.file is updated in 3 places: kvm_gmem_bind(), kvm_gmem_unbind(), kvm_gmem_release(). All of them are protected by kvm->slots_lock.
- slot->gmem.file is read in 2 paths: (1) kvm_gmem_populate kvm_gmem_get_file __kvm_gmem_get_pfn
(2) kvm_gmem_get_pfn kvm_gmem_get_file __kvm_gmem_get_pfn
Path (1) kvm_gmem_populate() requires holding kvm->slots_lock, so slot->gmem.file is protected by the kvm->slots_lock in this path.
Path (2) kvm_gmem_get_pfn() does not require holding kvm->slots_lock. However, it's also not guarded by rcu_read_lock() and rcu_read_unlock(). So synchronize_rcu() in kvm_gmem_unbind()/kvm_gmem_release() actually will not wait for the readers in kvm_gmem_get_pfn() due to lack of RCU read-side critical section.
The path (2) kvm_gmem_get_pfn() is safe without RCU protection because: a) kvm_gmem_bind() is called on a new memslot, before the memslot is visible to kvm_gmem_get_pfn(). b) kvm->srcu ensures that kvm_gmem_unbind() and freeing of a memslot occur after the memslot is no longer visible to kvm_gmem_get_pfn(). c) get_file_active() ensures that kvm_gmem_get_pfn() will not access the stale file if kvm_gmem_release() sets it to NULL. This is because if kvm_gmem_release() occurs before kvm_gmem_get_pfn(), get_file_active() will return NULL; if get_file_active() does not return NULL, kvm_gmem_release() should not occur until after kvm_gmem_get_pfn() releases the file reference.
Signed-off-by: Yan Zhao <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
| #
0664dc74 |
| 09-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Verify there's at least one online vCPU when iterating over all vCPUs
Explicitly check that there is at least online vCPU before iterating over all vCPUs. Because the max index is an unsigned
KVM: Verify there's at least one online vCPU when iterating over all vCPUs
Explicitly check that there is at least online vCPU before iterating over all vCPUs. Because the max index is an unsigned long, passing "0 - 1" in the online_vcpus==0 case results in xa_for_each_range() using an unlimited max, i.e. allows it to access vCPU0 when it shouldn't. This will allow KVM to safely _erase_ from vcpu_array if the last stages of vCPU creation fail, i.e. without generating a use-after-free if a different task happens to be concurrently iterating over all vCPUs.
Note, because xa_for_each_range() is a macro, kvm_for_each_vcpu() subtly reloads online_vcpus after each iteration, i.e. adding an extra load doesn't meaningfully impact the total cost of iterating over all vCPUs. And because online_vcpus is never decremented, there is no risk of a reload triggering a walk of the entire xarray.
Cc: Will Deacon <[email protected]> Cc: Michal Luczaj <[email protected]> Acked-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
| #
1e7381f3 |
| 09-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Explicitly verify target vCPU is online in kvm_get_vcpu()
Explicitly verify the target vCPU is fully online _prior_ to clamping the index in kvm_get_vcpu(). If the index is "bad", the nospec c
KVM: Explicitly verify target vCPU is online in kvm_get_vcpu()
Explicitly verify the target vCPU is fully online _prior_ to clamping the index in kvm_get_vcpu(). If the index is "bad", the nospec clamping will generate '0', i.e. KVM will return vCPU0 instead of NULL.
In practice, the bug is unlikely to cause problems, as it will only come into play if userspace or the guest is buggy or misbehaving, e.g. KVM may send interrupts to vCPU0 instead of dropping them on the floor.
However, returning vCPU0 when it shouldn't exist per online_vcpus is problematic now that KVM uses an xarray for the vCPUs array, as KVM needs to insert into the xarray before publishing the vCPU to userspace (see commit c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")), i.e. before vCPU creation is guaranteed to succeed.
As a result, incorrectly providing access to vCPU0 will trigger a use-after-free if vCPU0 is dereferenced and kvm_vm_ioctl_create_vcpu() bails out of vCPU creation due to an error and frees vCPU0. Commit afb2acb2e3a3 ("KVM: Fix vcpu_array[0] races") papered over that issue, but in doing so introduced an unsolvable teardown conundrum. Preventing accesses to vCPU0 before it's fully online will allow reverting commit afb2acb2e3a3, without re-introducing the vcpu_array[0] UAF race.
Fixes: 1d487e9bf8ba ("KVM: fix spectrev1 gadgets") Cc: [email protected] Cc: Will Deacon <[email protected]> Cc: Michal Luczaj <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Acked-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
| #
d96c77bd |
| 08-Nov-2024 |
Paolo Bonzini <[email protected]> |
KVM: x86: switch hugepage recovery thread to vhost_task
kvm_vm_create_worker_thread() is meant to be used for kthreads that can consume significant amounts of CPU time on behalf of a VM or in respon
KVM: x86: switch hugepage recovery thread to vhost_task
kvm_vm_create_worker_thread() is meant to be used for kthreads that can consume significant amounts of CPU time on behalf of a VM or in response to how the VM behaves (for example how it accesses its memory). Therefore it wants to charge the CPU time consumed by that work to the VM's container.
However, because of these threads, cgroups which have kvm instances inside never complete freezing. This can be trivially reproduced:
root@test ~# mkdir /sys/fs/cgroup/test root@test ~# echo $$ > /sys/fs/cgroup/test/cgroup.procs root@test ~# qemu-system-x86_64 -nographic -enable-kvm
and in another terminal:
root@test ~# echo 1 > /sys/fs/cgroup/test/cgroup.freeze root@test ~# cat /sys/fs/cgroup/test/cgroup.events populated 1 frozen 0
The cgroup freezing happens in the signal delivery path but kvm_nx_huge_page_recovery_worker, while joining non-root cgroups, never calls into the signal delivery path and thus never gets frozen. Because the cgroup freezer determines whether a given cgroup is frozen by comparing the number of frozen threads to the total number of threads in the cgroup, the cgroup never becomes frozen and users waiting for the state transition may hang indefinitely.
Since the worker kthread is tied to a user process, it's better if it behaves similarly to user tasks as much as possible, including being able to send SIGSTOP and SIGCONT. In fact, vhost_task is all that kvm_vm_create_worker_thread() wanted to be and more: not only it inherits the userspace process's cgroups, it has other niceties like being parented properly in the process tree. Use it instead of the homegrown alternative.
Incidentally, the new code is also better behaved when you flip recovery back and forth to disabled and back to enabled. If your recovery period is 1 minute, it will run the next recovery after 1 minute independent of how many times you flipped the parameter.
(Commit message based on emails from Tejun).
Reported-by: Tejun Heo <[email protected]> Reported-by: Luca Boccassi <[email protected]> Acked-by: Tejun Heo <[email protected]> Tested-by: Luca Boccassi <[email protected]> Cc: [email protected] Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
show more ...
|
| #
948ccbd9 |
| 13-Nov-2024 |
Xianglai Li <[email protected]> |
LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
Add iocsr and mmio memory read and write simulation to the kernel. When the VM accesses the device address space through iocsr instruction
LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
Add iocsr and mmio memory read and write simulation to the kernel. When the VM accesses the device address space through iocsr instructions or mmio, it does not need to return to the qemu user mode but can directly completes the access in the kernel mode.
Signed-off-by: Tianrui Zhao <[email protected]> Signed-off-by: Xianglai Li <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
show more ...
|
| #
3e7f4318 |
| 02-Aug-2024 |
Sean Christopherson <[email protected]> |
KVM: Protect vCPU's "last run PID" with rwlock, not RCU
To avoid jitter on KVM_RUN due to synchronize_rcu(), use a rwlock instead of RCU to protect vcpu->pid, a.k.a. the pid of the task last used to
KVM: Protect vCPU's "last run PID" with rwlock, not RCU
To avoid jitter on KVM_RUN due to synchronize_rcu(), use a rwlock instead of RCU to protect vcpu->pid, a.k.a. the pid of the task last used to a vCPU. When userspace is doing M:N scheduling of tasks to vCPUs, e.g. to run SEV migration helper vCPUs during post-copy, the synchronize_rcu() needed to change the PID associated with the vCPU can stall for hundreds of milliseconds, which is problematic for latency sensitive post-copy operations.
In the directed yield path, do not acquire the lock if it's contended, i.e. if the associated PID is changing, as that means the vCPU's task is already running.
Reported-by: Steve Rutherford <[email protected]> Reviewed-by: Steve Rutherford <[email protected]> Acked-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
show more ...
|
| #
8b15c376 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Don't grab reference on VM_MIXEDMAP pfns that have a "struct page"
Now that KVM no longer relies on an ugly heuristic to find its struct page references, i.e. now that KVM can't get false posit
KVM: Don't grab reference on VM_MIXEDMAP pfns that have a "struct page"
Now that KVM no longer relies on an ugly heuristic to find its struct page references, i.e. now that KVM can't get false positives on VM_MIXEDMAP pfns, remove KVM's hack to elevate the refcount for pfns that happen to have a valid struct page. In addition to removing a long-standing wart in KVM, this allows KVM to map non-refcounted struct page memory into the guest, e.g. for exposing GPU TTM buffers to KVM guests.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
93b7da40 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Drop APIs that manipulate "struct page" via pfns
Remove all kvm_{release,set}_pfn_*() APIs now that all users are gone.
No functional change intended.
Reviewed-by: Alex Bennée <alex.bennee@li
KVM: Drop APIs that manipulate "struct page" via pfns
Remove all kvm_{release,set}_pfn_*() APIs now that all users are gone.
No functional change intended.
Reviewed-by: Alex Bennée <[email protected]> Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
06cdaff8 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Drop gfn_to_pfn() APIs now that all users are gone
Drop gfn_to_pfn() and all its variants now that all users are gone.
No functional change intended.
Tested-by: Alex Bennée <alex.bennee@linar
KVM: Drop gfn_to_pfn() APIs now that all users are gone
Drop gfn_to_pfn() and all its variants now that all users are gone.
No functional change intended.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
f42e289a |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Add support for read-only usage of gfn_to_page()
Rework gfn_to_page() to support read-only accesses so that it can be used by arm64 to get MTE tags out of guest memory.
Opportunistically rewri
KVM: Add support for read-only usage of gfn_to_page()
Rework gfn_to_page() to support read-only accesses so that it can be used by arm64 to get MTE tags out of guest memory.
Opportunistically rewrite the comment to be even more stern about using gfn_to_page(), as there are very few scenarios where requiring a struct page is actually the right thing to do (though there are such scenarios). Add a FIXME to call out that KVM probably should be pinning pages, not just getting pages.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
dc061935 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Move x86's API to release a faultin page to common KVM
Move KVM x86's helper that "finishes" the faultin process to common KVM so that the logic can be shared across all architectures. Note, n
KVM: Move x86's API to release a faultin page to common KVM
Move KVM x86's helper that "finishes" the faultin process to common KVM so that the logic can be shared across all architectures. Note, not all architectures implement a fast page fault path, but the gist of the comment applies to all architectures.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
1fbee5b0 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: guest_memfd: Provide "struct page" as output from kvm_gmem_get_pfn()
Provide the "struct page" associated with a guest_memfd pfn as an output from __kvm_gmem_get_pfn() so that KVM guest page fa
KVM: guest_memfd: Provide "struct page" as output from kvm_gmem_get_pfn()
Provide the "struct page" associated with a guest_memfd pfn as an output from __kvm_gmem_get_pfn() so that KVM guest page fault handlers can directly put the page instead of having to rely on kvm_pfn_to_refcounted_page().
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
1c7b627e |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Add kvm_faultin_pfn() to specifically service guest page faults
Add a new dedicated API, kvm_faultin_pfn(), for servicing guest page faults, i.e. for getting pages/pfns that will be mapped into
KVM: Add kvm_faultin_pfn() to specifically service guest page faults
Add a new dedicated API, kvm_faultin_pfn(), for servicing guest page faults, i.e. for getting pages/pfns that will be mapped into the guest via an mmu_notifier-protected KVM MMU. Keep struct kvm_follow_pfn buried in internal code, as having __kvm_faultin_pfn() take "out" params is actually cleaner for several architectures, e.g. it allows the caller to have its own "page fault" structure without having to marshal data to/from kvm_follow_pfn.
Long term, common KVM would ideally provide a kvm_page_fault structure, a la x86's struct of the same name. But all architectures need to be converted to a common API before that can happen.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
21dd8770 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Move declarations of memslot accessors up in kvm_host.h
Move the memslot lookup helpers further up in kvm_host.h so that they can be used by inlined "to pfn" wrappers.
No functional change int
KVM: Move declarations of memslot accessors up in kvm_host.h
Move the memslot lookup helpers further up in kvm_host.h so that they can be used by inlined "to pfn" wrappers.
No functional change intended.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
365e3192 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Pass in write/dirty to kvm_vcpu_map(), not kvm_vcpu_unmap()
Now that all kvm_vcpu_{,un}map() users pass "true" for @dirty, have them pass "true" as a @writable param to kvm_vcpu_map(), and thus
KVM: Pass in write/dirty to kvm_vcpu_map(), not kvm_vcpu_unmap()
Now that all kvm_vcpu_{,un}map() users pass "true" for @dirty, have them pass "true" as a @writable param to kvm_vcpu_map(), and thus create a read-only mapping when possible.
Note, creating read-only mappings can be theoretically slower, as they don't play nice with fast GUP due to the need to break CoW before mapping the underlying PFN. But practically speaking, creating a mapping isn't a super hot path, and getting a writable mapping for reading is weird and confusing.
Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
2bcb52a3 |
| 10-Oct-2024 |
Sean Christopherson <[email protected]> |
KVM: Pin (as in FOLL_PIN) pages during kvm_vcpu_map()
Pin, as in FOLL_PIN, pages when mapping them for direct access by KVM. As per Documentation/core-api/pin_user_pages.rst, writing to a page that
KVM: Pin (as in FOLL_PIN) pages during kvm_vcpu_map()
Pin, as in FOLL_PIN, pages when mapping them for direct access by KVM. As per Documentation/core-api/pin_user_pages.rst, writing to a page that was gotten via FOLL_GET is explicitly disallowed.
Correct (uses FOLL_PIN calls): pin_user_pages() write to the data within the pages unpin_user_pages()
INCORRECT (uses FOLL_GET calls): get_user_pages() write to the data within the pages put_page()
Unfortunately, FOLL_PIN is a "private" flag, and so kvm_follow_pfn must use a one-off bool instead of being able to piggyback the "flags" field.
Link: https://lwn.net/Articles/930667 Link: https://lore.kernel.org/all/[email protected] Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
2ff072ba |
| 10-Oct-2024 |
David Stevens <[email protected]> |
KVM: Migrate kvm_vcpu_map() to kvm_follow_pfn()
Migrate kvm_vcpu_map() to kvm_follow_pfn(), and have it track whether or not the map holds a refcounted struct page. Precisely tracking struct page r
KVM: Migrate kvm_vcpu_map() to kvm_follow_pfn()
Migrate kvm_vcpu_map() to kvm_follow_pfn(), and have it track whether or not the map holds a refcounted struct page. Precisely tracking struct page references will eventually allow removing kvm_pfn_to_refcounted_page() and its various wrappers.
Signed-off-by: David Stevens <[email protected]> [sean: use a pointer instead of a boolean] Tested-by: Alex Bennée <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]>
show more ...
|