| 3764e757 | 10-Feb-2026 |
Alex Crichton <[email protected]> |
Refactor borrow state tracking for async tasks (#12550)
* Refactor borrow state tracking for async tasks
This commit is a somewhat deep refactoring of how the state of `borrow<T>` is managed for b
Refactor borrow state tracking for async tasks (#12550)
* Refactor borrow state tracking for async tasks
This commit is a somewhat deep refactoring of how the state of `borrow<T>` is managed for both the host and the guest with respect to async tasks. This additionally refactors how some async task management is done for host-called functions.
The fundamental problem being tackled here is #12510. In that issue it was discovered that the way `CallContext`, the borrow tracking mechanism in Wasmtime, is managed is incompatible with async tasks. Specifically the previous assumption of the scope being mutated for a borrow is somewhere on the call stack is no longer true. It's possible for an async task to be suspended, for example, and then a sibling task drops a borrow which should update the scope of the suspended task. There were a number of other small issues I noticed here and there which this PR additionally has tests for, all of which failed before this change and pass afterwards.
The manner in which borrow state is manipulated is a pretty old part of the component model implementation dating back to the original implementation of resources. I decided to forgo any possible quick fix and have attempted to more deeply refactor and integrate async tasks into all of this infrastructure. A list of the changes made here are:
* The `CallContexts` structure, a stack of `CallContext`, was removed. Tasks now directly store a `CallContext` which is the source of truth for borrow tracking for that call, and it does not move from this location. The store `CallContexts` is now deleted in favor of updating the `Option<ConcurrentState>` in the store to be an `enum` of either concurrent state or a stack. In this manner the old stack-based structure is still used sometimes, but it's impossible to reach when concurrency is enabled.
* Entry to the host from guests now reliably pushes a `HostTask` into the store. Previously where a frame were always pushed into a `CallContext` a `HostTask` is pushed into the store. This is still expected to be a bit too expensive for cheap host calls, but it doesn't meaningfully change the performance profile of before.
* The `resource_enter_call` and `resource_exit_call` libcalls have been removed. These are now folded into the `enter_sync_call` and `exit_sync_call` libcalls. Emission of these hooks has been updated accordingly. The concept of entering a call more generally has been removed. This is more formally known in the async world as a task starting, so the task creation is now responsible for the demarcation of entering a call. Additionally this means that the concept of exiting a call has somewhat gone away. Instead this method was renamed to `validate_scope_exit` which double-checks that a borrow-scope can be exited but doesn't actually remove the task. Task removal is deferred to preexisting mechanisms.
* Management of a `GuestTask`'s previous `Option<CallContext>` field, for example taking/restoring and pushing/popping onto `CallContexts` is now all gone. All related code is outright deleted as the `GuestTask`'s now non-optional `CallContext` field is the source of truth.
* The `ConcurrentState` structure now stores a `CurrentThread` enum instead of `Option<QualifiedThreadId>`. This represents how the currently executing thread could be a host thread, not just a guest thread, which is required for borrow-tracking.
* `HostTask` creation in `poll_and_block` and `first_poll`, the two main entrypoints of async host tasks when called by the guest, is now externalized from these functions. Instead these functions assume that the currently running thread is already a `HostTask` of some kind.
* In `poll_and_block` the host's result is no longer stored in the guest task but in the host task instead.
Overall this enables the `*.wast` test for #12510 to fix the original issue. This then adds new tests to ensure that cleanup of various constructs happens appropriately, such as cancelling a host task should clean up its associated resources. Additionally synchronously calling an async host task no longer leaks resources in a `Store` and should properly clean up everything.
There is still more work to do in this area (e.g. #12544) but that's going to be deferred to a future PR at this point.
Closes #12510
prtest:full
* Review comments/CI fixes
show more ...
|