| 5764da5f | 04-Sep-2025 |
Joel Dice <[email protected]> |
Revamp component model stream/future host API (again) (#11515)
* Revamp component model stream/future host API (again)
This changes the host APIs for dealing with futures and streams from a "rendez
Revamp component model stream/future host API (again) (#11515)
* Revamp component model stream/future host API (again)
This changes the host APIs for dealing with futures and streams from a "rendezvous"-style API to a callback-oriented one.
Previously you would create e.g. a `StreamReader`/`StreamWriter` pair and call their `read` and `write` methods, respectively, and those methods would return `Future`s that resolved when the operation was matched with a corresponding `write` or `read` operation on the other end.
With the new API, you instead provide a `StreamProducer` trait implementation whe creating the stream, whose `produce` method will be called as soon as a read happens, giving the implementation a chance to respond immediately without making the reader wait for a rendezvous. Likewise, you can match the read end of a stream to a `StreamConsumer` to respond immediately to writes. This model should reduce scheduling overhead and make it easier to e.g. pipe items to/from `AsyncWrite`/`AsyncRead` or `Sink`/`Stream` implementations without needing to explicitly spawn background tasks. In addition, the new API provides direct access to guest read and write buffers for `stream<u8>` operations, enabling zero-copy operations.
Other changes:
- I've removed the `HostTaskOutput`; we were using it to run extra code with access to the store after a host task completes, but we can do that more elegantly inside the future using `tls::get`. This also allowed me to simplify `Instance::poll_until` a bit.
- I've removed the `watch_{reader,writer}` functionality; it's not needed now given that the runtime will automatically dispose of the producer or consumer when the other end of the stream or future is closed -- no need for embedder code to manage that.
- In order to make `UntypedWriteBuffer` `Send`, I had to wrap its raw pointer `buf` field in a `SendSyncPtr`.
- I've removed `{Future,Stream}Writer` entirely and moved `Instance::{future,stream}` to `{Future,Stream}Reader::new`, respectively.
- I've added a bounds check to the beginnings of `Instance::guest_read` and `Instance::guest_write` so that we need not do it later in `Guest{Source,Destination}::remaining`, meaning those functions can be infallible.
Note that I haven't updated `wasmtime-wasi` yet to match; that will happen in one or more follow-up commits.
Signed-off-by: Joel Dice <[email protected]>
* Add `Accessor::getter`, rename `with_data` to `with_getter`
* fixup bindgen invocation
Signed-off-by: Roman Volosatovs <[email protected]>
* add support for zero-length writes/reads to/from host
I've added a test to cover this; it also tests direct buffer access for `stream<u8>`, which I realized I forgot to cover earlier. And of course there was a bug :facepalm:.
Signed-off-by: Joel Dice <[email protected]>
* add `{Destination,Source}::remaining` methods
This can help `Stream{Producer,Consumer}` implementations determine how many items to write or read, respectively.
Signed-off-by: Joel Dice <[email protected]>
* wasi: migrate sockets to new API
Signed-off-by: Roman Volosatovs <[email protected]>
* tests: read the socket stream until EOF
Signed-off-by: Roman Volosatovs <[email protected]>
* p3-sockets: account for cancellation
Signed-off-by: Roman Volosatovs <[email protected]>
* p3-sockets: mostly ensure byte buffer cancellation-safety
Signed-off-by: Roman Volosatovs <[email protected]>
* p3-filesystem: switch to new API
Signed-off-by: Roman Volosatovs <[email protected]>
* fixup! p3-sockets: mostly ensure byte buffer cancellation-safety
* p3-cli: switch to new API
Signed-off-by: Roman Volosatovs <[email protected]>
* p3: limit maximum buffer size
Signed-off-by: Roman Volosatovs <[email protected]>
* p3-sockets: remove reuseaddr test loop workaround
Signed-off-by: Roman Volosatovs <[email protected]>
* p3: drive I/O in `when_ready`
Signed-off-by: Roman Volosatovs <[email protected]>
* fixup! p3: drive I/O in `when_ready`
* Refine `Stream{Producer,Consumer}` APIs
Per conversations last week with Roman, Alex, and Lann, I've updated these traits to present a lower-level API based on `poll_{consume,produce}` functions and have documented the implementation requirements for various scenarios which have come up in `wasmtime-wasi`, particularly around graceful cancellation. See the doc comments for those functions for details.
Signed-off-by: Joel Dice <[email protected]>
* being integration of new API
Signed-off-by: Roman Volosatovs <[email protected]>
* update wasi/src/p3/filesystem to use new stream API
This is totally untested so far; I'll run the tests once we have everything else compiling.
Signed-off-by: Joel Dice <[email protected]>
* update wasi/src/p3/cli to use new stream API
This is totally untested and doesn't even compile yet due to a lifetime issue I don't have time to address yet. I'll follow up later with a fix.
Signed-off-by: Joel Dice <[email protected]>
* fix: remove `'a` bound on `&self`
Signed-off-by: Roman Volosatovs <[email protected]>
* finish `wasi:sockets` adaptation
Signed-off-by: Roman Volosatovs <[email protected]>
* finish `wasi:cli` adaptation
Note, that this removes the read optimization - let's get the implementation complete first and optimize later
Signed-off-by: Roman Volosatovs <[email protected]>
* remove redundant loop in sockets
Signed-off-by: Roman Volosatovs <[email protected]>
* wasi: buffer on 0-length reads
Signed-off-by: Roman Volosatovs <[email protected]>
* finish `wasi:filesystem` adaptation
Signed-off-by: Roman Volosatovs <[email protected]>
* remove `MAX_BUFFER_CAPACITY`
Signed-off-by: Roman Volosatovs <[email protected]>
* refactor `Cursor` usage
Signed-off-by: Roman Volosatovs <[email protected]>
* impl Default for VecBuffer
Signed-off-by: Roman Volosatovs <[email protected]>
* refactor: use consistent import styling
Signed-off-by: Roman Volosatovs <[email protected]>
* feature-gate fs Arc accessors
Signed-off-by: Roman Volosatovs <[email protected]>
* Update test expectations
---------
Signed-off-by: Joel Dice <[email protected]> Signed-off-by: Roman Volosatovs <[email protected]> Co-authored-by: Alex Crichton <[email protected]> Co-authored-by: Roman Volosatovs <[email protected]>
show more ...
|
| f6775a33 | 13-May-2025 |
Alex Crichton <[email protected]> |
Replace `GetHost` with a function pointer, add `HasData` (#10770)
* Replace `GetHost` with a function pointer, add `HasData`
This commit is a refactoring to the fundamentals of the `bindgen!` macro
Replace `GetHost` with a function pointer, add `HasData` (#10770)
* Replace `GetHost` with a function pointer, add `HasData`
This commit is a refactoring to the fundamentals of the `bindgen!` macro and the functions that it generates. Prior to this change the fundamental entrypoint generated by `bindgen!` was a function `add_to_linker_get_host` which takes a value of type `G: GetHost`. This `GetHost` implementation is effectively an alias for a closure whose return value is able to close over the parameter given lfietime-wise.
The `GetHost` abstraction was added to Wasmtime originally to enable using any type that implements `Host` traits, not just `&mut U` as was originally supported. The definition of `GetHost` was _just_ right to enable a type such as `MyThing<&mut T>` to implement `Host` and a closure could be provided that could return it. At the time that `GetHost` was added it was known to be problematic from an understandability point of view, namely:
* It has a non-obvious definition. * It's pretty advanced Rust voodoo to understand what it's actually doing * Using `GetHost` required lots of `for<'a> ...` in places which is unfamiliar syntax for many. * `GetHost` values couldn't be type-erased (e.g. put in a trait object) as we couldn't figure out the lifetime syntax to do so.
Despite these issues it was the only known solution at hand so we landed it and kept the previous `add_to_linker` style (`&mut T -> &mut U`) as a convenience. While this has worked reasonable well (most folks just try to not look at `GetHost`) it has reached a breaking point in the WASIp3 work.
In the WASIp3 work it's effectively now going to be required that the `G: GetHost` value is packaged up and actually stored inside of accessors provided to host functions. This means that `GetHost` values now need to not only be taken in `add_to_linker` but additionally provided to the rest of the system through an "accessor". This was made possible in #10746 by moving the `GetHost` type into Wasmtime itself (as opposed to generated code where it lived prior).
While this worked with WASIp3 and it was possible to plumb `G: GetHost` safely around, this ended up surfacing more issues. Namely all "concurrent" host functions started getting significantly more complicated `where` clauses and type signatures. At the end of the day I felt that we had reached the end of the road to `GetHost` and wanted to search for alternatives, hence this change.
The fundamental purpose of `GetHost` was to be able to express, in a generic fashion:
* Give me a closure that takes `&mut T` and returns `D`. * The `D` type can close over the lifetime in `&mut T`. * The `D` type must implement `bindgen!`-generated traits.
A realization I had was that we could model this with a generic associated type in Rust. Rust support for generic associated types is relatively new and not something I've used much before, but it ended up being a perfect model for this. The definition of the new `HasData` trait is deceptively simple:
trait HasData { type Data<'a>; }
What this enables us to do though is to generate `add_to_linker` functions that look like this:
fn add_to_linker<T, D>( linker: &mut Linker<T>, getter: fn(&mut T) -> D::Data<'_>, ) -> Result<()> where D: HasData, for<'a> D::Data<'a>: Host;
This definition here models `G: GetHost` as a literal function pointer, and the ability to close over the `&mut T` lifetime with type (not just `&mut U`) is expressed through the type constructor `type Data<'a>`). Ideally we could take a generic generic associated type (I'm not even sure what to call that), but that's not something Rust has today.
Overall this felt like a much simpler way of modeling `GetHost` and its requirements. This plumbed well throughout the WASIp3 work and the signatures for concurrent functions felt much more appropriate in terms of complexity after this change. Taking this change to the limit means that `GetHost` in its entirety could be purged since all usages of it could be replaced with `fn(&mut T) -> D::Data<'a>`, a hopefully much more understandable type.
This change is not all rainbows however, there are some gotchas that remain:
* One is that all `add_to_linker` generated functions have a `D: HasData` type parameter. This type parameter cannot be inferred and must always be explicitly specified, and it's not easy to know what to supply here without reading documentation. Actually supplying the type parameter is quite easy once you know what to do (and what to fill in), but it may involve defining a small struct with a custom `HasData` implementation which can be non-obvious.
* Another is that the `G: GetHost` value was previously a full Rust closure, but now it's transitioning to a function pointer. This is done in preparation for WASIp3 work where the function needs to be passed around, and doing that behind a generic parameter is more effort than it's worth. This means that embedders relying on the true closure-like nature here will have to update to using a function pointer instead.
* The function pointer is stored in locations that require `'static`, and while `fn(T)` might be expected to be `'static` regardless of `T` is is, in fact, not. This means that practically `add_to_linker` requires `T: 'static`. Relative to just before this change this is a possible regression in functionality, but there orthogonal reasons beyond just this that we want to start requiring `T: 'static` anyway. That means that this isn't actually a regression relative to #10760, a related change.
The first point is partially ameliorated with WASIp3 work insofar that the `D` type parameter will start serving as a location to specify where concurrent implementations are found. These concurrent methods don't take `&mut self` but instead are implemented for `T: HasData` types. In that sense it's more justified to have this weird type parameter, but in the meantime without this support it'll feel a bit odd to have this little type parameter hanging off the side.
This change has been integrated into the WASIp3 prototyping repository with success. This has additionally been integrated into the Spin embedding which has one of the more complicated reliances on `*_get_host` functions known. Given that it's expected that while this is not necessarily a trivial change to rebase over it should at least be possible.
Finally the `HasData` trait here has been included with what I'm hoping is a sufficient amount of documentation to at least give folks a spring board to understand it. If folks have confusion about this `D` type parameter my hope is they'll make their way to `HasData` which showcases various patterns for "librarifying" host implementations of WIT interfaces. These patterns are all used throughout Wasmtime and WASI currently in crates and tests and such.
* Update expanded test expectations
show more ...
|