|
Revision tags: dev, v36.0.9, v44.0.1, v43.0.2, v36.0.8, v24.0.8, v44.0.0, v43.0.1, v42.0.2, v36.0.7, v24.0.7, v43.0.0, v42.0.1, v41.0.4, v42.0.0, v40.0.4, v36.0.6, v24.0.6, v41.0.3, v41.0.2 |
|
| #
bc4582c3 |
| 27-Jan-2026 |
Alex Crichton <[email protected]> |
Forbid rustdoc warnings in CI (#12420)
* Forbid rustdoc warnings in CI
This commit corrects our handling of rustdoc flags in CI to ensure that warnings indeed fire. Additionally this changes our fl
Forbid rustdoc warnings in CI (#12420)
* Forbid rustdoc warnings in CI
This commit corrects our handling of rustdoc flags in CI to ensure that warnings indeed fire. Additionally this changes our flags to pass `-Dwarnings` to ensure that we have warning-free doc builds when all features are enabled at least.
There were quite a lot of preexisting issues to fix, so this additionally goes through and fixes all the warnings that cropped up.
* Update nightly toolchain again
prtest:full
* Update another nightly
* Fix a warning in generated code
show more ...
|
|
Revision tags: v41.0.1, v36.0.5, v40.0.3, v41.0.0, v36.0.4, v39.0.2, v40.0.2, v40.0.1 |
|
| #
76911c29 |
| 07-Jan-2026 |
SSD <[email protected]> |
Partial support for no_std in cranelift_codegen (#12222)
* Move most things from std to core and alloc
* Port assembler_x64 to no_std
* before adding prelude to each file
* Most of the files now
Partial support for no_std in cranelift_codegen (#12222)
* Move most things from std to core and alloc
* Port assembler_x64 to no_std
* before adding prelude to each file
* Most of the files now work with no_std
* update isle to use alloc and core
* some instances shouldn't have been renamed, fixes cargo test
* add cranelift-assembler-x64 (no_std) to CI
* fix codegen_meta, missed one spot with std::slice
* automatically remove prelude with cargo fix
* update isle changes
* update assembler changes
* update assembler changes
* use latest codegen changes + fix FxHash problem
* add imports
* fix floating issues with libm
* remove unused import
* temporarily remove OnceLock
* add no_std arm support and add it into CI
* Move most things from std to core and alloc
* Port assembler_x64 to no_std
* before adding prelude to each file
* Most of the files now work with no_std
* update isle to use alloc and core
* some instances shouldn't have been renamed, fixes cargo test
* add cranelift-assembler-x64 (no_std) to CI
* automatically remove prelude with cargo fix
* update isle changes
* update assembler changes
* update assembler changes
* use latest codegen changes + fix FxHash problem
* add imports
* fix floating issues with libm
* remove unused import
* temporarily remove OnceLock
* add no_std arm support and add it into CI
* Move most things from std to core and alloc
* Port assembler_x64 to no_std
* before adding prelude to each file
* Most of the files now work with no_std
* update isle to use alloc and core
* add cranelift-assembler-x64 (no_std) to CI
* automatically remove prelude with cargo fix
* update isle changes
* update assembler changes
* use latest codegen changes + fix FxHash problem
* add imports
* fix floating issues with libm
* temporarily remove OnceLock
* add no_std arm support and add it into CI
* revert Cargo.toml formating
* remove prelude and fix cargo.toml
* cargo fmt
* remove empty lines
* bad renames
* macro_use only on no_std
* revert OnceLock change
* only use stable libm features
* update regalloc2
* update comment
* use continue instead
* Update vets
---------
Co-authored-by: Alex Crichton <[email protected]>
show more ...
|
|
Revision tags: v40.0.0, v39.0.1, v39.0.0, v38.0.4, v37.0.3, v36.0.3, v24.0.5, v38.0.3, v38.0.2, v38.0.1, v37.0.2 |
|
| #
a3d6e407 |
| 06-Oct-2025 |
Chris Fallin <[email protected]> |
Cranelift: add debug tag infrastructure. (#11768)
* Cranelift: add debug tag infrastructure.
This PR adds *debug tags*, a kind of metadata that can attach to CLIF instructions and be lowered to VCo
Cranelift: add debug tag infrastructure. (#11768)
* Cranelift: add debug tag infrastructure.
This PR adds *debug tags*, a kind of metadata that can attach to CLIF instructions and be lowered to VCode instructions and as metadata on the produced compiled code. It also adds opaque descriptor blobs carried with stackslots. Together, these two features allow decorating IR with first-class debug instrumentation that is properly preserved by the compiler, including across optimizations and inlining. (Wasmtime's use of these features will come in followup PRs.)
The key idea of a "debug tag" is to allow the Cranelift embedder to express whatever information it needs to, in a format that is opaque to Cranelift itself, except for the parts that need translation during lowering. In particular, the `DebugTag::StackSlot` variant gets translated to a physical offset into the stackframe in the compiled metadata output. So, for example, the embedder can emit a tag referring to a stackslot, and another describing an offset in that stackslot.
The debug tags exist as a *sequence* on any given instruction; the meaning of the sequence is known only to the embedder, *except* that during inlining, the tags for the inlining call instruction are prepended to the tags of inlined instructions. In this way, a canonical use-case of tags as describing original source-language frames can preserve the source-language view even when multiple functions are inlined into one.
The descriptor on a stackslot may look a little odd at first, but its purpose is to allow serializing some description of stackslot-contained runtime user-program data, in a way that is firmly attached to the stackslot. In particular, in the face of inlining, this descriptor is copied into the inlining (parent) function from the inlined function when the stackslot entity is copied; no other metadata outside Cranelift needs to track the identity of stackslots and know about that motion. This fits nicely with the ability of tags to refer to stackslots; together, the embedder can annotate instructions as having certain state in stackslots, and describe the format of that state per stackslot.
This infrastructure is tested with some compile-tests now; testing of the interpretation of the metadata output will come with end-to-end debug instrumentation tests in a followup PR.
* Review feedback: add back sequence points and enforce tags only on sequence points or calls.
* Use Vecs for debug metadata in MachBuffer to avoid SmallVec size penalty in not-used case.
* Review feedback: switch from inlined stackslot descriptor blobs to u64 keys.
show more ...
|
| #
4f2fa154 |
| 29-Sep-2025 |
Alex Crichton <[email protected]> |
Update nightly Rust used in CI (#11755)
* Update nightly Rust used in CI
Keeping it up-to-date
prtest:full
* Fix unused warnings on nightly
* Rename rustdoc feature
* Adjust some removals
|
|
Revision tags: v37.0.1, v37.0.0, v36.0.2, v36.0.1, v36.0.0 |
|
| #
4590076f |
| 26-Jul-2025 |
Chris Fallin <[email protected]> |
Cranelift: support dynamic contexts in exception-handler lists. (#11321)
In #11285, we realized that Wasm semantics require us to match on dynamic instances of exception tags, rather than static tag
Cranelift: support dynamic contexts in exception-handler lists. (#11321)
In #11285, we realized that Wasm semantics require us to match on dynamic instances of exception tags, rather than static tag types. This fundamentally requires the unwinder to be able to resolve the current Wasm instance for each Wasm frame on the stack that has any handlers, and our frame format does not provide this today.
We discussed many options, some of which solve the more general problem (Wasm vmctx for any frame), but ultimately landed on a notion of "dynamic context for evaluating tags", specific to Cranelift's exception-catch metadata; and storing that context and carrying it through to a place that is named in the unwind metadata. The reasoning is fairly straightforward: we cannot afford a more general approach that stores vmctx in every frame (I measured this at 20% overhead for a recursive-Fibonacci benchmark that is call-intensive); and inlining means that we may have *multiple* contexts at any given program point, each associated with a different slice of the handler tags; so we need a mechanism that, *just for a try-call*, intersperses contexts with tags (or puts a context on each tag) and stores these somewhere that the exception-unwind ABI doesn't clobber (e.g., on the stack).
This PR implements "option 4" from that issue, namely, *dynamic exception contexts*. The idea is that this is the dual to exception payload: while payload lets the unwinder communicate state *to* the catching code, context lets the unwinder take state *from* the catching code that lets it decide whether the tag is a match. Because of inlining, we need to either associate (optional) context with every tag, or intersperse context-updates with handler tags. I've opted for the latter for efficiency at the CLIF level (in most cases there will be multiple tags per context), though they are isomorphic.
The new tag-matching semantics are: when walking up the stack, upon reaching a `try_call`, evaluate catch-clauses in listed order. A `context` clause sets the current context. A `tagN: block(...)` clause attempts to match the throwing exception against `tagN`, *evaluated in the current context*, and branches to the named block if it matches. A `default: block(...)` always branches to the named block.
Note that this lets us assume less about tags than before, and this particularly manifests in the changes to the inliner. Whereas before, `tagN` is `tagN` and an inner handler for that tag shadows an outer handler (that is, tags always alias if identical indices); and whereas before, `tagN` is not `tagM` and so we can order the tags arbitrarily (that is, tags never alias if non-identical indices); now any two static tag indices may or may not alias depending on the dynamic context of each. Or, even in the same context, two may alias, because we leave the match-predicate as an unspecified (user-chosen) algorithm during unwinding. (This mirrors the reality that, for example, a Wasm instance may import two tags, and dynamically these tags may be equal or different at runtime, even instantiation-to-instantiation.) Cranelift's only job is to faithfully carry the list of contexts and tags through to the compiled-code metadata; and to ensure that they remain in the order they were specified in the CLIF.
This PR introduces the Cranelift-level feature, and it will be used in a subsequent PR that introduces Wasm exception handling. Because of that, I've opted not to update the clif-utils runtest "runtime" to read out contexts and do something with them -- we will have plenty of test coverage via a bunch of Wasm tests for corner cases such as the above. This PR does include filetests that show that contexts are carried through to spillslots and those appear in the metadata.
Fixes #11285.
show more ...
|
|
Revision tags: v35.0.0, v24.0.4, v33.0.2, v34.0.2 |
|
| #
968952ab |
| 10-Jul-2025 |
Nick Fitzgerald <[email protected]> |
Cranelift: introduce a function inliner (#11210)
* Cranelift: introduce a function inliner
This comit adds "inlining as a library" to Cranelift; it does _not_ provide a complete, off-the-shelf inli
Cranelift: introduce a function inliner (#11210)
* Cranelift: introduce a function inliner
This comit adds "inlining as a library" to Cranelift; it does _not_ provide a complete, off-the-shelf inlining solution. Cranelift's compilation context is per-function and does not encompass the full call graph. It does not know which functions are hot and which are cold, which have been marked the equivalent of `#[inline(always)]` versus `#[inline(never)]`, etc... Only the Cranelift user can understand these aspects of the full compilation pipeline, and these things can be very different between (say) Wasmtime and `cg_clif`. Therefore, this infrastructure does not attempt to define hueristics for when inlining a particular call is likely beneficial. This module only provides hooks for the Cranelift user to tell Cranelift whether a given call should be inlined or not, and the mechanics to inline a callee into a particular call site when the user directs Cranelift to do so.
This commit also creates a new kind of filetest that will always inline calls to functions that have already been defined in the file. This lets us exercise the inliner in filetests.
Fixes https://github.com/bytecodealliance/wasmtime/issues/4127
* Address review feedback
* Require callee bodies are pre-legalized
show more ...
|
| #
099102d9 |
| 07-Jul-2025 |
Alex Crichton <[email protected]> |
Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen (#11182)
* Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen
This commit gets around to
Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen (#11182)
* Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen
This commit gets around to migrating the `cranelift-codegen` crate to require a reason on lint directives and additionally switch to `#[expect]` where possible.
prtest:full
* Move x64-only item to x64 backend
show more ...
|
|
Revision tags: v34.0.1, v33.0.1, v24.0.3, v32.0.1, v34.0.0 |
|
| #
cfe17cb1 |
| 19-Jun-2025 |
Nick Fitzgerald <[email protected]> |
Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate (#11065)
* Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate
This automatically g
Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate (#11065)
* Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate
This automatically generates operations and conversions for integer types for use in ISLE.
Supported types are: `{i,u}{8,16,32,64,128}`
We generate
* Comparisons (eq, ne, lt, lt_eq, gt, gt_eq) * Arithmetic operations (add, sub, mul, div, neg) * These each have checked, wrapping, and unwrapping variants * Bitwise operations (and, or, xor, shifts, counting leading/trailing zeros/ones) * A variety of predicates (is_zero, is_power_of_two, is_odd, etc...) * These generate both partial constructors and a handful of extractors * Conversions * These come in a variety of flavors: fallible, infallible, truncating, unwrapping, sign-reinterpretation * Fallible conversions are also available as an extractor
* Fix copy paste
* Rename `x_reinterpret_as_y` to `x_cast_[un]signed`
* Collapse some fallible conversions in pulley lowering
* Clean up pulley iconst lowering, make sure narrowest `xconst*` instruction is always used
* Avoid an unnecessary truncation in riscv64 lowering
* Use extractor instead of partial constructor in x64 `imm` rule
* Clean up `op mem, imm` x64 lowering rules
* Use `(i64_eq a b)` instead of `(u64_eq (i64_cast_unsigned a) (i64_cast_unsigned b))`
* Rename `<ty>_unwrapping_<op>` to `<ty>_<op>`
show more ...
|
| #
efa236e5 |
| 05-Jun-2025 |
Chris Fallin <[email protected]> |
Cranelift: implement an "unwinder" crate and exception throws in filetests. (#10919)
This commit introduces the next major piece of machinery (after the previously-landed `try_call` support) that we
Cranelift: implement an "unwinder" crate and exception throws in filetests. (#10919)
This commit introduces the next major piece of machinery (after the previously-landed `try_call` support) that we will eventually use to implement Wasm exceptions in Wasmtime. In particular, it implements a generic unwinder as a new crate that supports (i) walking a stack produced by Cranelift code, (ii) serializing Cranelift exception metadata to compact tables (in a way very similar to address maps in Wasmtime, so they will be mappable directly from disk), (iii) using these serialized tables to find handlers during a stack-walk, and (iv) jumping to handlers (i.e., actually unwinding). This crate is currently used in the filetests runner, and will next be used in Wasmtime.
The commit first performs code-motion: it moves stack-walking code from Wasmtime to `cranelift-unwinder`. This itself has no functional effect, but isolates the code that understands contiguous sequences of Cranelift frames ("activations") from that which is specific to Wasmtime's activation delimiters and metadata.
It then implements a compact exception-table format. This format uses the `object` crate's mechanisms for directly referencing in-memory arrays of little-endian `u32`s in a way that will allow us to find handlers when mapping exception metadata directly from an ELF section in a `.cwasm` (for example). The format consists of four sorted `u32` arrays in a way that allows us to look up a callsite first, then search its sorted array of handler offsets by tags.
It next implements the actual unwind control flow: it contains an assembly stub for each supported architecture that transfers control to a PC, SP, and FP value "up the stack", with payload values placed in the payload registers we have defined per our exception ABI in Cranelift.
Finally, it puts these pieces together in the filetest runner. Note that the runtest does a lot "by hand": we don't have entry and exit trampolines as we do in Wasmtime, so the filetest contains three functions, with the middle one invoking the "throw hostcall" and entry and exit trampolines around it grabbing the appropriate entry/exit FPs and exit PC. The dance to call back to host code is also somewhat delicate, as we haven't done this before. The `JITModule`'s linking + relocation support does not seem sufficient to properly define a symbol, so instead we scan for `func_addr` instructions referencing a well-known name (`__cranelift_throw`) and replace them with `iconst`s with the function address at runtime, baking it in. This is somewhat ugly, but it works. All of these filetest-specific details will be handled much more nicely in the Wasmtime version of this functionality, as we have proper abstractions for entry/exit trampolines and hostcalls.
show more ...
|
| #
2c1e1155 |
| 02-Jun-2025 |
Saúl Cabrera <[email protected]> |
winch(aarch64): Simplify constant handling, part 1/N (#10888)
* winch(aarch64): implify constant handling, part 1/N
This commit is the first step toward simplifying constant handling, particularly
winch(aarch64): Simplify constant handling, part 1/N (#10888)
* winch(aarch64): implify constant handling, part 1/N
This commit is the first step toward simplifying constant handling, particularly for the aarch64 backend.
The main highlights in this patch are:
* Introduction of `ConstantPool` implemenetation on top of Cranlift primitives. The implemettaion is identical to the existing for x64, however, it's abstracted so that it can be easily consumed from any existing backend. * Usage of the constant pool from aarch64, which simplifies the loading of constants, particularly floating point constants.
The main motivation behind this change is to _eventually_ detach the implicit usage of the scatch register from constant loading as much as possible, reducing the possibility of subtle bugs (like the one described in https://github.com/bytecodealliance/wasmtime/pull/10829).
Note that I have a work-in-progress branch from where all these changes are cherry picked from, to make everything easier to review.
A side effect of this change, is the improvement to the code generation involving floating point constants. Prior to this change, multiple moves were involved, with this patch, at most 1 move is required and at worst one load is required.
* Update disassembly tests
* Apply refactored constant handling on top of shared float min/max implementation
* `fmt`
show more ...
|
|
Revision tags: v33.0.0 |
|
| #
90ac295e |
| 19-May-2025 |
Alex Crichton <[email protected]> |
Update Wasmtime to the 2024 Rust Edition (#10806)
* Update Wasmtime to the 2024 Rust Edition
Now that our MSRV supports the 2024 edition it's possible to make this switch. This commit moves Wasmtim
Update Wasmtime to the 2024 Rust Edition (#10806)
* Update Wasmtime to the 2024 Rust Edition
Now that our MSRV supports the 2024 edition it's possible to make this switch. This commit moves Wasmtime to the 2024 Edition to keep up-to-date with Rust idioms and access many of the edition features exclusive to the 2024 edition.
prtest:full
* Reformat with the 2024 edition
show more ...
|
|
Revision tags: v32.0.0 |
|
| #
0e0a60ae |
| 08-Apr-2025 |
Nick Fitzgerald <[email protected]> |
Define an RAII helper for generic take-and-replace borrow splitting (#10548)
* Define an RAII helper for generic take-and-replace borrow splitting
Follow up to https://github.com/bytecodealliance/w
Define an RAII helper for generic take-and-replace borrow splitting (#10548)
* Define an RAII helper for generic take-and-replace borrow splitting
Follow up to https://github.com/bytecodealliance/wasmtime/pull/10524#discussion_r2032164529
* &mut T
show more ...
|
| #
3da7fc8e |
| 08-Apr-2025 |
SingleAccretion <[email protected]> |
[DI] Dump value label assignments in a table (#10549)
* Dump compilation start/end
* [DI] Log value label ranges in a table
Sample table:
|Inst |IP |VL0 |VL1 |VL3 |VL4 |VL5
[DI] Dump value label assignments in a table (#10549)
* Dump compilation start/end
* [DI] Log value label ranges in a table
Sample table:
|Inst |IP |VL0 |VL1 |VL3 |VL4 |VL5 |VL7 |VL10 |VL11 |VL4294967294| |--------|----|--------|---------|---------|--------|--------|--------|---------|--------|------------| |Inst 0 |53 | | | | | | | | | | | | | | | | | | | |Inst 1 |53 | | | | | | | | | | | | | | | | | | | |Inst 2 |60 |v194|p2i|v232|p12i| | | | | | | | | | | | |v192|p7i | |Inst 3 |64 |* |p2i|* |p12i|v231|p13i| | | | | | | | | | |* |p7i | |Inst 4 |68 |* |p2i|* |p12i|* |p13i| | | | | | | | | | |* |p7i | |Inst 5 |72 |* |p2i|* |p12i|* |p13i| | | | | | | | | | |* |p7i | |Inst 6 |76 |* |p2i|* |p12i|* |p13i| | | | | | | | | | |* |p7i | |Inst 7 |87 |* | |* |p12i|* |p13i| | | | | | | | | | |* |p7i | |Inst 8 |92 |* | |* |p12i|* |p13i|v227|p0i| | | | | | | | |* |p15i | |Inst 9 |94 |* | |v204| |v204| |v204| |v204| |v204| |v204| |v204| |* |p15i | |Inst 10 |100 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i | |Inst 11 |105 |* | |* | |* | |* | |v226|p9i|* | |* | |* | |* |p15i | |Inst 12 |109 |* | |* | |* | |* | |* | |v225|p9i|* | |* | |* |p15i | |Inst 13 |114 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i | |Inst 14 |119 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i | |Inst 15 |125 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i | |Inst 16 |129 |* | |* | |* | |* | |* | |* | |v223|p11i|* | |* |p15i | |Inst 17 |134 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i | |Inst 18 |134 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i | |Inst 19 |139 |* | |* | |* | |* | |* | |* | |* | |v222|p0i|* |p15i | |Inst 20 |143 |* | |* | |* | |* | |* | |* | |* | |* |p0i|* |p15i | |Inst 21 |143 |* | |* | |* | |* | |* | |* | |* | |* |p0i|* | |
This will make it much easier to diagnose problems with incomplete/missing live ranges.
show more ...
|
| #
82c0a09b |
| 26-Mar-2025 |
Chris Fallin <[email protected]> |
Simplify aegraphs by removing union-find and canonical eclass IDs. (#10471)
I was recently re-thinking through some of the core data structure design in our aegraph implementation, and wondered: do
Simplify aegraphs by removing union-find and canonical eclass IDs. (#10471)
I was recently re-thinking through some of the core data structure design in our aegraph implementation, and wondered: do we really need the union-find data structure, the notion of "canonical" ID for an eclass separate from its latest ID (root of union-node tree), and the hashcons-key canonicalization using all of this? It's an awful lot of complexity and has led to some fairly subtle bugs (e.g., #6126), and is generally unsatisfying.
I had the realization: the only case where the distinction between canonical and latest ID matters is when we expand an eclass after its initial (eager) rewriting, which happens before we see its uses. If we hypothesize that this happens rarely, then it should be fine to canonicalize based only on latest ID -- we shouldn't lose much (and we can measure this loss empirically).
The chief case where this kind of "late eclass expansion" still happens is: if we have some expression E1 that eventually rewrites to E2 via some simplification, and E2 already exists earlier in the program, then E1 will join the eclass. If we then have some `E3 := E1 + 1`, and later `E4 := E2 + 1`, we need the union-find canonicalization for E4 to GVN to E3. Otherwise, the latest ID for the eclass that eventually contains E1 and E2 is different at the time that E3 uses it (and is GVN'd and rewritten) and when E4 does. Put another way: E3 captures a snapshot of its operand's eclass before a new node joins it, and is never reprocessed when that happens, so E3 remains distinct.
But if the `E2 -> E1` rewrite is truly "directional" toward a better representation that we will always want to choose -- say, `x + 0 -> x`, or any constant-propagation in general -- then if the eager rewriting for E2 produces E1's eclass ID directly *without* adding E2 to its nodes, then all users will still canonicalize as before. This "only return the rewrite target, don't union with it" before is exactly our `subsume` operator.
Put another way: subsumption prevents growing eclasses later, so snapshots in time remain the latest, and everyone canonicalizes with the same ID. We move to a true immutable data structure, with simple hashconsing with no magic.
The rewrite semantics are then much simpler too: if any value is marked "subsume", we pick it (pick an arbitrary one if multiple) as our rewrite result; otherwise, create an eclass with the original rewrite input and all rewrite outputs (by creating union-nodes in the DFG). No auto-subsume or factoring in availability -- that's it. "Subsume" means: always pick the rewritten value, don't keep the original, and we use it whenever that's unambiguously true for better canonicalization.
Given that, it turns out that we can remove the union-find mechanism entirely. It also turns out that we can unify the scoped-hashmap for "effectful/idempotent ops" with the ordinary hashmap for pure ops; everything can be scoped. To get working LICM we do need to retain our "available block" mechanism, but only to insert values at a higher scope level in the scoped hashmap -- it is now heuristic, not load-bearing for correctness.
I suspect that the fixpoint loop computing analysis results can go away too, now that we truly don't update arguments -- we have a simple immutable data structure with everything snapshotted at creation -- but I haven't made that change yet.
This change appears to be "in the noise" in both runtime and compile time -- a Sightglass run on the default suite shows only
``` compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm
Δ = 551234.50 ± 514580.62 (confidence = 99%)
new.so is 1.00x to 1.01x faster than old.so!
[61669181 72513567.85 98139932] new.so [60991071 73064802.35 120044089] old.so
execution :: cycles :: benchmarks/bz2/benchmark.wasm
Δ = 232827.80 ± 204621.12 (confidence = 99%)
old.so is 1.00x to 1.01x faster than new.so!
[67208140 72812782.32 89996076] new.so [69531172 72579954.52 80530142] old.so ```
which seem like suitably small swings that are fine. Spot-checking the aegraph stats on the same function before-and-after shows the same optimizations happening in all functions I examined, and we see the compile-tests showing no movement except for a value renumbering in one case. So: no effect objectively, but deletes code and significantly simplifies the core algorithm.
show more ...
|
|
Revision tags: v31.0.0, v30.0.2, v30.0.1, v30.0.0, v29.0.1, v29.0.0, v28.0.1 |
|
| #
1bb71d31 |
| 09-Jan-2025 |
amartosch <[email protected]> |
Compute dominator tree using semi-NCA algorithm (#9603)
* Add dominator tree computed using semi-NCA algorithm.
* Add dominator tree fuzz target
* Move previous version of dominator tree to a sepa
Compute dominator tree using semi-NCA algorithm (#9603)
* Add dominator tree computed using semi-NCA algorithm.
* Add dominator tree fuzz target
* Move previous version of dominator tree to a separate file
* Improve comments.
* Use the new dominator tree in verifier.
* Remove unused `iterators` module.
show more ...
|
|
Revision tags: v28.0.0 |
|
| #
45b60bd6 |
| 02-Dec-2024 |
Alex Crichton <[email protected]> |
Start using `#[expect]` instead of `#[allow]` (#9696)
* Start using `#[expect]` instead of `#[allow]`
In Rust 1.81, our new MSRV, a new feature was added to Rust to use `#[expect]` to control lint
Start using `#[expect]` instead of `#[allow]` (#9696)
* Start using `#[expect]` instead of `#[allow]`
In Rust 1.81, our new MSRV, a new feature was added to Rust to use `#[expect]` to control lint levels. This new lint annotation will silence a lint but will itself cause a lint if it doesn't actually silence anything. This is quite useful to ensure that annotations don't get stale over time.
Another feature is the ability to use a `reason` directive on the attribute with a string explaining why the attribute is there. This string is then rendered in compiler messages if a warning or error happens.
This commit migrates applies a few changes across the workspace:
* Some `#[allow]` are changed to `#[expect]` with a `reason`. * Some `#[allow]` have a `reason` added if the lint conditionally fires (mostly related to macros). * Some `#[allow]` are removed since the lint doesn't actually fire. * The workspace configures `clippy::allow_attributes_without_reason = 'warn'` as a "ratchet" to prevent future regressions. * Many crates are annotated to allow `allow_attributes_without_reason` during this transitionary period.
The end-state is that all crates should use `#[expect(..., reason = "...")]` for any lint that unconditionally fires but is expected. The `#[allow(..., reason = "...")]` lint should be used for conditionally firing lints, primarily in macro-related code. The `allow_attributes_without_reason = 'warn'` level is intended to be permanent but the transitionary `#[expect(clippy::allow_attributes_without_reason)]` crate annotations to go away over time.
* Fix adapter build
prtest:full
* Fix one-core build of icache coherence
* Use `allow` for missing_docs
Work around rust-lang/rust#130021 which was fixed in Rust 1.83 and isn't fixed for our MSRV at this time.
* More MSRV compat
show more ...
|
|
Revision tags: v27.0.0, v26.0.1, v25.0.3, v24.0.2 |
|
| #
92cc0ad7 |
| 04-Nov-2024 |
SingleAccretion <[email protected]> |
Add very basic logging to the debug info transform (#9526)
* Add very basic logging to the debug info transform
The DI transform is a kind of compiler and logging is a very good way to gain insight
Add very basic logging to the debug info transform (#9526)
* Add very basic logging to the debug info transform
The DI transform is a kind of compiler and logging is a very good way to gain insight into compilers.
* Fix C&P
* Bubble the "trace-log" feature up the dependency tree
And switch logging macros to always be enabled in debug.
Verified "trace-log" **does not** show up when running 'cargo tree -f "{p} {f}" -e features,normal,build'
* Fix dead code warnings
show more ...
|
|
Revision tags: v26.0.0 |
|
| #
5b1a1edb |
| 22-Oct-2024 |
nihalpasham <[email protected]> |
Enable --all-features and display feature requirements in Cranelift docs on docs.rs (#9493)
* build docs for cranelift with all features enabled
* build docs for cranelift-codegen with the feature
Enable --all-features and display feature requirements in Cranelift docs on docs.rs (#9493)
* build docs for cranelift with all features enabled
* build docs for cranelift-codegen with the feature all-arch
* build docs for cranelift-codegen with the feature all-arch
show more ...
|
|
Revision tags: v21.0.2, v22.0.1, v23.0.3, v25.0.2, v24.0.1, v25.0.1, v25.0.0 |
|
| #
ae92cb41 |
| 03-Sep-2024 |
Alex Crichton <[email protected]> |
Refactor `CallInfo` amongst Cranelift's backends (#9190)
* Refactor backends to use the same `CallInfo`
This commit refactors the various backends of cranelift, except for s390x, to use a shared de
Refactor `CallInfo` amongst Cranelift's backends (#9190)
* Refactor backends to use the same `CallInfo`
This commit refactors the various backends of cranelift, except for s390x, to use a shared definition of `CallInfo`. They were all already quite similar and the main change here is to push platform-specific pieces into the instructions outside of `CallInfo`. This is intended to make additions to `CallInfo` easier and require less refactoring in the future. Additionally this enables passing a `CallInfo` structure around instead of passing around all of its components which helps reduce the amount of arguments to various functions.
* s390x: Use the same `CallInfo` as other backends
This commit refactors s390x the same way as the previous commit to use the shared `CallInfo` that all other backends are using. This required more refactoring on the s390x side of things to notably extract a dedicated pseudo-instruction for `ElfTlsGetOffset` rather than bundling it within the `Call` instruction.
* Review comments and test fixes
* Fold `ExternalName` into `CallInfo`
As predicted instruction sizes got larger when outlining this on some platforms so apply the same fix across all platforms by changing to `CallInfo<T>` where the `T` will change depending on whether it's an indirect or direct call.
* Update test expectations
show more ...
|
| #
c0c3a68c |
| 21-Aug-2024 |
Nick Fitzgerald <[email protected]> |
Cranelift: Remove the old stack maps implementation (#9159)
They are superseded by the new user stack maps implementation.
|
|
Revision tags: v24.0.0, v23.0.2, v23.0.1, v23.0.0, v22.0.0 |
|
| #
b3636ff6 |
| 18-Jun-2024 |
Nick Fitzgerald <[email protected]> |
Introduce the `cranelift-bitset` crate; use it for stack maps in both Cranelift and Wasmtime (#8826)
* Introduce the `cranelift-bitset` crate
The eventual goal is to deduplicate bitset types betwee
Introduce the `cranelift-bitset` crate; use it for stack maps in both Cranelift and Wasmtime (#8826)
* Introduce the `cranelift-bitset` crate
The eventual goal is to deduplicate bitset types between Cranelift and Wasmtime, especially their use in stack maps.
* Use the `cranelift-bitset` crate inside both Cranelift and Wasmtime
Mostly for stack maps, also for a variety of other random things where `cranelift_codegen::bitset::BitSet` was previously used.
* Fix stack maps unit test in cranelift-codegen
* Uncomment `no_std` declaration
* Fix `CompountBitSet::reserve` method
* Fix `CompoundBitSet::insert` method
* Keep track of the max in a `CompoundBitSet`
Makes a bunch of other stuff easier, and will be needed for replacing `cranelift_entity::EntitySet`'s bitset with this thing anyways.
* Add missing parens
* Fix a bug around insert and reserve
* Implement `with_capacity` in terms of `new` and `reserve`
* Rename `reserve` to `ensure_capacity`
show more ...
|
|
Revision tags: v21.0.1 |
|
| #
cacfaf8b |
| 20-May-2024 |
Nick Fitzgerald <[email protected]> |
Cranelift: Split out dominator tree's depth-first traversal into a reusable iterator (#8640)
We intend to use this when computing liveness of GC references in `cranelift-frontend` to manually constr
Cranelift: Split out dominator tree's depth-first traversal into a reusable iterator (#8640)
We intend to use this when computing liveness of GC references in `cranelift-frontend` to manually construct safepoints and ultimately remove `r{32,64}` reference types from CLIF, `cranelift-codegen`, and `regalloc2`.
Co-authored-by: Trevor Elliott <[email protected]>
show more ...
|
|
Revision tags: v21.0.0 |
|
| #
b869b66b |
| 13-May-2024 |
Jamey Sharp <[email protected]> |
cranelift: Delete redundant DCE optimization pass (#8227)
The egraph pass and the dead-code elimination pass both remove instructions whose results are unused. If the optimization level is "none", n
cranelift: Delete redundant DCE optimization pass (#8227)
The egraph pass and the dead-code elimination pass both remove instructions whose results are unused. If the optimization level is "none", neither pass runs, and if it's anything else both passes run. I don't think we should do this work twice.
Note that the DCE pass is different than the "eliminate unreachable code" pass, which removes entire blocks that are unreachable from the entry block. That pass might still be necessary.
show more ...
|
|
Revision tags: v20.0.2, v20.0.1 |
|
| #
2c409535 |
| 02-May-2024 |
Jamey Sharp <[email protected]> |
cranelift: Compress vcode range-lists (#8506)
These lists of ranges always cover contiguous ranges of an index space, meaning the start of one range is the same as the end of the previous range, so
cranelift: Compress vcode range-lists (#8506)
These lists of ranges always cover contiguous ranges of an index space, meaning the start of one range is the same as the end of the previous range, so we can cut storage in half by only storing one endpoint of each range.
This in turn means we don't have to keep track of the other endpoint while building these lists, reducing the state we need to keep while building vcode and simplifying the various build steps.
show more ...
|
| #
132ef1e4 |
| 29-Apr-2024 |
Kirpal Grewal <[email protected]> |
Fxhash to rustchash (#8498)
* move fx hash to workspace level dep
* change internal fxhash to use fxhash crate
* remove unneeded HashSet import
* change fxhash crate to rustc hash
* undo migrat
Fxhash to rustchash (#8498)
* move fx hash to workspace level dep
* change internal fxhash to use fxhash crate
* remove unneeded HashSet import
* change fxhash crate to rustc hash
* undo migration to rustc hash
* manually implement hash function from fxhash
* change to rustc hash
show more ...
|