History log of /wasmtime-44.0.1/cranelift/codegen/src/lib.rs (Results 1 – 25 of 80)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: dev, v36.0.9, v44.0.1, v43.0.2, v36.0.8, v24.0.8, v44.0.0, v43.0.1, v42.0.2, v36.0.7, v24.0.7, v43.0.0, v42.0.1, v41.0.4, v42.0.0, v40.0.4, v36.0.6, v24.0.6, v41.0.3, v41.0.2
# bc4582c3 27-Jan-2026 Alex Crichton <[email protected]>

Forbid rustdoc warnings in CI (#12420)

* Forbid rustdoc warnings in CI

This commit corrects our handling of rustdoc flags in CI to ensure that
warnings indeed fire. Additionally this changes our fl

Forbid rustdoc warnings in CI (#12420)

* Forbid rustdoc warnings in CI

This commit corrects our handling of rustdoc flags in CI to ensure that
warnings indeed fire. Additionally this changes our flags to pass
`-Dwarnings` to ensure that we have warning-free doc builds when all
features are enabled at least.

There were quite a lot of preexisting issues to fix, so this
additionally goes through and fixes all the warnings that cropped up.

* Update nightly toolchain again

prtest:full

* Update another nightly

* Fix a warning in generated code

show more ...


Revision tags: v41.0.1, v36.0.5, v40.0.3, v41.0.0, v36.0.4, v39.0.2, v40.0.2, v40.0.1
# 76911c29 07-Jan-2026 SSD <[email protected]>

Partial support for no_std in cranelift_codegen (#12222)

* Move most things from std to core and alloc

* Port assembler_x64 to no_std

* before adding prelude to each file

* Most of the files now

Partial support for no_std in cranelift_codegen (#12222)

* Move most things from std to core and alloc

* Port assembler_x64 to no_std

* before adding prelude to each file

* Most of the files now work with no_std

* update isle to use alloc and core

* some instances shouldn't have been renamed, fixes cargo test

* add cranelift-assembler-x64 (no_std) to CI

* fix codegen_meta, missed one spot with std::slice

* automatically remove prelude with cargo fix

* update isle changes

* update assembler changes

* update assembler changes

* use latest codegen changes + fix FxHash problem

* add imports

* fix floating issues with libm

* remove unused import

* temporarily remove OnceLock

* add no_std arm support and add it into CI

* Move most things from std to core and alloc

* Port assembler_x64 to no_std

* before adding prelude to each file

* Most of the files now work with no_std

* update isle to use alloc and core

* some instances shouldn't have been renamed, fixes cargo test

* add cranelift-assembler-x64 (no_std) to CI

* automatically remove prelude with cargo fix

* update isle changes

* update assembler changes

* update assembler changes

* use latest codegen changes + fix FxHash problem

* add imports

* fix floating issues with libm

* remove unused import

* temporarily remove OnceLock

* add no_std arm support and add it into CI

* Move most things from std to core and alloc

* Port assembler_x64 to no_std

* before adding prelude to each file

* Most of the files now work with no_std

* update isle to use alloc and core

* add cranelift-assembler-x64 (no_std) to CI

* automatically remove prelude with cargo fix

* update isle changes

* update assembler changes

* use latest codegen changes + fix FxHash problem

* add imports

* fix floating issues with libm

* temporarily remove OnceLock

* add no_std arm support and add it into CI

* revert Cargo.toml formating

* remove prelude and fix cargo.toml

* cargo fmt

* remove empty lines

* bad renames

* macro_use only on no_std

* revert OnceLock change

* only use stable libm features

* update regalloc2

* update comment

* use continue instead

* Update vets

---------

Co-authored-by: Alex Crichton <[email protected]>

show more ...


Revision tags: v40.0.0, v39.0.1, v39.0.0, v38.0.4, v37.0.3, v36.0.3, v24.0.5, v38.0.3, v38.0.2, v38.0.1, v37.0.2
# a3d6e407 06-Oct-2025 Chris Fallin <[email protected]>

Cranelift: add debug tag infrastructure. (#11768)

* Cranelift: add debug tag infrastructure.

This PR adds *debug tags*, a kind of metadata that can attach to CLIF
instructions and be lowered to VCo

Cranelift: add debug tag infrastructure. (#11768)

* Cranelift: add debug tag infrastructure.

This PR adds *debug tags*, a kind of metadata that can attach to CLIF
instructions and be lowered to VCode instructions and as metadata on
the produced compiled code. It also adds opaque descriptor blobs
carried with stackslots. Together, these two features allow decorating
IR with first-class debug instrumentation that is properly preserved
by the compiler, including across optimizations and
inlining. (Wasmtime's use of these features will come in followup
PRs.)

The key idea of a "debug tag" is to allow the Cranelift embedder to
express whatever information it needs to, in a format that is opaque
to Cranelift itself, except for the parts that need translation during
lowering. In particular, the `DebugTag::StackSlot` variant gets
translated to a physical offset into the stackframe in the compiled
metadata output. So, for example, the embedder can emit a tag
referring to a stackslot, and another describing an offset in that
stackslot.

The debug tags exist as a *sequence* on any given instruction; the
meaning of the sequence is known only to the embedder, *except* that
during inlining, the tags for the inlining call instruction are
prepended to the tags of inlined instructions. In this way, a
canonical use-case of tags as describing original source-language
frames can preserve the source-language view even when multiple
functions are inlined into one.

The descriptor on a stackslot may look a little odd at first, but its
purpose is to allow serializing some description of
stackslot-contained runtime user-program data, in a way that is firmly
attached to the stackslot. In particular, in the face of inlining,
this descriptor is copied into the inlining (parent) function from the
inlined function when the stackslot entity is copied; no other
metadata outside Cranelift needs to track the identity of stackslots
and know about that motion. This fits nicely with the ability of tags
to refer to stackslots; together, the embedder can annotate
instructions as having certain state in stackslots, and describe the
format of that state per stackslot.

This infrastructure is tested with some compile-tests now;
testing of the interpretation of the metadata output will come with
end-to-end debug instrumentation tests in a followup PR.

* Review feedback: add back sequence points and enforce tags only on sequence points or calls.

* Use Vecs for debug metadata in MachBuffer to avoid SmallVec size penalty in not-used case.

* Review feedback: switch from inlined stackslot descriptor blobs to u64 keys.

show more ...


# 4f2fa154 29-Sep-2025 Alex Crichton <[email protected]>

Update nightly Rust used in CI (#11755)

* Update nightly Rust used in CI

Keeping it up-to-date

prtest:full

* Fix unused warnings on nightly

* Rename rustdoc feature

* Adjust some removals


Revision tags: v37.0.1, v37.0.0, v36.0.2, v36.0.1, v36.0.0
# 4590076f 26-Jul-2025 Chris Fallin <[email protected]>

Cranelift: support dynamic contexts in exception-handler lists. (#11321)

In #11285, we realized that Wasm semantics require us to match on
dynamic instances of exception tags, rather than static tag

Cranelift: support dynamic contexts in exception-handler lists. (#11321)

In #11285, we realized that Wasm semantics require us to match on
dynamic instances of exception tags, rather than static tag types. This
fundamentally requires the unwinder to be able to resolve the current
Wasm instance for each Wasm frame on the stack that has any handlers,
and our frame format does not provide this today.

We discussed many options, some of which solve the more general problem
(Wasm vmctx for any frame), but ultimately landed on a notion of
"dynamic context for evaluating tags", specific to Cranelift's
exception-catch metadata; and storing that context and carrying it
through to a place that is named in the unwind metadata. The reasoning
is fairly straightforward: we cannot afford a more general approach that
stores vmctx in every frame (I measured this at 20% overhead for a
recursive-Fibonacci benchmark that is call-intensive); and inlining
means that we may have *multiple* contexts at any given program point,
each associated with a different slice of the handler tags; so we need a
mechanism that, *just for a try-call*, intersperses contexts with tags
(or puts a context on each tag) and stores these somewhere that the
exception-unwind ABI doesn't clobber (e.g., on the stack).

This PR implements "option 4" from that issue, namely, *dynamic
exception contexts*. The idea is that this is the dual to exception
payload: while payload lets the unwinder communicate state *to* the
catching code, context lets the unwinder take state *from* the catching
code that lets it decide whether the tag is a match. Because of
inlining, we need to either associate (optional) context with every tag,
or intersperse context-updates with handler tags. I've opted for the
latter for efficiency at the CLIF level (in most cases there will be
multiple tags per context), though they are isomorphic.

The new tag-matching semantics are: when walking up the stack, upon
reaching a `try_call`, evaluate catch-clauses in listed order. A
`context` clause sets the current context. A `tagN: block(...)` clause
attempts to match the throwing exception against `tagN`, *evaluated in
the current context*, and branches to the named block if it matches. A
`default: block(...)` always branches to the named block.

Note that this lets us assume less about tags than before, and this
particularly manifests in the changes to the inliner. Whereas before,
`tagN` is `tagN` and an inner handler for that tag shadows an outer
handler (that is, tags always alias if identical indices); and whereas
before, `tagN` is not `tagM` and so we can order the tags arbitrarily
(that is, tags never alias if non-identical indices); now any two static
tag indices may or may not alias depending on the dynamic context of
each. Or, even in the same context, two may alias, because we leave the
match-predicate as an unspecified (user-chosen) algorithm during
unwinding. (This mirrors the reality that, for example, a Wasm instance
may import two tags, and dynamically these tags may be equal or
different at runtime, even instantiation-to-instantiation.) Cranelift's
only job is to faithfully carry the list of contexts and tags through to
the compiled-code metadata; and to ensure that they remain in the order
they were specified in the CLIF.

This PR introduces the Cranelift-level feature, and it will be used in
a subsequent PR that introduces Wasm exception handling. Because of
that, I've opted not to update the clif-utils runtest "runtime" to read
out contexts and do something with them -- we will have plenty of test
coverage via a bunch of Wasm tests for corner cases such as the above.
This PR does include filetests that show that contexts are carried
through to spillslots and those appear in the metadata.

Fixes #11285.

show more ...


Revision tags: v35.0.0, v24.0.4, v33.0.2, v34.0.2
# 968952ab 10-Jul-2025 Nick Fitzgerald <[email protected]>

Cranelift: introduce a function inliner (#11210)

* Cranelift: introduce a function inliner

This comit adds "inlining as a library" to Cranelift; it does _not_ provide a
complete, off-the-shelf inli

Cranelift: introduce a function inliner (#11210)

* Cranelift: introduce a function inliner

This comit adds "inlining as a library" to Cranelift; it does _not_ provide a
complete, off-the-shelf inlining solution. Cranelift's compilation context is
per-function and does not encompass the full call graph. It does not know which
functions are hot and which are cold, which have been marked the equivalent of
`#[inline(always)]` versus `#[inline(never)]`, etc... Only the Cranelift user
can understand these aspects of the full compilation pipeline, and these things
can be very different between (say) Wasmtime and `cg_clif`. Therefore, this
infrastructure does not attempt to define hueristics for when inlining a
particular call is likely beneficial. This module only provides hooks for the
Cranelift user to tell Cranelift whether a given call should be inlined or not,
and the mechanics to inline a callee into a particular call site when the user
directs Cranelift to do so.

This commit also creates a new kind of filetest that will always inline calls to
functions that have already been defined in the file. This lets us exercise the
inliner in filetests.

Fixes https://github.com/bytecodealliance/wasmtime/issues/4127

* Address review feedback

* Require callee bodies are pre-legalized

show more ...


# 099102d9 07-Jul-2025 Alex Crichton <[email protected]>

Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen (#11182)

* Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen

This commit gets around to

Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen (#11182)

* Remove `expect(clippy::allow_attributes_without_reason)` from cranelift-codegen

This commit gets around to migrating the `cranelift-codegen` crate to
require a reason on lint directives and additionally switch to
`#[expect]` where possible.

prtest:full

* Move x64-only item to x64 backend

show more ...


Revision tags: v34.0.1, v33.0.1, v24.0.3, v32.0.1, v34.0.0
# cfe17cb1 19-Jun-2025 Nick Fitzgerald <[email protected]>

Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate (#11065)

* Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate

This automatically g

Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate (#11065)

* Cranelift: Generate integer numeric ops and conversions for ISLE in the meta crate

This automatically generates operations and conversions for integer types for
use in ISLE.

Supported types are: `{i,u}{8,16,32,64,128}`

We generate

* Comparisons (eq, ne, lt, lt_eq, gt, gt_eq)
* Arithmetic operations (add, sub, mul, div, neg)
* These each have checked, wrapping, and unwrapping variants
* Bitwise operations (and, or, xor, shifts, counting leading/trailing zeros/ones)
* A variety of predicates (is_zero, is_power_of_two, is_odd, etc...)
* These generate both partial constructors and a handful of extractors
* Conversions
* These come in a variety of flavors: fallible, infallible, truncating,
unwrapping, sign-reinterpretation
* Fallible conversions are also available as an extractor

* Fix copy paste

* Rename `x_reinterpret_as_y` to `x_cast_[un]signed`

* Collapse some fallible conversions in pulley lowering

* Clean up pulley iconst lowering, make sure narrowest `xconst*` instruction is always used

* Avoid an unnecessary truncation in riscv64 lowering

* Use extractor instead of partial constructor in x64 `imm` rule

* Clean up `op mem, imm` x64 lowering rules

* Use `(i64_eq a b)` instead of `(u64_eq (i64_cast_unsigned a) (i64_cast_unsigned b))`

* Rename `<ty>_unwrapping_<op>` to `<ty>_<op>`

show more ...


# efa236e5 05-Jun-2025 Chris Fallin <[email protected]>

Cranelift: implement an "unwinder" crate and exception throws in filetests. (#10919)

This commit introduces the next major piece of machinery (after the
previously-landed `try_call` support) that we

Cranelift: implement an "unwinder" crate and exception throws in filetests. (#10919)

This commit introduces the next major piece of machinery (after the
previously-landed `try_call` support) that we will eventually use to
implement Wasm exceptions in Wasmtime. In particular, it implements a
generic unwinder as a new crate that supports (i) walking a stack
produced by Cranelift code, (ii) serializing Cranelift exception
metadata to compact tables (in a way very similar to address maps in
Wasmtime, so they will be mappable directly from disk), (iii) using
these serialized tables to find handlers during a stack-walk, and (iv)
jumping to handlers (i.e., actually unwinding). This crate is currently
used in the filetests runner, and will next be used in Wasmtime.

The commit first performs code-motion: it moves stack-walking code from
Wasmtime to `cranelift-unwinder`. This itself has no functional effect,
but isolates the code that understands contiguous sequences of Cranelift
frames ("activations") from that which is specific to Wasmtime's
activation delimiters and metadata.

It then implements a compact exception-table format. This format uses
the `object` crate's mechanisms for directly referencing in-memory
arrays of little-endian `u32`s in a way that will allow us to find
handlers when mapping exception metadata directly from an ELF section in
a `.cwasm` (for example). The format consists of four sorted `u32`
arrays in a way that allows us to look up a callsite first, then search
its sorted array of handler offsets by tags.

It next implements the actual unwind control flow: it contains an
assembly stub for each supported architecture that transfers control to
a PC, SP, and FP value "up the stack", with payload values placed in the
payload registers we have defined per our exception ABI in Cranelift.

Finally, it puts these pieces together in the filetest runner. Note that
the runtest does a lot "by hand": we don't have entry and exit
trampolines as we do in Wasmtime, so the filetest contains three
functions, with the middle one invoking the "throw hostcall" and entry
and exit trampolines around it grabbing the appropriate entry/exit FPs
and exit PC. The dance to call back to host code is also somewhat
delicate, as we haven't done this before. The `JITModule`'s linking +
relocation support does not seem sufficient to properly define a symbol,
so instead we scan for `func_addr` instructions referencing a well-known
name (`__cranelift_throw`) and replace them with `iconst`s with the
function address at runtime, baking it in. This is somewhat ugly, but it
works. All of these filetest-specific details will be handled much more
nicely in the Wasmtime version of this functionality, as we have proper
abstractions for entry/exit trampolines and hostcalls.

show more ...


# 2c1e1155 02-Jun-2025 Saúl Cabrera <[email protected]>

winch(aarch64): Simplify constant handling, part 1/N (#10888)

* winch(aarch64): implify constant handling, part 1/N

This commit is the first step toward simplifying constant handling,
particularly

winch(aarch64): Simplify constant handling, part 1/N (#10888)

* winch(aarch64): implify constant handling, part 1/N

This commit is the first step toward simplifying constant handling,
particularly for the aarch64 backend.

The main highlights in this patch are:

* Introduction of `ConstantPool` implemenetation on top of Cranlift
primitives. The implemettaion is identical to the existing for x64,
however, it's abstracted so that it can be easily consumed from any
existing backend.
* Usage of the constant pool from aarch64, which simplifies the
loading of constants, particularly floating point constants.

The main motivation behind this change is to _eventually_ detach the
implicit usage of the scatch register from constant loading as much as
possible, reducing the possibility of subtle bugs (like the one
described in https://github.com/bytecodealliance/wasmtime/pull/10829).

Note that I have a work-in-progress branch from where all these
changes are cherry picked from, to make everything easier to review.

A side effect of this change, is the improvement to the code
generation involving floating point constants. Prior to this change,
multiple moves were involved, with this patch, at most 1 move is
required and at worst one load is required.

* Update disassembly tests

* Apply refactored constant handling on top of shared float min/max implementation

* `fmt`

show more ...


Revision tags: v33.0.0
# 90ac295e 19-May-2025 Alex Crichton <[email protected]>

Update Wasmtime to the 2024 Rust Edition (#10806)

* Update Wasmtime to the 2024 Rust Edition

Now that our MSRV supports the 2024 edition it's possible to make this
switch. This commit moves Wasmtim

Update Wasmtime to the 2024 Rust Edition (#10806)

* Update Wasmtime to the 2024 Rust Edition

Now that our MSRV supports the 2024 edition it's possible to make this
switch. This commit moves Wasmtime to the 2024 Edition to keep
up-to-date with Rust idioms and access many of the edition features
exclusive to the 2024 edition.

prtest:full

* Reformat with the 2024 edition

show more ...


Revision tags: v32.0.0
# 0e0a60ae 08-Apr-2025 Nick Fitzgerald <[email protected]>

Define an RAII helper for generic take-and-replace borrow splitting (#10548)

* Define an RAII helper for generic take-and-replace borrow splitting

Follow up to https://github.com/bytecodealliance/w

Define an RAII helper for generic take-and-replace borrow splitting (#10548)

* Define an RAII helper for generic take-and-replace borrow splitting

Follow up to https://github.com/bytecodealliance/wasmtime/pull/10524#discussion_r2032164529

* &mut T

show more ...


# 3da7fc8e 08-Apr-2025 SingleAccretion <[email protected]>

[DI] Dump value label assignments in a table (#10549)

* Dump compilation start/end

* [DI] Log value label ranges in a table

Sample table:

|Inst |IP |VL0 |VL1 |VL3 |VL4 |VL5

[DI] Dump value label assignments in a table (#10549)

* Dump compilation start/end

* [DI] Log value label ranges in a table

Sample table:

|Inst |IP |VL0 |VL1 |VL3 |VL4 |VL5 |VL7 |VL10 |VL11 |VL4294967294|
|--------|----|--------|---------|---------|--------|--------|--------|---------|--------|------------|
|Inst 0 |53 | | | | | | | | | | | | | | | | | | |
|Inst 1 |53 | | | | | | | | | | | | | | | | | | |
|Inst 2 |60 |v194|p2i|v232|p12i| | | | | | | | | | | | |v192|p7i |
|Inst 3 |64 |* |p2i|* |p12i|v231|p13i| | | | | | | | | | |* |p7i |
|Inst 4 |68 |* |p2i|* |p12i|* |p13i| | | | | | | | | | |* |p7i |
|Inst 5 |72 |* |p2i|* |p12i|* |p13i| | | | | | | | | | |* |p7i |
|Inst 6 |76 |* |p2i|* |p12i|* |p13i| | | | | | | | | | |* |p7i |
|Inst 7 |87 |* | |* |p12i|* |p13i| | | | | | | | | | |* |p7i |
|Inst 8 |92 |* | |* |p12i|* |p13i|v227|p0i| | | | | | | | |* |p15i |
|Inst 9 |94 |* | |v204| |v204| |v204| |v204| |v204| |v204| |v204| |* |p15i |
|Inst 10 |100 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i |
|Inst 11 |105 |* | |* | |* | |* | |v226|p9i|* | |* | |* | |* |p15i |
|Inst 12 |109 |* | |* | |* | |* | |* | |v225|p9i|* | |* | |* |p15i |
|Inst 13 |114 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i |
|Inst 14 |119 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i |
|Inst 15 |125 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i |
|Inst 16 |129 |* | |* | |* | |* | |* | |* | |v223|p11i|* | |* |p15i |
|Inst 17 |134 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i |
|Inst 18 |134 |* | |* | |* | |* | |* | |* | |* | |* | |* |p15i |
|Inst 19 |139 |* | |* | |* | |* | |* | |* | |* | |v222|p0i|* |p15i |
|Inst 20 |143 |* | |* | |* | |* | |* | |* | |* | |* |p0i|* |p15i |
|Inst 21 |143 |* | |* | |* | |* | |* | |* | |* | |* |p0i|* | |

This will make it much easier to diagnose problems with incomplete/missing live ranges.

show more ...


# 82c0a09b 26-Mar-2025 Chris Fallin <[email protected]>

Simplify aegraphs by removing union-find and canonical eclass IDs. (#10471)

I was recently re-thinking through some of the core data structure
design in our aegraph implementation, and wondered: do

Simplify aegraphs by removing union-find and canonical eclass IDs. (#10471)

I was recently re-thinking through some of the core data structure
design in our aegraph implementation, and wondered: do we really need
the union-find data structure, the notion of "canonical" ID for an
eclass separate from its latest ID (root of union-node tree), and the
hashcons-key canonicalization using all of this? It's an awful lot of
complexity and has led to some fairly subtle bugs (e.g., #6126), and is
generally unsatisfying.

I had the realization: the only case where the distinction between
canonical and latest ID matters is when we expand an eclass after its
initial (eager) rewriting, which happens before we see its uses. If we
hypothesize that this happens rarely, then it should be fine to
canonicalize based only on latest ID -- we shouldn't lose much (and we
can measure this loss empirically).

The chief case where this kind of "late eclass expansion" still happens
is: if we have some expression E1 that eventually rewrites to E2 via
some simplification, and E2 already exists earlier in the program, then
E1 will join the eclass. If we then have some `E3 := E1 + 1`, and later
`E4 := E2 + 1`, we need the union-find canonicalization for E4 to GVN to
E3. Otherwise, the latest ID for the eclass that eventually contains E1
and E2 is different at the time that E3 uses it (and is GVN'd and
rewritten) and when E4 does. Put another way: E3 captures a snapshot of
its operand's eclass before a new node joins it, and is never
reprocessed when that happens, so E3 remains distinct.

But if the `E2 -> E1` rewrite is truly "directional" toward a better
representation that we will always want to choose -- say, `x + 0 -> x`,
or any constant-propagation in general -- then if the eager rewriting
for E2 produces E1's eclass ID directly *without* adding E2 to its
nodes, then all users will still canonicalize as before. This "only
return the rewrite target, don't union with it" before is exactly our
`subsume` operator.

Put another way: subsumption prevents growing eclasses later, so
snapshots in time remain the latest, and everyone canonicalizes with the
same ID. We move to a true immutable data structure, with simple
hashconsing with no magic.

The rewrite semantics are then much simpler too: if any value is
marked "subsume", we pick it (pick an arbitrary one if multiple) as our
rewrite result; otherwise, create an eclass with the original rewrite
input and all rewrite outputs (by creating union-nodes in the DFG). No
auto-subsume or factoring in availability -- that's it. "Subsume" means:
always pick the rewritten value, don't keep the original, and we use it
whenever that's unambiguously true for better canonicalization.

Given that, it turns out that we can remove the union-find mechanism
entirely. It also turns out that we can unify the scoped-hashmap for
"effectful/idempotent ops" with the ordinary hashmap for pure ops;
everything can be scoped. To get working LICM we do need to retain our
"available block" mechanism, but only to insert values at a higher scope
level in the scoped hashmap -- it is now heuristic, not load-bearing for
correctness.

I suspect that the fixpoint loop computing analysis results can go away
too, now that we truly don't update arguments -- we have a simple
immutable data structure with everything snapshotted at creation -- but
I haven't made that change yet.

This change appears to be "in the noise" in both runtime and compile
time -- a Sightglass run on the default suite shows only

```
compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

Δ = 551234.50 ± 514580.62 (confidence = 99%)

new.so is 1.00x to 1.01x faster than old.so!

[61669181 72513567.85 98139932] new.so
[60991071 73064802.35 120044089] old.so

execution :: cycles :: benchmarks/bz2/benchmark.wasm

Δ = 232827.80 ± 204621.12 (confidence = 99%)

old.so is 1.00x to 1.01x faster than new.so!

[67208140 72812782.32 89996076] new.so
[69531172 72579954.52 80530142] old.so
```

which seem like suitably small swings that are fine. Spot-checking the
aegraph stats on the same function before-and-after shows the same
optimizations happening in all functions I examined, and we see the
compile-tests showing no movement except for a value renumbering in one
case. So: no effect objectively, but deletes code and significantly
simplifies the core algorithm.

show more ...


Revision tags: v31.0.0, v30.0.2, v30.0.1, v30.0.0, v29.0.1, v29.0.0, v28.0.1
# 1bb71d31 09-Jan-2025 amartosch <[email protected]>

Compute dominator tree using semi-NCA algorithm (#9603)

* Add dominator tree computed using semi-NCA algorithm.

* Add dominator tree fuzz target

* Move previous version of dominator tree to a sepa

Compute dominator tree using semi-NCA algorithm (#9603)

* Add dominator tree computed using semi-NCA algorithm.

* Add dominator tree fuzz target

* Move previous version of dominator tree to a separate file

* Improve comments.

* Use the new dominator tree in verifier.

* Remove unused `iterators` module.

show more ...


Revision tags: v28.0.0
# 45b60bd6 02-Dec-2024 Alex Crichton <[email protected]>

Start using `#[expect]` instead of `#[allow]` (#9696)

* Start using `#[expect]` instead of `#[allow]`

In Rust 1.81, our new MSRV, a new feature was added to Rust to use
`#[expect]` to control lint

Start using `#[expect]` instead of `#[allow]` (#9696)

* Start using `#[expect]` instead of `#[allow]`

In Rust 1.81, our new MSRV, a new feature was added to Rust to use
`#[expect]` to control lint levels. This new lint annotation will
silence a lint but will itself cause a lint if it doesn't actually
silence anything. This is quite useful to ensure that annotations don't
get stale over time.

Another feature is the ability to use a `reason` directive on the
attribute with a string explaining why the attribute is there. This
string is then rendered in compiler messages if a warning or error
happens.

This commit migrates applies a few changes across the workspace:

* Some `#[allow]` are changed to `#[expect]` with a `reason`.
* Some `#[allow]` have a `reason` added if the lint conditionally fires
(mostly related to macros).
* Some `#[allow]` are removed since the lint doesn't actually fire.
* The workspace configures `clippy::allow_attributes_without_reason = 'warn'`
as a "ratchet" to prevent future regressions.
* Many crates are annotated to allow `allow_attributes_without_reason`
during this transitionary period.

The end-state is that all crates should use
`#[expect(..., reason = "...")]` for any lint that unconditionally fires
but is expected. The `#[allow(..., reason = "...")]` lint should be used
for conditionally firing lints, primarily in macro-related code.
The `allow_attributes_without_reason = 'warn'` level is intended to be
permanent but the transitionary
`#[expect(clippy::allow_attributes_without_reason)]` crate annotations
to go away over time.

* Fix adapter build

prtest:full

* Fix one-core build of icache coherence

* Use `allow` for missing_docs

Work around rust-lang/rust#130021 which was fixed in Rust 1.83 and isn't
fixed for our MSRV at this time.

* More MSRV compat

show more ...


Revision tags: v27.0.0, v26.0.1, v25.0.3, v24.0.2
# 92cc0ad7 04-Nov-2024 SingleAccretion <[email protected]>

Add very basic logging to the debug info transform (#9526)

* Add very basic logging to the debug info transform

The DI transform is a kind of compiler and logging
is a very good way to gain insight

Add very basic logging to the debug info transform (#9526)

* Add very basic logging to the debug info transform

The DI transform is a kind of compiler and logging
is a very good way to gain insight into compilers.

* Fix C&P

* Bubble the "trace-log" feature up the dependency tree

And switch logging macros to always be enabled in debug.

Verified "trace-log" **does not** show up when running
'cargo tree -f "{p} {f}" -e features,normal,build'

* Fix dead code warnings

show more ...


Revision tags: v26.0.0
# 5b1a1edb 22-Oct-2024 nihalpasham <[email protected]>

Enable --all-features and display feature requirements in Cranelift docs on docs.rs (#9493)

* build docs for cranelift with all features enabled

* build docs for cranelift-codegen with the feature

Enable --all-features and display feature requirements in Cranelift docs on docs.rs (#9493)

* build docs for cranelift with all features enabled

* build docs for cranelift-codegen with the feature all-arch

* build docs for cranelift-codegen with the feature all-arch

show more ...


Revision tags: v21.0.2, v22.0.1, v23.0.3, v25.0.2, v24.0.1, v25.0.1, v25.0.0
# ae92cb41 03-Sep-2024 Alex Crichton <[email protected]>

Refactor `CallInfo` amongst Cranelift's backends (#9190)

* Refactor backends to use the same `CallInfo`

This commit refactors the various backends of cranelift, except for
s390x, to use a shared de

Refactor `CallInfo` amongst Cranelift's backends (#9190)

* Refactor backends to use the same `CallInfo`

This commit refactors the various backends of cranelift, except for
s390x, to use a shared definition of `CallInfo`. They were all already
quite similar and the main change here is to push platform-specific
pieces into the instructions outside of `CallInfo`. This is intended to
make additions to `CallInfo` easier and require less refactoring in the
future. Additionally this enables passing a `CallInfo` structure around
instead of passing around all of its components which helps reduce the
amount of arguments to various functions.

* s390x: Use the same `CallInfo` as other backends

This commit refactors s390x the same way as the previous commit to use
the shared `CallInfo` that all other backends are using. This required
more refactoring on the s390x side of things to notably extract a
dedicated pseudo-instruction for `ElfTlsGetOffset` rather than bundling
it within the `Call` instruction.

* Review comments and test fixes

* Fold `ExternalName` into `CallInfo`

As predicted instruction sizes got larger when outlining this on some
platforms so apply the same fix across all platforms by changing to
`CallInfo<T>` where the `T` will change depending on whether it's an
indirect or direct call.

* Update test expectations

show more ...


# c0c3a68c 21-Aug-2024 Nick Fitzgerald <[email protected]>

Cranelift: Remove the old stack maps implementation (#9159)

They are superseded by the new user stack maps implementation.


Revision tags: v24.0.0, v23.0.2, v23.0.1, v23.0.0, v22.0.0
# b3636ff6 18-Jun-2024 Nick Fitzgerald <[email protected]>

Introduce the `cranelift-bitset` crate; use it for stack maps in both Cranelift and Wasmtime (#8826)

* Introduce the `cranelift-bitset` crate

The eventual goal is to deduplicate bitset types betwee

Introduce the `cranelift-bitset` crate; use it for stack maps in both Cranelift and Wasmtime (#8826)

* Introduce the `cranelift-bitset` crate

The eventual goal is to deduplicate bitset types between Cranelift and Wasmtime,
especially their use in stack maps.

* Use the `cranelift-bitset` crate inside both Cranelift and Wasmtime

Mostly for stack maps, also for a variety of other random things where
`cranelift_codegen::bitset::BitSet` was previously used.

* Fix stack maps unit test in cranelift-codegen

* Uncomment `no_std` declaration

* Fix `CompountBitSet::reserve` method

* Fix `CompoundBitSet::insert` method

* Keep track of the max in a `CompoundBitSet`

Makes a bunch of other stuff easier, and will be needed for replacing
`cranelift_entity::EntitySet`'s bitset with this thing anyways.

* Add missing parens

* Fix a bug around insert and reserve

* Implement `with_capacity` in terms of `new` and `reserve`

* Rename `reserve` to `ensure_capacity`

show more ...


Revision tags: v21.0.1
# cacfaf8b 20-May-2024 Nick Fitzgerald <[email protected]>

Cranelift: Split out dominator tree's depth-first traversal into a reusable iterator (#8640)

We intend to use this when computing liveness of GC references in
`cranelift-frontend` to manually constr

Cranelift: Split out dominator tree's depth-first traversal into a reusable iterator (#8640)

We intend to use this when computing liveness of GC references in
`cranelift-frontend` to manually construct safepoints and ultimately remove
`r{32,64}` reference types from CLIF, `cranelift-codegen`, and `regalloc2`.

Co-authored-by: Trevor Elliott <[email protected]>

show more ...


Revision tags: v21.0.0
# b869b66b 13-May-2024 Jamey Sharp <[email protected]>

cranelift: Delete redundant DCE optimization pass (#8227)

The egraph pass and the dead-code elimination pass both remove
instructions whose results are unused. If the optimization level is
"none", n

cranelift: Delete redundant DCE optimization pass (#8227)

The egraph pass and the dead-code elimination pass both remove
instructions whose results are unused. If the optimization level is
"none", neither pass runs, and if it's anything else both passes run. I
don't think we should do this work twice.

Note that the DCE pass is different than the "eliminate unreachable
code" pass, which removes entire blocks that are unreachable from the
entry block. That pass might still be necessary.

show more ...


Revision tags: v20.0.2, v20.0.1
# 2c409535 02-May-2024 Jamey Sharp <[email protected]>

cranelift: Compress vcode range-lists (#8506)

These lists of ranges always cover contiguous ranges of an index space,
meaning the start of one range is the same as the end of the previous
range, so

cranelift: Compress vcode range-lists (#8506)

These lists of ranges always cover contiguous ranges of an index space,
meaning the start of one range is the same as the end of the previous
range, so we can cut storage in half by only storing one endpoint of
each range.

This in turn means we don't have to keep track of the other endpoint
while building these lists, reducing the state we need to keep while
building vcode and simplifying the various build steps.

show more ...


# 132ef1e4 29-Apr-2024 Kirpal Grewal <[email protected]>

Fxhash to rustchash (#8498)

* move fx hash to workspace level dep

* change internal fxhash to use fxhash crate

* remove unneeded HashSet import

* change fxhash crate to rustc hash

* undo migrat

Fxhash to rustchash (#8498)

* move fx hash to workspace level dep

* change internal fxhash to use fxhash crate

* remove unneeded HashSet import

* change fxhash crate to rustc hash

* undo migration to rustc hash

* manually implement hash function from fxhash

* change to rustc hash

show more ...


1234