History log of /wasmtime-44.0.1/cranelift/codegen/src/egraph/cost.rs (Results 1 – 12 of 12)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: dev, v36.0.9, v44.0.1, v43.0.2, v36.0.8, v24.0.8, v44.0.0, v43.0.1, v42.0.2, v36.0.7, v24.0.7, v43.0.0, v42.0.1, v41.0.4, v42.0.0, v40.0.4, v36.0.6, v24.0.6, v41.0.3, v41.0.2, v41.0.1, v36.0.5, v40.0.3, v41.0.0, v36.0.4, v39.0.2, v40.0.2, v40.0.1
# 50431b45 06-Jan-2026 Nick Fitzgerald <[email protected]>

Cranelift: Do not track expression depth in egraph cost function (#12248)

We don't have any rules that try to rebalance trees to make them shallower in
the mid-end (we determined it was the wrong pl

Cranelift: Do not track expression depth in egraph cost function (#12248)

We don't have any rules that try to rebalance trees to make them shallower in
the mid-end (we determined it was the wrong place to do that kind of thing,
since we don't know register pressure at that point) so there is no need to
track expression depth in the cost function.

show more ...


# 0889323a 03-Jan-2026 SSD <[email protected]>

cranelift-codegen: rename most uses of std to core and alloc (#12237)

* rename most std uses to core and alloc

* cargo fmt


Revision tags: v40.0.0, v39.0.1, v39.0.0, v38.0.4, v37.0.3, v36.0.3, v24.0.5
# d5ef5285 07-Nov-2025 Nick Fitzgerald <[email protected]>

Cranelift: Tweak cost function to make `imul` more expensive (#12006)

* Cranelift: Tweak cost function to pessimize `select` opcodes

`select`s cannot be speculated through on some of our targets (e

Cranelift: Tweak cost function to make `imul` more expensive (#12006)

* Cranelift: Tweak cost function to pessimize `select` opcodes

`select`s cannot be speculated through on some of our targets (e.g. x64) so
strongly prefer not choosing them.

Using a target-specific cost function, so that we only pessimize `select` when
it makes sense, is left for follow up work and is tracked in #12005.

* Don't pessimize `select`s that heavily

show more ...


Revision tags: v38.0.3, v38.0.2, v38.0.1, v37.0.2, v37.0.1, v37.0.0, v36.0.2, v36.0.1, v36.0.0, v35.0.0, v24.0.4, v33.0.2, v34.0.2, v34.0.1, v33.0.1, v24.0.3, v32.0.1, v34.0.0, v33.0.0, v32.0.0
# 7bf31723 08-Apr-2025 Nick Fitzgerald <[email protected]>

Cranelift: simplify some side-effectful instructions in ISLE (#10524)

* Cranelift: simplify some side-effectful instructions in ISLE

This commit adds a new top-level ISLE entrypoint specifically fo

Cranelift: simplify some side-effectful instructions in ISLE (#10524)

* Cranelift: simplify some side-effectful instructions in ISLE

This commit adds a new top-level ISLE entrypoint specifically for instructions
in the side-effectful skeleton: `simplify_skeleton`. While these rewrites are
processed during the egraph pass, values from skeleton instructions still do not
get inserted into the egraph. Indeed, `simplify_skeleton` operates
on *instructions* rather than *values* because we do not represent side effects
as values; values do not have side effects in CLIF, instructions do. Therefore,
rather than doing a whole dynamic-programming style extraction of the best
candidate simplification like we do with the egraph, we take an eager and greedy
approach.

Furthermore, `simplify_skeleton` is limited only to skeleton instructions that
do not involve control-flow or terminators right now. This is because changing
the control-flow graph can change whether a use is dominated by a def or not,
and we do not currently have the machinery to track and fix up invalidated
uses. Addressing this is left for future commits.

* fix `MIN / -1` cprop and add negative tests for things simplify_skeleton cannot handle yet

show more ...


Revision tags: v31.0.0
# 1104a838 17-Mar-2025 Chris Fallin <[email protected]>

Cranelift: avoid integer-shift optimization with invalid reduces on vector types. (#10413)

An optimization rule in our mid-end implemented the following
equivalence:

```
(x << N) >> N == x as T

Cranelift: avoid integer-shift optimization with invalid reduces on vector types. (#10413)

An optimization rule in our mid-end implemented the following
equivalence:

```
(x << N) >> N == x as T1 as T2
```

where `T1` is a smaller integer type and `T2` is a larger integer
type, and `N` is the difference in bitwidths between them. In other
words, when a left-then-right-shift clears the upper bits (or
sign-extends them), we can implement that by reducing to a smaller
type then extending back to the original type.

Unfortunately, `ireduce` is not valid on vector-of-integer types. A
fuzzbug reported in #10409 shows that this pattern results in invalid
CLIF when optimizing this pattern with an `i64x2` as input.

This PR tightens the rule's LHS to only apply to integer types.

Note that there is another optimization rule that can also implement
this behavior with a bitwise AND with a mask. Previously the two
resulting expressions had the same cost in the egraph, and whatever
perturbation occurred in the ISLE compilation of the rule (which is a
multi-constructor because this is the mid-end, so is returning
multiple results) caused that rule's result to be picked instead
during extraction. I've also tweaked the costs to make reduces/extends
cheaper than ordinary arithmetic as a result, to keep the same
optimizer outputs after extraction. One x64 test in particular showed
why this is correct: the mask-with-AND resulted in two
instructions (move, AND-with-immediate) while a uextend can be done
with one (zero-extending move, `movzbl`), and aarch64 has similar
builtin extends in many cases.

show more ...


Revision tags: v30.0.2, v30.0.1, v30.0.0, v29.0.1, v29.0.0, v28.0.1, v28.0.0, v27.0.0, v26.0.1, v25.0.3, v24.0.2, v26.0.0, v21.0.2, v22.0.1, v23.0.3, v25.0.2, v24.0.1, v25.0.1, v25.0.0, v24.0.0, v23.0.2, v23.0.1, v23.0.0, v22.0.0, v21.0.1, v21.0.0, v20.0.2, v20.0.1, v20.0.0, v17.0.3, v19.0.2, v18.0.4, v19.0.1, v19.0.0, v18.0.3, v18.0.2, v17.0.2, v18.0.1, v18.0.0, v17.0.1
# 5b2ae836 05-Feb-2024 Nick Fitzgerald <[email protected]>

Cranelift: Use a fixpoint loop to compute the best value for each eclass (#7859)

* Cranelift: Use a fixpoint loop to compute the best value for each eclass

Fixes #7857

* Remove fixpoint loop early

Cranelift: Use a fixpoint loop to compute the best value for each eclass (#7859)

* Cranelift: Use a fixpoint loop to compute the best value for each eclass

Fixes #7857

* Remove fixpoint loop early-continue optimization

* Add document describing optimization rule invariants

* Make select optimizations use subsume

* Remove invalid debug assert

* Remove now-unused methods

* Add commutative adds to cost tests

show more ...


Revision tags: v17.0.0, v16.0.0, v15.0.1, v15.0.0
# b9f2a306 07-Nov-2023 Nick Fitzgerald <[email protected]>

Cranelift: Break op cost ties with expression depth in egraphs (#7456)

* Cranelift: Switch egraph `Cost` to a struct with named fields

Mechanical change.

* Cranelift: Break op cost ties with expre

Cranelift: Break op cost ties with expression depth in egraphs (#7456)

* Cranelift: Switch egraph `Cost` to a struct with named fields

Mechanical change.

* Cranelift: Break op cost ties with expression depth in egraphs

This means that, when the opcode cost is the same, we prefer shallow and wide
expressions to narrow and deep. For example, `(a + b) + (c + d)` is preferred to
`((a + b) + c) + d`. This is beneficial because it exposes more
instruction-level parallelism and shortens live ranges.

Co-Authored-By: Trevor Elliott <[email protected]>

* Cranelift: Bitpack the egraph `Cost` structure

Co-Authored-By: Chris Fallin <[email protected]>
Co-Authored-By: Trevor Elliott <[email protected]>

* Make it so you can't construct `Cost::inifinity()` by accident

* Use fold to code golf

---------

Co-authored-by: Trevor Elliott <[email protected]>
Co-authored-by: Chris Fallin <[email protected]>

show more ...


# e39c6b76 07-Nov-2023 Nick Fitzgerald <[email protected]>

Cranelift: Fix union node bitpacking (#7465)

* Cranelift: Fix union node bitpacking

It turns out we have just been taking the newest rewrite's value for a eclass
union and never actually comparing

Cranelift: Fix union node bitpacking (#7465)

* Cranelift: Fix union node bitpacking

It turns out we have just been taking the newest rewrite's value for a eclass
union and never actually comparing costs and taking the value with the minimum
cost. Whoops!

Fixing this made some test expectations fail, which we resolved by tweaking the
cost function to give materializing constants nonzero cost. This way we prefer
`-x` to `0 - x`.

We also made elaboration function break ties between values with the same cost
with the value index. It prefers larger value indices, since the original
value's index will be lower than all of its rewritten values' indices. This
heuristically prefers rewritten values because we hope our rewrites are all
improvements even when the cost function can't show that.

Co-Authored-By: Chris Fallin <[email protected]>
Co-Authored-By: Trevor Elliott <[email protected]>

* Add more information to assertion message

* Fix off-by-one bug in assertion

* Limit number of matches consumed from ISLE

We generally want to clamp down and avoid runaway behavior here.

But there also seems to be some sort of rustc/llvm bug on Rust 1.71 that is
causing iteration to wild here. This commit avoids that bug.

* Update test expectation

* prtest:full

---------

Co-authored-by: Chris Fallin <[email protected]>
Co-authored-by: Trevor Elliott <[email protected]>

show more ...


Revision tags: v14.0.4, v14.0.3, v14.0.2, v13.0.1, v14.0.1, v14.0.0
# 04fcb6a1 19-Oct-2023 Trevor Elliott <[email protected]>

A couple of small refactorings to the egraph elaboration pass (#7304)

@jameysharp and I noticed a couple of refactoring opportunities while
reading through the elaboration pass:

* The elaboration l

A couple of small refactorings to the egraph elaboration pass (#7304)

@jameysharp and I noticed a couple of refactoring opportunities while
reading through the elaboration pass:

* The elaboration loop doesn't need to match on the top of the stack as
a reference, because each case pops it immediately. Instead we can pop
and match on the popped value.
* Computing the cost of a `Result` value that's not in the DFG was using
the block that contains the instruction to determine the level, but
since the instruction is already known to not be in the DFG, this
would default to `LoopLevel::root()` unconditionally. This also meant
that the `Cost::at_level` function turned into an identity function on
the cost given, making it unnecessary.

Co-authored-by: Jamey Sharp <[email protected]>

show more ...


Revision tags: minimum-viable-wasi-proxy-serve, v13.0.0, v12.0.2, v11.0.2, v10.0.2, v12.0.1, v12.0.0, v11.0.1, v11.0.0, v10.0.1, v10.0.0, v9.0.4, v9.0.3, v9.0.2, v9.0.1, v9.0.0, v6.0.2, v7.0.1, v8.0.1, v8.0.0, v7.0.0, v6.0.1, v5.0.1, v4.0.1, v6.0.0
# de0e0bea 06-Feb-2023 Alex Crichton <[email protected]>

Legalize `b{and,or,xor}_not` into component instructions (#5709)

* Remove trailing whitespace in `lower.isle` files

* Legalize the `band_not` instruction into simpler form

This commit legalize

Legalize `b{and,or,xor}_not` into component instructions (#5709)

* Remove trailing whitespace in `lower.isle` files

* Legalize the `band_not` instruction into simpler form

This commit legalizes the `band_not` instruction into `band`-of-`bnot`,
or two instructions. This is intended to assist with egraph-based
optimizations where the `band_not` instruction doesn't have to be
specifically included in other bit-operation-patterns.

Lowerings of the `band_not` instruction have been moved to a
specialization of the `band` instruction.

* Legalize `bor_not` into components

Same as prior commit, but for the `bor_not` instruction.

* Legalize bxor_not into bxor-of-bnot

Same as prior commits. I think this also ended up fixing a bug in the
s390x backend where `bxor_not x y` was actually translated as `bnot
(bxor x y)` by accident given the test update changes.

* Simplify not-fused operands for riscv64

Looks like some delegated-to rules have special-cases for "if this
feature is enabled use the fused instruction" so move the clause for
testing the feature up to the lowering phase to help trigger other rules
if the feature isn't enabled. This should make the riscv64 backend more
consistent with how other backends are implemented.

* Remove B{and,or,xor}Not from cost of egraph metrics

These shouldn't ever reach egraphs now that they're legalized away.

* Add an egraph optimization for `x^-1 => ~x`

This adds a simplification node to translate xor-against-minus-1 to a
`bnot` instruction. This helps trigger various other optimizations in
the egraph implementation and also various backend lowering rules for
instructions. This is chiefly useful as wasm doesn't have a `bnot`
equivalent, so it's encoded as `x^-1`.

* Add a wasm test for end-to-end bitwise lowerings

Test that end-to-end various optimizations are being applied for input
wasm modules.

* Specifically don't self-update rustup on CI

I forget why this was here originally, but this is failing on Windows
CI. In general there's no need to update rustup, so leave it as-is.

* Cleanup some aarch64 lowering rules

Previously a 32/64 split was necessary due to the `ALUOp` being different
but that's been refactored away no so there's no longer any need for
duplicate rules.

* Narrow a x64 lowering rule

This previously made more sense when it was `band_not` and rarely used,
but be more specific in the type-filter on this rule that it's only
applicable to SIMD types with lanes.

* Simplify xor-against-minus-1 rule

No need to have the commutative version since constants are already
shuffled right for egraphs

* Optimize band-of-bnot when bnot is on the left

Use some more rules in the egraph algebraic optimizations to
canonicalize band/bor/bxor with a `bnot` operand to put the operand on
the right. That way the lowerings in the backends only have to list the
rule once, with the operand on the right, to optimize both styles of
input.

* Add commutative lowering rules

* Update cranelift/codegen/src/isa/x64/lower.isle

Co-authored-by: Jamey Sharp <[email protected]>

---------

Co-authored-by: Jamey Sharp <[email protected]>

show more ...


# ffbcc67e 28-Jan-2023 Nick Fitzgerald <[email protected]>

Cranelift: Consider shifts as "simple" arithmetic in egraph cost model (#5646)


Revision tags: v5.0.0, v4.0.0
# f980defe 06-Dec-2022 Chris Fallin <[email protected]>

egraph support: rewrite to work in terms of CLIF data structures. (#5382)

* egraph support: rewrite to work in terms of CLIF data structures.

This work rewrites the "egraph"-based optimization f

egraph support: rewrite to work in terms of CLIF data structures. (#5382)

* egraph support: rewrite to work in terms of CLIF data structures.

This work rewrites the "egraph"-based optimization framework in
Cranelift to operate on aegraphs (acyclic egraphs) represented in the
CLIF itself rather than as a separate data structure to which and from
which we translate the CLIF.

The basic idea is to add a new kind of value, a "union", that is like an
alias but refers to two other values rather than one. This allows us to
represent an eclass of enodes (values) as a tree. The union node allows
for a value to have *multiple representations*: either constituent value
could be used, and (in well-formed CLIF produced by correct
optimization rules) they must be equivalent.

Like the old egraph infrastructure, we take advantage of acyclicity and
eager rule application to do optimization in a single pass. Like before,
we integrate GVN (during the optimization pass) and LICM (during
elaboration).

Unlike the old egraph infrastructure, everything stays in the
DataFlowGraph. "Pure" enodes are represented as instructions that have
values attached, but that are not placed into the function layout. When
entering "egraph" form, we remove them from the layout while optimizing.
When leaving "egraph" form, during elaboration, we can place an
instruction back into the layout the first time we elaborate the enode;
if we elaborate it more than once, we clone the instruction.

The implementation performs two passes overall:

- One, a forward pass in RPO (to see defs before uses), that (i) removes
"pure" instructions from the layout and (ii) optimizes as it goes. As
before, we eagerly optimize, so we form the entire union of optimized
forms of a value before we see any uses of that value. This lets us
rewrite uses to use the most "up-to-date" form of the value and
canonicalize and optimize that form.

The eager rewriting and acyclic representation make each other work
(we could not eagerly rewrite if there were cycles; and acyclicity
does not miss optimization opportunities only because the first time
we introduce a value, we immediately produce its "best" form). This
design choice is also what allows us to avoid the "parent pointers"
and fixpoint loop of traditional egraphs.

This forward optimization pass keeps a scoped hashmap to "intern"
nodes (thus performing GVN), and also interleaves on a per-instruction
level with alias analysis. The interleaving with alias analysis allows
alias analysis to see the most optimized form of each address (so it
can see equivalences), and allows the next value to see any
equivalences (reuses of loads or stored values) that alias analysis
uncovers.

- Two, a forward pass in domtree preorder, that "elaborates" pure enodes
back into the layout, possibly in multiple places if needed. This
tracks the loop nest and hoists nodes as needed, performing LICM as it
goes. Note that by doing this in forward order, we avoid the
"fixpoint" that traditional LICM needs: we hoist a def before its
uses, so when we place a node, we place it in the right place the
first time rather than moving later.

This PR replaces the old (a)egraph implementation. It removes both the
cranelift-egraph crate and the logic in cranelift-codegen that uses it.

On `spidermonkey.wasm` running a simple recursive Fibonacci
microbenchmark, this work shows 5.5% compile-time reduction and 7.7%
runtime improvement (speedup).

Most of this implementation was done in (very productive) pair
programming sessions with Jamey Sharp, thus:

Co-authored-by: Jamey Sharp <[email protected]>

* Review feedback.

* Review feedback.

* Review feedback.

* Bugfix: cprop rule: `(x + k1) - k2` becomes `x - (k2 - k1)`, not `x - (k1 - k2)`.

Co-authored-by: Jamey Sharp <[email protected]>

show more ...