History log of /llvm-project-15.0.7/llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp (Results 1 – 25 of 688)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 4704da13 20-Jul-2022 David Green <[email protected]>

[ARM] Fix Thumb2 compare being emitted ExpandCMP_SWAP

Given a patch like D129506, using instructions not valid for the current
target feature set becomes an error. This fixes an issue in
ARMExpandPs

[ARM] Fix Thumb2 compare being emitted ExpandCMP_SWAP

Given a patch like D129506, using instructions not valid for the current
target feature set becomes an error. This fixes an issue in
ARMExpandPseudo::ExpandCMP_SWAP where Thumb2 compares were used in
Thumb1Only code, such as thumbv8m.baseline targets.

Differential Revision: https://reviews.llvm.org/D129695

show more ...


# cb806ce2 17-Jul-2022 David Green <[email protected]>

[ARM] Guard VMOVH and VINS patterns.

These instructions are only available when fp is available, so cannot be
used with just +mve. Add predicates to ensure we fall-back under the
right circumstances.


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# 07881861 03-Jun-2022 Guillaume Chatelet <[email protected]>

[Alignment][NFC] Remove usage of MemSDNode::getAlignment

I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with

[Alignment][NFC] Remove usage of MemSDNode::getAlignment

I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.

Differential Revision: https://reviews.llvm.org/D126910

show more ...


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 440c4b70 21-Feb-2022 Craig Topper <[email protected]>

[SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y).

Previous we used sra (X, size(X)-1); xor (add (X, Y), Y).

By placi

[SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y).

Previous we used sra (X, size(X)-1); xor (add (X, Y), Y).

By placing sub at the end, we allow RISCV to combine sign_extend_inreg
with it to form subw.

Some X86 tests for Z - abs(X) seem to have improved as well.

Other targets look to be a wash.

I had to modify ARM's abs matching code to match from sub instead of
xor. Maybe instead ISD::ABS should be made legal. I'll try that in
parallel to this patch.

This is an alternative to D119099 which was focused on RISCV only.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D119171

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# d6b07348 19-Jan-2022 Jim Lin <[email protected]>

[NFC] Use Register instead of unsigned


Revision tags: llvmorg-13.0.1-rc2
# 2aed0813 07-Jan-2022 Kazu Hirata <[email protected]>

[llvm] Use true/false instead of 1/0 (NFC)

Identified with modernize-use-bool-literals.


# 69ccc961 01-Jan-2022 Kazu Hirata <[email protected]>

[llvm] Use the default constructor for SDValue (NFC)


# fbb61adb 25-Nov-2021 David Green <[email protected]>

[ARM] Convert fptoi.sat to fixed point multiply

This is a very small addition to the existing MVE fixed point vcvt code
to also create them from FP_TO_SINT_SAT and FP_TO_UINT_SAT nodes, which
should

[ARM] Convert fptoi.sat to fixed point multiply

This is a very small addition to the existing MVE fixed point vcvt code
to also create them from FP_TO_SINT_SAT and FP_TO_UINT_SAT nodes, which
should be equally valid for native saturating converts under MVE.

Differential Revision: https://reviews.llvm.org/D114360

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 85b4b21c 21-Sep-2021 Kazu Hirata <[email protected]>

[llvm] Use make_early_inc_range (NFC)


Revision tags: llvmorg-13.0.0-rc3
# 9af8f1b1 09-Sep-2021 Craig Topper <[email protected]>

[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.

Soft deprecrate isNullValue/isAllOnesValue and update in tree
callers. This matches the changes to the APInt interface from
D109483.

R

[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.

Soft deprecrate isNullValue/isAllOnesValue and update in tree
callers. This matches the changes to the APInt interface from
D109483.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D109535

show more ...


# 9cb8f4d1 02-Sep-2021 David Green <[email protected]>

[ARM] Add a tail-predication loop predicate register

The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r

[ARM] Add a tail-predication loop predicate register

The semantics of tail predication loops means that the value of LR as an
instruction is executed determines the predicate. In other words:

mov r3, #3
DLSTP lr, r3 // Start tail predication, lr==3
VADD.s32 q0, q1, q2 // Lanes 0,1 and 2 are updated in q0.
mov lr, #1
VADD.s32 q0, q1, q2 // Only first lane is updated.

This means that the value of lr cannot be spilled and re-used in tail
predication regions without potentially altering the behaviour of the
program. More lanes than required could be stored, for example, and in
the case of a gather those lanes might not have been setup, leading to
alignment exceptions.

This patch adds a new lr predicate operand to MVE instructions in order
to keep a reference to the lr that they use as a tail predicate. It will
usually hold the zeroreg meaning not predicated, being set to the LR phi
value in the MVETPAndVPTOptimisationsPass. This will prevent it from
being spilled anywhere that it needs to be used.

A lot of tests needed updating.

Differential Revision: https://reviews.llvm.org/D107638

show more ...


Revision tags: llvmorg-13.0.0-rc2
# 1e770f03 17-Aug-2021 Simon Pilgrim <[email protected]>

[ARM] ARMDAGToDAGISel::tryReadRegister/tryWriteRegister - don't dereference dyn_cast<> results.

dyn_cast<> can return nullptr if the cast is illegal, use cast<> instead which will assert that the ca

[ARM] ARMDAGToDAGISel::tryReadRegister/tryWriteRegister - don't dereference dyn_cast<> results.

dyn_cast<> can return nullptr if the cast is illegal, use cast<> instead which will assert that the cast is correct.

Fixes static analyser warnings.

show more ...


# 77e8f4ee 06-Aug-2021 David Green <[email protected]>

[ARM] Define ComplexPatternFuncMutatesDAG

Some of the Arm complex pattern functions call canExtractShiftFromMul,
which can modify the DAG in-place. For this to be valid and handled
successfully we n

[ARM] Define ComplexPatternFuncMutatesDAG

Some of the Arm complex pattern functions call canExtractShiftFromMul,
which can modify the DAG in-place. For this to be valid and handled
successfully we need to define ComplexPatternFuncMutatesDAG.

Differential Revision: https://reviews.llvm.org/D107476

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3
# 24d76419 21-Jun-2021 Sam Tebbs <[email protected]>

[ARM] Transform a floating-point to fixed-point conversion to a VCVT_fix

Much like fixed-point to floating-point conversion, the converse can
also be transformed into a fixed-point VCVT. This patch

[ARM] Transform a floating-point to fixed-point conversion to a VCVT_fix

Much like fixed-point to floating-point conversion, the converse can
also be transformed into a fixed-point VCVT. This patch transforms
multiplications of floating point numbers by 2^n into a VCVT_fix. The
exception is that a float to fixed conversion with 1 fractional bit
ends up being an FADD (FADD(x, x) emulates FMUL(x, 2)) rather than an FMUL so there is a special case for that. This patch also moves the code from https://reviews.llvm.org/D103903 into a separate function as fixed to float and float to fixed are very similar.

Differential Revision: https://reviews.llvm.org/D104793

show more ...


Revision tags: llvmorg-12.0.1-rc2
# bbe16b7a 07-Jun-2021 Sam Tebbs <[email protected]>

[ARM] Transform a fixed-point to floating-point conversion into a VCVT_fix

Conversion from a fixed-point number to a floating-point number is done by
multiplying the fixed-point number by 2^(-n) whe

[ARM] Transform a fixed-point to floating-point conversion into a VCVT_fix

Conversion from a fixed-point number to a floating-point number is done by
multiplying the fixed-point number by 2^(-n) where n is the number of
fractional bits. Currently this is lowered to a vcvt
(integer to floating-point) then a vmul, but it can instead be lowered
directly to a vcvt (fixed-point to floating-point). This patch enables
such transformations as long as the multiplication factor is a power of 2.

Differential Revision: https://reviews.llvm.org/D103903

show more ...


# 521d3732 20-Jun-2021 Fangrui Song <[email protected]>

Fix -Wunused-variable and -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC


# f6b9836b 02-Jun-2021 Kristina Bessonova <[email protected]>

[ARM][NEON] Combine base address updates for vld1Ndup intrinsics

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D103836


# 44843e2a 25-May-2021 Kristina Bessonova <[email protected]>

[ARM][NEON] Combine base address updates for vld1x intrinsics

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D102855


Revision tags: llvmorg-12.0.1-rc1
# d59a2a32 06-May-2021 Kristina Bessonova <[email protected]>

[ARM][NEON] Combine base address updates for vst1x intrinsics

Differential Revision: https://reviews.llvm.org/D102256


# 1011d4ed 13-May-2021 David Green <[email protected]>

[ARM] Constrain CMPZ shift combine to a single use

We currently prefer t2CMPrs over t2CMPri when the node contains a shift.
This can introduce more nodes if the shift has multiple uses though, as
va

[ARM] Constrain CMPZ shift combine to a single use

We currently prefer t2CMPrs over t2CMPri when the node contains a shift.
This can introduce more nodes if the shift has multiple uses though, as
value from the shift will be needed anyway, and in the case of a t2CMPri
compared with zero will more readily be removed entirely.

Differential Revision: https://reviews.llvm.org/D101688

show more ...


# 34c098b7 11-May-2021 Tomas Matheson <[email protected]>

[ARM] Prevent spilling between ldrex/strex pairs

Based on the same for AArch64: 4751cadcca45984d7671e594ce95aed8fe030bf1

At -O0, the fast register allocator may insert spills between the ldrex and

[ARM] Prevent spilling between ldrex/strex pairs

Based on the same for AArch64: 4751cadcca45984d7671e594ce95aed8fe030bf1

At -O0, the fast register allocator may insert spills between the ldrex and
strex instructions inserted by AtomicExpandPass when expanding atomicrmw
instructions in LL/SC loops. To avoid this, expand to cmpxchg loops and
therefore expand the cmpxchg pseudos after register allocation.

Required a tweak to ARMExpandPseudo::ExpandCMP_SWAP to use the 4-byte encoding
of UXT, since the pseudo instruction can be allocated a high register (R8-R15)
which the 2-byte encoding doesn't support. However, the 4-byte encodings
are not present for ARM v8-M Baseline. To enable this, two new pseudos are
added for Thumb which are only valid for v8mbase, tCMP_SWAP_8 and
tCMP_SWAP_16.

The previously committed attempt in D101164 had to be reverted due to runtime
failures in the test suites. Rather than spending time fixing that
implementation (adding another implementation of atomic operations and more
divergence between backends) I have chosen to follow the approach taken in
D101163.

Differential Revision: https://reviews.llvm.org/D101898

Depends on D101912

show more ...


# 9d86095f 03-May-2021 Tomas Matheson <[email protected]>

Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0"

This reverts commit 753185031d939711f8733639a77a6fdc3bdbad22.


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 75318503 31-Mar-2021 Tomas Matheson <[email protected]>

[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0

atomicrmw instructions are expanded by AtomicExpandPass before register allocation
into cmpxchg loops. Register allocation can insert s

[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0

atomicrmw instructions are expanded by AtomicExpandPass before register allocation
into cmpxchg loops. Register allocation can insert spills between the exclusive loads
and stores, which invalidates the exclusive monitor and can lead to infinite loops.

To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them
after register allocation.

Floating point legalisation:
f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to
f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually
f32 ATOMIC_LOAD_FADD_16(*i16, f32)

Differential Revision: https://reviews.llvm.org/D101164

Originally submitted as 3338290c187b254ad071f4b9cbf2ddb2623cefc0.
Reverted in c7df6b1223d88dfd15248fbf7b7b83dacad22ae3.

show more ...


# d1bbe61d 03-May-2021 David Green <[email protected]>

[ARM] Memory operands for MVE gathers/scatters

Similarly to D101096, this makes sure that MMO operands get propagated
through from MVE gathers/scatters to the Machine Instructions. This
allows extra

[ARM] Memory operands for MVE gathers/scatters

Similarly to D101096, this makes sure that MMO operands get propagated
through from MVE gathers/scatters to the Machine Instructions. This
allows extra scheduling freedom, not forcing the instructions to act as
scheduling barriers. We create MMO's with an unknown size, specifying
that they can load from anywhere in memory, similar to the masked_gather
or X86 intrinsics.

Differential Revision: https://reviews.llvm.org/D101219

show more ...


# 15b5d1a5 02-May-2021 David Green <[email protected]>

[ARM] Transfer memory operands for VLDn

We create MMO's for the VLDn/VSTn intrinsics in ARMTargetLowering::
getTgtMemIntrinsic, but they do not currently make it ll the way through
ISel. This chang

[ARM] Transfer memory operands for VLDn

We create MMO's for the VLDn/VSTn intrinsics in ARMTargetLowering::
getTgtMemIntrinsic, but they do not currently make it ll the way through
ISel. This changes that in the various places it needs changing, making
sure that the MMO is propagate through to the final instruction. This
can help in scheduling, not treating the VLD2/VST2 as a scheduling
barrier.

Differential Revision: https://reviews.llvm.org/D101096

show more ...


12345678910>>...28