History log of /llvm-project-15.0.7/clang/utils/TableGen/MveEmitter.cpp (Results 1 – 25 of 39)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# b0270f6e 18-Mar-2022 Arthur Eubanks <[email protected]>

[clang] Remove Address::deprecated from MveEmitter

We have to keep track of pointer pointee types with opaque pointers.

Reviewed By: simon_tatham

Differential Revision: https://reviews.llvm.org/D1

[clang] Remove Address::deprecated from MveEmitter

We have to keep track of pointer pointee types with opaque pointers.

Reviewed By: simon_tatham

Differential Revision: https://reviews.llvm.org/D122046

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 50650766 16-Feb-2022 Nikita Popov <[email protected]>

[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While d

[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# cb7f806a 13-Jan-2022 Kazu Hirata <[email protected]>

[clang] Remove redundant member initialization (NFC)

Identified with readability-redundant-member-init.


Revision tags: llvmorg-13.0.1-rc2
# 7485e6c7 10-Jan-2022 Kazu Hirata <[email protected]>

Revert "[clang] Remove redundant member initialization (NFC)"

This reverts commit 80e2c587498a7b2bf14dde47a33a058da6e88a9a.

The original patch causes a lot of warnings on gcc like:

llvm-project/

Revert "[clang] Remove redundant member initialization (NFC)"

This reverts commit 80e2c587498a7b2bf14dde47a33a058da6e88a9a.

The original patch causes a lot of warnings on gcc like:

llvm-project/clang/include/clang/Basic/Diagnostic.h:1329:3: warning:
base class ‘class clang::StreamingDiagnostic’ should be explicitly
initialized in the copy constructor [-Wextra]

show more ...


# 80e2c587 09-Jan-2022 Kazu Hirata <[email protected]>

[clang] Remove redundant member initialization (NFC)

Identified with readability-redundant-member-init.


# ab0c5cea 03-Dec-2021 David Green <[email protected]>

[ARM] Use v2i1 for MVE and CDE intrinsics

This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1

[ARM] Use v2i1 for MVE and CDE intrinsics

This adjusts all the MVE and CDE intrinsics now that v2i1 is a legal
type, to use a <2 x i1> as opposed to emulating the predicate with a
<4 x i1>. The v4i1 workarounds have been removed leaving the natural
v2i1 types, notably in vctp64 which now generates a v2i1 type.

AutoUpgrade code has been added to upgrade old IR, which needs to
convert the old v4i1 to a v2i1 be converting it back and forth to an
integer with arm.mve.v2i and arm.mve.i2v intrinsics. These should be
optimized away in the final assembly.

Differential Revision: https://reviews.llvm.org/D114455

show more ...


Revision tags: llvmorg-13.0.1-rc1
# c294715e 15-Oct-2021 Craig Topper <[email protected]>

[ARM] Don't use TARGET_HEADER_BUILTIN in arm_mve_builtins.inc or arm_cde_builtins.inc

The attributes string doesn't include 'f' or 'h'. I don't think
any code looks at the header name without those.

[ARM] Don't use TARGET_HEADER_BUILTIN in arm_mve_builtins.inc or arm_cde_builtins.inc

The attributes string doesn't include 'f' or 'h'. I don't think
any code looks at the header name without those.

Reviewed By: simon_tatham

Differential Revision: https://reviews.llvm.org/D111755

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# a9fc44c5 25-Feb-2021 Paul C. Anagnostopoulos <[email protected]>

[TableGen] Improve handling of template arguments

This requires changes to TableGen files and some C++ files due to
incompatible multiclass template arguments that slipped through
before the improve

[TableGen] Improve handling of template arguments

This requires changes to TableGen files and some C++ files due to
incompatible multiclass template arguments that slipped through
before the improved handling.

show more ...


Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 79689817 01-Jun-2020 Christopher Tetreault <[email protected]>

[SVE] Eliminate calls to default-false VectorType::get() from Clang

Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall

Reviewed By: fpetrogalli, rjmccall

Subscribers: tschuett, rkruppe

[SVE] Eliminate calls to default-false VectorType::get() from Clang

Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall

Reviewed By: fpetrogalli, rjmccall

Subscribers: tschuett, rkruppe, psnobl, dmgreen, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80323

show more ...


Revision tags: llvmorg-10.0.1-rc1
# 6e0afb5f 29-Mar-2020 Benjamin Kramer <[email protected]>

[ARMMVE] Create fewer temporary SmallVectors

Shrinks clang by 40k.


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6
# 969034b8 20-Mar-2020 Mikhail Maltsev <[email protected]>

[ARM,CDE] Implement CDE unpredicated Q-register intrinsics

Summary:
This patch implements the following intrinsics:

uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm);
T __arm_vcx1qa(int copr

[ARM,CDE] Implement CDE unpredicated Q-register intrinsics

Summary:
This patch implements the following intrinsics:

uint8x16_t __arm_vcx1q_u8 (int coproc, uint32_t imm);
T __arm_vcx1qa(int coproc, T acc, uint32_t imm);
T __arm_vcx2q(int coproc, T n, uint32_t imm);
uint8x16_t __arm_vcx2q_u8(int coproc, T n, uint32_t imm);
T __arm_vcx2qa(int coproc, T acc, U n, uint32_t imm);
T __arm_vcx3q(int coproc, T n, U m, uint32_t imm);
uint8x16_t __arm_vcx3q_u8(int coproc, T n, U m, uint32_t imm);
T __arm_vcx3qa(int coproc, T acc, U n, V m, uint32_t imm);

Most of them are polymorphic. Furthermore, some intrinsics are
polymorphic by 2 or 3 parameter types, such polymorphism is not
supported by the existing MVE/CDE tablegen backends, also we don't
really want to have a combinatorial explosion caused by 1000 different
combinations of 3 vector types. Because of this some intrinsics are
implemented as macros involving a cast of the polymorphic arguments to
uint8x16_t.

The IR intrinsics are even more restricted in terms of types: all MVE
vectors are cast to v16i8.

Reviewers: simon_tatham, MarkMurrayARM, dmgreen, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76299

show more ...


# d22e6617 20-Mar-2020 Mikhail Maltsev <[email protected]>

[ARM,CDE] Implement CDE S and D-register intrinsics

Summary:
This patch implements the following ACLE intrinsics:

uint32_t __arm_vcx1_u32(int coproc, uint32_t imm);
uint32_t __arm_vcx1a_u32(int

[ARM,CDE] Implement CDE S and D-register intrinsics

Summary:
This patch implements the following ACLE intrinsics:

uint32_t __arm_vcx1_u32(int coproc, uint32_t imm);
uint32_t __arm_vcx1a_u32(int coproc, uint32_t acc, uint32_t imm);
uint32_t __arm_vcx2_u32(int coproc, uint32_t n, uint32_t imm);
uint32_t __arm_vcx2a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t imm);
uint32_t __arm_vcx3_u32(int coproc, uint32_t n, uint32_t m, uint32_t imm);
uint32_t __arm_vcx3a_u32(int coproc, uint32_t acc, uint32_t n, uint32_t m, uint32_t imm);

uint64_t __arm_vcx1d_u64(int coproc, uint32_t imm);
uint64_t __arm_vcx1da_u64(int coproc, uint64_t acc, uint32_t imm);
uint64_t __arm_vcx2d_u64(int coproc, uint64_t m, uint32_t imm);
uint64_t __arm_vcx2da_u64(int coproc, uint64_t acc, uint64_t m, uint32_t imm);
uint64_t __arm_vcx3d_u64(int coproc, uint64_t n, uint64_t m, uint32_t imm);
uint64_t __arm_vcx3da_u64(int coproc, uint64_t acc, uint64_t n, uint64_t m, uint32_t imm);

Since the semantics of CDE instructions is opaque to the compiler, the
ACLE intrinsics require dedicated LLVM IR intrinsics. The 64-bit and
32-bit variants share the same IR intrinsic.

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76298

show more ...


Revision tags: llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4
# d608fee8 12-Mar-2020 Simon Tatham <[email protected]>

[ARM,MVE] Fix user-namespace violation in arm_mve.h.

Summary:
We were generating the declarations of polymorphic intrinsics using
`__attribute__((overloadable))`. But `overloadable` is a valid
ident

[ARM,MVE] Fix user-namespace violation in arm_mve.h.

Summary:
We were generating the declarations of polymorphic intrinsics using
`__attribute__((overloadable))`. But `overloadable` is a valid
identifier for an end user to define as a macro in a C program, and if
they do that before including `<arm_mve.h>`, then we shouldn't cause a
compile error.

Fixed to spell the attribute name `__overloadable__` instead.

Reviewers: miyuki, MarkMurrayARM, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, dmgreen, danielkiss, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75997

show more ...


# 47edf5ba 10-Mar-2020 Mikhail Maltsev <[email protected]>

[ARM,CDE] Generalize MVE intrinsics infrastructure to support CDE

Summary:
This patch generalizes the existing code to support CDE intrinsics
which will share some properties with existing MVE intri

[ARM,CDE] Generalize MVE intrinsics infrastructure to support CDE

Summary:
This patch generalizes the existing code to support CDE intrinsics
which will share some properties with existing MVE intrinsics
(some of the intrinsics will be polymorphic and accept/return values
of MVE vector types).
Specifically the patch:
* Adds new tablegen backends -gen-arm-cde-builtin-def,
-gen-arm-cde-builtin-codegen, -gen-arm-cde-builtin-sema,
-gen-arm-cde-builtin-aliases, -gen-arm-cde-builtin-header based on
existing MVE backends.
* Renames the '__clang_arm_mve_alias' attribute into
'__clang_arm_builtin_alias' (it will be used with CDE intrinsics as
well as MVE intrinsics)
* Implements semantic checks for the coprocessor argument of the CDE
intrinsics as well as the existing coprocessor intrinsics.
* Adds one CDE intrinsic __arm_cx1 to test the above changes

Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen

Reviewed By: simon_tatham

Subscribers: sdesmalen, mgorny, kristof.beyls, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75850

show more ...


Revision tags: llvmorg-10.0.0-rc3
# af450eab 29-Feb-2020 Reid Kleckner <[email protected]>

Avoid including FileSystem.h from MemoryBuffer.h

Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.

Saves ~250 includes of Chrono.h &

Avoid including FileSystem.h from MemoryBuffer.h

Lots of headers pass around MemoryBuffer objects, but very few open
them. Let those that do include FileSystem.h.

Saves ~250 includes of Chrono.h & FileSystem.h:

$ diff -u thedeps-before.txt thedeps-after.txt | grep '^[-+] ' | sort | uniq -c | sort -nr
254 - ../llvm/include/llvm/Support/FileSystem.h
253 - ../llvm/include/llvm/Support/Chrono.h
237 - ../llvm/include/llvm/Support/NativeFormatting.h
237 - ../llvm/include/llvm/Support/FormatProviders.h
192 - ../llvm/include/llvm/ADT/StringSwitch.h
190 - ../llvm/include/llvm/Support/FormatVariadicDetails.h
...

This requires duplicating the file_t typedef, which is unfortunate. I
sunk the choice of mapping mode down into the cpp file using variable
template specializations instead of class members in headers.

show more ...


# c32af444 17-Feb-2020 Simon Tatham <[email protected]>

[ARM,MVE] Add the vmovnbq,vmovntq intrinsic family.

Summary:
These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they onl

[ARM,MVE] Add the vmovnbq,vmovntq intrinsic family.

Summary:
These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they only
write half a vector's worth of output data, and therefore they also
take an 'inactive' parameter to provide the other half of the data in
the output vector. So vmovnb overwrites the even lanes of 'inactive'
with the narrowed values from the main input, and vmovnt overwrites
the odd lanes.

LLVM had existing codegen which generates these MVE instructions in
response to IR that takes two vectors of wide elements, or two vectors
of narrow ones. But in this case, we have one vector of each. So my
clang codegen strategy is to narrow the input vector of wide elements
by simply reinterpreting it as the output type, and then we have two
narrow vectors and can represent the operation as a vector shuffle
that interleaves lanes from both of them.

Even so, not all the cases I needed ended up being selected as a
single MVE instruction, so I've added a couple more patterns that spot
combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be
generated as a VMOVN instruction with operands swapped.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74337

show more ...


Revision tags: llvmorg-10.0.0-rc2
# f8d4afc4 31-Jan-2020 Simon Tatham <[email protected]>

[ARM,MVE] Add intrinsics for v[id]dupq and v[id]wdupq.

Summary:
These instructions generate a vector of consecutive elements starting
from a given base value and incrementing by 1, 2, 4 or 8. The `w

[ARM,MVE] Add intrinsics for v[id]dupq and v[id]wdupq.

Summary:
These instructions generate a vector of consecutive elements starting
from a given base value and incrementing by 1, 2, 4 or 8. The `wdup`
versions also wrap the values back to zero when they reach a given
limit value. The instruction updates the scalar base register so that
another use of the same instruction will continue the sequence from
where the previous one left off.

At the IR level, I've represented these instructions as a family of
target-specific intrinsics with two return values (the constructed
vector and the updated base). The user-facing ACLE API provides a set
of intrinsics that throw away the written-back base and another set
that receive it as a pointer so they can update it, plus the usual
predicated versions.

Because the intrinsics return two values (as do the underlying
instructions), the isel has to be done in C++.

This is the first family of MVE intrinsics that use the `imm_1248`
immediate type in the clang Tablegen framework, so naturally, I found
I'd given it the wrong C integer type. Also added some tests of the
check that the immediate has a legal value, because this is the first
time those particular checks have been exercised.

Finally, I also had to fix a bug in MveEmitter which failed an
assertion when I nested two `seq` nodes (the inner one used to extract
the two values from the pair returned by the IR intrinsic, and the
outer one put on by the predication multiclass).

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73357

show more ...


Revision tags: llvmorg-10.0.0-rc1
# adcd0268 28-Jan-2020 Benjamin Kramer <[email protected]>

Make llvm::StringRef to std::string conversions explicit.

This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly m

Make llvm::StringRef to std::string conversions explicit.

This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.

show more ...


# 98ea4b30 23-Jan-2020 Simon Tatham <[email protected]>

[ARM,MVE] Make the MVE intrinsics work in C++!

Summary:
Apparently nobody has tried this in months of development. It turns
out that `FunctionDecl::getBuiltinID` will never consider a function
to be

[ARM,MVE] Make the MVE intrinsics work in C++!

Summary:
Apparently nobody has tried this in months of development. It turns
out that `FunctionDecl::getBuiltinID` will never consider a function
to be a builtin if it is in C++ and not extern "C". So none of the
function declarations in <arm_mve.h> are recognized as builtins when
clang is compiling in C++ mode: it just emits calls to them as
ordinary functions, which then turn out not to exist at link time.

The trivial fix is to wrap most of arm_mve.h in an extern "C".

Added a test in clang/test/CodeGen/arm-mve-intrinsics which checks
basic functioning of the MVE header file in C++ mode. I've filled it
with copies of existing test functions from other files in that
directory, including a few moderately tricky cases of overloading (in
particular one that relies on the strict-polymorphism attribute added
in D72518).

(I considered making //every// test in that directory compile in both
C and C++ mode and check the code generation was identical. But I
think that would increase testing time by more than the value it adds,
and also update_cc_test_checks gets confused when the output function
name varies between RUN lines.)

Reviewers: LukeGeeson, MarkMurrayARM, miyuki, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73268

show more ...


# 4321c6af 23-Jan-2020 Simon Tatham <[email protected]>

[ARM,MVE] Support immediate vbicq,vorrq,vmvnq intrinsics.

Summary:
Immediate vmvnq is code-generated as a simple vector constant in IR,
and left to the backend to recognize that it can be created wi

[ARM,MVE] Support immediate vbicq,vorrq,vmvnq intrinsics.

Summary:
Immediate vmvnq is code-generated as a simple vector constant in IR,
and left to the backend to recognize that it can be created with an
MVE VMVN instruction. The predicated version is represented as a
select between the input and the same constant, and I've added a
Tablegen isel rule to turn that into a predicated VMVN. (That should
be better than the previous VMVN + VPSEL: it's the same number of
instructions but now it can fold into an adjacent VPT block.)

The unpredicated forms of VBIC and VORR are done by enabling the same
isel lowering as for NEON, recognizing appropriate immediates and
rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I
then instruction-select into the right MVE instructions (now that I've
also reworked those instructions to use the same MC operand encoding).
In order to do that, I had to promote the Tablegen SDNode instance
`NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and
similarly for `NEONvbicImm`.

The predicated forms of VBIC and VORR are represented as a vector
select between the original input vector and the output of the
unpredicated operation. The main convenience of this is that it still
lets me use the existing isel lowering for VBICIMM/VORRIMM, and not
have to write another copy of the operand encoding translation code.

This intrinsic family is the first to use the `imm_simd` system I put
into the MveEmitter tablegen backend. So, naturally, it showed up a
bug or two (emitting bogus range checks and the like). Fixed those,
and added a full set of tests for the permissible immediates in the
existing Sema test.

Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching
because lowering started turning its input into a VBICIMM. Now it
recognizes the VBICIMM instead.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72934

show more ...


# f63d7637 18-Jan-2020 Reid Kleckner <[email protected]>

[TableGen] Use a table to lookup MVE intrinsic names

Summary:
Speeds up compilation of SemaDeclAttr.cpp by nine seconds:
0m49.555s - > 0m40.249s

Reviewers: simon_tatham, dmgreen, ostannard, MarkM

[TableGen] Use a table to lookup MVE intrinsic names

Summary:
Speeds up compilation of SemaDeclAttr.cpp by nine seconds:
0m49.555s - > 0m40.249s

Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72984

show more ...


# ada01d1b 15-Jan-2020 Simon Tatham <[email protected]>

[clang] New __attribute__((__clang_arm_mve_strict_polymorphism)).

This is applied to the vector types defined in <arm_mve.h> for use
with the intrinsics for the ARM MVE vector architecture.

Its pur

[clang] New __attribute__((__clang_arm_mve_strict_polymorphism)).

This is applied to the vector types defined in <arm_mve.h> for use
with the intrinsics for the ARM MVE vector architecture.

Its purpose is to inhibit lax vector conversions, but only in the
context of overload resolution of the MVE polymorphic intrinsic
functions. This solves an ambiguity problem with polymorphic MVE
intrinsics that take a vector and a scalar argument: the scalar
argument can often have the wrong integer type due to default integer
promotions or unsuffixed literals, and therefore, the type of the
vector argument should be considered trustworthy when resolving MVE
polymorphism.

As part of the same change, I've added the new attribute to the
declarations generated by the MveEmitter Tablegen backend (and
corrected a namespace issue with the other attribute while I was
there).

Reviewers: aaron.ballman, dmgreen

Reviewed By: aaron.ballman

Subscribers: kristof.beyls, JDevlieghere, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72518

show more ...


Revision tags: llvmorg-11-init
# 31004809 08-Jan-2020 Simon Tatham <[email protected]>

[ARM,MVE] Intrinsics for partial-overwrite imm shifts.

This batch of intrinsics covers two sets of immediate shift
instructions, which have in common that they only overwrite part of
their output re

[ARM,MVE] Intrinsics for partial-overwrite imm shifts.

This batch of intrinsics covers two sets of immediate shift
instructions, which have in common that they only overwrite part of
their output register and so they need an extra input giving its
previous value.

The VSLI and VSRI instructions shift each lane of the input vector
left or right just as if they were normal immediate VSHL/VSHR, but
then they only overwrite the output bits that correspond to actual
shifted bits of the input. So VSLI will leave the low n bits of each
output lane unchanged, and VSRI the same with the top n bits.

The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input
vector of 2n-bit integers, shift each lane right by a constant, and
then narrowing the shifted result to only n bits. So they only
overwrite half of the n-bit lanes in the output register, and the B/T
suffix indicates whether it's the bottom or top half of each 2n-bit
lane.

I've implemented the whole of the latter family using a single IR
intrinsic `vshrn`, which takes a lot of i32 parameters indicating
which instruction it expands to (by specifying signedness of the input
and output types, whether it saturates and/or rounds, etc).

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72328

show more ...


# 4978296c 06-Jan-2020 Simon Tatham <[email protected]>

[ARM,MVE] Support -ve offsets in gather-load intrinsics.

Summary:
The ACLE intrinsics with `gather_base` or `scatter_base` in the name
are wrappers on the MVE load/store instructions that take a vec

[ARM,MVE] Support -ve offsets in gather-load intrinsics.

Summary:
The ACLE intrinsics with `gather_base` or `scatter_base` in the name
are wrappers on the MVE load/store instructions that take a vector of
base addresses and an immediate offset. The immediate offset can be up
to 127 times the alignment unit, and it can be positive or negative.

At the MC layer, we got that right. But in the Sema error checking for
the wrapping intrinsics, the offset was erroneously constrained to be
positive.

To fix this I've adjusted the `imm_mem7bit` class in the Tablegen that
defines the intrinsics. But that causes integer literals like
`0xfffffffffffffe04` to appear in the autogenerated calls to
`SemaBuiltinConstantArgRange`, which provokes a compiler warning
because that's out of the non-overflowing range of an `int64_t`. So
I've also tweaked `MveEmitter` to emit that as `-0x1fc` instead.

Updated the tests of the Sema checks themselves, and also adjusted a
random sample of the CodeGen tests to actually use negative offsets
and prove they get all the way through code generation without causing
a crash.

Reviewers: dmgreen, miyuki, MarkMurrayARM

Reviewed By: dmgreen

Subscribers: kristof.beyls, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72268

show more ...


Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3
# bd0f271c 11-Dec-2019 Simon Tatham <[email protected]>

[ARM][MVE] Add intrinsics for immediate shifts. (reland)

This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which
shift every lane of a vector left or right by a compile-time
immediate

[ARM][MVE] Add intrinsics for immediate shifts. (reland)

This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which
shift every lane of a vector left or right by a compile-time
immediate. They mostly work by expanding to the IR `shl`, `lshr` and
`ashr` operations, with their second operand being a vector splat of
the immediate.

There's a fiddly special case, though. ACLE specifies that the
immediate in `vshrq_n` can take values up to //and including// the bit
size of the vector lane. But LLVM IR thinks that shifting right by the
full size of the lane is UB, and feels free to replace the `lshr` with
an `undef` half way through the optimization pipeline. Hence, to keep
this legal in source code, I have to detect it at codegen time.
Logical (unsigned) right shifts by the element size are handled by
simply emitting the zero vector; arithmetic ones are converted into a
shift of one bit less, which will always give the same output.

In order to do that check, I also had to enhance the tablegen
MveEmitter so that it can cope with converting a builtin function's
operand into a bare integer to pass to a code-generating subfunction.
Previously the only bare integers it knew how to handle were flags
generated from within `arm_mve.td`.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen, MarkMurrayARM

Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71065

show more ...


12