History log of /llvm-project-15.0.7/llvm/lib/Target/X86/X86FastISel.cpp (Results 1 – 25 of 695)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# 655ba9c8 17-Jun-2022 Phoebe Wang <[email protected]>

Reland "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This resolves problems reported in commit 1a20252978c76cf2518aa45b175a9e5d6d36c4f0.
1. Promot

Reland "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This resolves problems reported in commit 1a20252978c76cf2518aa45b175a9e5d6d36c4f0.
1. Promote to float lowering for nodes XINT_TO_FP
2. Bail out f16 from shuffle combine due to vector type is not legal in the version

show more ...


# 1a202529 17-Jun-2022 Benjamin Kramer <[email protected]>

Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This reverts commit 04a3d5f3a1193fb87576425a385aa0a6115b1e7c.

I see two more issues:

- uito

Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This reverts commit 04a3d5f3a1193fb87576425a385aa0a6115b1e7c.

I see two more issues:

- uitofp/sitofp from i32/i64 to half now generates
__floatsihf/__floatdihf, which exists in neither compiler-rt nor
libgcc

- This crashes when legalizing the bitcast:
```
; RUN: llc < %s -mcpu=skx
define void @main.45(ptr nocapture readnone %retval, ptr noalias nocapture readnone %run_options, ptr noalias nocapture readnone %params, ptr noalias nocapture readonly %buffer_table, ptr noalias nocapture readnone %status, ptr noalias nocapture readnone %prof_counters) local_unnamed_addr {
entry:
%fusion = load ptr, ptr %buffer_table, align 8
%0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
%Arg_1.2 = load ptr, ptr %0, align 8
%1 = getelementptr inbounds ptr, ptr %buffer_table, i64 2
%Arg_0.1 = load ptr, ptr %1, align 8
%2 = load half, ptr %Arg_0.1, align 8
%3 = bitcast half %2 to i16
%4 = and i16 %3, 32767
%5 = icmp eq i16 %4, 0
%6 = and i16 %3, -32768
%broadcast.splatinsert = insertelement <4 x half> poison, half %2, i64 0
%broadcast.splat = shufflevector <4 x half> %broadcast.splatinsert, <4 x half> poison, <4 x i32> zeroinitializer
%broadcast.splatinsert9 = insertelement <4 x i16> poison, i16 %4, i64 0
%broadcast.splat10 = shufflevector <4 x i16> %broadcast.splatinsert9, <4 x i16> poison, <4 x i32> zeroinitializer
%broadcast.splatinsert11 = insertelement <4 x i16> poison, i16 %6, i64 0
%broadcast.splat12 = shufflevector <4 x i16> %broadcast.splatinsert11, <4 x i16> poison, <4 x i32> zeroinitializer
%broadcast.splatinsert13 = insertelement <4 x i16> poison, i16 %3, i64 0
%broadcast.splat14 = shufflevector <4 x i16> %broadcast.splatinsert13, <4 x i16> poison, <4 x i32> zeroinitializer
%wide.load = load <4 x half>, ptr %Arg_1.2, align 8
%7 = fcmp uno <4 x half> %broadcast.splat, %wide.load
%8 = fcmp oeq <4 x half> %broadcast.splat, %wide.load
%9 = bitcast <4 x half> %wide.load to <4 x i16>
%10 = and <4 x i16> %9, <i16 32767, i16 32767, i16 32767, i16 32767>
%11 = icmp eq <4 x i16> %10, zeroinitializer
%12 = and <4 x i16> %9, <i16 -32768, i16 -32768, i16 -32768, i16 -32768>
%13 = or <4 x i16> %12, <i16 1, i16 1, i16 1, i16 1>
%14 = select <4 x i1> %11, <4 x i16> %9, <4 x i16> %13
%15 = icmp ugt <4 x i16> %broadcast.splat10, %10
%16 = icmp ne <4 x i16> %broadcast.splat12, %12
%17 = or <4 x i1> %15, %16
%18 = select <4 x i1> %17, <4 x i16> <i16 -1, i16 -1, i16 -1, i16 -1>, <4 x i16> <i16 1, i16 1, i16 1, i16 1>
%19 = add <4 x i16> %18, %broadcast.splat14
%20 = select i1 %5, <4 x i16> %14, <4 x i16> %19
%21 = select <4 x i1> %8, <4 x i16> %9, <4 x i16> %20
%22 = bitcast <4 x i16> %21 to <4 x half>
%23 = select <4 x i1> %7, <4 x half> <half 0xH7E00, half 0xH7E00, half 0xH7E00, half 0xH7E00>, <4 x half> %22
store <4 x half> %23, ptr %fusion, align 16
ret void
}
```

llc: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:977: void (anonymous namespace)::SelectionDAGLegalize::LegalizeOp(llvm::SDNode *): Assertion `(TLI.getTypeAction(*DAG.getContext(), Op.getValueType()) == TargetLowering::TypeLegal || Op.getOpcode() == ISD::TargetConstant || Op.getOpcode() == ISD::Register) && "Unexpected illegal type!"' failed.

show more ...


# 04a3d5f3 17-Jun-2022 Phoebe Wang <[email protected]>

Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""

Fix the crash on lowering X86ISD::FCMP.


# 3cd5696a 15-Jun-2022 Frederik Gossen <[email protected]>

Revert "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""

This reverts commit e1c5afa47d37012499467b5061fc42e50884d129.

This introduces crashes in the JAX back

Revert "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""

This reverts commit e1c5afa47d37012499467b5061fc42e50884d129.

This introduces crashes in the JAX backend on CPU. A reproducer in LLVM is
below. Let me know if you have trouble reproducing this.

; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

@0 = private unnamed_addr constant [4 x i8] c"\00\00\00?"
@1 = private unnamed_addr constant [4 x i8] c"\1C}\908"
@2 = private unnamed_addr constant [4 x i8] c"?\00\\4"
@3 = private unnamed_addr constant [4 x i8] c"%ci1"
@4 = private unnamed_addr constant [4 x i8] zeroinitializer
@5 = private unnamed_addr constant [4 x i8] c"\00\00\00\C0"
@6 = private unnamed_addr constant [4 x i8] c"\00\00\00B"
@7 = private unnamed_addr constant [4 x i8] c"\94\B4\C22"
@8 = private unnamed_addr constant [4 x i8] c"^\09B6"
@9 = private unnamed_addr constant [4 x i8] c"\15\F3M?"
@10 = private unnamed_addr constant [4 x i8] c"e\CC\\;"
@11 = private unnamed_addr constant [4 x i8] c"d\BD/>"
@12 = private unnamed_addr constant [4 x i8] c"V\F4I="
@13 = private unnamed_addr constant [4 x i8] c"\10\CB,<"
@14 = private unnamed_addr constant [4 x i8] c"\AC\E3\D6:"
@15 = private unnamed_addr constant [4 x i8] c"\DC\A8E9"
@16 = private unnamed_addr constant [4 x i8] c"\C6\FA\897"
@17 = private unnamed_addr constant [4 x i8] c"%\F9\955"
@18 = private unnamed_addr constant [4 x i8] c"\B5\DB\813"
@19 = private unnamed_addr constant [4 x i8] c"\B4W_\B2"
@20 = private unnamed_addr constant [4 x i8] c"\1Cc\8F\B4"
@21 = private unnamed_addr constant [4 x i8] c"~3\94\B6"
@22 = private unnamed_addr constant [4 x i8] c"3Yq\B8"
@23 = private unnamed_addr constant [4 x i8] c"\E9\17\17\BA"
@24 = private unnamed_addr constant [4 x i8] c"\F1\B2\8D\BB"
@25 = private unnamed_addr constant [4 x i8] c"\F8t\C2\BC"
@26 = private unnamed_addr constant [4 x i8] c"\82[\C2\BD"
@27 = private unnamed_addr constant [4 x i8] c"uB-?"
@28 = private unnamed_addr constant [4 x i8] c"^\FF\9B\BE"
@29 = private unnamed_addr constant [4 x i8] c"\00\00\00A"

; Function Attrs: uwtable
define void @main.158(ptr %retval, ptr noalias %run_options, ptr noalias %params, ptr noalias %buffer_table, ptr noalias %status, ptr noalias %prof_counters) #0 {
entry:
%fusion.invar_address.dim.1 = alloca i64, align 8
%fusion.invar_address.dim.0 = alloca i64, align 8
%0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
%Arg_0.1 = load ptr, ptr %0, align 8, !invariant.load !0, !dereferenceable !1, !align !2
%1 = getelementptr inbounds ptr, ptr %buffer_table, i64 0
%fusion = load ptr, ptr %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2
store i64 0, ptr %fusion.invar_address.dim.0, align 8
br label %fusion.loop_header.dim.0

return: ; preds = %fusion.loop_exit.dim.0
ret void

fusion.loop_header.dim.0: ; preds = %fusion.loop_exit.dim.1, %entry
%fusion.indvar.dim.0 = load i64, ptr %fusion.invar_address.dim.0, align 8
%2 = icmp uge i64 %fusion.indvar.dim.0, 3
br i1 %2, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0

fusion.loop_body.dim.0: ; preds = %fusion.loop_header.dim.0
store i64 0, ptr %fusion.invar_address.dim.1, align 8
br label %fusion.loop_header.dim.1

fusion.loop_header.dim.1: ; preds = %fusion.loop_body.dim.1, %fusion.loop_body.dim.0
%fusion.indvar.dim.1 = load i64, ptr %fusion.invar_address.dim.1, align 8
%3 = icmp uge i64 %fusion.indvar.dim.1, 1
br i1 %3, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1

fusion.loop_body.dim.1: ; preds = %fusion.loop_header.dim.1
%4 = getelementptr inbounds [3 x [1 x half]], ptr %Arg_0.1, i64 0, i64 %fusion.indvar.dim.0, i64 0
%5 = load half, ptr %4, align 2, !invariant.load !0, !noalias !3
%6 = fpext half %5 to float
%7 = call float @llvm.fabs.f32(float %6)
%constant.121 = load float, ptr @29, align 4
%compare.2 = fcmp ole float %7, %constant.121
%8 = zext i1 %compare.2 to i8
%constant.120 = load float, ptr @0, align 4
%multiply.95 = fmul float %7, %constant.120
%constant.119 = load float, ptr @5, align 4
%add.82 = fadd float %multiply.95, %constant.119
%constant.118 = load float, ptr @4, align 4
%multiply.94 = fmul float %add.82, %constant.118
%constant.117 = load float, ptr @19, align 4
%add.81 = fadd float %multiply.94, %constant.117
%multiply.92 = fmul float %add.82, %add.81
%constant.116 = load float, ptr @18, align 4
%add.79 = fadd float %multiply.92, %constant.116
%multiply.91 = fmul float %add.82, %add.79
%subtract.87 = fsub float %multiply.91, %add.81
%constant.115 = load float, ptr @20, align 4
%add.78 = fadd float %subtract.87, %constant.115
%multiply.89 = fmul float %add.82, %add.78
%subtract.86 = fsub float %multiply.89, %add.79
%constant.114 = load float, ptr @17, align 4
%add.76 = fadd float %subtract.86, %constant.114
%multiply.88 = fmul float %add.82, %add.76
%subtract.84 = fsub float %multiply.88, %add.78
%constant.113 = load float, ptr @21, align 4
%add.75 = fadd float %subtract.84, %constant.113
%multiply.86 = fmul float %add.82, %add.75
%subtract.83 = fsub float %multiply.86, %add.76
%constant.112 = load float, ptr @16, align 4
%add.73 = fadd float %subtract.83, %constant.112
%multiply.85 = fmul float %add.82, %add.73
%subtract.81 = fsub float %multiply.85, %add.75
%constant.111 = load float, ptr @22, align 4
%add.72 = fadd float %subtract.81, %constant.111
%multiply.83 = fmul float %add.82, %add.72
%subtract.80 = fsub float %multiply.83, %add.73
%constant.110 = load float, ptr @15, align 4
%add.70 = fadd float %subtract.80, %constant.110
%multiply.82 = fmul float %add.82, %add.70
%subtract.78 = fsub float %multiply.82, %add.72
%constant.109 = load float, ptr @23, align 4
%add.69 = fadd float %subtract.78, %constant.109
%multiply.80 = fmul float %add.82, %add.69
%subtract.77 = fsub float %multiply.80, %add.70
%constant.108 = load float, ptr @14, align 4
%add.68 = fadd float %subtract.77, %constant.108
%multiply.79 = fmul float %add.82, %add.68
%subtract.75 = fsub float %multiply.79, %add.69
%constant.107 = load float, ptr @24, align 4
%add.67 = fadd float %subtract.75, %constant.107
%multiply.77 = fmul float %add.82, %add.67
%subtract.74 = fsub float %multiply.77, %add.68
%constant.106 = load float, ptr @13, align 4
%add.66 = fadd float %subtract.74, %constant.106
%multiply.76 = fmul float %add.82, %add.66
%subtract.72 = fsub float %multiply.76, %add.67
%constant.105 = load float, ptr @25, align 4
%add.65 = fadd float %subtract.72, %constant.105
%multiply.74 = fmul float %add.82, %add.65
%subtract.71 = fsub float %multiply.74, %add.66
%constant.104 = load float, ptr @12, align 4
%add.64 = fadd float %subtract.71, %constant.104
%multiply.73 = fmul float %add.82, %add.64
%subtract.69 = fsub float %multiply.73, %add.65
%constant.103 = load float, ptr @26, align 4
%add.63 = fadd float %subtract.69, %constant.103
%multiply.71 = fmul float %add.82, %add.63
%subtract.67 = fsub float %multiply.71, %add.64
%constant.102 = load float, ptr @11, align 4
%add.62 = fadd float %subtract.67, %constant.102
%multiply.70 = fmul float %add.82, %add.62
%subtract.66 = fsub float %multiply.70, %add.63
%constant.101 = load float, ptr @28, align 4
%add.61 = fadd float %subtract.66, %constant.101
%multiply.68 = fmul float %add.82, %add.61
%subtract.65 = fsub float %multiply.68, %add.62
%constant.100 = load float, ptr @27, align 4
%add.60 = fadd float %subtract.65, %constant.100
%subtract.64 = fsub float %add.60, %add.62
%multiply.66 = fmul float %subtract.64, %constant.120
%constant.99 = load float, ptr @6, align 4
%divide.4 = fdiv float %constant.99, %7
%add.59 = fadd float %divide.4, %constant.119
%multiply.65 = fmul float %add.59, %constant.118
%constant.98 = load float, ptr @3, align 4
%add.58 = fadd float %multiply.65, %constant.98
%multiply.64 = fmul float %add.59, %add.58
%constant.97 = load float, ptr @7, align 4
%add.57 = fadd float %multiply.64, %constant.97
%multiply.63 = fmul float %add.59, %add.57
%subtract.63 = fsub float %multiply.63, %add.58
%constant.96 = load float, ptr @2, align 4
%add.56 = fadd float %subtract.63, %constant.96
%multiply.62 = fmul float %add.59, %add.56
%subtract.62 = fsub float %multiply.62, %add.57
%constant.95 = load float, ptr @8, align 4
%add.55 = fadd float %subtract.62, %constant.95
%multiply.61 = fmul float %add.59, %add.55
%subtract.61 = fsub float %multiply.61, %add.56
%constant.94 = load float, ptr @1, align 4
%add.54 = fadd float %subtract.61, %constant.94
%multiply.60 = fmul float %add.59, %add.54
%subtract.60 = fsub float %multiply.60, %add.55
%constant.93 = load float, ptr @10, align 4
%add.53 = fadd float %subtract.60, %constant.93
%multiply.59 = fmul float %add.59, %add.53
%subtract.59 = fsub float %multiply.59, %add.54
%constant.92 = load float, ptr @9, align 4
%add.52 = fadd float %subtract.59, %constant.92
%subtract.58 = fsub float %add.52, %add.54
%multiply.58 = fmul float %subtract.58, %constant.120
%9 = call float @llvm.sqrt.f32(float %7)
%10 = fdiv float 1.000000e+00, %9
%multiply.57 = fmul float %multiply.58, %10
%11 = trunc i8 %8 to i1
%12 = select i1 %11, float %multiply.66, float %multiply.57
%13 = fptrunc float %12 to half
%14 = getelementptr inbounds [3 x [1 x half]], ptr %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 0
store half %13, ptr %14, align 2, !alias.scope !3
%invar.inc1 = add nuw nsw i64 %fusion.indvar.dim.1, 1
store i64 %invar.inc1, ptr %fusion.invar_address.dim.1, align 8
br label %fusion.loop_header.dim.1

fusion.loop_exit.dim.1: ; preds = %fusion.loop_header.dim.1
%invar.inc = add nuw nsw i64 %fusion.indvar.dim.0, 1
store i64 %invar.inc, ptr %fusion.invar_address.dim.0, align 8
br label %fusion.loop_header.dim.0

fusion.loop_exit.dim.0: ; preds = %fusion.loop_header.dim.0
br label %return
}

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.fabs.f32(float %0) #1

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.sqrt.f32(float %0) #1

attributes #0 = { uwtable "denormal-fp-math"="preserve-sign" "no-frame-pointer-elim"="false" }
attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn }

!0 = !{}
!1 = !{i64 6}
!2 = !{i64 8}
!3 = !{!4}
!4 = !{!"buffer: {index:0, offset:0, size:6}", !5}
!5 = !{!"XLA global AA domain"}

show more ...


# e1c5afa4 15-Jun-2022 Phoebe Wang <[email protected]>

Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""

Fixed the missing SQRT promotion. Adding several missing operations too.


# 37455b1f 15-Jun-2022 Thomas Joerg <[email protected]>

Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""

This reverts commit 6e02e27536b9de25a651cfc9c2966ce471169355.

This introduces a crash in the backend. Reproduc

Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""

This reverts commit 6e02e27536b9de25a651cfc9c2966ce471169355.

This introduces a crash in the backend. Reproducer in MLIR's LLVM
dialect follows. Let me know if you have trouble reproducing this.

module {
llvm.func @malloc(i64) -> !llvm.ptr<i8>
llvm.func @_mlir_ciface_tf_report_error(!llvm.ptr<i8>, i32, !llvm.ptr<i8>)
llvm.mlir.global internal constant @error_message_2208944672953921889("failed to allocate memory at loc(\22-\22:3:8)\00")
llvm.func @_mlir_ciface_tf_alloc(!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
llvm.func @Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<i8>, %arg1: i64, %arg2: !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> attributes {llvm.emit_c_interface, tf_entry} {
%0 = llvm.mlir.constant(8 : i32) : i32
%1 = llvm.mlir.constant(8 : index) : i64
%2 = llvm.mlir.constant(2 : index) : i64
%3 = llvm.mlir.constant(dense<0.000000e+00> : vector<4xf16>) : vector<4xf16>
%4 = llvm.mlir.constant(dense<[0, 1, 2, 3]> : vector<4xi32>) : vector<4xi32>
%5 = llvm.mlir.constant(dense<1.000000e+00> : vector<4xf16>) : vector<4xf16>
%6 = llvm.mlir.constant(false) : i1
%7 = llvm.mlir.constant(1 : i32) : i32
%8 = llvm.mlir.constant(0 : i32) : i32
%9 = llvm.mlir.constant(4 : index) : i64
%10 = llvm.mlir.constant(0 : index) : i64
%11 = llvm.mlir.constant(1 : index) : i64
%12 = llvm.mlir.constant(-1 : index) : i64
%13 = llvm.mlir.null : !llvm.ptr<f16>
%14 = llvm.getelementptr %13[%9] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%15 = llvm.ptrtoint %14 : !llvm.ptr<f16> to i64
%16 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
%17 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
%18 = llvm.mlir.null : !llvm.ptr<i64>
%19 = llvm.getelementptr %18[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%20 = llvm.ptrtoint %19 : !llvm.ptr<i64> to i64
%21 = llvm.alloca %20 x i64 : (i64) -> !llvm.ptr<i64>
llvm.br ^bb1(%10 : i64)
^bb1(%22: i64): // 2 preds: ^bb0, ^bb2
%23 = llvm.icmp "slt" %22, %arg1 : i64
llvm.cond_br %23, ^bb2, ^bb3
^bb2: // pred: ^bb1
%24 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
%25 = llvm.getelementptr %24[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
%26 = llvm.add %22, %11 : i64
%27 = llvm.getelementptr %25[%26] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%28 = llvm.load %27 : !llvm.ptr<i64>
%29 = llvm.getelementptr %21[%22] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %28, %29 : !llvm.ptr<i64>
llvm.br ^bb1(%26 : i64)
^bb3: // pred: ^bb1
llvm.br ^bb4(%10, %11 : i64, i64)
^bb4(%30: i64, %31: i64): // 2 preds: ^bb3, ^bb5
%32 = llvm.icmp "slt" %30, %arg1 : i64
llvm.cond_br %32, ^bb5, ^bb6
^bb5: // pred: ^bb4
%33 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
%34 = llvm.getelementptr %33[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
%35 = llvm.add %30, %11 : i64
%36 = llvm.getelementptr %34[%35] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%37 = llvm.load %36 : !llvm.ptr<i64>
%38 = llvm.mul %37, %31 : i64
llvm.br ^bb4(%35, %38 : i64, i64)
^bb6: // pred: ^bb4
%39 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
%40 = llvm.getelementptr %39[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%41 = llvm.load %40 : !llvm.ptr<ptr<f16>>
%42 = llvm.getelementptr %13[%11] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%43 = llvm.ptrtoint %42 : !llvm.ptr<f16> to i64
%44 = llvm.alloca %7 x i32 : (i32) -> !llvm.ptr<i32>
llvm.store %8, %44 : !llvm.ptr<i32>
%45 = llvm.call @_mlir_ciface_tf_alloc(%arg0, %31, %43, %8, %7, %44) : (!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
%46 = llvm.bitcast %45 : !llvm.ptr<i8> to !llvm.ptr<f16>
%47 = llvm.icmp "eq" %31, %10 : i64
%48 = llvm.or %6, %47 : i1
%49 = llvm.mlir.null : !llvm.ptr<i8>
%50 = llvm.icmp "ne" %45, %49 : !llvm.ptr<i8>
%51 = llvm.or %50, %48 : i1
llvm.cond_br %51, ^bb7, ^bb13
^bb7: // pred: ^bb6
%52 = llvm.urem %31, %9 : i64
%53 = llvm.sub %31, %52 : i64
llvm.br ^bb8(%10 : i64)
^bb8(%54: i64): // 2 preds: ^bb7, ^bb9
%55 = llvm.icmp "slt" %54, %53 : i64
llvm.cond_br %55, ^bb9, ^bb10
^bb9: // pred: ^bb8
%56 = llvm.mul %54, %11 : i64
%57 = llvm.add %56, %10 : i64
%58 = llvm.add %57, %10 : i64
%59 = llvm.getelementptr %41[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%60 = llvm.bitcast %59 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
%61 = llvm.load %60 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%62 = "llvm.intr.sqrt"(%61) : (vector<4xf16>) -> vector<4xf16>
%63 = llvm.fdiv %5, %62 : vector<4xf16>
%64 = llvm.getelementptr %46[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%65 = llvm.bitcast %64 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %63, %65 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%66 = llvm.add %54, %9 : i64
llvm.br ^bb8(%66 : i64)
^bb10: // pred: ^bb8
%67 = llvm.icmp "ult" %53, %31 : i64
llvm.cond_br %67, ^bb11, ^bb12
^bb11: // pred: ^bb10
%68 = llvm.mul %53, %12 : i64
%69 = llvm.add %31, %68 : i64
%70 = llvm.mul %53, %11 : i64
%71 = llvm.add %70, %10 : i64
%72 = llvm.trunc %69 : i64 to i32
%73 = llvm.mlir.undef : vector<4xi32>
%74 = llvm.insertelement %72, %73[%8 : i32] : vector<4xi32>
%75 = llvm.shufflevector %74, %73 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<4xi32>, vector<4xi32>
%76 = llvm.icmp "slt" %4, %75 : vector<4xi32>
%77 = llvm.add %71, %10 : i64
%78 = llvm.getelementptr %41[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%79 = llvm.bitcast %78 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
%80 = llvm.intr.masked.load %79, %76, %3 {alignment = 2 : i32} : (!llvm.ptr<vector<4xf16>>, vector<4xi1>, vector<4xf16>) -> vector<4xf16>
%81 = llvm.bitcast %16 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %80, %81 : !llvm.ptr<vector<4xf16>>
%82 = llvm.load %81 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%83 = "llvm.intr.sqrt"(%82) : (vector<4xf16>) -> vector<4xf16>
%84 = llvm.fdiv %5, %83 : vector<4xf16>
%85 = llvm.bitcast %17 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.store %84, %85 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
%86 = llvm.load %85 : !llvm.ptr<vector<4xf16>>
%87 = llvm.getelementptr %46[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
%88 = llvm.bitcast %87 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
llvm.intr.masked.store %86, %88, %76 {alignment = 2 : i32} : vector<4xf16>, vector<4xi1> into !llvm.ptr<vector<4xf16>>
llvm.br ^bb12
^bb12: // 2 preds: ^bb10, ^bb11
%89 = llvm.mul %2, %1 : i64
%90 = llvm.mul %arg1, %2 : i64
%91 = llvm.add %90, %11 : i64
%92 = llvm.mul %91, %1 : i64
%93 = llvm.add %89, %92 : i64
%94 = llvm.alloca %93 x i8 : (i64) -> !llvm.ptr<i8>
%95 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
llvm.store %46, %95 : !llvm.ptr<ptr<f16>>
%96 = llvm.getelementptr %95[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
llvm.store %46, %96 : !llvm.ptr<ptr<f16>>
%97 = llvm.getelementptr %95[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%98 = llvm.bitcast %97 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
llvm.store %10, %98 : !llvm.ptr<i64>
%99 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>
%100 = llvm.getelementptr %99[%10, 3] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>, i64) -> !llvm.ptr<i64>
%101 = llvm.getelementptr %100[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%102 = llvm.sub %arg1, %11 : i64
llvm.br ^bb14(%102, %11 : i64, i64)
^bb13: // pred: ^bb6
%103 = llvm.mlir.addressof @error_message_2208944672953921889 : !llvm.ptr<array<42 x i8>>
%104 = llvm.getelementptr %103[%10, %10] : (!llvm.ptr<array<42 x i8>>, i64, i64) -> !llvm.ptr<i8>
llvm.call @_mlir_ciface_tf_report_error(%arg0, %0, %104) : (!llvm.ptr<i8>, i32, !llvm.ptr<i8>) -> ()
%105 = llvm.mul %2, %1 : i64
%106 = llvm.mul %2, %10 : i64
%107 = llvm.add %106, %11 : i64
%108 = llvm.mul %107, %1 : i64
%109 = llvm.add %105, %108 : i64
%110 = llvm.alloca %109 x i8 : (i64) -> !llvm.ptr<i8>
%111 = llvm.bitcast %110 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
llvm.store %13, %111 : !llvm.ptr<ptr<f16>>
%112 = llvm.getelementptr %111[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
llvm.store %13, %112 : !llvm.ptr<ptr<f16>>
%113 = llvm.getelementptr %111[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
%114 = llvm.bitcast %113 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
llvm.store %10, %114 : !llvm.ptr<i64>
%115 = llvm.call @malloc(%109) : (i64) -> !llvm.ptr<i8>
"llvm.intr.memcpy"(%115, %110, %109, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
%116 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
%117 = llvm.insertvalue %10, %116[0] : !llvm.struct<(i64, ptr<i8>)>
%118 = llvm.insertvalue %115, %117[1] : !llvm.struct<(i64, ptr<i8>)>
llvm.return %118 : !llvm.struct<(i64, ptr<i8>)>
^bb14(%119: i64, %120: i64): // 2 preds: ^bb12, ^bb15
%121 = llvm.icmp "sge" %119, %10 : i64
llvm.cond_br %121, ^bb15, ^bb16
^bb15: // pred: ^bb14
%122 = llvm.getelementptr %21[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
%123 = llvm.load %122 : !llvm.ptr<i64>
%124 = llvm.getelementptr %100[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %123, %124 : !llvm.ptr<i64>
%125 = llvm.getelementptr %101[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
llvm.store %120, %125 : !llvm.ptr<i64>
%126 = llvm.mul %120, %123 : i64
%127 = llvm.sub %119, %11 : i64
llvm.br ^bb14(%127, %126 : i64, i64)
^bb16: // pred: ^bb14
%128 = llvm.call @malloc(%93) : (i64) -> !llvm.ptr<i8>
"llvm.intr.memcpy"(%128, %94, %93, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
%129 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
%130 = llvm.insertvalue %arg1, %129[0] : !llvm.struct<(i64, ptr<i8>)>
%131 = llvm.insertvalue %128, %130[1] : !llvm.struct<(i64, ptr<i8>)>
llvm.return %131 : !llvm.struct<(i64, ptr<i8>)>
}
llvm.func @_mlir_ciface_Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<struct<(i64, ptr<i8>)>>, %arg1: !llvm.ptr<i8>, %arg2: !llvm.ptr<struct<(i64, ptr<i8>)>>) attributes {llvm.emit_c_interface, tf_entry} {
%0 = llvm.load %arg2 : !llvm.ptr<struct<(i64, ptr<i8>)>>
%1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
%2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
%3 = llvm.call @Rsqrt_CPU_DT_HALF_DT_HALF(%arg1, %1, %2) : (!llvm.ptr<i8>, i64, !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)>
llvm.store %3, %arg0 : !llvm.ptr<struct<(i64, ptr<i8>)>>
llvm.return
}
}

show more ...


# 6e02e275 15-Jun-2022 Phoebe Wang <[email protected]>

Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"

Disabled 2 mlir tests due to the runtime doesn't support `_Float16`, see
the issue here https://github.com/llvm/llvm-pro

Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"

Disabled 2 mlir tests due to the runtime doesn't support `_Float16`, see
the issue here https://github.com/llvm/llvm-project/issues/55992

show more ...


# 5d8298a7 12-Jun-2022 Mehdi Amini <[email protected]>

Revert "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"

This reverts commit 2d2da259c8726fd5c974c01122a9689981a12196.

This breaks MLIR integration test (JIT crashing), reverti

Revert "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"

This reverts commit 2d2da259c8726fd5c974c01122a9689981a12196.

This breaks MLIR integration test (JIT crashing), reverting in the
meantime.

show more ...


# 2d2da259 11-Jun-2022 Phoebe Wang <[email protected]>

[X86][RFC] Enable `_Float16` type support on X86 following the psABI

GCC and Clang/LLVM will support `_Float16` on X86 in C/C++, following
the latest X86 psABI. (https://gitlab.com/x86-psABIs)

_Flo

[X86][RFC] Enable `_Float16` type support on X86 following the psABI

GCC and Clang/LLVM will support `_Float16` on X86 in C/C++, following
the latest X86 psABI. (https://gitlab.com/x86-psABIs)

_Float16 arithmetic will be performed using native half-precision. If
native arithmetic instructions are not available, it will be performed
at a higher precision (currently always float) and then truncated down
to _Float16 immediately after each single arithmetic operation.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D107082

show more ...


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 3075e5d2 31-Mar-2022 Nikita Popov <[email protected]>

[X86][FastISel] Fix with.overflow + select eflags clobber (PR54369)

Don't try to directly use the with.overflow flag result in a cmov
if we need to materialize constants between the instruction
prod

[X86][FastISel] Fix with.overflow + select eflags clobber (PR54369)

Don't try to directly use the with.overflow flag result in a cmov
if we need to materialize constants between the instruction
producing the overflow flag and the cmov. The current code is
careful to check that there are no other instructions in between,
but misses the constant materialization case (which may clobber
eflags via xor or constant expression evaluation).

Fixes https://github.com/llvm/llvm-project/issues/54369.

Differential Revision: https://reviews.llvm.org/D122825

show more ...


# 674d52e8 27-Mar-2022 Phoebe Wang <[email protected]>

[X86] Refactor X86ScalarSSEf16/32/64 with hasFP16/SSE1/SSE2. NFCI

This is used for f16 emulation. We emulate f16 for SSE2 targets and
above. Refactoring makes the future code to be more clean.

Revi

[X86] Refactor X86ScalarSSEf16/32/64 with hasFP16/SSE1/SSE2. NFCI

This is used for f16 emulation. We emulate f16 for SSE2 targets and
above. Refactoring makes the future code to be more clean.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D122475

show more ...


# 076a9dc9 20-Mar-2022 Shengchen Kan <[email protected]>

[X86][NFC] Rename hasCMOV() to canUseCMOV(), hasLAHFSAHF() to canUseLAHFSAHF()

To make them less like other feature functions.
This is a follow-up patch for D121978.


# 920c2e57 18-Mar-2022 Shengchen Kan <[email protected]>

[X86][NFC] Rename target feature hasCMov->hasCMOV

This is a follow-up patch for D121975.


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 671f0930 19-Nov-2021 Matt Morehouse <[email protected]>

[X86] Selective relocation relaxation for +tagged-globals

For tagged-globals, we only need to disable relaxation for globals that
we actually tag. With this patch function pointer relocations, whic

[X86] Selective relocation relaxation for +tagged-globals

For tagged-globals, we only need to disable relaxation for globals that
we actually tag. With this patch function pointer relocations, which
we do not instrument, can be relaxed.

This patch also makes tagged-globals work properly with LTO, as
-Wa,-mrelax-relocations=no doesn't work with LTO.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D113220

show more ...


# d391e4fe 07-Nov-2021 Simon Pilgrim <[email protected]>

[X86] Update RET/LRET instruction to use the same naming convention as IRET (PR36876). NFC

Be more consistent in the naming convention for the various RET instructions to specify in terms of bitwidt

[X86] Update RET/LRET instruction to use the same naming convention as IRET (PR36876). NFC

Be more consistent in the naming convention for the various RET instructions to specify in terms of bitwidth.

Helps prevent future scheduler model mismatches like those that were only addressed in D44687.

Differential Revision: https://reviews.llvm.org/D113302

show more ...


# bd4dad87 07-Oct-2021 Jack Andersen <[email protected]>

[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand

Based on the reasoning of D53903, register operands of DBG_VALUE are
invariably treated as RegState::

[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand

Based on the reasoning of D53903, register operands of DBG_VALUE are
invariably treated as RegState::Debug operands. This change enforces
this invariant as part of MachineInstr::addOperand so that all passes
emit this flag consistently.

RegState::Debug is inconsistently set on DBG_VALUE registers throughout
LLVM. This runs the risk of a filtering iterator like
MachineRegisterInfo::reg_nodbg_iterator to process these operands
erroneously when not parsed from MIR sources.

This issue was observed in the development of the llvm-mos fork which
adds a backend that relies on physical register operands much more than
existing targets. Physical RegUnit 0 has the same numeric encoding as
$noreg (indicating an undef for DBG_VALUE). Allowing debug operands into
the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision
of register numbers with different zero semantics). Eventually, this
causes an assert where DBG_VALUE instructions are prohibited from
participating in live register ranges.

Reviewed By: MatzeB, StephenTozer

Differential Revision: https://reviews.llvm.org/D110105

show more ...


# c1e32b3f 02-Oct-2021 Kazu Hirata <[email protected]>

[Target] Migrate from getNumArgOperands to arg_size (NFC)

Note that getNumArgOperands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.


# 4c72b10f 25-Sep-2021 Simon Pilgrim <[email protected]>

[X86] X86FastISel::fastMaterializeConstant - break if-else chain to fix llvm-else-after-return warning. NFCI

All previous if-else cases return


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 6f7f5b54 10-Aug-2021 Wang, Pengfei <[email protected]>

[X86] AVX512FP16 instructions enabling 1/6

1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.

Ref.: https://software.intel.co

[X86] AVX512FP16 instructions enabling 1/6

1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105263

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3
# 36003c20 22-Jun-2021 Luo, Yuanke <[email protected]>

[X86] Selecting fld0 for undefined value in fast ISEL.

When set opt-bisect-limit to some value that is less than ISel pass
in command line and CurBisectNum expired, "DAG to DAG" pass lower
its opt l

[X86] Selecting fld0 for undefined value in fast ISEL.

When set opt-bisect-limit to some value that is less than ISel pass
in command line and CurBisectNum expired, "DAG to DAG" pass lower
its opt level to O0. However "processimpdefs" and "X86 FP Stackifier"
is not stopped due to the CurBisectNum expiration. So undefined fp0
is generated. This cause crash in the "X86 FP Stackifier" pass,
because Stackifier doesn't expect any undefined fp value.

Here is the scenario that cause compiler crash.

successors: %bb.26
liveins: $r14
ST_FPrr $st0, implicit-def $fpsw, implicit $fpcw
renamable $rdi = MOV64ri @.str.3.16422
renamable $rdx = LEA64r %stack.6, 1, $noreg, 0, $noreg
ADJCALLSTACKDOWN64 0, 0, 0, implicit-def $rsp, implicit-def dead
$eflags, implicit-def $ssp, implicit $rsp, implicit $ssp
dead $esi = MOV32r0 implicit-def dead $eflags, implicit-def $rsi
CALL64pcrel32 @foo, implicit $rsp, implicit $ssp, implicit $rdi,
implicit $rsi, implicit $rdx, implicit-def dead $fp0
renamable $xmm0 = MOVSDrm_alt %stack.10, 1, $noreg, 0, $noreg :: (load 8
from %stack.10)
ADJCALLSTACKUP64 0, 0, implicit-def $rsp, implicit-def dead $eflags,
implicit-def $ssp, implicit $rsp, implicit $ssp
renamable $fp2 = CHS_Fp80 killed undef renamable $fp0, implicit-def
$fpsw
JMP_1 %bb.26
The CALL64pcrel32 mark fp0 dead, so llvm free the stack slot for fp0
and the stack become empty. In the late instruction CHS_Fp80, it use
undefined register fp0, the original code assume there must be a stack
slot for the src register (fp0) without respecting it is undefined,
so llvm report error.

We have some discussion in https://reviews.llvm.org/D104440 and we
decide to fix it in fast ISel. The fix is to lower undefined fp value to
zero value, so that it release the burden of "X86 FP Stackifier" pass.
Thank Craig for the suggestion and the initial patch to fix it.

Differential Revision: https://reviews.llvm.org/D104678

show more ...


Revision tags: llvmorg-12.0.1-rc2
# 06e7de79 05-Jun-2021 Fangrui Song <[email protected]>

Fix some -Wunused-but-set-variable in -DLLVM_ENABLE_ASSERTIONS=off build


Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# ba1509da 12-Jan-2021 Tim Northover <[email protected]>

Recommit X86: support Swift Async context

This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an

Recommit X86: support Swift Async context

This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented frame record (populated in IR, but generally containing
enhanced backtrace for coroutines using lots of tail calls back and forth).

The context frame is identical to AArch64 (primarily so that unwinders etc
don't get extra complexity). Specfically, the new frame record is [AsyncCtx,
%rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp
being set to 1. %rbp still points to the frame pointer in memory for backwards
compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest
of the record is because of x86).

Recommited with a fix for unwind info when i386 pc-rel thunks are
adjacent to a prologue.

show more ...


# 6791a6b3 17-May-2021 Mitch Phillips <[email protected]>

Revert "X86: support Swift Async context"

This reverts commit 747e5cfb9f5d944b47fe014925b0d5dc2fda74d7.

Reason: New frame layout broke the sanitizer unwinder. Not clear why,
but seems like some of

Revert "X86: support Swift Async context"

This reverts commit 747e5cfb9f5d944b47fe014925b0d5dc2fda74d7.

Reason: New frame layout broke the sanitizer unwinder. Not clear why,
but seems like some of the changes aren't always guarded by Swyft
checks. See
https://reviews.llvm.org/rG747e5cfb9f5d944b47fe014925b0d5dc2fda74d7 for
more information.

show more ...


# 747e5cfb 12-Jan-2021 Tim Northover <[email protected]>

X86: support Swift Async context

This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented

X86: support Swift Async context

This adds support to the X86 backend for the newly committed swiftasync
function parameter. If such a (pointer) parameter is present it gets stored
into an augmented frame record (populated in IR, but generally containing
enhanced backtrace for coroutines using lots of tail calls back and forth).

The context frame is identical to AArch64 (primarily so that unwinders etc
don't get extra complexity). Specfically, the new frame record is [AsyncCtx,
%rbp, ReturnAddr], and its presence is signalled by bit 60 of the stored %rbp
being set to 1. %rbp still points to the frame pointer in memory for backwards
compatibility (only partial on x86, but OTOH the weird AsyncCtx before the rest
of the record is because of x86).

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 82a0e808 19-Nov-2020 Tim Northover <[email protected]>

IR/AArch64/X86: add "swifttailcc" calling convention.

Swift's new concurrency features are going to require guaranteed tail calls so
that they don't consume excessive amounts of stack space. This wo

IR/AArch64/X86: add "swifttailcc" calling convention.

Swift's new concurrency features are going to require guaranteed tail calls so
that they don't consume excessive amounts of stack space. This would normally
mean "tailcc", but there are also Swift-specific ABI desires that don't
naturally go along with "tailcc" so this adds another calling convention that's
the combination of "swiftcc" and "tailcc".

Support is added for AArch64 and X86 for now.

show more ...


12345678910>>...28