[CodeGen] Use extendhfsf2 and truncsfhf2 by default #126880

nikic · 2025-02-12T10:04:34Z

The standard libcalls for half to float and float to half conversion are __extendhfsf2 and __truncsfhf2. However, LLVM currently uses __gnu_h2f_ieee and __gnu_f2h_ieee instead. As far as I can tell, these libcalls are an ARM-ism and only provided by libgcc on that platform. compiler-rt always provides both libcalls.

Use the standard libcalls by default, and only use the __gnu libcalls on ARM.

The standard libcalls for half to float and float to half conversion are __extendhfsf2 and __truncsfhf2. However, LLVM currently uses __gnu_h2f_ieee and __gnu_f2h_ieee instead. As far as I can tell, these libcalls are ARM-ism and only provided by libgcc on that platform. compiler-rt always provides both libcalls. Use the standard libcalls by default, and only use the __gnu libcalls on ARM to improve libgcc compatibilty. We encounted this issue with MLIR execution engine test failures on Power 8.

llvmbot · 2025-02-12T10:05:15Z

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-hexagon

@llvm/pr-subscribers-backend-arm

Author: Nikita Popov (nikic)

Changes

The standard libcalls for half to float and float to half conversion are __extendhfsf2 and __truncsfhf2. However, LLVM currently uses __gnu_h2f_ieee and __gnu_f2h_ieee instead. As far as I can tell, these libcalls are ARM-ism and only provided by libgcc on that platform. compiler-rt always provides both libcalls.

Use the standard libcalls by default, and only use the __gnu libcalls on ARM to improve libgcc compatibility.

We encountered this issue with MLIR execution engine test failures on Power 8.

Patch is 223.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126880.diff

35 Files Affected:

(modified) llvm/include/llvm/IR/RuntimeLibcalls.def (+2-2)
(modified) llvm/lib/IR/RuntimeLibcalls.cpp (-3)
(modified) llvm/lib/Target/ARM/ARMISelLowering.cpp (+3)
(modified) llvm/lib/Target/Hexagon/HexagonISelLowering.cpp (-5)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (-3)
(modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (-5)
(modified) llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp (-4)
(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (-3)
(modified) llvm/test/CodeGen/AArch64/16bit-float-promotion-with-nofp.ll (+1-1)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll (+12-12)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll (+12-12)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll (+12-12)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll (+12-12)
(modified) llvm/test/CodeGen/AArch64/strictfp_f16_abi_promote.ll (+20-20)
(modified) llvm/test/CodeGen/LoongArch/fp16-promote.ll (+46-46)
(modified) llvm/test/CodeGen/Mips/fp16-promote.ll (+48-48)
(modified) llvm/test/CodeGen/Mips/ldexp.ll (+2-2)
(modified) llvm/test/CodeGen/PowerPC/atomics.ll (+4-4)
(modified) llvm/test/CodeGen/PowerPC/handle-f16-storage-type.ll (+84-84)
(modified) llvm/test/CodeGen/PowerPC/pr48519.ll (+7-7)
(modified) llvm/test/CodeGen/PowerPC/pr49092.ll (+1-1)
(modified) llvm/test/CodeGen/PowerPC/vector-llrint.ll (+378-378)
(modified) llvm/test/CodeGen/PowerPC/vector-lrint.ll (+378-378)
(modified) llvm/test/CodeGen/SPARC/fp16-promote.ll (+38-38)
(modified) llvm/test/CodeGen/VE/Scalar/fp_extload_truncstore.ll (+14-14)
(modified) llvm/test/CodeGen/X86/cvt16.ll (+5-5)
(modified) llvm/test/CodeGen/X86/fmf-flags.ll (+3-3)
(modified) llvm/test/CodeGen/X86/fp-i129.ll (+2-2)
(modified) llvm/test/CodeGen/X86/fp128-cast-strict.ll (+2-2)
(modified) llvm/test/CodeGen/X86/fptosi-sat-scalar.ll (+10-10)
(modified) llvm/test/CodeGen/X86/fptoui-sat-scalar.ll (+10-10)
(modified) llvm/test/CodeGen/X86/frem.ll (+1-1)
(modified) llvm/test/CodeGen/X86/half-constrained.ll (+7-7)
(modified) llvm/test/CodeGen/X86/ldexp.ll (+2-2)
(modified) llvm/test/CodeGen/X86/llvm.frexp.ll (+5-5)

diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.def b/llvm/include/llvm/IR/RuntimeLibcalls.def
index a7963543c4350..c6ac341d71a20 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.def
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.def
@@ -384,8 +384,8 @@ HANDLE_LIBCALL(FPEXT_F16_F128, "__extendhftf2")
 HANDLE_LIBCALL(FPEXT_F16_F80, "__extendhfxf2")
 HANDLE_LIBCALL(FPEXT_F32_F64, "__extendsfdf2")
 HANDLE_LIBCALL(FPEXT_F16_F64, "__extendhfdf2")
-HANDLE_LIBCALL(FPEXT_F16_F32, "__gnu_h2f_ieee")
-HANDLE_LIBCALL(FPROUND_F32_F16, "__gnu_f2h_ieee")
+HANDLE_LIBCALL(FPEXT_F16_F32, "__extendhfsf2")
+HANDLE_LIBCALL(FPROUND_F32_F16, "__truncsfhf2")
 HANDLE_LIBCALL(FPROUND_F64_F16, "__truncdfhf2")
 HANDLE_LIBCALL(FPROUND_F80_F16, "__truncxfhf2")
 HANDLE_LIBCALL(FPROUND_F128_F16, "__trunctfhf2")
diff --git a/llvm/lib/IR/RuntimeLibcalls.cpp b/llvm/lib/IR/RuntimeLibcalls.cpp
index e38fce764b640..1f94400f7c088 100644
--- a/llvm/lib/IR/RuntimeLibcalls.cpp
+++ b/llvm/lib/IR/RuntimeLibcalls.cpp
@@ -170,9 +170,6 @@ void RuntimeLibcallsInfo::initLibcalls(const Triple &TT) {
     // TODO: BridgeOS should be included in isOSDarwin.
     setLibcallName(RTLIB::EXP10_F32, "__exp10f");
     setLibcallName(RTLIB::EXP10_F64, "__exp10");
-  } else {
-    setLibcallName(RTLIB::FPEXT_F16_F32, "__gnu_h2f_ieee");
-    setLibcallName(RTLIB::FPROUND_F32_F16, "__gnu_f2h_ieee");
   }
 
   if (TT.isGNUEnvironment() || TT.isOSFuchsia() ||
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 5c4fe9d922f4c..3a975e5db0e5e 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -767,6 +767,9 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
       setLibcallName(LC.Op, LC.Name);
       setLibcallCallingConv(LC.Op, LC.CC);
     }
+  } else if (!Subtarget->isTargetMachO()) {
+    setLibcallName(RTLIB::FPROUND_F32_F16, "__gnu_f2h_ieee");
+    setLibcallName(RTLIB::FPEXT_F16_F32, "__gnu_h2f_ieee");
   }
 
   if (Subtarget->isThumb1Only())
diff --git a/llvm/lib/Target/Hexagon/HexagonISelLowering.cpp b/llvm/lib/Target/Hexagon/HexagonISelLowering.cpp
index 1a7667fe42fbc..be1960db41479 100644
--- a/llvm/lib/Target/Hexagon/HexagonISelLowering.cpp
+++ b/llvm/lib/Target/Hexagon/HexagonISelLowering.cpp
@@ -1886,11 +1886,6 @@ HexagonTargetLowering::HexagonTargetLowering(const TargetMachine &TM,
     setLibcallName(RTLIB::SQRT_F32, "__hexagon_fast2_sqrtf");
   else
     setLibcallName(RTLIB::SQRT_F32, "__hexagon_sqrtf");
-
-  // Routines to handle fp16 storage type.
-  setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");
-  setLibcallName(RTLIB::FPROUND_F64_F16, "__truncdfhf2");
-  setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");
 }
 
 const char* HexagonTargetLowering::getTargetNodeName(unsigned Opcode) const {
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 7ca8482149eb9..1e06215fa79e6 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1549,9 +1549,6 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
   if (Subtarget.useRVVForFixedLengthVectors())
     setTargetDAGCombine(ISD::BITCAST);
 
-  setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");
-  setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");
-
   // Disable strict node mutation.
   IsStrictFPEnabled = true;
   EnableExtLdPromotion = true;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index fedad25c775e2..3c918e8b675f0 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -377,11 +377,6 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
 
   setMaxAtomicSizeInBitsSupported(64);
 
-  // Override the __gnu_f2h_ieee/__gnu_h2f_ieee names so that the f32 name is
-  // consistent with the f64 and f128 names.
-  setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");
-  setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");
-
   // Define the emscripten name for return address helper.
   // TODO: when implementing other Wasm backends, make this generic or only do
   // this on emscripten depending on what they end up doing.
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp
index b20a06b238c88..1fe0b1f2e0591 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp
@@ -537,10 +537,6 @@ struct StaticLibcallNameMap {
         Map[NameLibcall.first] = NameLibcall.second;
       }
     }
-    // Override the __gnu_f2h_ieee/__gnu_h2f_ieee names so that the f32 name is
-    // consistent with the f64 and f128 names.
-    Map["__extendhfsf2"] = RTLIB::FPEXT_F16_F32;
-    Map["__truncsfhf2"] = RTLIB::FPROUND_F32_F16;
 
     Map["emscripten_return_address"] = RTLIB::RETURN_ADDRESS;
   }
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 91249f0bb009f..839c28ef643ef 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -736,9 +736,6 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
     setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f32, Custom);
     setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f64, Custom);
 
-    setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");
-    setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");
-
     // Lower this to MOVMSK plus an AND.
     setOperationAction(ISD::FGETSIGN, MVT::i64, Custom);
     setOperationAction(ISD::FGETSIGN, MVT::i32, Custom);
diff --git a/llvm/test/CodeGen/AArch64/16bit-float-promotion-with-nofp.ll b/llvm/test/CodeGen/AArch64/16bit-float-promotion-with-nofp.ll
index bfe9ab8424bb0..0bd7c1b10b123 100644
--- a/llvm/test/CodeGen/AArch64/16bit-float-promotion-with-nofp.ll
+++ b/llvm/test/CodeGen/AArch64/16bit-float-promotion-with-nofp.ll
@@ -7,7 +7,7 @@ define half @f2h(float %a) {
 ; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-NEXT:    .cfi_def_cfa_offset 16
 ; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __gnu_f2h_ieee
+; CHECK-NEXT:    bl __truncsfhf2
 ; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
 ; CHECK-NEXT:    ret
 entry:
diff --git a/llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll b/llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll
index 0c3a40d93d640..21729b9dfd101 100644
--- a/llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll
+++ b/llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll
@@ -60,13 +60,13 @@ define half @test_atomicrmw_fadd_f16_seq_cst_align2(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB0_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl __addsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB0_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB0_2 Depth=1
@@ -148,13 +148,13 @@ define half @test_atomicrmw_fadd_f16_seq_cst_align4(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB1_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl __addsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB1_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB1_2 Depth=1
@@ -712,22 +712,22 @@ define <2 x half> @test_atomicrmw_fadd_v2f16_seq_cst_align4(ptr %ptr, <2 x half>
 ; SOFTFP-NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB7_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    and w0, w19, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w23, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w24
 ; SOFTFP-NOLSE-NEXT:    bl __addsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w21, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w25, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w25
 ; SOFTFP-NOLSE-NEXT:    bl __addsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w22
 ; SOFTFP-NOLSE-NEXT:    bfi w0, w24, #16, #16
 ; SOFTFP-NOLSE-NEXT:    bfi w8, w23, #16, #16
diff --git a/llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll b/llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll
index 24088998f36d1..9b5e48d2b4217 100644
--- a/llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll
+++ b/llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll
@@ -62,13 +62,13 @@ define half @test_atomicrmw_fmax_f16_seq_cst_align2(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB0_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl fmaxf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB0_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB0_2 Depth=1
@@ -150,13 +150,13 @@ define half @test_atomicrmw_fmax_f16_seq_cst_align4(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB1_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl fmaxf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB1_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB1_2 Depth=1
@@ -592,22 +592,22 @@ define <2 x half> @test_atomicrmw_fmax_v2f16_seq_cst_align4(ptr %ptr, <2 x half>
 ; SOFTFP-NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB6_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    and w0, w19, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w23, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w24
 ; SOFTFP-NOLSE-NEXT:    bl fmaxf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w21, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w25, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w25
 ; SOFTFP-NOLSE-NEXT:    bl fmaxf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w22
 ; SOFTFP-NOLSE-NEXT:    bfi w0, w24, #16, #16
 ; SOFTFP-NOLSE-NEXT:    bfi w8, w23, #16, #16
diff --git a/llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll b/llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll
index 65f1f4863c173..f6c542fe7d407 100644
--- a/llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll
+++ b/llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll
@@ -62,13 +62,13 @@ define half @test_atomicrmw_fmin_f16_seq_cst_align2(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB0_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl fminf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB0_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB0_2 Depth=1
@@ -150,13 +150,13 @@ define half @test_atomicrmw_fmin_f16_seq_cst_align4(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB1_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl fminf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB1_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB1_2 Depth=1
@@ -592,22 +592,22 @@ define <2 x half> @test_atomicrmw_fmin_v2f16_seq_cst_align4(ptr %ptr, <2 x half>
 ; SOFTFP-NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB6_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    and w0, w19, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w23, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w24
 ; SOFTFP-NOLSE-NEXT:    bl fminf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w21, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w25, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w25
 ; SOFTFP-NOLSE-NEXT:    bl fminf
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w22
 ; SOFTFP-NOLSE-NEXT:    bfi w0, w24, #16, #16
 ; SOFTFP-NOLSE-NEXT:    bfi w8, w23, #16, #16
diff --git a/llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll b/llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll
index 0f1a2f03c98c3..82e0f14e68e26 100644
--- a/llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll
+++ b/llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll
@@ -60,13 +60,13 @@ define half @test_atomicrmw_fsub_f16_seq_cst_align2(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB0_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl __subsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB0_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB0_2 Depth=1
@@ -148,13 +148,13 @@ define half @test_atomicrmw_fsub_f16_seq_cst_align4(ptr %ptr, half %value) #0 {
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB1_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    mov w22, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w20, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w21, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w21
 ; SOFTFP-NOLSE-NEXT:    bl __subsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w0
 ; SOFTFP-NOLSE-NEXT:  .LBB1_3: // %cmpxchg.start
 ; SOFTFP-NOLSE-NEXT:    // Parent Loop BB1_2 Depth=1
@@ -712,22 +712,22 @@ define <2 x half> @test_atomicrmw_fsub_v2f16_seq_cst_align4(ptr %ptr, <2 x half>
 ; SOFTFP-NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; SOFTFP-NOLSE-NEXT:    // Child Loop BB7_3 Depth 2
 ; SOFTFP-NOLSE-NEXT:    and w0, w19, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w23, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w24
 ; SOFTFP-NOLSE-NEXT:    bl __subsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w24, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w21, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w25, w0
 ; SOFTFP-NOLSE-NEXT:    and w0, w22, #0xffff
-; SOFTFP-NOLSE-NEXT:    bl __gnu_h2f_ieee
+; SOFTFP-NOLSE-NEXT:    bl __extendhfsf2
 ; SOFTFP-NOLSE-NEXT:    mov w1, w25
 ; SOFTFP-NOLSE-NEXT:    bl __subsf3
-; SOFTFP-NOLSE-NEXT:    bl __gnu_f2h_ieee
+; SOFTFP-NOLSE-NEXT:    bl __truncsfhf2
 ; SOFTFP-NOLSE-NEXT:    mov w8, w22
 ; SOFTFP-NOLSE-NEXT:    bfi w0, w24, #16, #16
 ; SOFTFP-NOLSE-NEXT:    bfi w8, w23, #16, #16
diff --git a/llvm/test/CodeGen/AArch64/strictfp_f16_abi_promote.ll b/llvm/test/CodeGen/AArch64/strictfp_f16_abi_promote.ll
index 3db802a2bc355..63b8a1cee27ae 100644
--- a/llvm/test/CodeGen/AArch64/strictfp_f16_abi_promote.ll
+++ b/llvm/test/CodeGen/AArch64/strictfp_f16_abi_promote.ll
@@ -22,7 +22,7 @@ define void @f16_arg(half %arg, ptr %ptr) #0 {
 ; NOFP16-NEXT:    .cfi_offset w30, -16
 ; NOFP16-NEXT:    and w0, w0, #0xffff
 ; NOFP16-NEXT:    mov x19, x1
-; NOFP16-NEXT:    bl __gnu_h2f_ieee
+; NOFP16-NEXT:    bl __extendhfsf2
 ; NOFP16-NEXT:    str w0, [x19]
 ; NOFP16-NEXT:    ldp x30, x19, [sp], #16 // 16-byte Folded Reload
 ; NOFP16-NEXT:    ret
@@ -44,10 +44,10 @@ define void @v2f16_arg(<2 x half> %arg, ptr %ptr) #0 {
 ; NOFP16-NEXT:    and w0, w0, #0xffff
 ; NOFP16-NEXT:    mov x19, x2
 ; NOFP16-NEXT:    mov w20, w1
-; NOFP16-NEXT:    bl __gnu_h2f_ieee
+; NOFP16-NEXT:    bl __extendhfsf2
 ; NOFP16-NEXT:    mov w21, w0
 ; NOFP16-NEXT:    and w0, w20, #0xffff
-; NOFP16-NEXT:    bl __gnu_h2f_ieee
+; NOFP16-NEXT:    bl __extendhfsf2
 ; NOFP16-NEXT:    stp w21, w0, [x19]
 ; NOFP16-NEXT:    ldp x20, x19, [sp, #16] // 16-byte Folded Reload
 ; NOFP16-NEXT:    ldp x30, x21, [sp], #32 // 16-byte Folded Reload
@@ -73,14 +73,14 @@ define void @v3f16_arg(<3 x half> %arg, ptr %ptr) #0 {
 ; NOFP16-NEXT:    and w0, w1, #0xffff
 ; NOFP16-NEXT:    mov x19, x3
 ; NOFP16-NEXT:    mov w20, w2
-; NOFP16-NEXT:    bl __gnu_h2f_ieee
+; NOFP16-NEXT:    bl __extendhfsf2
 ; NOFP16-NEXT:    mov w22, w0
 ; NOFP16-NEXT:    and w0, w21, #0xffff
-; NOFP16-NEXT:    bl __gnu_h2f_i...
[truncated]

nikic · 2025-02-12T13:50:17Z

Okay, looking a bit more deeply, libgcc only seems to actually enable __extendhfsf2 and __truncsfhf2 for i386 and riscv, if I'm reading the sources correctly.

So while I still think it makes sense to use these symbols instead of the ARM ones, it probably makes little practical difference...

llvmbot added backend:ARM backend:AArch64 backend:Hexagon backend:RISC-V backend:PowerPC backend:WebAssembly backend:X86 backend:loongarch llvm:ir labels Feb 12, 2025

nikic requested review from arsenm, topperc and efriedma-quic February 12, 2025 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeGen] Use extendhfsf2 and truncsfhf2 by default #126880

[CodeGen] Use extendhfsf2 and truncsfhf2 by default #126880

nikic commented Feb 12, 2025 •

edited

Loading

llvmbot commented Feb 12, 2025 •

edited

Loading

nikic commented Feb 12, 2025

[CodeGen] Use __extendhfsf2 and __truncsfhf2 by default #126880

Are you sure you want to change the base?

[CodeGen] Use __extendhfsf2 and __truncsfhf2 by default #126880

Conversation

nikic commented Feb 12, 2025 • edited Loading

llvmbot commented Feb 12, 2025 • edited Loading

nikic commented Feb 12, 2025

[CodeGen] Use extendhfsf2 and truncsfhf2 by default #126880

[CodeGen] Use extendhfsf2 and truncsfhf2 by default #126880

nikic commented Feb 12, 2025 •

edited

Loading

llvmbot commented Feb 12, 2025 •

edited

Loading