Skip to content

[AArch64][SelectionDAG] Generate subs+csel for usub.sat#193203

Open
bojle wants to merge 3 commits intollvm:mainfrom
bojle:upstream_satsub
Open

[AArch64][SelectionDAG] Generate subs+csel for usub.sat#193203
bojle wants to merge 3 commits intollvm:mainfrom
bojle:upstream_satsub

Conversation

@bojle
Copy link
Copy Markdown
Contributor

@bojle bojle commented Apr 21, 2026

Fixes #191488

As this is a regression of
#170076, adds a check to avoid generic lowering of usub.sat to X - zext(X != 0) in case of aarch64 via a virtual hook in TargetLowering. All other backends will still receive generic lowering as implemented in the original patch.

Change-Id: I0a194bcc9e66819c12d0f9179464823301f0d7bf

@bojle bojle requested a review from efriedma-quic April 21, 2026 12:03
@llvmbot llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well labels Apr 21, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 21, 2026

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-selectiondag

Author: Shreeyash Pandey (bojle)

Changes

Fixes #191488

As this is a regression of
#170076, adds a check to avoid generic lowering of usub.sat to X - zext(X != 0) in case of aarch64 via a virtual hook in TargetLowering. All other backends will still receive generic lowering as implemented in the original patch.

Change-Id: I0a194bcc9e66819c12d0f9179464823301f0d7bf


Full diff: https://github.com/llvm/llvm-project/pull/193203.diff

6 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+2-1)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+5)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+2)
  • (modified) llvm/test/CodeGen/AArch64/and-mask-removal.ll (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/usub_sat_plus.ll (+18)
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 59a0f2d2e0c2a..441a407e2edc1 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3595,6 +3595,10 @@ class LLVM_ABI TargetLoweringBase {
     return false;
   }
 
+  /// Should usub.sat(X, 1) prefer the generic lowering X - zext(X != 0) over
+  /// the default overflow/select expansion?
+  virtual bool preferSubOfZextForUsubSatOne(EVT VT) const { return true; }
+
   /// True if target has some particular form of dealing with pointer arithmetic
   /// semantics for pointers with the given value type. False if pointer
   /// arithmetic should not be preserved for passes such as instruction
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index e6aa222425d13..be7401c0328d2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -11475,7 +11475,8 @@ SDValue TargetLowering::expandAddSubSat(SDNode *Node, SelectionDAG &DAG) const {
   }
 
   // usub.sat(a, 1) -> sub(a, zext(a != 0))
-  if (Opcode == ISD::USUBSAT && isOneOrOneSplat(RHS)) {
+  if (Opcode == ISD::USUBSAT && isOneOrOneSplat(RHS) &&
+      preferSubOfZextForUsubSatOne(VT)) {
     LHS = DAG.getFreeze(LHS);
     SDValue Zero = DAG.getConstant(0, dl, VT);
     EVT BoolVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 9b34d9b385b4e..ea4d5467c73d5 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -30978,6 +30978,11 @@ bool AArch64TargetLowering::shouldConvertFpToSat(unsigned Op, EVT FPVT,
   return TargetLowering::shouldConvertFpToSat(Op, FPVT, VT);
 }
 
+bool AArch64TargetLowering::preferSubOfZextForUsubSatOne(EVT /*VT*/) const {
+  // See https://github.com/llvm/llvm-project/issues/191488
+  return false;
+}
+
 bool AArch64TargetLowering::preferSelectsOverBooleanArithmetic(EVT VT) const {
   // Expand scalar and SVE operations using selects. Neon vectors prefer sub to
   // avoid vselect becoming bsl / unrolling.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 58efdd3e18fc0..cdef09ef7013e 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -450,6 +450,8 @@ class AArch64TargetLowering : public TargetLowering {
 
   bool shouldConvertFpToSat(unsigned Op, EVT FPVT, EVT VT) const override;
 
+  bool preferSubOfZextForUsubSatOne(EVT VT) const override;
+
   bool preferSelectsOverBooleanArithmetic(EVT VT) const override;
 
   bool isComplexDeinterleavingSupported() const override;
diff --git a/llvm/test/CodeGen/AArch64/and-mask-removal.ll b/llvm/test/CodeGen/AArch64/and-mask-removal.ll
index 855fe5caf97b2..5046c0571ad2b 100644
--- a/llvm/test/CodeGen/AArch64/and-mask-removal.ll
+++ b/llvm/test/CodeGen/AArch64/and-mask-removal.ll
@@ -483,9 +483,9 @@ define i64 @pr58109(i8 signext %0) {
 ; CHECK-SD-LABEL: pr58109:
 ; CHECK-SD:       ; %bb.0:
 ; CHECK-SD-NEXT:    add w8, w0, #1
-; CHECK-SD-NEXT:    ands w8, w8, #0xff
-; CHECK-SD-NEXT:    cset w9, ne
-; CHECK-SD-NEXT:    sub w0, w8, w9
+; CHECK-SD-NEXT:    and w8, w8, #0xff
+; CHECK-SD-NEXT:    subs w8, w8, #1
+; CHECK-SD-NEXT:    csel w0, wzr, w8, lo
 ; CHECK-SD-NEXT:    ret
 ;
 ; CHECK-GI-LABEL: pr58109:
diff --git a/llvm/test/CodeGen/AArch64/usub_sat_plus.ll b/llvm/test/CodeGen/AArch64/usub_sat_plus.ll
index 2793aeb163c94..9f1e2eeb04781 100644
--- a/llvm/test/CodeGen/AArch64/usub_sat_plus.ll
+++ b/llvm/test/CodeGen/AArch64/usub_sat_plus.ll
@@ -8,6 +8,24 @@ declare i16 @llvm.usub.sat.i16(i16, i16)
 declare i32 @llvm.usub.sat.i32(i32, i32)
 declare i64 @llvm.usub.sat.i64(i64, i64)
 
+define i32 @sat_dec_i32(i32 %x) nounwind {
+; CHECK-SD-LABEL: sat_dec_i32:
+; CHECK-SD:       // %bb.0:
+; CHECK-SD-NEXT:    subs w8, w0, #1
+; CHECK-SD-NEXT:    csel w0, wzr, w8, lo
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: sat_dec_i32:
+; CHECK-GI:       // %bb.0:
+; CHECK-GI-NEXT:    subs w8, w0, #1
+; CHECK-GI-NEXT:    cset w9, lo
+; CHECK-GI-NEXT:    tst w9, #0x1
+; CHECK-GI-NEXT:    csel w0, wzr, w8, ne
+; CHECK-GI-NEXT:    ret
+  %tmp = call i32 @llvm.usub.sat.i32(i32 %x, i32 1)
+  ret i32 %tmp
+}
+
 define i32 @func32(i32 %x, i32 %y, i32 %z) nounwind {
 ; CHECK-SD-LABEL: func32:
 ; CHECK-SD:       // %bb.0:

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 21, 2026

@llvm/pr-subscribers-backend-aarch64

Author: Shreeyash Pandey (bojle)

Changes

Fixes #191488

As this is a regression of
#170076, adds a check to avoid generic lowering of usub.sat to X - zext(X != 0) in case of aarch64 via a virtual hook in TargetLowering. All other backends will still receive generic lowering as implemented in the original patch.

Change-Id: I0a194bcc9e66819c12d0f9179464823301f0d7bf


Full diff: https://github.com/llvm/llvm-project/pull/193203.diff

6 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+2-1)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+5)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+2)
  • (modified) llvm/test/CodeGen/AArch64/and-mask-removal.ll (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/usub_sat_plus.ll (+18)
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 59a0f2d2e0c2a..441a407e2edc1 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3595,6 +3595,10 @@ class LLVM_ABI TargetLoweringBase {
     return false;
   }
 
+  /// Should usub.sat(X, 1) prefer the generic lowering X - zext(X != 0) over
+  /// the default overflow/select expansion?
+  virtual bool preferSubOfZextForUsubSatOne(EVT VT) const { return true; }
+
   /// True if target has some particular form of dealing with pointer arithmetic
   /// semantics for pointers with the given value type. False if pointer
   /// arithmetic should not be preserved for passes such as instruction
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index e6aa222425d13..be7401c0328d2 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -11475,7 +11475,8 @@ SDValue TargetLowering::expandAddSubSat(SDNode *Node, SelectionDAG &DAG) const {
   }
 
   // usub.sat(a, 1) -> sub(a, zext(a != 0))
-  if (Opcode == ISD::USUBSAT && isOneOrOneSplat(RHS)) {
+  if (Opcode == ISD::USUBSAT && isOneOrOneSplat(RHS) &&
+      preferSubOfZextForUsubSatOne(VT)) {
     LHS = DAG.getFreeze(LHS);
     SDValue Zero = DAG.getConstant(0, dl, VT);
     EVT BoolVT = getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), VT);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 9b34d9b385b4e..ea4d5467c73d5 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -30978,6 +30978,11 @@ bool AArch64TargetLowering::shouldConvertFpToSat(unsigned Op, EVT FPVT,
   return TargetLowering::shouldConvertFpToSat(Op, FPVT, VT);
 }
 
+bool AArch64TargetLowering::preferSubOfZextForUsubSatOne(EVT /*VT*/) const {
+  // See https://github.com/llvm/llvm-project/issues/191488
+  return false;
+}
+
 bool AArch64TargetLowering::preferSelectsOverBooleanArithmetic(EVT VT) const {
   // Expand scalar and SVE operations using selects. Neon vectors prefer sub to
   // avoid vselect becoming bsl / unrolling.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 58efdd3e18fc0..cdef09ef7013e 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -450,6 +450,8 @@ class AArch64TargetLowering : public TargetLowering {
 
   bool shouldConvertFpToSat(unsigned Op, EVT FPVT, EVT VT) const override;
 
+  bool preferSubOfZextForUsubSatOne(EVT VT) const override;
+
   bool preferSelectsOverBooleanArithmetic(EVT VT) const override;
 
   bool isComplexDeinterleavingSupported() const override;
diff --git a/llvm/test/CodeGen/AArch64/and-mask-removal.ll b/llvm/test/CodeGen/AArch64/and-mask-removal.ll
index 855fe5caf97b2..5046c0571ad2b 100644
--- a/llvm/test/CodeGen/AArch64/and-mask-removal.ll
+++ b/llvm/test/CodeGen/AArch64/and-mask-removal.ll
@@ -483,9 +483,9 @@ define i64 @pr58109(i8 signext %0) {
 ; CHECK-SD-LABEL: pr58109:
 ; CHECK-SD:       ; %bb.0:
 ; CHECK-SD-NEXT:    add w8, w0, #1
-; CHECK-SD-NEXT:    ands w8, w8, #0xff
-; CHECK-SD-NEXT:    cset w9, ne
-; CHECK-SD-NEXT:    sub w0, w8, w9
+; CHECK-SD-NEXT:    and w8, w8, #0xff
+; CHECK-SD-NEXT:    subs w8, w8, #1
+; CHECK-SD-NEXT:    csel w0, wzr, w8, lo
 ; CHECK-SD-NEXT:    ret
 ;
 ; CHECK-GI-LABEL: pr58109:
diff --git a/llvm/test/CodeGen/AArch64/usub_sat_plus.ll b/llvm/test/CodeGen/AArch64/usub_sat_plus.ll
index 2793aeb163c94..9f1e2eeb04781 100644
--- a/llvm/test/CodeGen/AArch64/usub_sat_plus.ll
+++ b/llvm/test/CodeGen/AArch64/usub_sat_plus.ll
@@ -8,6 +8,24 @@ declare i16 @llvm.usub.sat.i16(i16, i16)
 declare i32 @llvm.usub.sat.i32(i32, i32)
 declare i64 @llvm.usub.sat.i64(i64, i64)
 
+define i32 @sat_dec_i32(i32 %x) nounwind {
+; CHECK-SD-LABEL: sat_dec_i32:
+; CHECK-SD:       // %bb.0:
+; CHECK-SD-NEXT:    subs w8, w0, #1
+; CHECK-SD-NEXT:    csel w0, wzr, w8, lo
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: sat_dec_i32:
+; CHECK-GI:       // %bb.0:
+; CHECK-GI-NEXT:    subs w8, w0, #1
+; CHECK-GI-NEXT:    cset w9, lo
+; CHECK-GI-NEXT:    tst w9, #0x1
+; CHECK-GI-NEXT:    csel w0, wzr, w8, ne
+; CHECK-GI-NEXT:    ret
+  %tmp = call i32 @llvm.usub.sat.i32(i32 %x, i32 1)
+  ret i32 %tmp
+}
+
 define i32 @func32(i32 %x, i32 %y, i32 %z) nounwind {
 ; CHECK-SD-LABEL: func32:
 ; CHECK-SD:       // %bb.0:

Comment thread llvm/include/llvm/CodeGen/TargetLowering.h Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 22, 2026

🐧 Linux x64 Test Results

  • 194186 tests passed
  • 5119 tests skipped

✅ The build succeeded and all tests passed.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 22, 2026

🪟 Windows x64 Test Results

  • 133818 tests passed
  • 3136 tests skipped

✅ The build succeeded and all tests passed.

Copy link
Copy Markdown
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please regenerate x86/combine-sub-usat.ll

bojle added 3 commits April 23, 2026 01:27
Fixes llvm#191488

As this is a regression of
llvm#170076, adds a check to avoid
generic lowering of usub.sat to X - zext(X != 0) in case of aarch64 via
a virtual hook in TargetLowering. All other backends will still receive
generic lowering as implemented in the original patch.

Change-Id: I0a194bcc9e66819c12d0f9179464823301f0d7bf
Change-Id: I4575a604fe5a76be2e657c63707c5fe25d631b98
Change-Id: If34cdfe1eea8c30f6be502a63270ae5b6c34d141
@bojle bojle force-pushed the upstream_satsub branch from eb63508 to a5fc224 Compare April 23, 2026 09:29
@RKSimon RKSimon self-requested a review April 23, 2026 11:46
@bojle bojle requested a review from arsenm April 23, 2026 13:15
@bojle
Copy link
Copy Markdown
Contributor Author

bojle commented Apr 23, 2026

hmm... the CI failures seem unrelated to this patch. Are these the same failures as #193724?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AArch64] Suboptimal code for saturating subtract of 1

4 participants