Skip to content

Comments

JIT: Fix fgFoldCondToReturnBlock for multi-statement return blocks#124642

Open
yykkibbb wants to merge 4 commits intodotnet:mainfrom
yykkibbb:fix/constant-folded-operand-codegen-123621
Open

JIT: Fix fgFoldCondToReturnBlock for multi-statement return blocks#124642
yykkibbb wants to merge 4 commits intodotnet:mainfrom
yykkibbb:fix/constant-folded-operand-codegen-123621

Conversation

@yykkibbb
Copy link

Summary

Fixes #123621

When a constant-folded operand appears after a non-constant operand in a short-circuit && expression (e.g., v == 2 && Environment.NewLine != "\r\n"), callee inlining can leave dead local stores in the return block. The isReturnBool lambda in fgFoldCondToReturnBlock required hasSingleStmt(), which caused the optimization to bail out when these dead stores were present, resulting in suboptimal branching codegen.

Changes

  • src/coreclr/jit/optimizebools.cpp: Relax the hasSingleStmt() constraint in isReturnBool to allow preceding statements as long as they have no globally visible side effects (GTF_GLOBALLY_VISIBLE_SIDE_EFFECTS). This enables fgFoldCondToReturnBlock to fold the conditional into a branchless return even when dead local stores from inlining remain in the block.

Before (ARM64, Inline_After)

      cmp     w0, #2
      bne     G_M4495_IG04
      mov     w0, #1
      ret
G_M4495_IG04:
      mov     w0, #0
      ret

After (ARM64, Inline_After)

      cmp     w0, #2
      cset    x0, eq
      ret

Test plan

  • Added regression test Runtime_123621 covering the original issue pattern
  • Verified Hoisted, Inline_Before, and Inline_After all produce identical branchless codegen (cset on ARM64)
  • Verified existing DevDiv_168744 regression test still passes
  • Verified side-effect-ful blocks are correctly excluded from the optimization

…otnet#123621)

When a constant-folded operand appears after a non-constant operand in a
short-circuit && expression, callee inlining may leave dead local stores
in the return block. The isReturnBool lambda in fgFoldCondToReturnBlock
required hasSingleStmt(), causing the optimization to bail out when these
dead stores were present.

Relax the single-statement constraint to allow preceding statements as
long as they have no globally visible side effects (e.g., dead local
stores left over from inlining).

Fixes dotnet#123621
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 20, 2026
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Feb 20, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@@ -0,0 +1,53 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a regression to be under JIT/Regression

This is an optimization improvement, not a regression fix,
so the test belongs under JIT/opt/OptimizeBools.
@yykkibbb yykkibbb force-pushed the fix/constant-folded-operand-codegen-123621 branch from c2be40f to 8a2123e Compare February 20, 2026 15:39
@jakobbotsch
Copy link
Member

Do you know why the dead stores aren't being removed by dead code elimination? It would probably be better to fix that if possible.

@yykkibbb yykkibbb force-pushed the fix/constant-folded-operand-codegen-123621 branch from 09d46df to 4f539bb Compare February 20, 2026 17:19
The VN-based dead store removal phase only handled value-redundant
stores but had no mechanism to remove stores with zero remaining uses.
After Redundant branch opts removes JTRUE references, inlining temps'
dead stores were left behind, preventing fgFoldCondToReturnBlock from
producing branchless codegen.

- Add an SSA use recount pass to get accurate counts after earlier
  phases may have removed references without updating use counts
- Change the inner loop to start at defIndex=0 so inlining temps'
  first explicit def (at m_array[0], with no implicit entry def) is
  considered for removal
- Add a "no remaining uses" dead store removal path that removes the
  entire statement when data has no side effects, falling back to the
  COMMA approach otherwise
- Revert optimizebools.cpp to the original hasSingleStmt() check,
  since the dead stores are now properly removed before Optimize bools
@yykkibbb yykkibbb force-pushed the fix/constant-folded-operand-codegen-123621 branch from 4f539bb to 74defde Compare February 20, 2026 17:21
@yykkibbb
Copy link
Author

@jakobbotsch Thanks so much for taking the time to review — I really appreciate the suggestion. This is my first contribution to this repo, and I'm learning as I go, so I know there may be rough edges. Your feedback has been incredibly helpful, and I want to do my best to get this right.

The VN-based dead store removal phase only handles value-redundant stores (where the new value matches the previous def). It doesn't detect stores that simply have no remaining uses — which is exactly the case here after Redundant branch opts removes the referencing JTRUEs.

I've updated the PR to fix this at the DCE level:

  • Added an SSA use recount pass at the start of optVNBasedDeadStoreRemoval (use counts become stale after earlier phases remove references)
  • Added a "no remaining uses" removal path (GetNumUses() == 0)
  • Changed the loop to start at defIndex = 0 so inlining temps' first explicit def is also considered
  • Reverted optimizebools.cpp back to the original hasSingleStmt() check

Both dead stores are now properly removed, and fgFoldCondToReturnBlock produces branchless cset codegen.

@jakobbotsch
Copy link
Member

It was enough to say that this is dependent on RBO. I think this change is too invasive – I would go back to what you had before.

…t#123621)

Revert the DCE-level fix as it is too invasive. The original
optimizebools.cpp approach (allowing multi-statement return blocks
in isReturnBool) is sufficient since the dead stores are dependent
on Redundant branch opts.
@yykkibbb
Copy link
Author

@jakobbotsch Thank you for the feedback — I've reverted the DCE changes and restored the original approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Position of constant folded operand in or-statement affects codegen

3 participants