Skip to content

Fix perf regression urls#2638

Merged
dgageot merged 3 commits intodocker:mainfrom
dgageot:fix-perf-regression-urls
May 5, 2026
Merged

Fix perf regression urls#2638
dgageot merged 3 commits intodocker:mainfrom
dgageot:fix-perf-regression-urls

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented May 5, 2026

No description provided.

dgageot added 3 commits May 5, 2026 14:38
…single-pass scanner

fixHyperlinkWrapping previously routed every rendered output that contained at least one OSC 8 hyperlink through lipgloss.NewWrapWriter on every Render() call. Lipgloss's WrapWriter is a generic ANSI/SGR-aware stream rewriter; for our specific use case all we need is to close the active OSC 8 link before each '\n' and re-open it after, so the per-call overhead was disproportionate.

Replace it with a tight, OSC-8-only single-pass scanner: track the most recent open sequence, emit close-newline-reopen when a '\n' is seen inside a link, and bail early when the input has no '\x1b]8;' or no '\n'.

Benchstat (HEAD vs this commit, 6×300ms, M4 Max):

  FastRenderer            allocs/op  4939 -> 155      -96.86%

  FastRenderer            B/op       57.1Ki -> 46.0Ki -19.40%

  StreamingFastRenderer   allocs/op  100.55M -> 3.74M -96.28%

  StreamingFastRenderer   B/op       1.23Gi -> 1.03Gi  -16.76%

Allocation counts on URL-heavy renders are now back in line with the pre-PR-docker#2498 baseline (154 allocs/op for FastRenderer, 3.73M for the streaming bench).

All hyperlink/auto-link/OSC-8 tests still pass.

Assisted-By: docker-agent
findFirstURL is called on every inline-text segment processed by
renderInlineWithStyleTo, and again on every segment in
writeCodeSegmentsWithAutoLinks. The previous implementation always did up to
two strings.Index substring searches even on plain prose that contains no URLs
at all (the common case for status text, prompts, code, headings, table cells,
etc.).

Both supported URL prefixes start with 'h'. When 'h' isn't present anywhere in
the input, neither prefix can match, so a single strings.IndexByte scan can
short-circuit the whole function. IndexByte is heavily SIMD-optimised in the
standard library and is dramatically cheaper than two substring searches on
URL-free text.

Benchstat (commit 1 vs this commit, 6×300ms, M4 Max):

  FastRenderer            sec/op  825µs -> 199µs   -75.84%
  FastRendererSmall       sec/op  17.0µs -> 3.4µs  -80.02%
  FastRendererCodeBlock   sec/op  67.7µs -> 22.3µs -67.15%
  FastRendererTable       sec/op  139µs -> 43µs    -69.34%

Allocation counts unchanged (already at baseline after commit 1).

Assisted-By: docker-agent
Inside renderInlineWithStyleTo, every byte of plain prose runs through an
auto-link check that compares text[i:i+7] against "http://" and text[i:i+8]
against "https://". Even though Go's compiler turns these constant-string
comparisons into direct byte compares, every iteration still pays for two
slice-bounds checks plus the memequal call frame on 100% of plain-text bytes —
dwarfing the actual work done on the rare URL.

Both prefixes start with 'h'. Adding a single text[i] == 'h' guard
short-circuits the slice creation + comparison on the ~99.x% of plain prose
bytes that aren't 'h'.

Benchstat (commit 2 vs this commit, 12×500ms, M4 Max):

  FastRenderer        sec/op  192µs -> 172µs   -10.53% (p=0.010)
  FastRendererTable   sec/op  42.1µs -> 32.5µs -22.84% (p<0.001)
  geomean                                       -13.82%

Allocations unchanged. CodeBlock is unaffected because it routes through
writeCodeSegmentsWithAutoLinks (which uses findFirstURL, already fast-pathed
in commit 2).

Assisted-By: docker-agent
@dgageot dgageot requested a review from a team as a code owner May 5, 2026 12:41
Copy link
Copy Markdown

@docker-agent docker-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The changes are clean performance optimizations with no functional regressions found.

Summary of changes reviewed:

  • Added a single-byte 'h' gate to the URL detection hot loop in renderInlineWithStyleTo — correct, since both http:// and https:// start with 'h'
  • Added an IndexByte('h') early-out fast path in findFirstURL — logically sound
  • Replaced lipgloss.NewWrapWriter-based fixHyperlinkWrapping with a single-pass scanner — the OSC 8 sequence parsing, close detection, and newline re-open logic are all correct

No CONFIRMED or LIKELY bugs found in the changed code.

@dgageot dgageot merged commit 3b04a65 into docker:main May 5, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants