Fix perf regression urls#2638
Merged
dgageot merged 3 commits intodocker:mainfrom May 5, 2026
Merged
Conversation
…single-pass scanner fixHyperlinkWrapping previously routed every rendered output that contained at least one OSC 8 hyperlink through lipgloss.NewWrapWriter on every Render() call. Lipgloss's WrapWriter is a generic ANSI/SGR-aware stream rewriter; for our specific use case all we need is to close the active OSC 8 link before each '\n' and re-open it after, so the per-call overhead was disproportionate. Replace it with a tight, OSC-8-only single-pass scanner: track the most recent open sequence, emit close-newline-reopen when a '\n' is seen inside a link, and bail early when the input has no '\x1b]8;' or no '\n'. Benchstat (HEAD vs this commit, 6×300ms, M4 Max): FastRenderer allocs/op 4939 -> 155 -96.86% FastRenderer B/op 57.1Ki -> 46.0Ki -19.40% StreamingFastRenderer allocs/op 100.55M -> 3.74M -96.28% StreamingFastRenderer B/op 1.23Gi -> 1.03Gi -16.76% Allocation counts on URL-heavy renders are now back in line with the pre-PR-docker#2498 baseline (154 allocs/op for FastRenderer, 3.73M for the streaming bench). All hyperlink/auto-link/OSC-8 tests still pass. Assisted-By: docker-agent
findFirstURL is called on every inline-text segment processed by renderInlineWithStyleTo, and again on every segment in writeCodeSegmentsWithAutoLinks. The previous implementation always did up to two strings.Index substring searches even on plain prose that contains no URLs at all (the common case for status text, prompts, code, headings, table cells, etc.). Both supported URL prefixes start with 'h'. When 'h' isn't present anywhere in the input, neither prefix can match, so a single strings.IndexByte scan can short-circuit the whole function. IndexByte is heavily SIMD-optimised in the standard library and is dramatically cheaper than two substring searches on URL-free text. Benchstat (commit 1 vs this commit, 6×300ms, M4 Max): FastRenderer sec/op 825µs -> 199µs -75.84% FastRendererSmall sec/op 17.0µs -> 3.4µs -80.02% FastRendererCodeBlock sec/op 67.7µs -> 22.3µs -67.15% FastRendererTable sec/op 139µs -> 43µs -69.34% Allocation counts unchanged (already at baseline after commit 1). Assisted-By: docker-agent
Inside renderInlineWithStyleTo, every byte of plain prose runs through an auto-link check that compares text[i:i+7] against "http://" and text[i:i+8] against "https://". Even though Go's compiler turns these constant-string comparisons into direct byte compares, every iteration still pays for two slice-bounds checks plus the memequal call frame on 100% of plain-text bytes — dwarfing the actual work done on the rare URL. Both prefixes start with 'h'. Adding a single text[i] == 'h' guard short-circuits the slice creation + comparison on the ~99.x% of plain prose bytes that aren't 'h'. Benchstat (commit 2 vs this commit, 12×500ms, M4 Max): FastRenderer sec/op 192µs -> 172µs -10.53% (p=0.010) FastRendererTable sec/op 42.1µs -> 32.5µs -22.84% (p<0.001) geomean -13.82% Allocations unchanged. CodeBlock is unaffected because it routes through writeCodeSegmentsWithAutoLinks (which uses findFirstURL, already fast-pathed in commit 2). Assisted-By: docker-agent
gtardif
approved these changes
May 5, 2026
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The changes are clean performance optimizations with no functional regressions found.
Summary of changes reviewed:
- Added a single-byte
'h'gate to the URL detection hot loop inrenderInlineWithStyleTo— correct, since bothhttp://andhttps://start with'h' - Added an
IndexByte('h')early-out fast path infindFirstURL— logically sound - Replaced
lipgloss.NewWrapWriter-basedfixHyperlinkWrappingwith a single-pass scanner — the OSC 8 sequence parsing, close detection, and newline re-open logic are all correct
No CONFIRMED or LIKELY bugs found in the changed code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.