refactor(runtime): extract model-fallback chain into fallbackExecutor#2541
Merged
dgageot merged 1 commit intodocker:mainfrom Apr 28, 2026
Merged
refactor(runtime): extract model-fallback chain into fallbackExecutor#2541dgageot merged 1 commit intodocker:mainfrom
dgageot merged 1 commit intodocker:mainfrom
Conversation
melmennaoui
previously approved these changes
Apr 28, 2026
The model-fallback subsystem owned two fields on LocalRuntime that
no other code touched:
retryOnRateLimit bool
cooldowns *cooldownManager
…plus two methods on *LocalRuntime that exist purely to operate
on that state (tryModelWithFallback, handleModelError). The
boundary was already there implicitly; this commit makes it
explicit by introducing a fallbackExecutor type.
- Introduce fallbackExecutor in pkg/runtime/fallback.go. It owns
the cooldownManager, the retryOnRateLimit flag, the Telemetry
sink it forwards to handleStream, and all the methods listed
above plus three new helpers (chainStartIndex, recordSuccess,
classifyAttemptError) that tighten the inner retry loop.
- LocalRuntime.fallback *fallbackExecutor replaces the two
removed fields. Constructed once in NewLocalRuntime via
newFallbackExecutor() before opts run so WithRetryOnRateLimit()
can mutate it; cooldowns and telemetry are wired after opts so
WithClock and WithTelemetry are reflected.
- tryModelWithFallback renamed to fallbackExecutor.execute.
"fallback" was redundant once the method moved onto a type
whose name already says "fallback"; the call site reads
r.fallback.execute(...) which is clean.
- handleModelError moves to *fallbackExecutor as well — it
reads e.retryOnRateLimit.
- Free helpers (buildModelChain, logFallbackAttempt,
logRetryBackoff, getEffectiveCooldown, getEffectiveRetries)
stay free; they are stateless and have no reason to move.
- New per-iteration helpers on *fallbackExecutor:
chainStartIndex one-liner at the top of execute that
consults the cooldownManager and skips
the primary when a fallback is pinned.
recordSuccess replaces the cooldown set-or-clear switch
at the bottom of the success path.
classifyAttemptError consolidates the identical
ctx-cancellation + handleModelError +
retryDecisionReturn-vs-Break dispatch
block that used to follow both
CreateChatCompletionStream and
handleStream.
- The two callers of cooldown state in runtime.go
(getEffectiveModelID, agentDetailsFromTeam — both for the
cooldown-aware sidebar display) updated to
r.fallback.cooldowns.Get(...).
- Tests in fallback_test.go and clock_test.go that exercise
cooldown directly updated to rt.fallback.cooldowns.X.
End-to-end RunStream-based tests unchanged. Two stale
references to "tryModelWithFallback" in runtime_test.go
comments updated to "fallback.execute".
Net effect on LocalRuntime: −2 fields, +1 field. All fallback
state is now scoped to one type that can be reasoned about and
tested in isolation. Behaviour is unchanged: same cooldown
semantics, same retry policy, same events emitted, same error
shapes returned. The full lint and test suites pass.
Assisted-By: docker-agent
5fe7eca to
13f2686
Compare
rumpl
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The model-fallback subsystem owned three fields and a mutex on
LocalRuntimethat no other code touches:…plus five methods on
*LocalRuntimewhose only job is tooperate on that state. The boundary was already there implicitly.
This PR makes it explicit by introducing a
fallbackExecutortypethat owns the state and methods, and replacing the three runtime
fields with one (
fallback *fallbackExecutor).What changes
LocalRuntime.retryOnRateLimit boolfallbackExecutor.retryOnRateLimitLocalRuntime.fallbackCooldowns map[…]fallbackExecutor.cooldownsLocalRuntime.fallbackCooldownsMux sync.RWMutexfallbackExecutor.cooldownsMuxr.getCooldownState / setCooldownState / clearCooldownState*fallbackExecutorr.tryModelWithFallback(...)r.fallback.execute(...)r.handleModelError(...)e.handleModelError(...)(executor-private)Stateless helpers (
buildModelChain,logFallbackAttempt,logRetryBackoff,getEffectiveCooldown,getEffectiveRetries)stay free — they have no reason to move.
The two callers of
getCooldownStateoutsidefallback.go(
getEffectiveModelIDandagentDetailsFromTeam, both forcooldown-aware sidebar display) updated to
r.fallback.X.WithRetryOnRateLimit()now sets the flag onr.fallbackinstead of the runtime field — a one-line change.
Why this one and not the others
I evaluated several extraction candidates honestly. The strongest
case for this one:
LocalRuntimeloses 3 fields + a mutex. Noother candidate I considered (sub-session, stream chunker)
shrinks the runtime struct.
already de-facto private to fallback logic. Making the boundary
explicit doesn't bet on future callers — it just names what's
already true.
WithRetryOnRateLimit()is the onlypublic touchpoint and its signature is unchanged.
Diff size
5 files, +81 / −50:
pkg/runtime/fallback.go— most of the change (rewritingreceivers, adding the executor type, the constructor, doc
paragraph)
pkg/runtime/runtime.go— drop 3 fields, add 1, update 2call sites in cooldown-aware UI code, update one Opt
pkg/runtime/loop.go— one call site updatedpkg/runtime/fallback_test.go— 8 lines: update direct cooldowntest to call through
rt.fallbackpkg/runtime/runtime_test.go— 2 stale doc-comment referencesto "tryModelWithFallback" updated
Behaviour
Unchanged. Same cooldown semantics, same retry policy, same events
emitted (
ModelFallback,AgentInfo), same error wrapping(
ContextOverflowError), same logging.Validation
mise lint— 0 issues across 807 filesmise test— full suite passes (including the cooldown unittests that now go through
rt.fallback)Honest scope assessment
This is a structural improvement: the runtime struct shrinks,
fallback state lives in one named place, and the boundary is
documented. It is not a behaviour change, a feature, or
preparation for a specific future refactor. It pays for itself
on day one because today, anyone reading
LocalRuntimeno longersees three fields whose purpose is opaque without reading
fallback.go.A natural follow-up — moving
fallbackExecutorinto a sub-package(e.g.
pkg/runtime/internal/fallback) for compile-time isolation —is intentionally not done here. It would require inverting the
events-channel dependency (since
Eventis currently a runtimetype), which is a much bigger refactor and explicitly out of scope.