Skip to content

refactor(runtime): extract model-fallback chain into fallbackExecutor#2541

Merged
dgageot merged 1 commit intodocker:mainfrom
dgageot:extract-fallback-executor
Apr 28, 2026
Merged

refactor(runtime): extract model-fallback chain into fallbackExecutor#2541
dgageot merged 1 commit intodocker:mainfrom
dgageot:extract-fallback-executor

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented Apr 27, 2026

Summary

The model-fallback subsystem owned three fields and a mutex on
LocalRuntime that no other code touches:

retryOnRateLimit     bool
fallbackCooldowns    map[string]*fallbackCooldownState
fallbackCooldownsMux sync.RWMutex

…plus five methods on *LocalRuntime whose only job is to
operate on that state. The boundary was already there implicitly.
This PR makes it explicit by introducing a fallbackExecutor type
that owns the state and methods, and replacing the three runtime
fields with one (fallback *fallbackExecutor).

What changes

Before After
LocalRuntime.retryOnRateLimit bool fallbackExecutor.retryOnRateLimit
LocalRuntime.fallbackCooldowns map[…] fallbackExecutor.cooldowns
LocalRuntime.fallbackCooldownsMux sync.RWMutex fallbackExecutor.cooldownsMux
r.getCooldownState / setCooldownState / clearCooldownState methods on *fallbackExecutor
r.tryModelWithFallback(...) r.fallback.execute(...)
r.handleModelError(...) e.handleModelError(...) (executor-private)

Stateless helpers (buildModelChain, logFallbackAttempt,
logRetryBackoff, getEffectiveCooldown, getEffectiveRetries)
stay free — they have no reason to move.

The two callers of getCooldownState outside fallback.go
(getEffectiveModelID and agentDetailsFromTeam, both for
cooldown-aware sidebar display) updated to r.fallback.X.

WithRetryOnRateLimit() now sets the flag on r.fallback
instead of the runtime field — a one-line change.

Why this one and not the others

I evaluated several extraction candidates honestly. The strongest
case for this one:

  • Real shrinkage: LocalRuntime loses 3 fields + a mutex. No
    other candidate I considered (sub-session, stream chunker)
    shrinks the runtime struct.
  • Zero speculation: the cooldown map and retry flag are
    already de-facto private to fallback logic. Making the boundary
    explicit doesn't bet on future callers — it just names what's
    already true.
  • No public-API change: WithRetryOnRateLimit() is the only
    public touchpoint and its signature is unchanged.

Diff size

5 files, +81 / −50:

  • pkg/runtime/fallback.go — most of the change (rewriting
    receivers, adding the executor type, the constructor, doc
    paragraph)
  • pkg/runtime/runtime.go — drop 3 fields, add 1, update 2
    call sites in cooldown-aware UI code, update one Opt
  • pkg/runtime/loop.go — one call site updated
  • pkg/runtime/fallback_test.go — 8 lines: update direct cooldown
    test to call through rt.fallback
  • pkg/runtime/runtime_test.go — 2 stale doc-comment references
    to "tryModelWithFallback" updated

Behaviour

Unchanged. Same cooldown semantics, same retry policy, same events
emitted (ModelFallback, AgentInfo), same error wrapping
(ContextOverflowError), same logging.

Validation

  • mise lint — 0 issues across 807 files
  • mise test — full suite passes (including the cooldown unit
    tests that now go through rt.fallback)

Honest scope assessment

This is a structural improvement: the runtime struct shrinks,
fallback state lives in one named place, and the boundary is
documented. It is not a behaviour change, a feature, or
preparation for a specific future refactor. It pays for itself
on day one because today, anyone reading LocalRuntime no longer
sees three fields whose purpose is opaque without reading
fallback.go.

A natural follow-up — moving fallbackExecutor into a sub-package
(e.g. pkg/runtime/internal/fallback) for compile-time isolation —
is intentionally not done here. It would require inverting the
events-channel dependency (since Event is currently a runtime
type), which is a much bigger refactor and explicitly out of scope.

@dgageot dgageot requested a review from a team as a code owner April 27, 2026 15:28
melmennaoui
melmennaoui previously approved these changes Apr 28, 2026
The model-fallback subsystem owned two fields on LocalRuntime that
no other code touched:

  retryOnRateLimit bool
  cooldowns        *cooldownManager

…plus two methods on *LocalRuntime that exist purely to operate
on that state (tryModelWithFallback, handleModelError). The
boundary was already there implicitly; this commit makes it
explicit by introducing a fallbackExecutor type.

- Introduce fallbackExecutor in pkg/runtime/fallback.go. It owns
  the cooldownManager, the retryOnRateLimit flag, the Telemetry
  sink it forwards to handleStream, and all the methods listed
  above plus three new helpers (chainStartIndex, recordSuccess,
  classifyAttemptError) that tighten the inner retry loop.
- LocalRuntime.fallback *fallbackExecutor replaces the two
  removed fields. Constructed once in NewLocalRuntime via
  newFallbackExecutor() before opts run so WithRetryOnRateLimit()
  can mutate it; cooldowns and telemetry are wired after opts so
  WithClock and WithTelemetry are reflected.
- tryModelWithFallback renamed to fallbackExecutor.execute.
  "fallback" was redundant once the method moved onto a type
  whose name already says "fallback"; the call site reads
  r.fallback.execute(...) which is clean.
- handleModelError moves to *fallbackExecutor as well — it
  reads e.retryOnRateLimit.
- Free helpers (buildModelChain, logFallbackAttempt,
  logRetryBackoff, getEffectiveCooldown, getEffectiveRetries)
  stay free; they are stateless and have no reason to move.
- New per-iteration helpers on *fallbackExecutor:
    chainStartIndex      one-liner at the top of execute that
                         consults the cooldownManager and skips
                         the primary when a fallback is pinned.
    recordSuccess        replaces the cooldown set-or-clear switch
                         at the bottom of the success path.
    classifyAttemptError consolidates the identical
                         ctx-cancellation + handleModelError +
                         retryDecisionReturn-vs-Break dispatch
                         block that used to follow both
                         CreateChatCompletionStream and
                         handleStream.
- The two callers of cooldown state in runtime.go
  (getEffectiveModelID, agentDetailsFromTeam — both for the
  cooldown-aware sidebar display) updated to
  r.fallback.cooldowns.Get(...).
- Tests in fallback_test.go and clock_test.go that exercise
  cooldown directly updated to rt.fallback.cooldowns.X.
  End-to-end RunStream-based tests unchanged. Two stale
  references to "tryModelWithFallback" in runtime_test.go
  comments updated to "fallback.execute".

Net effect on LocalRuntime: −2 fields, +1 field. All fallback
state is now scoped to one type that can be reasoned about and
tested in isolation. Behaviour is unchanged: same cooldown
semantics, same retry policy, same events emitted, same error
shapes returned. The full lint and test suites pass.

Assisted-By: docker-agent
@dgageot dgageot merged commit f7bee6f into docker:main Apr 28, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants