Skip to content

core,opentelemetry: Fix server metric labels on early close#12774

Open
becomeStar wants to merge 3 commits intogrpc:masterfrom
becomeStar:fix/otel-server-early-method-resolution
Open

core,opentelemetry: Fix server metric labels on early close#12774
becomeStar wants to merge 3 commits intogrpc:masterfrom
becomeStar:fix/otel-server-early-method-resolution

Conversation

@becomeStar
Copy link
Copy Markdown
Contributor

This addresses the server-side OpenTelemetry metric labeling bug from #12117 where a generated method can be recorded as grpc.method="other" if streamClosed() happens before serverCallStarted().

What changed

  • add an internal StatsTraceContext.ServerCallMethodListener hook so tracers can consume an already-resolved primary-registry MethodDescriptor
  • resolve the immutable internal primary registry on the transport path and seed method classification before the async MethodLookup path runs
  • keep fallback registry lookup on the existing async path
  • update the OpenTelemetry server tracer to use the early-resolved method classification for close metrics

Why this shape

  • avoids tracer-side HandlerRegistry lookup
  • uses only the immutable internal primary registry for early transport-path lookup
  • keeps fallback registry lookup on the existing async path

Tests

  • primary generated method: early close preserves the generated method name
  • primary non-generated method: early close still records other
  • fallback generated method: fallback lookup remains on the existing async path and does not introduce early transport-path classification
  • tracer-level regression: serverCallMethodResolved() + streamClosed() records the generated method name without waiting for serverCallStarted()

Notes

  • ServerCallMethodListener is an internal hook that carries the resolved MethodDescriptor; tracers consume the resolved result instead of performing registry lookup themselves
  • ServerImpl uses InternalHandlerRegistry explicitly for the primary registry to make it clear that the early transport- path lookup is limited to the immutable internal primary registry
  • this PR intentionally does not widen transport-path lookup to the fallback registry

Ref #12117

Resolve generated-method classification before serverCallStarted() so close metrics do not fall back
 to "other" when streamClosed() happens first.

Keep fallback registry lookup on the existing async
 path and avoid tracer-side HandlerRegistry access to match the maintainer constraints from issue grpc#12117.

Add regressions for primary generated, primary non-generated, and fallback-generated server paths,
 plus the tracer-level early-resolution contract.
@kannanjgithub
Copy link
Copy Markdown
Contributor

Unit test coverage is missing for non-generated method name case. If a method is not generated (or isSampledToLocalTracing is false), it should be recorded as "other". This applies to both the new serverCallMethodResolved() path and the standard serverCallStarted() path.

Add unit tests for non-generated server method metrics.

Verify that both serverCallMethodResolved() and
serverCallStarted() record the method name as "other"
when isSampledToLocalTracing is false.
@becomeStar
Copy link
Copy Markdown
Contributor Author

Thanks for the review.

Added the missing non-generated coverage in OpenTelemetryMetricsModuleTest.

The new tests cover both:

  • the new serverCallMethodResolved() path
  • the standard serverCallStarted() path

They verify that when isSampledToLocalTracing is false, the method is recorded as other for server metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants