Skip to content

Adding mutation batcher and row assembler in cdc data generator#3762

Open
shreyakhajanchi wants to merge 2 commits intoGoogleCloudPlatform:mainfrom
shreyakhajanchi:mutation-batcher
Open

Adding mutation batcher and row assembler in cdc data generator#3762
shreyakhajanchi wants to merge 2 commits intoGoogleCloudPlatform:mainfrom
shreyakhajanchi:mutation-batcher

Conversation

@shreyakhajanchi
Copy link
Copy Markdown
Contributor

@shreyakhajanchi shreyakhajanchi commented May 6, 2026

📝 Description

This PR introduces the MutationBatcher and RowAssembler components into the v2/cdc-data-generator module, separating row synthesis from transient buffering logic. It accumulates and chunks mutations in worker memory based on size limits partitioned per table, shard, and operation type (INSERT, UPDATE, DELETE).

Additionally, it implements FailureRecord to uniformly format serialization/writer errors into JSON lines for dead-letter queue (DLQ) storage.

Changes

Core Production Classes

  • BufferKey.java [NEW]: AutoValue key structure for buffer group partitioning.
  • MutationBatcher.java [NEW]: Accumulates rows, drives local thresholds, and manages transactional flushes.
  • RowAssembler.java [NEW]: Pure-function helpers for updating and deleting Beam Row assembly.
  • FailureRecord.java [NEW]: Uniform serialization wrapper for DLQ errors.
  • DataGeneratorUtils.java [MODIFY]: Embedded hex conversion helpers (canonicalizeValue, bytesToHex).

Unit Test Suites

  • MutationBatcherTest.java [NEW]: Validates batch thresholds, selective update flushes, and custom connection constraints.
  • RowAssemblerTest.java [NEW]: Asserts field schema mapping rules.
  • FailureRecordTest.java [NEW]: Verifies JSON conversion attributes across raw byte arrays and nulls.

@shreyakhajanchi shreyakhajanchi added the addition New feature or request label May 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 82.52033% with 43 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.28%. Comparing base (5874e81) to head (74d5796).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...ud/teleport/v2/templates/dofn/MutationBatcher.java 78.72% 7 Missing and 13 partials ⚠️
...cloud/teleport/v2/templates/dofn/RowAssembler.java 87.50% 7 Missing and 6 partials ⚠️
...eleport/v2/templates/utils/DataGeneratorUtils.java 63.15% 5 Missing and 2 partials ⚠️
...oud/teleport/v2/templates/utils/FailureRecord.java 88.88% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3762      +/-   ##
============================================
+ Coverage     53.16%   53.28%   +0.12%     
+ Complexity     6490     6162     -328     
============================================
  Files          1075     1079       +4     
  Lines         65233    65490     +257     
  Branches       7230     7289      +59     
============================================
+ Hits          34680    34896     +216     
- Misses        28223    28243      +20     
- Partials       2330     2351      +21     
Components Coverage Δ
spanner-templates 72.85% <ø> (-0.01%) ⬇️
spanner-import-export 68.63% <ø> (-0.02%) ⬇️
spanner-live-forward-migration 80.98% <ø> (-0.02%) ⬇️
spanner-live-reverse-replication 77.19% <ø> (-0.02%) ⬇️
spanner-bulk-migration 91.13% <ø> (-0.01%) ⬇️
gcs-spanner-dv 85.81% <ø> (-0.02%) ⬇️
Files with missing lines Coverage Δ
...e/cloud/teleport/v2/templates/model/BufferKey.java 100.00% <100.00%> (ø)
...oud/teleport/v2/templates/utils/FailureRecord.java 88.88% <88.88%> (ø)
...eleport/v2/templates/utils/DataGeneratorUtils.java 81.25% <63.15%> (-5.64%) ⬇️
...cloud/teleport/v2/templates/dofn/RowAssembler.java 87.50% <87.50%> (ø)
...ud/teleport/v2/templates/dofn/MutationBatcher.java 78.72% <78.72%> (ø)

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shreyakhajanchi shreyakhajanchi marked this pull request as ready for review May 6, 2026 03:50
@shreyakhajanchi shreyakhajanchi requested a review from a team as a code owner May 6, 2026 03:50
@gemini-code-assist
Copy link
Copy Markdown

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

addition New feature or request size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant