Fix inflate operator: Reset max output volume. Use size in bytes, not elements.#6283
Conversation
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Greptile SummaryThis PR fixes two bugs in the GPU LZ4 inflate operator: (1) Confidence Score: 5/5Safe to merge; both targeted bug fixes are correct and the only remaining finding is a P2 style hardening suggestion. All root-cause bugs (uninitialized max volume and element-vs-byte mismatch) are correctly addressed. The single comment is a P2 latent edge-case that the operator already guards against in RunImpl, so it does not block merge. dali/operators/decoder/inflate/inflate_params.h — the early-return path in ParseOutputShape still skips resetting the max-volume accumulators.
|
| Filename | Overview |
|---|---|
| dali/operators/decoder/inflate/inflate_params.h | Adds max_output_sample_vol_ = 0 reset alongside the existing max_output_vol_ = 0 reset, fixing a missing initialisation that could cause stale values on repeated calls; reset is still skipped on the num_samples == 0 early-return path. |
| dali/operators/decoder/inflate/inflate_gpu.cc | Correctly converts element-count volumes to byte sizes before passing to nvcompBatchedLZ4DecompressGetTempSizeAsync, fixing incorrect temp-buffer sizing for output types wider than uint8. |
| dali/operators/decoder/inflate/inflate.h | Introduces TypeInfo element_type_ member (initialised in constructor) to provide element byte-size to the GPU implementation; straightforward and correct. |
Sequence Diagram
sequenceDiagram
participant Inflate as "Inflate (GPU)"
participant Impl as "InflateOpGpuLZ4Impl"
participant Params as "ShapeParams"
participant nvcomp as "nvcompBatchedLZ4"
Inflate->>Impl: "SetupImpl(output_desc, ws)"
Impl->>Params: "ProcessInputArgs(ws, batch_size)"
Params->>Params: "ParseOutputShape() - reset max_output_vol_=0, max_output_sample_vol_=0"
Params-->>Impl: "output shape ready"
Inflate->>Impl: "RunImpl(ws)"
Impl->>Params: "GetMaxOutChunkVol() elements"
Impl->>Params: "GetMaxOutVol() elements"
Note over Impl: "multiply by element_type_.size() for bytes"
Impl->>nvcomp: "nvcompBatchedLZ4DecompressGetTempSizeAsync(num_chunks, max_chunk_bytes, params, tempSize, max_total_bytes)"
nvcomp-->>Impl: "tempSize"
Impl->>nvcomp: "nvcompBatchedLZ4DecompressAsync(...)"
nvcomp-->>Impl: "decompressed output"
Impl->>Impl: "FillTheTails(...)"
Comments Outside Diff (1)
-
dali/operators/decoder/inflate/inflate_params.h, line 351-356 (link)Stale max-volume on empty batch early-return
When
num_samples == 0the function returns on line 352 before the two reset assignments, so bothmax_output_vol_andmax_output_sample_vol_keep whatever values they held from the previous call. If a non-empty batch is processed first and then an empty batch follows,GetMaxOutChunkVol()/GetMaxOutVol()will return stale (oversized) values.RunImpldoes guard against empty batches with an early return, so this doesn't cause a crash today, but it is a latent correctness issue.
Reviews (1): Last reviewed commit: "Reset max output volume. Use size in byt..." | Re-trigger Greptile
|
CI MESSAGE: [48015281]: BUILD STARTED |
|
CI MESSAGE: [48015281]: BUILD FAILED |
|
CI MESSAGE: [48015281]: BUILD PASSED |
Category:
Bug fix (non-breaking change which fixes an issue)
Description:
This PR fixes intermittent failure of
inflateoperator tests caused by uninitialized variable and improper element size assumptions.Additional information:
Affected modules and functionalities:
Key points relevant for the review:
test_dali_stateless_operators.test_inflate_stateless
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A