Skip to content

Use-after-free in to_array(map(each(arr).reverse(), lambda)) iterator-fusion chain #2505

@borisbat

Description

@borisbat

Summary

A latent use-after-free lives in the iterator-fusion runtime. The pattern that triggers it (in daslib/aot_cpp.das describeCppTypeEx, the original site of the test_archive AOT codegen crash):

let args <- to_array(map(each(typeDecl.dim).reverse(), @(itd) { return ",{itd}>"; }))

Replacing the chain with a plain for-loop fully eliminates the crash (200/200 iterations clean). The chain itself crashes 1-in-N times depending on heap layout — repro rate observed climbing 1/96 → 1/51 → 1/4 across recent sessions as the heap-layout entropy shifted.

Symptom

EXCEPTION_ACCESS_VIOLATION (0xC0000005) reading a page-aligned freed address (DAS_TRACK_ALLOC freed-page sentinel — the suballocator unmaps freed pages, so a UAF becomes a hard AV instead of "happens to read garbage").

[ 0] register_fusion + 0x2f8f7   <-- fused SimNode crashes here, every time
[ 1] das::SimNode_BlockNF::eval + 0x43b1a
...
daslang stack:
  _lambda_aot_cpp_235_10`function
  _lambda_aot_cpp_58_15`function
  builtin`to_array`8900707383779332554 from daslib/aot_cpp.das:409:20
  invoke block ... aot_cpp.das:293:11
  describeCppTypeEx

The C++ stack always tops at the same register_fusion + 0x2f8f7 offset across repros, suggesting one specific fused SimNode is the culprit.

Diagnosis

  • TypeDecl itself is alive at every guard point. Inserted verify_typedecl_gc(typeDecl) C++ binding (added in Use-after-free in to_array(map(each(arr).reverse(), lambda)) iterator-fusion chain #2505 / e6ae32f82) at the top of describeCppTypeEx and immediately before the chain — gc_magic stayed 0x1ee70001 (alive) under DAS_GC_DEBUG=1 (memory-poisoning sweep mode) on every crash.
  • So the freed page is not the TypeDecl, not its dim field directly — it is something the iterator chain or fusion-node generates and frees prematurely.

Suspect

One of these (or their fusion combination):

  1. each(arr).reverse() — reverse-iterator wrapper holding a back-pointer into a stack temporary that the fusion optimizer elides early.
  2. map(iter, lambda) — map iterator capturing a freed inner reference.
  3. to_array(iter) collecting strings — interpolation ",{itd}>" allocates a transient string, possibly on a context heap that to_array's fusion frees before consumption.

Repro

Plain windows-64 Release build, DAS_TRACK_ALLOC=ON (default for the build folder per CMakeCache).

for i in $(seq 1 200); do
  rm -f tests/archive/_aot_generated/test_aot_archive_test_archive.das.cpp
  bin/Release/daslang.exe utils/aot/main.das -- -aot \
    "$(pwd)/tests/archive/test_archive.das" \
    "$(pwd)/tests/archive/_aot_generated/test_aot_archive_test_archive.das.cpp" \
    || break
done

Crashes in 1/4 to 1/96 iterations on master HEAD. Pre-fix CI repros:

Workaround

PR replacing the chain with a plain for-loop in describeCppTypeEx. The double-reverse in the original was a no-op anyway, so the for-loop is also faster and clearer.

Suggested investigation

  1. Build RelWithDebInfo locally (DAS_TRACK_ALLOC auto-on) so register_fusion + 0x2f8f7 resolves to a symbol — the fused SimNode name will identify the carrier.
  2. Bisect: drop .reverse(), then drop to_array(...), then drop each(). With 1/4 repro rate this is fast.
  3. Once isolated, audit the iterator's destructor / fusion-node lifetime for who owns the captured iterator-state buffer.

This bug is orthogonal to any specific test or codegen path — describeCppTypeEx was the trigger because it runs the chain on TypeDecl::dim, but every other site combining each() + reverse() + map() + to_array() (or the same fusion combo with different operators) is at risk.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions