Currently, Spark Structured streaming via the DSv2 api does not pushdown predicate. This results in more data being scan and filtered out at engine layer. This results in excessive I/O, driver bottlenecks and increased latency.
Relevant Iceberg issue - apache/iceberg#15692
PR on Spark side - #55679
Currently, Spark Structured streaming via the DSv2 api does not pushdown predicate. This results in more data being scan and filtered out at engine layer. This results in excessive I/O, driver bottlenecks and increased latency.
Relevant Iceberg issue - apache/iceberg#15692
PR on Spark side - #55679