perf(duckdb): push down list length expressions#8544
Conversation
Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[128] |
186.1 ns | 215.3 ns | -13.55% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
246.4 ns | 275.6 ns | -10.58% |
| ⚡ | Simulation | eq_i64_constant |
318.3 µs | 288.4 µs | +10.35% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing mk/duckdb-list-length-pushdown (9d94fe5) with mk/list-length (51b1a4c)
Pushes DuckDB's list-length scalar function into the Vortex scan as the
list_lengthexpression, so lengths are computed from list offsets/sizes without materializing element values.Stacked on #8495.
What
SELECT len(list)/length(list)/array_length(list)): handled intry_from_projection_expression, gated on the projected column beingList/FixedSizeList.WHERE array_length(list) >= k, alsolen/length): handled intry_from_bound_function+can_push_expression.cast(list_length(col), i64)— DuckDB'slen/array_lengthreturnBIGINTwhilelist_lengthreturnsu64— mirroring the existingstrlen→byte_lengthpushdown.len/lengthare overloaded with strings/bits, so the filter path needs the argument type to disambiguate. Added a small FFI accessorduckdb_vx_expr_get_return_typeplusExpressionRef::return_type(), and gatelen/length/array_lengthon the bound child beingLIST/ARRAY.