Skip to content

perf(duckdb): push down list length expressions#8544

Open
mhk197 wants to merge 1 commit into
mk/list-lengthfrom
mk/duckdb-list-length-pushdown
Open

perf(duckdb): push down list length expressions#8544
mhk197 wants to merge 1 commit into
mk/list-lengthfrom
mk/duckdb-list-length-pushdown

Conversation

@mhk197

@mhk197 mhk197 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Pushes DuckDB's list-length scalar function into the Vortex scan as the list_length expression, so lengths are computed from list offsets/sizes without materializing element values.

Stacked on #8495.

What

  • Projection (SELECT len(list) / length(list) / array_length(list)): handled in try_from_projection_expression, gated on the projected column being List/FixedSizeList.
  • Filter (WHERE array_length(list) >= k, also len/length): handled in try_from_bound_function + can_push_expression.
  • Each maps to cast(list_length(col), i64) — DuckDB's len/array_length return BIGINT while list_length returns u64 — mirroring the existing strlenbyte_length pushdown.

len/length are overloaded with strings/bits, so the filter path needs the argument type to disambiguate. Added a small FFI accessor duckdb_vx_expr_get_return_type plus ExpressionRef::return_type(), and gate len/length/array_length on the bound child being LIST/ARRAY.

Signed-off-by: Matt Katz <mhkatz97@gmail.com>
@mhk197 mhk197 added the changelog/performance A performance improvement label Jun 22, 2026
@codspeed-hq

codspeed-hq Bot commented Jun 22, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
❌ 2 regressed benchmarks
✅ 1586 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation bitwise_not_vortex_buffer_mut[128] 186.1 ns 215.3 ns -13.55%
Simulation bitwise_not_vortex_buffer_mut[1024] 246.4 ns 275.6 ns -10.58%
Simulation eq_i64_constant 318.3 µs 288.4 µs +10.35%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing mk/duckdb-list-length-pushdown (9d94fe5) with mk/list-length (51b1a4c)

Open in CodSpeed

@mhk197 mhk197 added the action/benchmark-sql Trigger SQL benchmarks to run on this PR label Jun 22, 2026
@mhk197 mhk197 changed the title feat[duckdb]: push down list length expression feat(duckdb): push down list length expression Jun 22, 2026
@mhk197 mhk197 changed the title feat(duckdb): push down list length expression perf(duckdb): push down list length expression Jun 22, 2026
@mhk197 mhk197 changed the title perf(duckdb): push down list length expression perf(duckdb): push down list length expressions Jun 22, 2026
@mhk197 mhk197 added action/benchmark-sql Trigger SQL benchmarks to run on this PR and removed action/benchmark-sql Trigger SQL benchmarks to run on this PR labels Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action/benchmark-sql Trigger SQL benchmarks to run on this PR changelog/performance A performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant