group ids as array ref, multi encoding kernel lookup by onursatici · Pull Request #8550 · vortex-data/vortex

onursatici · 2026-06-23T10:39:07Z

Signed-off-by: Onur Satici onur@spiraldb.com

Summary

Closes: #000

Testing

Signed-off-by: Onur Satici <onur@spiraldb.com>

codspeed-hq · 2026-06-23T10:52:18Z

Merging this PR will not alter performance

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 28 improved benchmarks
❌ 20 regressed benchmarks
✅ 1479 untouched benchmarks
⏩ 9 skipped benchmarks¹
🗄️ 1 archived benchmark run²

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`count_varbinview`	22.8 µs	115.6 µs	-80.3%
❌	Simulation	`sum_f64_all_valid`	51.4 µs	119.8 µs	-57.09%
❌	Simulation	`count_i32_clustered_nulls`	35 µs	76.1 µs	-54.04%
❌	Simulation	`sum_i32_clustered_nulls`	65.5 µs	135.9 µs	-51.78%
❌	Simulation	`sum_f64_clustered_nulls`	68.6 µs	139.2 µs	-50.73%
❌	Simulation	`sum_i32_nullable_all_valid`	105 µs	177.5 µs	-40.87%
❌	Simulation	`compare[31]`	157.9 µs	213.3 µs	-25.99%
❌	Simulation	`compare[30]`	155.4 µs	208.9 µs	-25.6%
❌	Simulation	`compare[29]`	153.6 µs	205.2 µs	-25.16%
❌	Simulation	`compare[28]`	150.9 µs	200.7 µs	-24.83%
❌	Simulation	`compare[27]`	149.5 µs	197.5 µs	-24.28%
❌	Simulation	`compare[26]`	147.1 µs	193.3 µs	-23.87%
❌	Simulation	`compare[25]`	145.3 µs	189.4 µs	-23.31%
❌	Simulation	`compare[24]`	141.1 µs	183.5 µs	-23.12%
❌	Simulation	`compare[23]`	140.9 µs	181.4 µs	-22.32%
❌	Simulation	`compare[22]`	138.5 µs	177.1 µs	-21.79%
❌	Simulation	`compare[21]`	136.7 µs	173.3 µs	-21.15%
❌	Simulation	`compare[20]`	133.5 µs	168.5 µs	-20.74%
❌	Simulation	`compare[19]`	132.3 µs	165.4 µs	-19.97%
❌	Simulation	`compare[18]`	129.8 µs	161 µs	-19.35%
...	...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing os/grouped-aggregate (59d7be7) with ngates/grouped-aggregate (2ef64b2)}

9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
1 benchmark was run, but is now archived. If it was deleted in another branch, consider rebasing to remove it from the report. Instead if it was added back, click here to restore it. ↩

gatesn · 2026-06-24T15:39:07Z

+#[derive(Debug)]
+pub(crate) struct CountGroupedKernel;
+
+impl DynGroupedAggregateKernel for CountGroupedKernel {


Don't we typically have non-object-safe typed kernels with associated types, and then lift them into the DynTrait when we register?

gatesn · 2026-06-24T15:41:06Z

+            return Ok(None);
+        };
+
+        let columnar = batch.clone().execute::<Columnar>(ctx)?;


Why execute to Columnar if we never match on Constant?

gatesn · 2026-06-24T15:42:08Z

+                num_groups
+            )
+        })?;
+        Self::from_buffer((0..=last).collect(), num_groups)


Wouldn't we want a sequence array?

gatesn · 2026-06-24T15:42:46Z


-    pub fn group_ids(&self) -> &[u32] {
-        self.group_ids.as_ref()
+    pub fn dense(partials: ArrayRef, num_groups: usize) -> VortexResult<Self> {


Is num_groups == partials.len()?

gatesn · 2026-06-24T15:43:04Z


 type AggregateKernelKey = (ArrayId, Option<AggregateFnId>);
-type GroupedEncodingKernelKey = (ArrayId, AggregateFnId);
+type GroupedAggregateKernelKey = (AggregateFnId, Option<ArrayId>, Option<ArrayId>);


Might be worth a struct at this point

group ids as array ref, multi encoding kernel lookup

59d7be7

Signed-off-by: Onur Satici <onur@spiraldb.com>

gatesn requested changes Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

group ids as array ref, multi encoding kernel lookup#8550

group ids as array ref, multi encoding kernel lookup#8550
onursatici wants to merge 1 commit into
ngates/grouped-aggregatefrom
os/grouped-aggregate

onursatici commented Jun 23, 2026

Uh oh!

codspeed-hq Bot commented Jun 23, 2026

Uh oh!

gatesn Jun 24, 2026

Uh oh!

gatesn Jun 24, 2026

Uh oh!

gatesn Jun 24, 2026

Uh oh!

gatesn Jun 24, 2026

Uh oh!

gatesn Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

onursatici commented Jun 23, 2026

Summary

Testing

Uh oh!

codspeed-hq Bot commented Jun 23, 2026

Merging this PR will not alter performance

Performance Changes

Footnotes

Uh oh!

gatesn Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gatesn Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gatesn Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gatesn Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gatesn Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants