Skip to content

group ids as array ref, multi encoding kernel lookup#8550

Open
onursatici wants to merge 1 commit into
ngates/grouped-aggregatefrom
os/grouped-aggregate
Open

group ids as array ref, multi encoding kernel lookup#8550
onursatici wants to merge 1 commit into
ngates/grouped-aggregatefrom
os/grouped-aggregate

Conversation

@onursatici

Copy link
Copy Markdown
Contributor

Signed-off-by: Onur Satici onur@spiraldb.com

Summary

Closes: #000

Testing

Signed-off-by: Onur Satici <onur@spiraldb.com>
@codspeed-hq

codspeed-hq Bot commented Jun 23, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 28 improved benchmarks
❌ 20 regressed benchmarks
✅ 1479 untouched benchmarks
⏩ 9 skipped benchmarks1
🗄️ 1 archived benchmark run2

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation count_varbinview 22.8 µs 115.6 µs -80.3%
Simulation sum_f64_all_valid 51.4 µs 119.8 µs -57.09%
Simulation count_i32_clustered_nulls 35 µs 76.1 µs -54.04%
Simulation sum_i32_clustered_nulls 65.5 µs 135.9 µs -51.78%
Simulation sum_f64_clustered_nulls 68.6 µs 139.2 µs -50.73%
Simulation sum_i32_nullable_all_valid 105 µs 177.5 µs -40.87%
Simulation compare[31] 157.9 µs 213.3 µs -25.99%
Simulation compare[30] 155.4 µs 208.9 µs -25.6%
Simulation compare[29] 153.6 µs 205.2 µs -25.16%
Simulation compare[28] 150.9 µs 200.7 µs -24.83%
Simulation compare[27] 149.5 µs 197.5 µs -24.28%
Simulation compare[26] 147.1 µs 193.3 µs -23.87%
Simulation compare[25] 145.3 µs 189.4 µs -23.31%
Simulation compare[24] 141.1 µs 183.5 µs -23.12%
Simulation compare[23] 140.9 µs 181.4 µs -22.32%
Simulation compare[22] 138.5 µs 177.1 µs -21.79%
Simulation compare[21] 136.7 µs 173.3 µs -21.15%
Simulation compare[20] 133.5 µs 168.5 µs -20.74%
Simulation compare[19] 132.3 µs 165.4 µs -19.97%
Simulation compare[18] 129.8 µs 161 µs -19.35%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing os/grouped-aggregate (59d7be7) with ngates/grouped-aggregate (2ef64b2)

Open in CodSpeed

Footnotes

  1. 9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. 1 benchmark was run, but is now archived. If it was deleted in another branch, consider rebasing to remove it from the report. Instead if it was added back, click here to restore it.

#[derive(Debug)]
pub(crate) struct CountGroupedKernel;

impl DynGroupedAggregateKernel for CountGroupedKernel {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we typically have non-object-safe typed kernels with associated types, and then lift them into the DynTrait when we register?

return Ok(None);
};

let columnar = batch.clone().execute::<Columnar>(ctx)?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why execute to Columnar if we never match on Constant?

num_groups
)
})?;
Self::from_buffer((0..=last).collect(), num_groups)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want a sequence array?


pub fn group_ids(&self) -> &[u32] {
self.group_ids.as_ref()
pub fn dense(partials: ArrayRef, num_groups: usize) -> VortexResult<Self> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is num_groups == partials.len()?


type AggregateKernelKey = (ArrayId, Option<AggregateFnId>);
type GroupedEncodingKernelKey = (ArrayId, AggregateFnId);
type GroupedAggregateKernelKey = (AggregateFnId, Option<ArrayId>, Option<ArrayId>);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a struct at this point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants