Skip to content

Support session-time source configuration in DataFusion, and clear up precedence between config sources#8575

Open
AdamGS wants to merge 4 commits into
developfrom
adamg/support-settings-config-session
Open

Support session-time source configuration in DataFusion, and clear up precedence between config sources#8575
AdamGS wants to merge 4 commits into
developfrom
adamg/support-settings-config-session

Conversation

@AdamGS

@AdamGS AdamGS commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Rational for this change

Makes it possible to configure Vortex FileSource-based tables using SET statements on an appropriately configured DataFusion session. This is mostly useful for testing or interactive workflows.

What changes are included in this PR?

  1. Use DataFusion's ExtensionOptions and ConfigExtension APIs instead of config_namespace to get configuration handling for free.
  2. Adds SLT based tests for table and session configuration.
  3. Better document the order of precedence between the various sources of config, trying to mirror the built-in file formats.

What APIs are changed? Are there any user-facing changes?

For users of VortexSource and VortexFormat, some APIs might change slightly, and if they expose Vortex through an interactive interface, the config might change slightly.

AdamGS added 3 commits June 24, 2026 15:45
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
.
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
@AdamGS AdamGS requested review from a team, a10y and onursatici June 24, 2026 15:10
@AdamGS AdamGS added changelog/feature A new feature ext/datafusion Relates to the DataFusion integration labels Jun 24, 2026
@AdamGS AdamGS changed the title Adamg/support settings config session Support session-time source configuration in DataFusion, and clear up precedence between config sources Jun 24, 2026
@codspeed-hq

codspeed-hq Bot commented Jun 24, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 11.9%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 4 improved benchmarks
✅ 1585 untouched benchmarks
⏩ 4 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation encode_varbin[(1000, 4)] 157 µs 139.8 µs +12.34%
Simulation encode_varbin[(1000, 32)] 162.5 µs 144.8 µs +12.22%
Simulation encode_varbin[(1000, 8)] 157.3 µs 140.4 µs +12.05%
Simulation encode_varbin[(1000, 2)] 156.1 µs 140.7 µs +10.98%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing adamg/support-settings-config-session (2ab0b57) with develop (2a19323)

Open in CodSpeed

Footnotes

  1. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature ext/datafusion Relates to the DataFusion integration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant