Skip to content

feat(isthmus): make unquoted identifier casing configurable in ConverterProvider#983

Draft
nielspardon wants to merge 1 commit into
substrait-io:mainfrom
nielspardon:feat/configurable-unquoted-casing
Draft

feat(isthmus): make unquoted identifier casing configurable in ConverterProvider#983
nielspardon wants to merge 1 commit into
substrait-io:mainfrom
nielspardon:feat/configurable-unquoted-casing

Conversation

@nielspardon

@nielspardon nielspardon commented Jul 2, 2026

Copy link
Copy Markdown
Member

Summary

Adds constructor-based configuration of unquoted SQL identifier casing to ConverterProvider, so that isthmus consumers can control how unquoted identifiers are cased during parsing. The default remains Casing.TO_UPPER (no behaviour change).

Previously the only way to change this was to subclass ConverterProvider and override getSqlParserConfig() — as IsthmusEntryPoint already did with an anonymous class. That workaround is now replaced by a first-class constructor parameter.

Changes

ConverterProvider

  • unquotedCasing is a new final field, consistent with executionBehavior
  • getUnquotedCasing() — getter
  • getSqlParserConfig() reads unquotedCasing instead of hard-coding Casing.TO_UPPER
  • New constructors: ConverterProvider(Casing) and ConverterProvider(extensions, typeFactory, Casing) for the common cases; the existing 7-arg all-components constructor gains Casing as an 8th parameter. All narrower constructors default to Casing.TO_UPPER.

Propagation through the pipeline

The casing setting is applied consistently across both CREATE TABLE parsing and query parsing, so that the table name stored in a NamedScan matches the configured casing end-to-end.

Class Change
SubstraitSqlToCalcite New convertQueries(sql, catalog, ConverterProvider, operatorTable) overload; passes getSqlParserConfig() down to the statement parser
SqlToSubstrait convert(sql, catalog) now uses the ConverterProvider overload of convertQueries
SubstraitCreateStatementParser New processCreateStatements(ConverterProvider, sql) and processCreateStatementsToCatalog(ConverterProvider, ...) overloads; SqlParser.Config stays an internal detail
SqlExpressionToSubstrait Uses processCreateStatements(converterProvider, tableDef)
IsthmusEntryPoint Uses new ConverterProvider(unquotedCasing); anonymous ConverterProvider subclass removed

Test

UnquotedCasingTest verifies:

  • The default casing is TO_UPPER and is reflected in getSqlParserConfig()
  • new ConverterProvider(Casing) sets the casing correctly for all three Casing values
  • End-to-end: with TO_UPPER a plan built from CREATE TABLE employees … / SELECT … FROM employees produces a NamedScan with name EMPLOYEES; with UNCHANGED it produces employees

Notes

SubstraitSqlStatementParser keeps SqlParser.Config as its parameter type — it is a low-level parse primitive used by multiple callers with different configs. ConverterProvider awareness belongs one level up, which is where it already sits.

…terProvider

Add constructor-based configuration of unquoted SQL identifier casing to
ConverterProvider, so that isthmus consumers can control how unquoted
identifiers are cased during parsing. The default remains Casing.TO_UPPER
(no behaviour change).

Previously the only way to change this was to subclass ConverterProvider
and override getSqlParserConfig() — as IsthmusEntryPoint already did with
an anonymous class. That workaround is now replaced by a first-class
constructor parameter.

Changes to ConverterProvider:
- unquotedCasing is a new final field, consistent with executionBehavior
- getUnquotedCasing() getter
- getSqlParserConfig() reads unquotedCasing instead of hard-coding TO_UPPER
- new ConverterProvider(Casing) and ConverterProvider(extensions, typeFactory, Casing)
  for the common cases; the existing 7-arg constructor gains Casing as an 8th
  parameter; all narrower constructors default to Casing.TO_UPPER

Propagation through the pipeline — casing is applied consistently across
both CREATE TABLE parsing and query parsing:
- SubstraitSqlToCalcite: new convertQueries(sql, catalog, ConverterProvider,
  operatorTable) overload passes getSqlParserConfig() to the statement parser
- SqlToSubstrait: convert(sql, catalog) uses the ConverterProvider overload
- SubstraitCreateStatementParser: new processCreateStatements(ConverterProvider, sql)
  and processCreateStatementsToCatalog(ConverterProvider, ...) overloads;
  SqlParser.Config stays an internal detail
- SqlExpressionToSubstrait: uses processCreateStatements(converterProvider, tableDef)
- IsthmusEntryPoint: uses new ConverterProvider(unquotedCasing);
  anonymous ConverterProvider subclass removed
@nielspardon nielspardon force-pushed the feat/configurable-unquoted-casing branch from 85f5d3a to 648b0e1 Compare July 2, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant