feat(isthmus): make unquoted identifier casing configurable in ConverterProvider#983
Draft
nielspardon wants to merge 1 commit into
Draft
Conversation
…terProvider Add constructor-based configuration of unquoted SQL identifier casing to ConverterProvider, so that isthmus consumers can control how unquoted identifiers are cased during parsing. The default remains Casing.TO_UPPER (no behaviour change). Previously the only way to change this was to subclass ConverterProvider and override getSqlParserConfig() — as IsthmusEntryPoint already did with an anonymous class. That workaround is now replaced by a first-class constructor parameter. Changes to ConverterProvider: - unquotedCasing is a new final field, consistent with executionBehavior - getUnquotedCasing() getter - getSqlParserConfig() reads unquotedCasing instead of hard-coding TO_UPPER - new ConverterProvider(Casing) and ConverterProvider(extensions, typeFactory, Casing) for the common cases; the existing 7-arg constructor gains Casing as an 8th parameter; all narrower constructors default to Casing.TO_UPPER Propagation through the pipeline — casing is applied consistently across both CREATE TABLE parsing and query parsing: - SubstraitSqlToCalcite: new convertQueries(sql, catalog, ConverterProvider, operatorTable) overload passes getSqlParserConfig() to the statement parser - SqlToSubstrait: convert(sql, catalog) uses the ConverterProvider overload - SubstraitCreateStatementParser: new processCreateStatements(ConverterProvider, sql) and processCreateStatementsToCatalog(ConverterProvider, ...) overloads; SqlParser.Config stays an internal detail - SqlExpressionToSubstrait: uses processCreateStatements(converterProvider, tableDef) - IsthmusEntryPoint: uses new ConverterProvider(unquotedCasing); anonymous ConverterProvider subclass removed
85f5d3a to
648b0e1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds constructor-based configuration of unquoted SQL identifier casing to
ConverterProvider, so that isthmus consumers can control how unquoted identifiers are cased during parsing. The default remainsCasing.TO_UPPER(no behaviour change).Previously the only way to change this was to subclass
ConverterProviderand overridegetSqlParserConfig()— asIsthmusEntryPointalready did with an anonymous class. That workaround is now replaced by a first-class constructor parameter.Changes
ConverterProviderunquotedCasingis a newfinalfield, consistent withexecutionBehaviorgetUnquotedCasing()— gettergetSqlParserConfig()readsunquotedCasinginstead of hard-codingCasing.TO_UPPERConverterProvider(Casing)andConverterProvider(extensions, typeFactory, Casing)for the common cases; the existing 7-arg all-components constructor gainsCasingas an 8th parameter. All narrower constructors default toCasing.TO_UPPER.Propagation through the pipeline
The casing setting is applied consistently across both CREATE TABLE parsing and query parsing, so that the table name stored in a
NamedScanmatches the configured casing end-to-end.SubstraitSqlToCalciteconvertQueries(sql, catalog, ConverterProvider, operatorTable)overload; passesgetSqlParserConfig()down to the statement parserSqlToSubstraitconvert(sql, catalog)now uses theConverterProvideroverload ofconvertQueriesSubstraitCreateStatementParserprocessCreateStatements(ConverterProvider, sql)andprocessCreateStatementsToCatalog(ConverterProvider, ...)overloads;SqlParser.Configstays an internal detailSqlExpressionToSubstraitprocessCreateStatements(converterProvider, tableDef)IsthmusEntryPointnew ConverterProvider(unquotedCasing); anonymousConverterProvidersubclass removedTest
UnquotedCasingTestverifies:TO_UPPERand is reflected ingetSqlParserConfig()new ConverterProvider(Casing)sets the casing correctly for all threeCasingvaluesTO_UPPERa plan built fromCREATE TABLE employees … / SELECT … FROM employeesproduces aNamedScanwith nameEMPLOYEES; withUNCHANGEDit producesemployeesNotes
SubstraitSqlStatementParserkeepsSqlParser.Configas its parameter type — it is a low-level parse primitive used by multiple callers with different configs.ConverterProviderawareness belongs one level up, which is where it already sits.