Skip to content

feat: add queryMetadata config hook with BigQuery job-label support#11065

Open
darrenjl wants to merge 2 commits into
cube-js:masterfrom
darrenjl:feat/query-metadata-hook
Open

feat: add queryMetadata config hook with BigQuery job-label support#11065
darrenjl wants to merge 2 commits into
cube-js:masterfrom
darrenjl:feat/query-metadata-hook

Conversation

@darrenjl

Copy link
Copy Markdown

Closes #11054, references #7753.

Adds a generic queryMetadata configuration hook that Cube evaluates once per REST/GraphQL request and threads through to the driver as options.queryMetadata. BigQuery is the first consumer: it maps the entries to BigQuery job labels, which surface in INFORMATION_SCHEMA.JOBS for cost attribution and trace correlation.

Usage

// cube.js
module.exports = {
  queryMetadata: ({ requestId, securityContext }) => ({
    tenant:     securityContext.tenantId,
    team:       securityContext.team,
    request_id: requestId,   // correlate BigQuery jobs back to the upstream session
  }),
};

Set x-request-id in your upstream app or AI agent; Cube already uses that header verbatim as requestId (with traceparent as fallback), so it flows straight through to the label. Query cost by session:

SELECT job_id, total_bytes_billed, total_slot_ms, labels
FROM `region-us`.INFORMATION_SCHEMA.JOBS
WHERE labels['request_id'] = 'session-tenant42-20240611-abc123'

Why a generic hook rather than driver-specific options

Two requirements can't be satisfied by static driver config:

  1. requestId is dynamic per request — it can't live in driverFactory config set at startup.
  2. securityContext doesn't reach the driver's query() today — the only workaround is driverFactory(context) building a fresh driver per tenant, which forces contextToOrchestratorId and makes per-request metadata impossible.

A hook evaluated in contextByReq — where both requestId and securityContext are in scope — solves both and serves every driver without further core changes.

Changes

New config hook (cross-package)

  • queryMetadata?: (context: RequestContext) => Record<string,string> | Promise<...> added to CreateOptions (server-core), ApiGatewayOptions (api-gateway), validated with Joi.func()
  • Evaluated in ApiGateway.contextByReq alongside extendContext
  • Propagated as queryMetadata on QueryBodyCacheQueryResultOptionsqueryWithRetryAndRelease → the _query object passed to client.query(req.query, req.values, req), mirroring requestId exactly
  • QueryOptions in @cubejs-backend/base-driver gains an explicit queryMetadata? field

BigQuery driver

  • buildJobLabels(options) reads options.queryMetadata and sanitizes values to BigQuery label constraints (lowercase, [a-z0-9_-] only, max 63 chars)
  • Applied in runQueryJob() — the single chokepoint for query() and loadPreAggregationIntoTable()
  • stream() unchanged (out of scope for this PR)

Tests

  • New packages/cubejs-bigquery-driver/test/unit/labels.test.ts — 8 cases, mocked @google-cloud/bigquery, no GCP credentials required

Docs

  • docs-mintlify/reference/configuration/config.mdx — new query_metadata section
  • docs-mintlify/admin/connect-to-data/data-sources/google-bigquery.mdx — updated Job Labels section

Known limitations (future PRs)

  • Applies to REST/GraphQL request path only (contextByReq). Refresh-worker pre-aggregation builds and the SQL API are not request-scoped and won't carry metadata — semantically correct (no upstream request).
  • stream() is not labelled this PR.
  • Other drivers (Snowflake QUERY_TAG, SQL comments, etc.) can adopt options.queryMetadata without further core changes.

Checklist

  • Tests added / updated
  • Docs updated
  • Linter passes
  • TypeScript compiles cleanly for all modified packages

…etadata (e.g. BigQuery job labels) to every data-source query for cost attribution and trace correlation
@darrenjl darrenjl requested review from a team and keydunov as code owners June 11, 2026 12:47
@github-actions github-actions Bot added driver:bigquery Issues related to the BigQuery driver javascript Pull requests that update Javascript code data source driver pr:community Contribution from Cube.js community members. labels Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data source driver driver:bigquery Issues related to the BigQuery driver javascript Pull requests that update Javascript code pr:community Contribution from Cube.js community members.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: first-class hook to attach BigQuery job labels to generated queries

1 participant