Complex SQL with JooQ
Adhoc speaks SQL through JooQ. The simplest case — a single table — is
already covered in Tables: hand JooqTableWrapperParameters a Table<?> and you're
done. This article is about the next step up: snowflake / star schemas, multi-key joins, diamond
joins, join pruning, and the column-alias mechanism — everything JooqTableSupplierBuilder and
its pruning subclass PrunedJoinsJooqTableSupplierBuilder exist for.
When you need it
Use the builder when any of the following is true:
- Your table is a multi-arm snowflake (fact joined to several dimensions, dimensions
themselves joined to deeper dimensions). Hand-rolling the JooQ
FROMclause becomes verbose and error-prone fast. - You expose caller-friendly column names that don't match the underlying SQL columns —
e.g. you want callers to query
countryeven though the actual column iscustomers.customer_country. - Most queries touch only a subset of the joins, and you want unused arms pruned out of
the SQL
FROMclause to reduce engine work (see Join pruning below).
If none of those apply, the plain JooqTableWrapperParameters.builder().table(DSL.table(...))
path is enough — don't reach for the builder.
Basic snowflake
PrunedJoinsJooqTableSupplierBuilder schema = PrunedJoinsJooqTableSupplierBuilder.prunedBuilder()
.baseTable(DSL.table("orders"))
.baseTableAlias("f")
.dslSupplier(dslSupplier)
.build()
// f → customers
.leftJoin(j -> j.table(DSL.table("customers")).alias("c").on("customer_id", "id"))
// f → products → categories (snowflake leg: `.from("p")` switches the parent)
.leftJoin(j -> j.table(DSL.table("products")).alias("p").on("product_id", "id"))
.leftJoin(j -> j.table(DSL.table("categories"))
.alias("cat")
.from("p")
.on("category_id", "id"));
JooqTableWrapperParameters params = JooqTableWrapperParameters.builder()
.dslSupplier(dslSupplier)
.tableSupplier(PrunedJoinsJooqTableSupplier.builder().schema(schema).build())
.build();
ITableWrapper table = new JooqTableWrapper("orders", params);
The tableSupplier(IJooqTableSupplier) setter wires both .table(...) (used for schema
introspection — getColumns) and the per-query pruning supplier in one go. You no longer
have to hand .table(...) separately.
Diamond joins
Adhoc inherits the full SQL join expressiveness — diamond joins, asOf joins, window-function joins, multi-key composite ON-clauses are all supported. This is one of the major differences with Atoti, where the schema is restricted to a single-parent star/snowflake tree.
A diamond join is one where the joined table depends on columns from multiple parent tables.
Classic shape: pricing_rules keyed by (region_id, segment), where region_id lives on the
fact table and segment lives on the customers dimension.
.leftJoin(j -> j.table(DSL.table("customers")).alias("b").on("customer_id", "id"))
.leftJoin(j -> j.table(DSL.table("pricing_rules"))
.alias("c")
.on("region_id", "region_id") // c.region_id = f.region_id (LEFT side defaults to base)
.on("b.segment", "segment")); // c.segment = b.segment (LEFT side fully qualified)
The fully-qualified b.segment LEFT side declares an ON-clause dependency on b. The pruning
closure picks this up automatically: a query that asks only for c.discount_pct will keep both
b AND c in the materialised FROM (without that, b would be pruned and the SQL would
reference a dropped alias).
parentAlias is therefore a convenience shortcut for the common case where ON-clauses don't
need qualification, not a structural constraint of the join graph.
Join pruning {#join-pruning}
The big win. When most queries touch only a subset of the joins, eagerly composing the full
baseTable.leftJoin(...).leftJoin(...)chain pays a per-join cost in the SQL engine even when the join contributes no columns to theSELECT. DuckDB, for example, pays a meaningful cost perLEFT JOINeven on rows that don't survive any filter.PrunedJoinsJooqTableSupplierprunes theFROMclause per query to the minimum set of joins reachable from the columns the query actually references.
How the closure works:
- Collect referenced columns from the query — group-by columns, aggregator source columns,
and every column appearing in a filter (shared
WHEREor per-aggregatorFILTER (WHERE …)). - Look each column up in the supplier's column→join index (built lazily from each
JoinNode'scolumnsOverrideor — by default — aSELECT * LIMIT 0probe of the joined table). - Mark the owning join needed, then walk the parent chain transitively (snowflake ancestors come along) plus every alias the ON-clause references (for diamond joins).
- Force-include any join declared with
.prunable(false)— for joins whose presence changes row cardinality (an INNER-style join used as a filter, for instance) and must therefore not be elided. - Materialise the resulting subset back into a
Table<Record>.
The decision is cached per referenced-column set (bounded LRU), so repeated queries with the same column footprint don't re-walk the closure.
Things that subvert pruning
expressionColumnExtractor-misses: when a column reference is wrapped in a SQL expression the extractor doesn't recognise (custom dialect functions, deeply nested expressions), the supplier raises a strict-mode error rather than silently dropping the join. Configure a customIExpressionColumnExtractoron the supplier when this bites.prunable(false)declared on a join opts it out of pruning entirely. Use this sparingly — only when the join's presence changes the result set semantically.- Calculated / non-pushdown filters force the table layer to hoist the filter columns into every grouping set so the post-filter has per-row values to evaluate against — these columns show up as ON-clause leftovers and the closure correctly keeps the providing join alive.
Column aliases
JooqTableSupplierBuilder records caller-facing column aliases via withAlias(...) — both as a
fluent setter on the join builder (per-join scope) and as a top-level setter on the schema
builder (post-join, scoped to the most recently declared join):
.leftJoin(j -> j.table(DSL.table("customers"))
.alias("c")
.on("customer_id", "id")
// Caller-facing alias `country` redirects to the actual table column `customer_country`.
.withAlias("country", "customer_country"));
The alias is stored as "c"."customer_country" (jOOQ-escaped two-part Name) so the form is
unambiguous — it survives column names that contain dots or whitespace.
Important: aliases must be combined with the cube's ColumnsManager
Read this twice.
JooqTableSupplierBuilder'saliasToOriginalmap is consumed by the table layer (pruning,JooqTableWrapper.getColumns) but is NOT automatically picked up by the cube'sColumnsManager.aliaser— the component that translatesqueried → underlyingat SQL render time. If you don't wire the alias map intoColumnsManager, every alias-named column reaches the SQL renderer untranslated and the database rejects it with a "column not found" error.
The wiring you need at cube-build time:
MapTableAliaser tableAliaser = MapTableAliaser.builder()
.aliasToOriginals(schema.getAliasToOriginal())
.build();
// If you have additional aliases beyond what the schema knows (e.g. cube-level renames,
// multi-table federation), compose them with the table-level aliaser. For a single source,
// pass `tableAliaser` directly; for multiple, use a `CompositeTableTranscoder`.
ITableAliaser combined = composeWithCustomAliasers(tableAliaser, /* user-provided aliasers */);
CubeWrapper cube = CubeWrapper.builder()
.table(table)
.forest(forest)
.columnsManager(ColumnsManager.builder().aliaser(combined).build())
.build();
The reverse direction — turning a ColumnMetadata back into the joined column it came from —
is handled automatically: JooqTableWrapper.getColumns() reads getAliasToOriginal() from the
supplier and attaches the caller-facing alias names to the corresponding ColumnMetadata.aliases
set, and renames any column whose bare name is shadowed by an alias to its qualified form
(e.g. b.country rather than country when an alias claims country for a different join).
Bare-name collisions
When an alias claims a bare name already used by a real column on a different join, the
shadowed column is re-exposed in the column list under its qualified form <owner>.<col>. This
is intentional: the bare name is unambiguously the alias's target, and the shadowed column
stays reachable via its qualified form. The auto-registered ON-clause "aliases" (e.g. the bare
name productId registered as the canonical form on a natural-key join) are detected and do
NOT trigger renaming — they're identity mappings, not user-declared shadowings.
Pitfalls and edge cases
- Plain
DSL.table("name")carries no declared fields. The defaultIJooqColumnsResolveris aSELECT * LIMIT 0probe over the joined table — that works fine, but if you want the builder to not hit the DB for column resolution, either set.providedColumns(Set.of(...))on each join or use a derived table built viaDSL.values(...).asTable(alias, fieldNames...)whosefields()carries the declared columns. - Identifiers needing quotes. Column names with spaces, mixed case (when the dialect folds
identifiers), or other special characters must be created with
DSL.quotedName(...)so the underlying engine preserves them. The supplier's alias map then stores the jOOQ-escaped form ("c"."country group") and queries through the alias work end-to-end. Direct column references can also use the bare-dotted formc."country group"if your dialect supports it, but the alias path is more portable. leftJoinConditions(...)registers a join with rawConditionobjects — the supplier has no column-contract knowledge for that join, so it is forced toprunable=false. Prefer.leftJoin(Consumer<JooqJoinBuilder>)whenever possible.- Late
.leftJoin(...)s registered after queries have flowed needPrunedJoinsJooqTableSupplier#invalidateAll()to drop the stale column→alias index.
Related reading
- Tables — the simpler single-table path, where this article picks up.
- CubeQueryEngine — how the cube DAG and the table DAG interact, so you
can understand where the
aliasertranslation step happens in the pipeline. - Optimisations — broader catalogue of query-time optimisations beyond join pruning.