← All projects

yardstick

Measures in SQL. Semantic layer primitives for DuckDB.

An experimental DuckDB extension implementing measure-aware SQL. Define aggregations that know how to re-aggregate themselves, enabling percent-of-total and year-over-year calculations without CTEs or window functions.

Installation

INSTALL yardstick FROM community;
LOAD yardstick;

Defining Measures

Define measures in views using the AS MEASURE keyword:

CREATE VIEW sales_v AS
SELECT
    year,
    region,
    SUM(amount) AS MEASURE revenue,
    COUNT(*) AS MEASURE order_count
FROM sales;

Querying Measures

Query measures with SEMANTIC SELECT and AGGREGATE():

SEMANTIC SELECT
    year,
    region,
    AGGREGATE(revenue) AS revenue
FROM sales_v;

AT Modifiers

AT modifiers change aggregation context, enabling powerful analytics:

SEMANTIC SELECT
    year,
    region,
    AGGREGATE(revenue) AS revenue,
    -- Grand total across all dimensions
    AGGREGATE(revenue) AT (ALL) AS grand_total,
    -- Percent of total
    100.0 * AGGREGATE(revenue) / AGGREGATE(revenue) AT (ALL) AS pct_of_total,
    -- Remove region dimension (year total)
    AGGREGATE(revenue) AT (ALL region) AS year_total,
    -- Percent of year
    100.0 * AGGREGATE(revenue) / AGGREGATE(revenue) AT (ALL region) AS pct_of_year
FROM sales_v;

Year-over-Year Comparisons

SEMANTIC SELECT
    region,
    AGGREGATE(revenue) AT (SET year = 2024) AS rev_2024,
    AGGREGATE(revenue) AT (SET year = 2023) AS rev_2023,
    AGGREGATE(revenue) AT (SET year = 2024) -
      AGGREGATE(revenue) AT (SET year = 2023) AS yoy_growth
FROM sales_v;

Filtering with AT (WHERE)

SEMANTIC SELECT
    year,
    AGGREGATE(revenue) AS total,
    AGGREGATE(revenue) AT (WHERE region = 'North') AS north_only,
    AGGREGATE(revenue) AT (WHERE region IN ('North', 'South')) AS north_south
FROM sales_v;

How Measures Work

Traditional SQL requires CTEs or window functions for analytics like percent-of-total. Measures solve this by understanding aggregation context:

  • Context-aware: Measures know which dimensions they're grouped by
  • Correlation: AT modifiers correlate on remaining dimensions
  • Automatic grouping: GROUP BY is inferred from measure definitions

Based on Julian Hyde's research paper "Measures in SQL".