Sidequery

Configuration

YAML file structure, database connections, models, dimensions, metrics, and CLI configuration for Sidemantic

Sidemantic configuration is specified in YAML files and can be overridden via CLI flags.

YAML File Structure

Basic semantic layer YAML file:

# semantic_layer.yml

# Database connection (optional, defaults to in-memory DuckDB)
# Supports environment variable substitution
connection: duckdb:///data/analytics.duckdb

# Models define your tables and metrics
models:
  - name: orders
    table: orders
    primary_key: order_id

    dimensions:
      - name: status
        type: categorical
        sql: status

      - name: order_date
        type: time
        sql: created_at
        granularity: day

    metrics:
      - name: revenue
        agg: sum
        sql: amount

      - name: order_count
        agg: count

    relationships:
      - name: customers
        type: many_to_one
        foreign_key: customer_id

# Optional: Graph-level metrics
metrics:
  - name: total_revenue
    sql: orders.revenue

Connection Configuration

Specify database connection in YAML:

# DuckDB (default)
connection: duckdb:///:memory:
connection: duckdb:///path/to/database.duckdb

# PostgreSQL
connection: postgres://user:pass@host:5432/database

# BigQuery
connection: bigquery://project-id/dataset-id

# Snowflake
connection: snowflake://user:pass@account/database/schema?warehouse=wh

# ClickHouse
connection: clickhouse://user:pass@host:8123/database

# Databricks
connection: databricks://token@server/http-path?catalog=main

# Spark SQL
connection: spark://host:10000/database

Environment Variables in YAML

Use environment variables for sensitive credentials:

# Use ${ENV_VAR} syntax
connection: postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}

# With default values
connection: duckdb:///${DB_FILE:-/tmp/default.duckdb}

# Simple form (uppercase vars only)
connection: $DATABASE_URL

Supported syntax:

  • ${ENV_VAR} - Substituted with environment variable value
  • ${ENV_VAR:-default} - Use default if variable not set
  • $ENV_VAR - Simple form (uppercase variables only)

Example:

# Set environment variables
export DB_USER=analyst
export DB_PASSWORD=secret
export DB_HOST=localhost
export DB_NAME=analytics
# semantic_layer.yml
connection: postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}

models:
  - name: orders
    table: ${SCHEMA_NAME:-public}.orders
    primary_key: order_id

See Database Connections for complete connection string reference.

Override via CLI

# Use --connection flag to override YAML
sidemantic query models.yml \
  --connection "postgres://localhost:5432/analytics" \
  --sql "SELECT revenue FROM orders"

Model Configuration

Models define tables, dimensions, metrics, and relationships.

Required Fields

models:
  - name: orders              # Model name (required)
    table: orders             # Table name or SQL (required)
    primary_key: order_id     # Primary key column (required)

Optional Fields

models:
  - name: orders
    table: orders
    primary_key: order_id

    # Optional: Description
    description: "Order transactions"

    # Optional: SQL instead of table name
    sql: |
      SELECT *
      FROM raw_orders
      WHERE deleted_at IS NULL

    # Optional: Dimensions for grouping
    dimensions: [...]

    # Optional: Metrics for aggregation
    metrics: [...]

    # Optional: Relationships to other models
    relationships: [...]

    # Optional: Reusable filters
    segments: [...]

See Models for complete model configuration.

Dimension Configuration

Dimensions define columns for grouping and filtering.

dimensions:
  # Categorical dimension
  - name: status
    type: categorical
    sql: status
    description: "Order status"

  # Time dimension
  - name: order_date
    type: time
    sql: created_at
    granularity: day

  # Numeric dimension
  - name: amount
    type: number
    sql: amount

See Models for dimension types and options.

Metric Configuration

Metrics define aggregations and calculations.

Model-Level Metrics

Simple aggregations defined within a model:

models:
  - name: orders
    # ...
    metrics:
      # Sum
      - name: revenue
        agg: sum
        sql: amount

      # Count
      - name: order_count
        agg: count
        sql: order_id

      # Average
      - name: avg_order_value
        agg: avg
        sql: amount

      # Count distinct
      - name: customer_count
        agg: count_distinct
        sql: customer_id

      # With filter
      - name: completed_revenue
        agg: sum
        sql: amount
        filters: ["{model}.status = 'completed'"]

Graph-Level Metrics

Complex metrics defined at the top level:

# Graph-level metrics
metrics:
  # Simple reference
  - name: total_revenue
    sql: orders.revenue

  # Ratio
  - name: conversion_rate
    type: ratio
    numerator: orders.completed_revenue
    denominator: orders.revenue

  # Derived formula
  - name: profit_margin
    type: derived
    sql: "(revenue - cost) / revenue"

  # Cumulative
  - name: running_total
    type: cumulative
    sql: orders.revenue
    window: "7 days"

See Metrics for metric types and advanced features.

Relationship Configuration

Relationships define joins between models.

models:
  - name: orders
    # ...
    relationships:
      # Many-to-one (most common)
      - name: customers
        type: many_to_one
        foreign_key: customer_id

      # One-to-many
      - name: line_items
        type: one_to_many
        foreign_key: order_id  # In line_items table

      # One-to-one
      - name: invoice
        type: one_to_one
        foreign_key: order_id

See Relationships for join configuration.

Segment Configuration

Segments are reusable named filters.

models:
  - name: orders
    # ...
    segments:
      - name: completed
        sql: "{model}.status = 'completed'"
        description: "Only completed orders"

      - name: high_value
        sql: "{model}.amount > 100"

      - name: recent
        sql: "{model}.created_at >= CURRENT_DATE - 30"

Use in queries:

sidemantic query models.yml \
  --sql "SELECT revenue FROM orders WHERE {segments.completed}"

CLI Configuration

Global Flags

Available for most commands:

sidemantic COMMAND [OPTIONS]

Options:
  --connection TEXT    Database connection string
  --help              Show help message

Query Command

sidemantic query PATH [OPTIONS]

Options:
  --sql, -q TEXT           SQL query to execute
  --connection TEXT        Database connection override
  --output, -o TEXT        Output file (defaults to stdout)

Examples:
  # Query to stdout
  sidemantic query models/ --sql "SELECT revenue FROM orders"

  # Save to file
  sidemantic query models/ -q "SELECT * FROM orders" -o results.csv

  # Override connection
  sidemantic query models/ \
    --connection "postgres://localhost:5432/db" \
    --sql "SELECT revenue FROM orders"

Workbench Command

sidemantic workbench PATH [OPTIONS]

Options:
  --connection TEXT    Database connection override
  --demo              Run with demo data

Examples:
  # Local models
  sidemantic workbench models/

  # Demo mode
  sidemantic workbench --demo

  # Custom connection
  sidemantic workbench models/ \
    --connection "bigquery://project/dataset"

Serve Command

sidemantic serve PATH [OPTIONS]

Options:
  --connection TEXT    Database connection override
  --port INTEGER       Port number (default: 5432)
  --host TEXT          Host address (default: 127.0.0.1)
  --username TEXT      Authentication username
  --password TEXT      Authentication password
  --demo              Run with demo data

Examples:
  # Basic server
  sidemantic serve models/ --port 5433

  # With auth
  sidemantic serve models/ \
    --username admin \
    --password secret

  # Custom connection
  sidemantic serve models/ \
    --connection "snowflake://account/db/schema"

Validate Command

sidemantic validate PATH

Examples:
  # Validate all models
  sidemantic validate models/

  # Validate specific file
  sidemantic validate semantic_layer.yml

Info Command

sidemantic info PATH

Examples:
  # Show summary
  sidemantic info models/

See CLI for complete CLI reference.

File Organization

Single File

Simple setup with one YAML file:

semantic_layer.yml
# semantic_layer.yml
connection: duckdb:///data.duckdb

models:
  - name: orders
    # ...

  - name: customers
    # ...

metrics:
  - name: total_revenue
    # ...

Multiple Files

Organize models into separate files:

models/
├── orders.yml
├── customers.yml
└── products.yml

Each file can contain one or more models:

# models/orders.yml
models:
  - name: orders
    table: orders
    primary_key: order_id
    # ...

Load directory with CLI:

sidemantic query models/ --sql "SELECT revenue FROM orders"

Mixed Formats

Combine different semantic layer formats:

semantic_models/
├── cube/
│   └── Orders.yml          # Cube format
├── dbt/
│   └── metrics.yml         # MetricFlow format
└── native/
    └── customers.yml       # Sidemantic format

Sidemantic auto-detects format:

sidemantic query semantic_models/ --sql "SELECT revenue FROM orders"

Environment Variables

Store sensitive values in environment variables:

# .env file
DATABASE_URL=postgres://user:pass@host:5432/analytics
SNOWFLAKE_WAREHOUSE=COMPUTE_WH

Reference in CLI:

sidemantic query models/ \
  --connection "$DATABASE_URL" \
  --sql "SELECT revenue FROM orders"

Validation

Sidemantic validates YAML files automatically:

# Validate before querying
sidemantic validate models/

# Common validation errors:
# - Missing required fields (name, table, primary_key)
# - Invalid metric aggregation types
# - Undefined relationship references
# - Invalid dimension types

Fix validation errors before running queries.

Best Practices

Use Descriptive Names

models:
  - name: orders              # Good: clear, singular
    # vs
  - name: ord                 # Bad: unclear abbreviation

Document Your Models

models:
  - name: orders
    description: "Customer orders from the e-commerce platform"
    table: prod.orders
    # ...

    metrics:
      - name: revenue
        description: "Total order revenue excluding refunds"
        # ...

Organize by Domain

models/
├── sales/
│   ├── orders.yml
│   ├── line_items.yml
│   └── returns.yml
├── customers/
│   ├── customers.yml
│   └── segments.yml
└── products/
    ├── products.yml
    └── categories.yml

Use Segments for Common Filters

# Define once
segments:
  - name: completed
    sql: "{model}.status = 'completed'"

# Reuse everywhere

Keep Connection Strings Secure

# Bad: Hardcoded credentials
connection: postgres://admin:password123@prod-db:5432/analytics

# Good: Environment variable
# Then use CLI flag: --connection "$DATABASE_URL"

Related Documentation