Models define your data sources and their structure.

Sidemantic supports **three syntaxes**: YAML, SQL, and Python. Examples show all three.

## Runnable Example (Executed in CI)

<details>
<summary>Doctest fixture (models/semantic_layer.yml)</summary>

```yaml doc-file=models/semantic_layer.yml
models:
  - name: orders
    sql: |
      select * from (values
        (1, 120.00),
        (2, 80.00)
      ) as t(id, amount)
    primary_key: id
    metrics:
      - name: revenue
        agg: sum
        sql: amount
```

</details>

```bash doc-test
uvx sidemantic query "SELECT orders.revenue FROM orders" --models ./models
```

```csv doc-expected
revenue
200.00
```

## Basic Model

<Tabs>
<TabItem value="yaml" label="YAML">

```yaml doc-validate=semantic
models:
  - name: orders
    table: orders
    primary_key: order_id
```

</TabItem>
<TabItem value="sql" label="SQL">

```sql doc-validate=lint
MODEL (
    name orders,
    table orders,
    primary_key order_id
);
```

</TabItem>
<TabItem value="python" label="Python">

```python doc-test
from sidemantic import Model

orders = Model(
    name="orders",
    table="orders",
    primary_key="order_id"
)
```

</TabItem>
</Tabs>

## Properties

### Required

- **name**: Unique identifier for the model
- **table**: Physical table name (or use `sql` for derived tables)
- **primary_key**: Primary key column (required for joins and symmetric aggregates)

### Optional

- **sql**: SQL query instead of table name (for derived models)
- **description**: Human-readable description
- **relationships**: Relationships to other models (see [Relationships](relationships))
- **dimensions**: Attributes for grouping and filtering
- **metrics**: Model-level aggregations
- **default_time_dimension**: Default time dimension to auto-include in queries
- **default_grain**: Default granularity for the time dimension (`hour`, `day`, `week`, `month`, `quarter`, `year`)

## Dimensions

Dimensions are attributes for grouping and filtering:

<Tabs>
<TabItem value="yaml" label="YAML">

```yaml doc-validate=semantic
models:
  - name: orders
    dimensions:
      - name: status
        type: categorical
        sql: status

      - name: order_date
        type: time
        sql: created_at
        granularity: day

      - name: is_weekend
        type: boolean
        sql: "DAYOFWEEK(order_date) IN (6, 7)"

      - name: amount
        type: numeric
        sql: total_amount
```

</TabItem>
<TabItem value="sql" label="SQL">

```sql doc-validate=lint
MODEL (name orders, table orders, primary_key order_id);

DIMENSION (
    name status,
    type categorical,
    sql status
);

DIMENSION (
    name order_date,
    type time,
    sql created_at,
    granularity day
);

DIMENSION (
    name is_weekend,
    type boolean,
    sql DAYOFWEEK(order_date) IN (6, 7)
);

DIMENSION (
    name amount,
    type numeric,
    sql total_amount
);
```

</TabItem>
<TabItem value="python" label="Python">

```python doc-test
from sidemantic import Model, Dimension

orders = Model(
    name="orders",
    table="orders",
    primary_key="order_id",
    dimensions=[
        Dimension(name="status", type="categorical", sql="status"),
        Dimension(name="order_date", type="time", sql="created_at", granularity="day"),
        Dimension(name="is_weekend", type="boolean", sql="DAYOFWEEK(order_date) IN (6, 7)"),
        Dimension(name="amount", type="numeric", sql="total_amount"),
    ]
)
```

</TabItem>
</Tabs>

### Dimension Types

- **categorical**: Text/enum values (status, region, product)
- **time**: Dates/timestamps with granularity (day, week, month, year)
- **boolean**: True/false values
- **numeric**: Numeric dimensions (price tier, quantity bucket)

### Time Granularity

```yaml doc-validate=lint
- name: order_date
  type: time
  sql: created_at
  granularity: day  # hour, day, week, month, quarter, year
```

Query with: `orders.order_date__month` to automatically roll up to month.

## Metrics (Model-Level)

Model-level metrics define aggregations on a single model:

<Tabs>
<TabItem value="yaml" label="YAML">

```yaml doc-validate=semantic
models:
  - name: orders
    metrics:
      # Simple aggregations
      - name: revenue
        agg: sum
        sql: amount

      - name: order_count
        agg: count

      - name: avg_order_value
        agg: avg
        sql: amount

      # With filters
      - name: completed_revenue
        agg: sum
        sql: amount
        filters: ["status = 'completed'"]

      # SQL expressions
      - name: total_value
        agg: sum
        sql: "quantity * price"
```

</TabItem>
<TabItem value="sql" label="SQL">

```sql doc-validate=lint
MODEL (name orders, table orders, primary_key order_id);

-- Simple aggregations
METRIC (
    name revenue,
    agg sum,
    sql amount
);

METRIC (
    name order_count,
    agg count
);

METRIC (
    name avg_order_value,
    agg avg,
    sql amount
);

-- With filters
METRIC (
    name completed_revenue,
    agg sum,
    sql amount,
    filters status = 'completed'
);

-- SQL expressions
METRIC (
    name total_value,
    agg sum,
    sql quantity * price
);
```

</TabItem>
<TabItem value="python" label="Python">

```python doc-test
from sidemantic import Model, Metric

orders = Model(
    name="orders",
    table="orders",
    primary_key="order_id",
    metrics=[
        Metric(name="revenue", agg="sum", sql="amount"),
        Metric(name="order_count", agg="count"),
        Metric(name="avg_order_value", agg="avg", sql="amount"),
        Metric(name="completed_revenue", agg="sum", sql="amount",
               filters=["status = 'completed'"]),
        Metric(name="total_value", agg="sum", sql="quantity * price"),
    ]
)
```

</TabItem>
</Tabs>


### Aggregation Types

- **sum**: Sum values
- **count**: Count rows
- **count_distinct**: Count unique values
- **avg**: Average
- **min**: Minimum value
- **max**: Maximum value
- **median**: Median value

See [Metrics](metrics) for graph-level metrics (ratios, derived, cumulative).

## Relationships

Define how models join to each other:

<Tabs>
<TabItem value="yaml" label="YAML">

```yaml doc-validate=semantic
models:
  - name: orders
    table: orders
    primary_key: order_id
    relationships:
      - name: customer
        type: many_to_one
        foreign_key: customer_id

  - name: customers
    table: customers
    primary_key: customer_id
    relationships:
      - name: orders
        type: one_to_many
        foreign_key: customer_id
```

</TabItem>
<TabItem value="sql" label="SQL">

```sql doc-validate=lint
-- orders.sql
MODEL (name orders, table orders, primary_key order_id);

RELATIONSHIP (
    name customer,
    type many_to_one,
    foreign_key customer_id
);

-- customers.sql
MODEL (name customers, table customers, primary_key customer_id);

RELATIONSHIP (
    name orders,
    type one_to_many,
    foreign_key customer_id
);
```

</TabItem>
<TabItem value="python" label="Python">

```python doc-test
from sidemantic import Model, Relationship

orders = Model(
    name="orders",
    table="orders",
    primary_key="order_id",
    relationships=[
        Relationship(name="customer", type="many_to_one", foreign_key="customer_id")
    ]
)

customers = Model(
    name="customers",
    table="customers",
    primary_key="customer_id",
    relationships=[
        Relationship(name="orders", type="one_to_many", foreign_key="customer_id")
    ]
)
```

</TabItem>
</Tabs>

See [Relationships](relationships) for details.

## Derived Models

Use SQL instead of a table:

```yaml doc-validate=semantic
models:
  - name: weekly_summary
    sql: |
      SELECT
        DATE_TRUNC('week', order_date) as week,
        SUM(amount) as total
      FROM orders
      GROUP BY 1
    primary_key: week
```

## Complete Example

<Tabs>
<TabItem value="yaml" label="YAML">

```yaml doc-validate=semantic
models:
  - name: orders
    table: orders
    primary_key: order_id
    description: "Customer orders"

    relationships:
      - name: customer
        type: many_to_one
        foreign_key: customer_id

    dimensions:
      - name: status
        type: categorical
        sql: status

      - name: order_date
        type: time
        sql: created_at
        granularity: day

      - name: customer_tier
        type: categorical
        sql: |
          CASE
            WHEN total_amount > 1000 THEN 'premium'
            WHEN total_amount > 100 THEN 'standard'
            ELSE 'basic'
          END

    metrics:
      - name: revenue
        agg: sum
        sql: amount
        description: "Total order revenue"

      - name: order_count
        agg: count
        description: "Number of orders"

      - name: avg_order_value
        agg: avg
        sql: amount

      - name: completed_revenue
        agg: sum
        sql: amount
        filters: ["status = 'completed'"]

      - name: large_orders
        agg: count
        filters: ["amount > 1000"]
```

</TabItem>
<TabItem value="sql" label="SQL">

See `examples/sql/orders.sql` for the complete SQL version:

```sql doc-validate=lint
MODEL (
    name orders,
    table orders,
    primary_key order_id,
    description 'Customer orders'
);

-- Relationships
RELATIONSHIP (
    name customer,
    type many_to_one,
    foreign_key customer_id
);

-- Dimensions
DIMENSION (
    name status,
    type categorical,
    sql status
);

DIMENSION (
    name order_date,
    type time,
    sql created_at,
    granularity day
);

DIMENSION (
    name customer_tier,
    type categorical,
    sql CASE WHEN total_amount > 1000 THEN 'premium' WHEN total_amount > 100 THEN 'standard' ELSE 'basic' END
);

-- Metrics
METRIC (
    name revenue,
    agg sum,
    sql amount,
    description 'Total order revenue'
);

METRIC (
    name order_count,
    agg count,
    description 'Number of orders'
);

METRIC (
    name avg_order_value,
    agg avg,
    sql amount
);

METRIC (
    name completed_revenue,
    agg sum,
    sql amount,
    filters status = 'completed'
);

METRIC (
    name large_orders,
    agg count,
    filters amount > 1000
);
```

</TabItem>
<TabItem value="python" label="Python">

```python doc-test
from sidemantic import Model, Dimension, Relationship, Metric

orders = Model(
    name="orders",
    table="orders",
    primary_key="order_id",
    description="Customer orders",
    relationships=[
        Relationship(name="customer", type="many_to_one", foreign_key="customer_id")
    ],
    dimensions=[
        Dimension(name="status", type="categorical", sql="status"),
        Dimension(name="order_date", type="time", sql="created_at", granularity="day"),
        Dimension(
            name="customer_tier",
            type="categorical",
            sql="""CASE
                WHEN total_amount > 1000 THEN 'premium'
                WHEN total_amount > 100 THEN 'standard'
                ELSE 'basic'
            END"""
        ),
    ],
    metrics=[
        Metric(name="revenue", agg="sum", sql="amount", description="Total order revenue"),
        Metric(name="order_count", agg="count", description="Number of orders"),
        Metric(name="avg_order_value", agg="avg", sql="amount"),
        Metric(name="completed_revenue", agg="sum", sql="amount", filters=["status = 'completed'"]),
        Metric(name="large_orders", agg="count", filters=["amount > 1000"]),
    ]
)
```

</TabItem>
</Tabs>


## Best Practices

<Callout type="tip" title="Naming Conventions">
- Models: plural nouns (`orders`, `customers`)
- Dimensions: descriptive names (`status`, `order_date`, `customer_region`)
- Metrics: what they measure (`revenue`, `order_count`, `avg_order_value`)
</Callout>

<Callout type="warning" title="Primary Keys Required">
Always set `primary_key` - it's required for:
- Joins between models
- Symmetric aggregates (fan-out handling)
- Proper deduplication
</Callout>

<Callout type="tip" title="SQL Expressions">
Use SQL expressions in `sql` fields:
- `sql: "UPPER(status)"` - transform dimension
- `sql: "quantity * price"` - calculated metric
- Supports any SQL your database supports
</Callout>