j17 - Event sourcing for the rest of us

Projections are read-only views that combine data from multiple aggregates into a single response. Your frontend needs an order with customer name, product details, and shipping info -- that's four aggregates. Without projections, you either make 4 API calls or build a custom endpoint. Projections solve this: define the shape you need in JSON, and the platform composes it for you.

When to use projections

Good for: - Dashboards combining multiple data sources - API responses needing denormalized data - Complex queries that would need multiple round-trips

Not for: - Simple single-aggregate reads (use GET directly) - Write operations (projections are read-only)

Defining projections

Projections are deployed via the admin API. Define them in JSON:

{
  "name": "order_with_details",
  "for_each": "order",
  "sources": {
    "order": "$key",
    "customer": {"type": "customer", "id": "$.order.customer_id"},
    "products": {"type": "product", "id": "$.order.line_items[*].product_id", "mode": "array"}
  },
  "include": {
    "pricing": "pricing_rules:global"
  },
  "select": {
    "order_id": "$.order.id",
    "total": "$.order.total",
    "customer_name": "$.customer.name",
    "tax_rate": "$.pricing.default_tax_rate",
    "items": {
      "$map": "$.order.line_items",
      "as": "$item",
      "to": {
        "product_id": "$item.product_id",
        "product_name": {
          "$lookup": "$.products",
          "where": {"id": "$item.product_id"},
          "select": "$.name"
        }
      }
    }
  }
}

Fields

Field	Required	Purpose
`name`	Yes	Projection identifier (lowercase, alphanumeric with underscores)
`for_each`	Yes	Trigger aggregate type -- one projection instance per aggregate of this type (bare streams only by default)
`include_substreams`	No	If `true`, also include tagged UUID variants (e.g., `ledger:uuid:2025`). Default: `false`.
`sources`	Yes	Aggregates derived from the trigger's state tree
`include`	No	Global singletons via `type:global` syntax
`select`	Yes	Output shape with JSONPath expressions or typed column objects

Sources

Sources define which aggregates to fetch. The first source can use "$key" shorthand to reference the trigger aggregate:

"sources": {
  "order": "$key",
  "customer": {"type": "customer", "id": "$.order.customer_id"}
}

Source options

Option	Default	Description
`type`	Required	Aggregate type to fetch
`id`	Required	ID expression (`$key`, literal, or JSONPath)
`mode`	`"single"`	`"single"` or `"array"` for fetching multiple

Array mode

Use mode: "array" when the ID path returns multiple values:

"products": {
  "type": "product",
  "id": "$.order.line_items[*].product_id",
  "mode": "array"
}

This fetches all products referenced by the order's line items.

Include

The include field is for global singleton aggregates that are not in the trigger's state tree:

"include": {
  "config": "settings:global",
  "pricing": "pricing_rules:global"
}

include is restricted to :global IDs to guide users toward proper patterns. For non-global aggregates, use sources with appropriate ID resolution.

Select expressions

The select field defines the output shape using JSONPath expressions.

Basic paths

Reference any source or include binding:

"select": {
  "customer_name": "$.customer.name",
  "order_total": "$.order.total"
}

$map transform

Map over arrays to transform each element:

"items": {
  "$map": "$.order.line_items",
  "as": "$item",
  "to": {
    "name": "$item.name",
    "qty": "$item.quantity"
  }
}

$lookup transform

Find and extract from arrays:

"product_name": {
  "$lookup": "$.products",
  "where": {"id": "$item.product_id"},
  "select": "$.name"
}

Both $lookup and typed path entries accept an optional "type" annotation for SQL materialization:

"price": {
  "$lookup": "$.products",
  "where": {"id": "$item.product_id"},
  "select": "$.price",
  "type": "real"
}

API endpoints

Admin (headnode, JWT authentication)

POST   /api/instances/:id/projections        # Deploy definition
GET    /api/instances/:id/projections        # List all
DELETE /api/instances/:id/projections/:name  # Remove

The deploy body wraps the definition under a projection key:

{
  "name": "order_summary",
  "environment": "prod",
  "projection": {
    "for_each": "order",
    "sources": {"order": "$key"},
    "select": {"total": "$.order.total"}
  }
}

The dashboard upload expects the same format — use the same JSON file you'd use with curl.

Data (API key authentication)

GET /_projections/:name/:id              # Read cached
GET /_projections/:name/:id?synchronous  # Compute fresh

Use ?synchronous when you need guaranteed fresh data. Otherwise, projections automatically refresh when any source aggregate changes.

Example: User dashboard

{
  "name": "user_dashboard",
  "for_each": "user",
  "sources": {
    "user": "$key",
    "org": {"type": "organization", "id": "$.user.org_id"},
    "recent_orders": {"type": "order", "id": "$.user.recent_order_ids", "mode": "array"}
  },
  "include": {
    "features": "feature_flags:global"
  },
  "select": {
    "user_name": "$.user.name",
    "org_name": "$.org.name",
    "orders": "$.recent_orders",
    "dark_mode_enabled": "$.features.dark_mode"
  }
}

Request:

curl https://myapp.j17.dev/_projections/user_dashboard/abc123 \
  -H "Authorization: Bearer $API_KEY"

Response:

{
  "ok": true,
  "data": {
    "user_name": "Alice",
    "org_name": "Acme Corp",
    "orders": [],
    "dark_mode_enabled": true
  },
  "cached_at": 1706745600
}

Deployment example

# Deploy a projection (headnode)
curl -X POST "https://console.j17.dev/api/instances/$INSTANCE_ID/projections" \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "order_summary",
    "environment": "prod",
    "projection": {
      "for_each": "order",
      "sources": {
        "order": "$key",
        "customer": {"type": "customer", "id": "$.order.customer_id"}
      },
      "select": {
        "id": "$.order.id",
        "total": "$.order.total",
        "customer_name": "$.customer.name"
      }
    }
  }'

# Read a projection (worker, API key auth)
curl https://myapp.j17.dev/_projections/order_summary/ord_12345 \
  -H "Authorization: Bearer $API_KEY"

Performance

Projections automatically cache and refresh when source data changes. No manual invalidation required.

Guidelines: - Keep to 3-4 sources max - Use array mode sparingly (each ID is a separate fetch) - For high-traffic projections, use the cached endpoint (default) rather than ?synchronous

Limitations

No joins across instances
No aggregation across all aggregates (use analytics export)
Max 10 sources per projection
include is restricted to :global IDs

Compared to read models

Traditional event sourcing uses "read models" -- separate databases updated by event handlers. Projections are simpler: - No separate data store - No eventual consistency lag (computed from live data) - Computed on demand with automatic caching

But projections are not a substitute for heavy analytics. If you need complex cross-aggregate queries, export to a data warehouse.

CSV Export

Download all rows of a projection as CSV:

curl https://myapp.j17.dev/_projections/order_summary/export.csv \
  -H "Authorization: Bearer $API_KEY" \
  -o orders.csv

Columns: _key_id (aggregate ID) followed by select keys alphabetically. Nested values are JSON-serialized. A trailer comment reports row counts for completeness verification.

100,000 row limit
Subject to instance read rate limits
Stale-cache rows omitted (reported in trailer)

SQL Queries

Projection data is automatically materialized into queryable tables. Query them with read-only SQL:

curl -X POST https://myapp.j17.dev/_projections/query \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sql": "SELECT name, email FROM user_summary WHERE status = ?1 ORDER BY name LIMIT 50",
    "params": ["active"]
  }'

{
  "ok": true,
  "columns": ["name", "email"],
  "rows": [["Alice", "alice@example.com"], ["Bob", "bob@example.com"]],
  "types": ["text", "text"]
}

Each row is a positional array aligned with columns (and types). columns and types are always present — including when the query matches no rows — so you can read a projection's shape from an empty result.

Column types

Every projection table has two metadata columns plus your select columns:

Column	SQLite type	`types` value	Notes
`_key_id`	TEXT	`"text"`	The aggregate ID (primary key). When `include_substreams` is true, tagged variants appear as separate rows with the full ID including the tag (e.g., `abc123...:2025`).
`_updated_at`	INTEGER	`"integer"`	Unix seconds when the row was last upserted — returned as a JSON number
unannotated `select` columns	TEXT	`"text"`	Default for plain string paths
annotated `select` columns	per declaration	declared type	See type annotations below

Type annotations

By default all user columns are TEXT. To use a native SQLite type, replace the plain path string with an object:

"select": {
  "name":      "$.user.name",
  "age":       {"path": "$.user.age",    "type": "integer"},
  "score":     {"path": "$.user.score",  "type": "real"},
  "active":    {"path": "$.user.active", "type": "boolean"},
  "tags":      {"path": "$.user.tags",   "type": "json"}
}

Valid types:

Type	SQLite affinity	`types` value	Notes
`text`	TEXT	`"text"`	Default. `to_string()` coercion.
`integer`	INTEGER	`"integer"`	Numeric comparisons and `ORDER BY` work correctly. Values that can't parse as integer are stored `NULL`.
`real`	REAL	`"real"`	Same as integer for floats.
`boolean`	INTEGER	`"boolean"`	Stored as `0`/`1`. Truthy: `true`, `"true"`, `1`. Falsy: `false`, `"false"`, `0`. Anything else → `NULL`.
`json`	TEXT	`"json"`	Maps and arrays JSON-encoded. Signals the client to `JSON.parse()`. `$map` columns default to this type.

The types array in the query response is parallel to columns — one entry per selected column. Use it to decode the columns your client needs:

{
  "ok": true,
  "columns": ["name", "tags", "age"],
  "rows": [
    ["Alice", "[\"admin\",\"billing\"]", 32],
    ["Bob",   "[\"viewer\"]",            28]
  ],
  "types": ["text", "json", "integer"]
}

tags arrives as the string "[\"admin\",\"billing\"]" — parse it with JSON.parse() / Jason.decode!(). age arrives as a JSON number (no parsing needed). An unannotated date field arrives as whatever string the source event used (typically ISO 8601).

Range queries on text columns

Text columns sort lexicographically. That's chronological for two common shapes:

ISO 8601 strings ("2026-05-04T13:22:11Z") sort chronologically by string comparison: WHERE started_at >= '2026-05-01T00:00:00Z'
Fixed-width Unix seconds ("1714521600") sort numerically because all values have the same digit count for any timestamp from 2001 to 2286

If you control the event shape, prefer ISO 8601 — it's the default JSON Schema format: "date-time" representation and stays human-readable in query results.

The _updated_at column is INTEGER, so it takes numeric comparison directly: WHERE _updated_at >= 1714521600.

Other restrictions

SELECT only, no semicolons, 10,000 row default limit, 5-second timeout. Subject to instance read rate limits.

Not for OLAP: For analytical queries across large datasets, pipe events into ClickHouse, DuckDB, or BigQuery. SQL queries cover operational needs — active users, orders by status, recent activity.

Substreams and tagged UUIDs

By default, for_each only matches bare aggregate streams — ledger:uuid, not ledger:uuid:2025. Set "include_substreams": true to include tagged variants:

{
  "name": "ledger_summary",
  "for_each": "ledger",
  "include_substreams": true,
  "sources": { "ledger": "$key" },
  "select": { "total": "$.ledger.total", "period": "$.ledger.period" }
}

Each tagged variant becomes its own row. Filter by period in SQL: WHERE _key_id LIKE '%:2025'.

When to use include_substreams: the events in the substream share the same aggregate type, the same schema, and the same handlers as the base stream. A yearly ledger partition (ledger:uuid:2026) that carries forward balances and posts entries is the right use case.

When not to use include_substreams: if you're tagging to store related-but-different data under a user's ID — a webhook log, an audit trail, an inbox — those events have different schemas and likely no meaningful handlers. That's not a substream, that's a separate aggregate type. Use a deterministic v5 UUID instead:

// Stable ID derived from the user — no colons, genuinely separate type
const emailLogId = uuidv5(userId, EMAIL_LOG_NAMESPACE)
await j17.post(`email_log/${emailLogId}/webhook_received`, { ... })

This keeps the user projection clean and gives the email_log type its own spec, schema, and handlers.

Projections