---
title: Metering
description: How to define billable metrics, configure metering rules with flat/per-unit/tiered pricing, and record usage events that debit credits.
order: 3
---

# Metering

Metering is how QuotaStack maps real-world actions to credit costs. You define what you charge for (billable metrics), how much it costs (metering rules), and then report when it happens (usage events). The system handles the rest -- looking up the rule, computing the cost, and debiting credits from the customer's account.

> **Mental Model:** Metering is the **bridge between "something happened" and "credits got spent"**. You define the price list (metering rules), report what happened (usage events), and QuotaStack does the math.

## Quick Take

- Define **billable metrics** (what you charge for) and **metering rules** (how much it costs)
- Three cost types: **flat**, **per-unit**, and **tiered** (graduated or volume)
- Record **usage events** — the system looks up the rule, computes cost, and debits credits
- Batch ingestion supported for high-throughput scenarios

## Diagram

A Usage Event is ingested, matched to a Metric Rule, passes through a Cost Calculation (with four pricing-type options: flat, per-unit, tiered, graduated), and results in a Credit Debit.

```mermaid
flowchart TD
    A[Usage Event] --> B[Metric Rule]
    B --> C[Cost Calculation]
    C --> D[Credit Debit]
    C -.- E[flat]
    C -.- F[per-unit]
    C -.- G[tiered]
    C -.- H[graduated]
```

## Billable metrics

A billable metric is a named thing you charge for. It is identified by a unique key within your tenant.

Examples:

| Key | Description |
|---|---|
| `chat_message` | A single chat message sent. |
| `look` | An AI-generated outfit look. |
| `api_call` | One API request to your service. |
| `storage_gb` | One gigabyte-month of storage. |
| `plan_purchase` | Purchasing or renewing a subscription plan. |

Metric keys are strings. Use lowercase with underscores. The key is what you reference in entitlement checks, usage events, and metering rules.

### Listing billable metrics

```
GET /v1/billable-metrics
```

```bash
curl https://api.quotastack.io/v1/billable-metrics \
  -H "X-API-Key: qs_live_..."
```

This returns all **billable metrics** defined for your tenant — the named units you charge for (e.g. `chat_message`, `look`). Each metric's pricing lives in a separate **metering rule** (see below). A billable metric without an active metering rule cannot be charged for.

### Getting a single metric

```
GET /v1/billable-metrics/{key}
```

```bash
curl https://api.quotastack.io/v1/billable-metrics/look \
  -H "X-API-Key: qs_live_..."
```

## Metering rules

A metering rule maps a billable metric key to a credit cost. Each tenant has at most one **active** rule per metric key at any time.

### Creating a metering rule

```
POST /v1/metering-rules
```

```bash
curl -X POST https://api.quotastack.io/v1/metering-rules \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: rule-look-v1" \
  -d '{
    "billable_metric_key": "look",
    "cost_type": "per_unit",
    "unit_cost": 1000
  }'
```

### Rule fields

| Field | Type | Required | Description |
|---|---|---|---|
| `billable_metric_key` | string | yes | The metric this rule prices. |
| `cost_type` | string | yes | `flat`, `per_unit`, or `tiered`. |
| `base_cost` | int64 | for `flat` | Fixed millicredit cost (ignores unit count). |
| `unit_cost` | int64 | for `per_unit` | Millicredits per unit. |
| `tier_config` | object | for `tiered` | Tier definitions (see below). |
| `metadata` | object | no | Arbitrary key-value pairs. |

### Listing metering rules

```
GET /v1/metering-rules
```

Supports optional query parameters:

| Parameter | Description |
|---|---|
| `billable_metric_key` | Filter by a specific metric key. |
| `active_only=true` | Only return currently active rules (no `effective_until`). |

## Cost types

### Flat

A fixed cost regardless of how many units are consumed. Useful for plan purchases or one-time fees.

```json
{
  "billable_metric_key": "plan_purchase",
  "cost_type": "flat",
  "base_cost": 99000
}
```

Checking 1 unit or 100 units always costs 99,000 mc.

### Per-unit

Cost scales linearly with units. The most common type for usage-based billing.

```json
{
  "billable_metric_key": "chat_message",
  "cost_type": "per_unit",
  "unit_cost": 1000
}
```

| Units | Cost |
|---|---|
| 1 | 1,000 mc |
| 5 | 5,000 mc |
| 100 | 100,000 mc |

### Tiered

Tiered pricing supports two modes: **graduated** and **volume**.

#### Graduated

Each tier prices its own slice of units. The total cost is the sum across all tiers.

```json
{
  "billable_metric_key": "api_call",
  "cost_type": "tiered",
  "tier_config": {
    "mode": "graduated",
    "tiers": [
      { "up_to": 100,  "unit_cost": 500, "flat_cost": 0 },
      { "up_to": 1000, "unit_cost": 300, "flat_cost": 0 },
      { "up_to": null, "unit_cost": 100, "flat_cost": 0 }
    ]
  }
}
```

For 250 API calls:

| Tier | Units | Unit cost | Subtotal |
|---|---|---|---|
| 1-100 | 100 | 500 mc | 50,000 mc |
| 101-250 | 150 | 300 mc | 45,000 mc |
| **Total** | **250** | | **95,000 mc** |

#### Volume

All units are priced at the rate of the tier they fall into. The tier is determined by the total unit count.

```json
{
  "billable_metric_key": "api_call",
  "cost_type": "tiered",
  "tier_config": {
    "mode": "volume",
    "tiers": [
      { "up_to": 100,  "unit_cost": 500, "flat_cost": 0 },
      { "up_to": 1000, "unit_cost": 300, "flat_cost": 0 },
      { "up_to": null, "unit_cost": 100, "flat_cost": 0 }
    ]
  }
}
```

For 250 API calls: all 250 fall into the second tier (up_to 1000), so total cost = 250 * 300 = 75,000 mc.

#### Tier fields

| Field | Type | Description |
|---|---|---|
| `up_to` | int64 or null | Upper bound for this tier. `null` means infinity (must be the last tier). |
| `unit_cost` | int64 | Millicredits per unit within this tier. |
| `flat_cost` | int64 | Fixed millicredit fee added once per tier entered (graduated) or once per event (volume). See below. |

#### `flat_cost` behavior

- **Graduated:** added once when the customer's cumulative usage crosses into the tier.
- **Volume:** added once, alongside the per-unit cost at the single tier all units fall into.

Concrete example — two tiers:

- Tier 1: 0–100 units, `unit_cost: 10`, `flat_cost: 100`
- Tier 2: 101+ units, `unit_cost: 5`, `flat_cost: 200`

For 150 units:

| Mode | Calculation | Total |
|---|---|---|
| Graduated | `100 (tier-1 flat) + 100 × 10 + 200 (tier-2 flat) + 50 × 5` | `1,550 mc` |
| Volume | `200 (tier-2 flat) + 150 × 5` | `950 mc` |

## Rule versioning

Creating a new metering rule for a `billable_metric_key` that already has an active rule automatically deactivates the old one. The old rule gets an `effective_until` timestamp set to now; the new rule becomes the active rule with `effective_until = null`.

This means you can update pricing without downtime:

1. Create a new rule for the same metric key with the new cost.
2. The old rule is deactivated immediately.
3. All subsequent usage events and entitlement checks use the new rule.

Historical ledger entries retain the cost they were computed with at the time of the event. Changing a rule does not retroactively alter past charges.

## Usage events

Usage events are how you tell QuotaStack that something billable happened. Recording a usage event triggers the pipeline that debits credits.

### Recording a usage event

```
POST /v1/usage
```

```bash
curl -X POST https://api.quotastack.io/v1/usage \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: req-abc-123" \
  -d '{
    "external_customer_id": "user_abc",
    "billable_metric_key": "look",
    "units": 1,
    "idempotency_key": "look-gen-xyz-789",
    "metadata": {
      "outfit_id": "outfit_456"
    }
  }'
```

| Field | Type | Required | Description |
|---|---|---|---|
| `external_customer_id` | string | yes* | Your tenant ID for the customer. See [Customer identification](/docs/concepts/customer-identification). |
| `customer_id` | string | yes* | Alternative: QuotaStack UUID. Exactly one of the two is required. |
| `billable_metric_key` | string | yes | The metric key being consumed. |
| `units` | int64 | yes | Number of units consumed (must be positive). |
| `idempotency_key` | string | yes | Unique key for this event. Prevents double-charging on retries. |
| `occurred_at` | timestamp | no | When the event happened. Defaults to now. Must not be in the future. |
| `metadata` | object | no | Arbitrary key-value pairs attached to the event. |

The response returns immediately with status `202 Accepted`:

```json
{
  "event_id": "evt_01HXY...",
  "idempotency_key": "look-gen-xyz-789",
  "status": "accepted",
  "estimated_cost": 1000,
  "duplicate": false
}
```

### Two-layer idempotency

Usage events have two levels of idempotency protection:

1. **HTTP-level:** The `Idempotency-Key` header prevents duplicate HTTP requests. If you retry the same POST, you get the same response.
2. **Event-level:** The `idempotency_key` field in the body prevents duplicate business events. Even if two different HTTP requests carry the same event idempotency key, the event is processed at most once.

### Batch usage events

Record multiple events in a single request:

```
POST /v1/usage/batch
```

```bash
curl -X POST https://api.quotastack.io/v1/usage/batch \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: batch-abc-001" \
  -d '{
    "events": [
      {
        "external_customer_id": "user_abc",
        "billable_metric_key": "chat_message",
        "units": 1,
        "idempotency_key": "msg-001"
      },
      {
        "external_customer_id": "user_abc",
        "billable_metric_key": "chat_message",
        "units": 1,
        "idempotency_key": "msg-002"
      }
    ]
  }'
```

Each event in the batch is validated independently. The response includes per-event status — one entry per input event, in the same order:

```json
{
  "accepted": 1,
  "rejected": 1,
  "results": [
    {
      "event_id": "019d...",
      "idempotency_key": "msg-001",
      "status": "accepted",
      "estimated_cost": 1000,
      "duplicate": false,
      "index": 0
    },
    {
      "idempotency_key": "msg-002",
      "status": "rejected",
      "error": "billable_metric_key is required; units must be positive",
      "index": 1
    }
  ]
}
```

Rejected events omit `event_id` and `estimated_cost` and include an `error` string (multiple errors joined with `; `). The overall HTTP response is always `202 Accepted` when the batch is well-formed — per-event rejections don't fail the whole request.

### Event-level `idempotency_key` scope

The `idempotency_key` field inside the event body is scoped **per-tenant + per-environment + per-credit-block**. Two different customers within the same tenant can safely share the same `idempotency_key` value without collision. Still, in practice, prefer a globally unique key derived from your business event (message ID, request ID) so your own logs and reconciliation stay clean.

This is separate from the HTTP `Idempotency-Key` **header**, which is per-tenant-scoped and deduplicates the whole POST request. See [Idempotency](/docs/concepts/idempotency) for details.

## The usage event pipeline

Usage events are processed asynchronously. The POST enqueues the event and returns immediately; a background pipeline then applies the debit in order:

<div style="display:flex;flex-direction:column;align-items:center;gap:0;margin:20px 0;">
<div style="background:#dbeafe;color:#1e40af;padding:10px 24px;border-radius:8px;font-size:13px;font-weight:600;font-family:var(--font-mono,monospace);">POST /v1/usage</div>
<div style="color:#9CA3AF;font-size:16px;padding:4px 0;">&darr;</div>
<div style="background:#ccfbf1;color:#115e59;padding:10px 24px;border-radius:8px;font-size:13px;font-weight:600;">Enqueued (per-tenant, ordered)</div>
<div style="color:#9CA3AF;font-size:16px;padding:4px 0;">&darr;</div>
<div style="background:#f3f4f6;color:#374151;padding:10px 24px;border-radius:8px;font-size:13px;font-weight:500;">Lookup active metering rule for billable_metric_key</div>
<div style="color:#9CA3AF;font-size:16px;padding:4px 0;">&darr;</div>
<div style="background:#f3e8ff;color:#6b21a8;padding:10px 24px;border-radius:8px;font-size:13px;font-weight:600;">Compute credit cost using the rule</div>
<div style="color:#9CA3AF;font-size:16px;padding:4px 0;">&darr;</div>
<div style="background:#fee2e2;color:#991b1b;padding:10px 24px;border-radius:8px;font-size:13px;font-weight:600;">Debit credits from customer's account (burn-down order)</div>
<div style="color:#9CA3AF;font-size:16px;padding:4px 0;">&darr;</div>
<div style="background:#fef3c7;color:#92400e;padding:10px 24px;border-radius:8px;font-size:13px;font-weight:600;">Refresh entitlement summary</div>
</div>

The POST returns `202 Accepted` immediately after enqueue. Events for a given tenant are processed in order, ensuring that credits are debited exactly once per event (using the event-level idempotency key).

If processing a particular event fails with a transient error, it's retried automatically. Events are only acknowledged after successful processing.

### Typical latency

Pipeline processing is typically **single-digit milliseconds to a few seconds** end-to-end. There is no strict SLA — treat it as near-real-time, not guaranteed-real-time.

If your flow requires **strict ordering** or **synchronous insufficient-credit errors**, use [reservations](/docs/concepts/reservations) instead of usage events.

### Insufficient balance during async processing

The `202 Accepted` confirms the event is enqueued, not that the debit succeeded. At processing time, if the customer's `effective_balance` is less than the computed cost, the debit **fails silently** — no negative balance is written, the customer is not charged, and no webhook is emitted today.

To avoid this, always:

1. **Pre-check** via the single-metric entitlement endpoint before sending the event, or
2. Use `/v1/reserve` for expensive operations where you need a synchronous 402 on insufficient credits.

Overage policy (`allow`, `notify`) does **not** automatically cause silent overdrafts. If you want to permit overage, you must check the entitlement's `allowed: true` response (which reflects the policy) before posting the usage event; the backend pipeline itself will not write a negative balance even under `allow`.

Periodic reconciliation — comparing the customer balance from `GET /v1/customers/{id}/credits` against your own expected totals — is the safest way to detect skew.

## Common Mistakes

**✗ Don't look for a `/debit` endpoint — it doesn't exist**

Debiting is intentionally not directly exposed. You record a usage event and the async pipeline handles rule lookup, cost calculation, and debit atomically.

**✗ Don't change a billable_metric_key mid-flight**

Metering rules are keyed by the metric key. Renaming it orphans historical rules and breaks in-flight events. Create a new metric instead.
