Docs / Concepts / Metering

Metering

How to define billable metrics, configure metering rules with flat/per-unit/tiered pricing, and record usage events that debit credits.

Mental Model

Metering is the bridge between "something happened" and "credits got spent". You define the price list (metering rules), report what happened (usage events), and QuotaStack does the math.

Quick Take
Define billable metrics (what you charge for) and metering rules (how much it costs)
Three cost types: flat, per-unit, and tiered (graduated or volume)
Record usage events — the system looks up the rule, computes cost, and debits credits
Batch ingestion supported for high-throughput scenarios
INGESTED Usage Event MATCHED TO Metric Rule APPLIES Cost Calculation flat per-unit tiered graduated RESULTS IN Credit Debit

Metering

Metering is how QuotaStack maps real-world actions to credit costs. You define what you charge for (billable metrics), how much it costs (metering rules), and then report when it happens (usage events). The system handles the rest — looking up the rule, computing the cost, and debiting credits from the customer’s account.

Billable metrics

A billable metric is a named thing you charge for. It is identified by a unique key within your tenant.

Examples:

KeyDescription
chat_messageA single chat message sent.
lookAn AI-generated outfit look.
api_callOne API request to your service.
storage_gbOne gigabyte-month of storage.
plan_purchasePurchasing or renewing a subscription plan.

Metric keys are strings. Use lowercase with underscores. The key is what you reference in entitlement checks, usage events, and metering rules.

Listing billable metrics

GET /v1/billable-metrics
curl https://api.quotastack.io/v1/billable-metrics \
  -H "X-API-Key: qs_live_..."

This returns all billable metrics defined for your tenant — the named units you charge for (e.g. chat_message, look). Each metric’s pricing lives in a separate metering rule (see below). A billable metric without an active metering rule cannot be charged for.

Getting a single metric

GET /v1/billable-metrics/{key}
curl https://api.quotastack.io/v1/billable-metrics/look \
  -H "X-API-Key: qs_live_..."

Metering rules

A metering rule maps a billable metric key to a credit cost. Each tenant has at most one active rule per metric key at any time.

Creating a metering rule

POST /v1/metering-rules
curl -X POST https://api.quotastack.io/v1/metering-rules \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: rule-look-v1" \
  -d '{
    "billable_metric_key": "look",
    "cost_type": "per_unit",
    "unit_cost": 1000
  }'

Rule fields

FieldTypeRequiredDescription
billable_metric_keystringyesThe metric this rule prices.
cost_typestringyesflat, per_unit, or tiered.
base_costint64for flatFixed millicredit cost (ignores unit count).
unit_costint64for per_unitMillicredits per unit.
tier_configobjectfor tieredTier definitions (see below).
metadataobjectnoArbitrary key-value pairs.

Listing metering rules

GET /v1/metering-rules

Supports optional query parameters:

ParameterDescription
billable_metric_keyFilter by a specific metric key.
active_only=trueOnly return currently active rules (no effective_until).

Cost types

Flat

A fixed cost regardless of how many units are consumed. Useful for plan purchases or one-time fees.

{
  "billable_metric_key": "plan_purchase",
  "cost_type": "flat",
  "base_cost": 99000
}

Checking 1 unit or 100 units always costs 99,000 mc.

Per-unit

Cost scales linearly with units. The most common type for usage-based billing.

{
  "billable_metric_key": "chat_message",
  "cost_type": "per_unit",
  "unit_cost": 1000
}
UnitsCost
11,000 mc
55,000 mc
100100,000 mc

Tiered

Tiered pricing supports two modes: graduated and volume.

Graduated

Each tier prices its own slice of units. The total cost is the sum across all tiers.

{
  "billable_metric_key": "api_call",
  "cost_type": "tiered",
  "tier_config": {
    "mode": "graduated",
    "tiers": [
      { "up_to": 100,  "unit_cost": 500, "flat_cost": 0 },
      { "up_to": 1000, "unit_cost": 300, "flat_cost": 0 },
      { "up_to": null, "unit_cost": 100, "flat_cost": 0 }
    ]
  }
}

For 250 API calls:

TierUnitsUnit costSubtotal
1-100100500 mc50,000 mc
101-250150300 mc45,000 mc
Total25095,000 mc

Volume

All units are priced at the rate of the tier they fall into. The tier is determined by the total unit count.

{
  "billable_metric_key": "api_call",
  "cost_type": "tiered",
  "tier_config": {
    "mode": "volume",
    "tiers": [
      { "up_to": 100,  "unit_cost": 500, "flat_cost": 0 },
      { "up_to": 1000, "unit_cost": 300, "flat_cost": 0 },
      { "up_to": null, "unit_cost": 100, "flat_cost": 0 }
    ]
  }
}

For 250 API calls: all 250 fall into the second tier (up_to 1000), so total cost = 250 * 300 = 75,000 mc.

Tier fields

FieldTypeDescription
up_toint64 or nullUpper bound for this tier. null means infinity (must be the last tier).
unit_costint64Millicredits per unit within this tier.
flat_costint64Fixed millicredit fee added once per tier entered (graduated) or once per event (volume). See below.

flat_cost behavior

  • Graduated: added once when the customer’s cumulative usage crosses into the tier.
  • Volume: added once, alongside the per-unit cost at the single tier all units fall into.

Concrete example — two tiers:

  • Tier 1: 0–100 units, unit_cost: 10, flat_cost: 100
  • Tier 2: 101+ units, unit_cost: 5, flat_cost: 200

For 150 units:

ModeCalculationTotal
Graduated100 (tier-1 flat) + 100 × 10 + 200 (tier-2 flat) + 50 × 51,550 mc
Volume200 (tier-2 flat) + 150 × 5950 mc

Rule versioning

Creating a new metering rule for a billable_metric_key that already has an active rule automatically deactivates the old one. The old rule gets an effective_until timestamp set to now; the new rule becomes the active rule with effective_until = null.

This means you can update pricing without downtime:

  1. Create a new rule for the same metric key with the new cost.
  2. The old rule is deactivated immediately.
  3. All subsequent usage events and entitlement checks use the new rule.

Historical ledger entries retain the cost they were computed with at the time of the event. Changing a rule does not retroactively alter past charges.

Usage events

Usage events are how you tell QuotaStack that something billable happened. Recording a usage event triggers the pipeline that debits credits.

Recording a usage event

POST /v1/usage
curl -X POST https://api.quotastack.io/v1/usage \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: req-abc-123" \
  -d '{
    "external_customer_id": "user_abc",
    "billable_metric_key": "look",
    "units": 1,
    "idempotency_key": "look-gen-xyz-789",
    "metadata": {
      "outfit_id": "outfit_456"
    }
  }'
FieldTypeRequiredDescription
external_customer_idstringyes*Your tenant ID for the customer. See Customer identification.
customer_idstringyes*Alternative: QuotaStack UUID. Exactly one of the two is required.
billable_metric_keystringyesThe metric key being consumed.
unitsint64yesNumber of units consumed (must be positive).
idempotency_keystringyesUnique key for this event. Prevents double-charging on retries.
occurred_attimestampnoWhen the event happened. Defaults to now. Must not be in the future.
metadataobjectnoArbitrary key-value pairs attached to the event.

The response returns immediately with status 202 Accepted:

{
  "event_id": "evt_01HXY...",
  "idempotency_key": "look-gen-xyz-789",
  "status": "accepted",
  "estimated_cost": 1000,
  "duplicate": false
}

Two-layer idempotency

Usage events have two levels of idempotency protection:

  1. HTTP-level: The Idempotency-Key header prevents duplicate HTTP requests. If you retry the same POST, you get the same response.
  2. Event-level: The idempotency_key field in the body prevents duplicate business events. Even if two different HTTP requests carry the same event idempotency key, the event is processed at most once.

Batch usage events

Record multiple events in a single request:

POST /v1/usage/batch
curl -X POST https://api.quotastack.io/v1/usage/batch \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: batch-abc-001" \
  -d '{
    "events": [
      {
        "external_customer_id": "user_abc",
        "billable_metric_key": "chat_message",
        "units": 1,
        "idempotency_key": "msg-001"
      },
      {
        "external_customer_id": "user_abc",
        "billable_metric_key": "chat_message",
        "units": 1,
        "idempotency_key": "msg-002"
      }
    ]
  }'

Each event in the batch is validated independently. The response includes per-event status — one entry per input event, in the same order:

{
  "accepted": 1,
  "rejected": 1,
  "results": [
    {
      "event_id": "019d...",
      "idempotency_key": "msg-001",
      "status": "accepted",
      "estimated_cost": 1000,
      "duplicate": false,
      "index": 0
    },
    {
      "idempotency_key": "msg-002",
      "status": "rejected",
      "error": "billable_metric_key is required; units must be positive",
      "index": 1
    }
  ]
}

Rejected events omit event_id and estimated_cost and include an error string (multiple errors joined with ; ). The overall HTTP response is always 202 Accepted when the batch is well-formed — per-event rejections don’t fail the whole request.

Event-level idempotency_key scope

The idempotency_key field inside the event body is scoped per-tenant + per-environment + per-credit-block. Two different customers within the same tenant can safely share the same idempotency_key value without collision. Still, in practice, prefer a globally unique key derived from your business event (message ID, request ID) so your own logs and reconciliation stay clean.

This is separate from the HTTP Idempotency-Key header, which is per-tenant-scoped and deduplicates the whole POST request. See Idempotency for details.

The usage event pipeline

Usage events are processed asynchronously. The POST enqueues the event and returns immediately; a background pipeline then applies the debit in order:

POST /v1/usage
Enqueued (per-tenant, ordered)
Lookup active metering rule for billable_metric_key
Compute credit cost using the rule
Debit credits from customer's account (burn-down order)
Refresh entitlement summary

The POST returns 202 Accepted immediately after enqueue. Events for a given tenant are processed in order, ensuring that credits are debited exactly once per event (using the event-level idempotency key).

If processing a particular event fails with a transient error, it’s retried automatically. Events are only acknowledged after successful processing.

Typical latency

Pipeline processing is typically single-digit milliseconds to a few seconds end-to-end. There is no strict SLA — treat it as near-real-time, not guaranteed-real-time.

If your flow requires strict ordering or synchronous insufficient-credit errors, use reservations instead of usage events.

Insufficient balance during async processing

The 202 Accepted confirms the event is enqueued, not that the debit succeeded. At processing time, if the customer’s effective_balance is less than the computed cost, the debit fails silently — no negative balance is written, the customer is not charged, and no webhook is emitted today.

To avoid this, always:

  1. Pre-check via the single-metric entitlement endpoint before sending the event, or
  2. Use /v1/reserve for expensive operations where you need a synchronous 402 on insufficient credits.

Overage policy (allow, notify) does not automatically cause silent overdrafts. If you want to permit overage, you must check the entitlement’s allowed: true response (which reflects the policy) before posting the usage event; the backend pipeline itself will not write a negative balance even under allow.

Periodic reconciliation — comparing the customer balance from GET /v1/customers/{id}/credits against your own expected totals — is the safest way to detect skew.

Common Mistakes

The mistakes developers typically make with this concept — and what to do instead.

×
Don't look for a /debit endpoint — it doesn't exist
Why
Debiting is intentionally not directly exposed. You record a usage event and the async pipeline handles rule lookup, cost calculation, and debit atomically.
×
Don't change a billable_metric_key mid-flight
Why
Metering rules are keyed by the metric key. Renaming it orphans historical rules and breaks in-flight events. Create a new metric instead.
🤖
Building with an AI agent?
Get this page as markdown: /docs/concepts/metering.md · Full index: /llms.txt