---
title: "API Platform: Postpaid Metered Billing"
description: "How to model a usage-based API platform with postpaid billing, tiered pricing, usage summaries, and contract lifecycle management."
order: 5
---

# API Platform

Pattern: postpaid metered API billing.

**Pattern:** POSTPAID · METERED · TIERED

*Inspired by: Stripe API, Twilio, OpenAI API, most usage-based developer infra*

> **Mental Model:** Think of this as a **utility bill for an API**: customers consume requests throughout the month, QuotaStack tracks usage at tiered rates, and at period close you receive a usage_summary webhook to invoice them. No prepayment, no spending caps unless you want them.

## Quick Take

- Postpaid **billing_mode** — usage accumulates, you invoice in arrears
- **Tiered metering** (graduated or volume) for volume discounts on heavy users
- Each cycle ends with a **subscription.renewed webhook** carrying a usage_summary
- Optional **contracts** for fixed-term commitments and renewals

## Diagram

Customer signs up via a postpaid plan variant — no upfront credits required (or a small budget grant for spending caps). Every API request emits POST /v1/usage with the billable metric. The metering rule applies tiered pricing in real time. At period close, QuotaStack auto-advances the cycle and fires subscription.renewed with a usage_summary — your code generates the invoice, charges the customer, and reconciles.

```mermaid
flowchart TD
    A[Customer signs up] --> B[POST /v1/subscriptions<br/>postpaid variant]
    B --> C[Optional: budget grant<br/>spending ceiling]
    D[Each API request] --> E[POST /v1/usage<br/>or /v1/usage/batch]
    E --> F[Tiered cost calculation]
    F --> G[Usage tracked,<br/>no synchronous debit]
    H[Cycle end] --> I[Auto-advance + webhook<br/>subscription.renewed]
    I --> J[usage_summary payload]
    J --> K[Tenant generates invoice<br/>+ charges customer]
```

## The problem

You run an API platform. Customers make API calls and are billed monthly based on actual usage. Like Stripe's API pricing or OpenAI's usage-based billing. Customers get a credit budget at the start of each cycle, consume it through API calls, and receive an invoice at cycle end for what they used.

You need:

- Postpaid billing: usage first, invoice later.
- Tiered pricing: first 10,000 API calls are cheap, the next 90,000 cost less per call.
- Automatic cycle advancement with usage summaries for invoicing.
- Contract lifecycle: annual contracts with auto-renewal and ending-soon alerts.
- Graceful handling of overdue invoices -- do not cut off a paying customer over a delayed ACH.

## Credit structure

In postpaid mode, credits function as a **budget**, not a prepayment. The customer does not pay for them upfront. QuotaStack grants them at cycle start to set the usage ceiling, and the tenant invoices based on actual consumption at cycle end.

| Block type | Priority | Expiry | Source |
|---|---|---|---|
| Monthly budget grant | 10 | End of billing cycle | Credit grant template on plan variant |
| Included credits (optional) | 0 | None | One-time grant on activation |

**Budget vs. prepaid:** The credit blocks work identically at the API level. The difference is business logic: in prepaid, credits are paid for before use; in postpaid, credits are a spending limit and the tenant invoices after the fact. QuotaStack does not care which model you use -- it tracks the credits either way.

## Billable metrics and tiered metering

### Single metric: `api_call`

```bash
POST /v1/billable-metrics
Idempotency-Key: metric:api_call

{ "key": "api_call", "name": "API Call" }
```

### Tiered metering rule

For graduated tiered pricing: different credit costs at different usage volumes within a cycle.

```bash
POST /v1/metering-rules
Idempotency-Key: rule:api_call

{
  "billable_metric_key": "api_call",
  "cost_type": "tiered",
  "tiers": [
    { "up_to": 10000,  "credit_cost": 1000 },
    { "up_to": 100000, "credit_cost": 500 },
    { "up_to": null,   "credit_cost": 100 }
  ],
  "unit_cost": 1000
}
```

This defines graduated tiers:

| Tier | Range | Credit cost per call | Per-call in credits |
|---|---|---|---|
| 1 | First 10,000 calls | 1000mc | 1 credit |
| 2 | Next 90,000 calls | 500mc | 0.5 credits |
| 3 | Beyond 100,000 calls | 100mc | 0.1 credits |

**Graduated mode:** Each tier applies only to the units within its range. A customer making 15,000 calls in a cycle pays:

```
Tier 1: 10,000 calls x 1000mc = 10,000,000mc (10,000 credits)
Tier 2:  5,000 calls x  500mc =  2,500,000mc  (2,500 credits)
Total:                           12,500,000mc (12,500 credits)
```

The tiered computation happens inside QuotaStack's metering rule engine. Your app just posts `units: 1` per API call.

## Plan catalog

### Plan and variant

```bash
POST /v1/plans
Idempotency-Key: plan:api-platform

{ "name": "API Platform", "description": "Usage-based API access" }
```

```bash
POST /v1/plans/{plan_id}/variants
Idempotency-Key: variant:api-monthly-postpaid

{
  "name": "Monthly Postpaid",
  "billing_cycle": "monthly",
  "billing_mode": "postpaid",
  "allow_usage_while_overdue": true
}
```

Key difference from prepaid: `billing_mode: "postpaid"`. QuotaStack auto-advances the cycle at `current_period_end` without waiting for a renew call. No `renewal_due_days` or `grace_period_days` -- those are prepaid concepts.

### Credit grant template (budget)

```bash
POST /v1/plans/{plan_id}/variants/{variant_id}/credit-grants
Idempotency-Key: grant:api-monthly-budget

{
  "credits": 100000000,
  "grant_interval": "billing_cycle",
  "grant_type": "recurring",
  "source": "plan_grant",
  "expires_after_seconds": 2592000,
  "rollover_percentage": 0,
  "accumulation_cap": null
}
```

100,000,000mc = 100,000 credits per cycle. This is the budget ceiling, not a prepayment. At 1 credit per API call (tier 1 rate), this covers 100,000 calls before overage kicks in.

For customers expected to exceed the budget, set `overage_policy: "allow"` on the customer so the balance can go negative.

## Integration flow

### 1. Customer subscribes

After contract signing and payment setup:

```bash
POST /v1/customers
Idempotency-Key: signup:<customer_id>

{
  "external_id": "acme_corp",
  "overage_policy": "allow"
}
```

```bash
POST /v1/subscriptions
Idempotency-Key: sub:<contract_id>

{
  "customer_id": "cus_...",
  "plan_variant_id": "pvr_api_monthly_postpaid",
  "contract_end": "2027-04-13T00:00:00Z",
  "contract_ending_soon_days": 30,
  "external_subscription_id": "your_contract_ref_123",
  "metadata": {
    "account_manager": "jane@yourcompany.com",
    "annual_contract_value": "120000"
  }
}
```

QuotaStack:
1. Creates the subscription with status `active`.
2. Sets the billing cycle: `current_period_start` = now, `current_period_end` = now + 1 month.
3. Sets the contract window: `contract_start` = now, `contract_end` = 2027-04-13.
4. Fires the credit grant template: creates a 100,000-credit budget block.
5. Fires `subscription.created` webhook.

### 2. Recording API usage

Each time a customer makes an API call, record usage:

```bash
POST /v1/usage
Idempotency-Key: usage:<request_id>

{
  "customer_id": "cus_...",
  "billable_metric_key": "api_call",
  "units": 1,
  "metadata": {
    "endpoint": "/v1/generate",
    "request_id": "req_abc123",
    "response_status": 200
  }
}
```

Response: `202 Accepted`. The usage event is enqueued and processed asynchronously. Your API responds to the customer without waiting for the credit debit to land.

**Batch usage for high-throughput APIs:**

```bash
POST /v1/usage/batch
Idempotency-Key: batch:<batch_id>

{
  "events": [
    {
      "customer_id": "cus_...",
      "billable_metric_key": "api_call",
      "units": 1,
      "idempotency_key": "usage:req_001",
      "metadata": { "endpoint": "/v1/generate" }
    },
    {
      "customer_id": "cus_...",
      "billable_metric_key": "api_call",
      "units": 1,
      "idempotency_key": "usage:req_002",
      "metadata": { "endpoint": "/v1/analyze" }
    }
  ]
}
```

Up to 100 events per batch. Each event has its own idempotency key for deduplication.

### 3. Cycle end -- usage summary and invoicing

At `current_period_end`, QuotaStack automatically advances the subscription and fires `subscription.renewed` with a usage summary:

```json
{
  "event": "subscription.renewed",
  "subscription_id": "sub_...",
  "customer_id": "cus_...",
  "billing_mode": "postpaid",
  "prior_period": {
    "start": "2026-04-13T00:00:00Z",
    "end": "2026-05-13T00:00:00Z"
  },
  "new_period": {
    "start": "2026-05-13T00:00:00Z",
    "end": "2026-06-13T00:00:00Z"
  },
  "credits_granted": 100000000,
  "usage_summary": {
    "total_credits_consumed": 12500000,
    "net_balance_at_cycle_start": 100000000,
    "net_balance_at_cycle_end": 87500000,
    "by_billable_metric": {
      "api_call": {
        "units": 15000,
        "credits": 12500000
      }
    }
  }
}
```

Your billing system reads this webhook and computes the invoice:

```python
def handle_subscription_renewed(event):
    if event.billing_mode != "postpaid":
        return

    summary = event.usage_summary

    # Map credit consumption back to fiat using your pricing
    api_calls = summary.by_billable_metric["api_call"]["units"]

    # Your tiered fiat pricing (separate from QuotaStack's credit tiers)
    invoice_amount = 0
    if api_calls <= 10000:
        invoice_amount = api_calls * 0.01  # $0.01 per call
    elif api_calls <= 100000:
        invoice_amount = 10000 * 0.01 + (api_calls - 10000) * 0.005
    else:
        invoice_amount = 10000 * 0.01 + 90000 * 0.005 + (api_calls - 100000) * 0.001

    # Create invoice via your PSP
    stripe.Invoice.create(
        customer=event.customer.external_id,
        amount=int(invoice_amount * 100),
        description=f"API usage: {api_calls} calls"
    )

    # Optionally: zero the ledger for the prior period
    # (only if the balance went negative due to overage)
    if summary.net_balance_at_cycle_end < 0:
        quotastack.adjust_credits(
            customer_id=event.customer_id,
            delta=-summary.net_balance_at_cycle_end,  # positive adjustment
            idempotency_key=f"settle:{event.subscription_id}:{event.prior_period.end}",
            metadata={"source": "postpaid_settlement"}
        )
```

The new cycle's budget is granted automatically. The customer keeps using the API without interruption.

### 4. Contract lifecycle

**30 days before contract end:** QuotaStack fires `subscription.contract_ending_soon`:

```json
{
  "event": "subscription.contract_ending_soon",
  "subscription_id": "sub_...",
  "customer_id": "cus_...",
  "contract_end": "2027-04-13T00:00:00Z",
  "days_remaining": 30
}
```

Your account management team uses this to start renewal negotiations.

**If the customer renews:** Extend the contract:

```bash
POST /v1/subscriptions/sub_.../extend
Idempotency-Key: extend:<amendment_id>

{
  "contract_end": "2028-04-13T00:00:00Z",
  "reason": "annual renewal signed",
  "metadata": {
    "amendment_id": "AMD-2027-001",
    "new_annual_value": "144000"
  }
}
```

The subscription continues auto-advancing at each cycle boundary.

**If the customer does not renew:** At `contract_end`, QuotaStack transitions to `contract_ended`:

```json
{
  "event": "subscription.contract_ended",
  "subscription_id": "sub_...",
  "customer_id": "cus_...",
  "contract_end": "2027-04-13T00:00:00Z"
}
```

No more auto-renewals. No more credit grants. Generate the final usage summary and invoice.

**Re-activating a lapsed contract:** If the customer comes back, extend the old subscription:

```bash
POST /v1/subscriptions/sub_.../extend
Idempotency-Key: extend:<new_amendment_id>

{
  "contract_end": "2028-04-13T00:00:00Z",
  "reason": "customer returned after lapse"
}
```

Extend works on `contract_ended` subscriptions -- it flips them back to `active` and resumes auto-advancing.

## Usage event pipeline

For high-throughput API platforms, the usage pipeline must be fast and resilient.

### Async ingestion

`POST /v1/usage` returns `202 Accepted` immediately. The event is queued and processed asynchronously, so your API response time stays unaffected by credit-debit latency.

In the background, QuotaStack:
1. Looks up the metering rule for `api_call`.
2. Computes the credit cost (applying tiered pricing if applicable).
3. Debits the customer's credit blocks in burn-down order.
4. Writes a credit ledger entry.
5. Optionally fires a `credit.consumed` webhook.

### Idempotency

Every usage event must have a unique idempotency key. Use your internal request ID:

```bash
Idempotency-Key: usage:<your_request_id>
```

If the same event is posted twice (network retry, queue replay), the second is a no-op. This gives at-least-once delivery semantics with exactly-once processing.

### Batching

For APIs handling thousands of requests per second, batch usage events rather than posting one at a time. Collect events in a local buffer (e.g., 100 events or 5 seconds, whichever comes first) and send them via `POST /v1/usage/batch`.

```python
# Pseudocode: usage event buffer
class UsageBuffer:
    def __init__(self, max_size=100, flush_interval_seconds=5):
        self.buffer = []
        self.max_size = max_size
        self.flush_interval = flush_interval_seconds

    def add(self, customer_id, metric_key, units, request_id, metadata):
        self.buffer.append({
            "customer_id": customer_id,
            "billable_metric_key": metric_key,
            "units": units,
            "idempotency_key": f"usage:{request_id}",
            "metadata": metadata
        })
        if len(self.buffer) >= self.max_size:
            self.flush()

    def flush(self):
        if not self.buffer:
            return
        quotastack.post_usage_batch(
            events=self.buffer,
            idempotency_key=f"batch:{generate_uuid()}"
        )
        self.buffer = []
```

### Entitlement gating for API calls

For prepaid-like behavior where you want to reject calls when the budget is exhausted, add an entitlement check to your API gateway:

```python
# API gateway middleware
def check_api_entitlement(customer_id):
    ent = quotastack.get_entitlement(customer_id, "api_call", units=1)
    if not ent.allowed:
        return http_response(
            status=429,
            body={
                "error": "usage_limit_exceeded",
                "message": "Monthly API call budget exhausted",
                "balance": ent.balance,
                "affordable_units": ent.affordable_units
            }
        )
    return None  # proceed
```

The entitlement check responds in sub-1ms on the fast path — safe to put on the hot path. See [Entitlements: latency and freshness](/docs/concepts/entitlements#latency-and-freshness) for the full semantics.

For postpaid customers with `overage_policy: "allow"`, skip the entitlement check or use it only for reporting -- the balance can go negative and the customer will be invoiced.

## Handling overdue invoices

When a postpaid customer has an unpaid invoice, you may want to throttle or block their API access. QuotaStack does not enforce this automatically -- it is your business decision.

Option 1: **Set overage_policy to "block"** when the invoice is overdue:

```bash
PATCH /v1/customers/cus_...
Idempotency-Key: block:<invoice_id>

{
  "overage_policy": "block"
}
```

Now the entitlement check returns `allowed: false` when the balance reaches 0. Restore to `"allow"` when the invoice is paid.

Option 2: **Use `allow_usage_while_overdue: false`** on the plan variant. This blocks usage when the subscription enters `overdue` status. But postpaid subscriptions do not have an `overdue` state (that is prepaid-only), so this option only applies if you also use prepaid elements.

Option 3: **Handle it entirely in your API gateway.** Check your own invoice status and reject requests independently of QuotaStack. This is the most common approach for API platforms.

## Example: full cycle with tiered usage

```
Day 1:   Subscription created (postpaid, monthly, annual contract).
         100,000-credit budget granted.

Day 1-30: Customer makes 85,000 API calls.
          Tier 1: 10,000 calls x 1 credit   = 10,000 credits
          Tier 2: 75,000 calls x 0.5 credits = 37,500 credits
          Total consumed: 47,500 credits.
          Balance: 100,000 - 47,500 = 52,500 credits.

Day 30:  Cycle auto-advances.
         subscription.renewed webhook fires with usage_summary:
           total_credits_consumed: 47,500,000 mc (47,500 credits)
           by_billable_metric:
             api_call: { units: 85,000, credits: 47,500,000 }
         
         Your billing system computes:
           85,000 calls at your fiat pricing = $550
           Generates Stripe invoice.

         New 100,000-credit budget granted for next cycle.
         Customer keeps calling the API without interruption.

Month 11: subscription.contract_ending_soon fires.
          Sales contacts customer for renewal.

Month 12: Customer signs renewal.
          POST /v1/subscriptions/{id}/extend
          Contract extends to year 2.
```

## Tips

- **Postpaid auto-advances.** Unlike prepaid where you must call `/renew`, postpaid subscriptions advance automatically at `current_period_end`. You do not need to call anything. The `subscription.renewed` webhook delivers the usage summary for invoicing.

- **Usage summary is your invoice source.** Do not re-query the credit ledger to compute invoices. The `usage_summary` in the `subscription.renewed` webhook payload has everything: total credits consumed, per-metric breakdown. Map credits back to fiat using your own pricing table.

- **Credit tiers are not fiat tiers.** QuotaStack's tiered metering rules define credit costs per unit. Your fiat pricing tiers may differ. A common pattern: set QuotaStack's credit cost to 1000mc per unit at all tiers (flat), and apply tiered fiat pricing in your invoicing code. Alternatively, mirror your fiat tiers in credit costs so the usage summary directly reflects tiered consumption.

- **Overage with negative balance.** When `overage_policy: "allow"`, the balance can go negative. This is normal for postpaid. The negative balance represents unbilled usage. At cycle end, settle the ledger by posting a positive adjustment after invoicing.

- **Contract lifecycle is separate from billing cycle.** A 12-month contract with monthly billing has 12 cycles. Each cycle auto-advances. The contract boundary only matters for `contract_ending_soon` and `contract_ended` events. Billing continues as normal within the contract.

- **Idempotency is critical at scale.** At thousands of requests per second, network retries happen. Every usage event must carry a unique, deterministic idempotency key (your request ID is ideal). Duplicate events are silently dropped. This gives you exactly-once credit accounting with at-least-once delivery.

See also: [Subscriptions](/docs/concepts/subscriptions), [Billing Modes](/docs/concepts/subscriptions#billing-modes), [Metering Rules](/docs/concepts/metering#metering-rules), [Usage Events](/docs/concepts/metering#usage-events).

## Concepts Used

- [Metering](/docs/concepts/metering)
- [Subscriptions](/docs/concepts/subscriptions)
- [Webhooks](/docs/concepts/webhooks)
- [Idempotency](/docs/concepts/idempotency)
