Docs / Use Cases / API Platform: Postpaid Metered Billing
POSTPAID · METERED · TIERED

API Platform: Postpaid Metered Billing

How to model a usage-based API platform with postpaid billing, tiered pricing, usage summaries, and contract lifecycle management.

Inspired by: Stripe API, Twilio, OpenAI API, most usage-based developer infra

Mental Model

Think of this as a utility bill for an API: customers consume requests throughout the month, QuotaStack tracks usage at tiered rates, and at period close you receive a usage_summary webhook to invoice them. No prepayment, no spending caps unless you want them.

Quick Take
Postpaid billing_mode — usage accumulates, you invoice in arrears
Tiered metering (graduated or volume) for volume discounts on heavy users
Each cycle ends with a subscription.renewed webhook carrying a usage_summary
Optional contracts for fixed-term commitments and renewals
POSTPAID · BILLED IN ARREARS Subscription every request POST /V1/USAGE · 202 Usage Event GRADUATED OR VOLUME TIERS Tiered Cost Calc cycle end · usage_summary webhook TENANT GENERATES Invoice

API Platform

Pattern: postpaid metered API billing.

The problem

You run an API platform. Customers make API calls and are billed monthly based on actual usage. Like Stripe’s API pricing or OpenAI’s usage-based billing. Customers get a credit budget at the start of each cycle, consume it through API calls, and receive an invoice at cycle end for what they used.

You need:

  • Postpaid billing: usage first, invoice later.
  • Tiered pricing: first 10,000 API calls are cheap, the next 90,000 cost less per call.
  • Automatic cycle advancement with usage summaries for invoicing.
  • Contract lifecycle: annual contracts with auto-renewal and ending-soon alerts.
  • Graceful handling of overdue invoices — do not cut off a paying customer over a delayed ACH.

Credit structure

In postpaid mode, credits function as a budget, not a prepayment. The customer does not pay for them upfront. QuotaStack grants them at cycle start to set the usage ceiling, and the tenant invoices based on actual consumption at cycle end.

Block typePriorityExpirySource
Monthly budget grant10End of billing cycleCredit grant template on plan variant
Included credits (optional)0NoneOne-time grant on activation

Budget vs. prepaid: The credit blocks work identically at the API level. The difference is business logic: in prepaid, credits are paid for before use; in postpaid, credits are a spending limit and the tenant invoices after the fact. QuotaStack does not care which model you use — it tracks the credits either way.

Billable metrics and tiered metering

Single metric: api_call

POST /v1/billable-metrics
Idempotency-Key: metric:api_call

{ "key": "api_call", "name": "API Call" }

Tiered metering rule

For graduated tiered pricing: different credit costs at different usage volumes within a cycle.

POST /v1/metering-rules
Idempotency-Key: rule:api_call

{
  "billable_metric_key": "api_call",
  "cost_type": "tiered",
  "tiers": [
    { "up_to": 10000,  "credit_cost": 1000 },
    { "up_to": 100000, "credit_cost": 500 },
    { "up_to": null,   "credit_cost": 100 }
  ],
  "unit_cost": 1000
}

This defines graduated tiers:

TierRangeCredit cost per callPer-call in credits
1First 10,000 calls1000mc1 credit
2Next 90,000 calls500mc0.5 credits
3Beyond 100,000 calls100mc0.1 credits

Graduated mode: Each tier applies only to the units within its range. A customer making 15,000 calls in a cycle pays:

Tier 1: 10,000 calls x 1000mc = 10,000,000mc (10,000 credits)
Tier 2:  5,000 calls x  500mc =  2,500,000mc  (2,500 credits)
Total:                           12,500,000mc (12,500 credits)

The tiered computation happens inside QuotaStack’s metering rule engine. Your app just posts units: 1 per API call.

Plan catalog

Plan and variant

POST /v1/plans
Idempotency-Key: plan:api-platform

{ "name": "API Platform", "description": "Usage-based API access" }
POST /v1/plans/{plan_id}/variants
Idempotency-Key: variant:api-monthly-postpaid

{
  "name": "Monthly Postpaid",
  "billing_cycle": "monthly",
  "billing_mode": "postpaid",
  "allow_usage_while_overdue": true
}

Key difference from prepaid: billing_mode: "postpaid". QuotaStack auto-advances the cycle at current_period_end without waiting for a renew call. No renewal_due_days or grace_period_days — those are prepaid concepts.

Credit grant template (budget)

POST /v1/plans/{plan_id}/variants/{variant_id}/credit-grants
Idempotency-Key: grant:api-monthly-budget

{
  "credits": 100000000,
  "grant_interval": "billing_cycle",
  "grant_type": "recurring",
  "source": "plan_grant",
  "expires_after_seconds": 2592000,
  "rollover_percentage": 0,
  "accumulation_cap": null
}

100,000,000mc = 100,000 credits per cycle. This is the budget ceiling, not a prepayment. At 1 credit per API call (tier 1 rate), this covers 100,000 calls before overage kicks in.

For customers expected to exceed the budget, set overage_policy: "allow" on the customer so the balance can go negative.

Integration flow

1. Customer subscribes

After contract signing and payment setup:

POST /v1/customers
Idempotency-Key: signup:<customer_id>

{
  "external_id": "acme_corp",
  "overage_policy": "allow"
}
POST /v1/subscriptions
Idempotency-Key: sub:<contract_id>

{
  "customer_id": "cus_...",
  "plan_variant_id": "pvr_api_monthly_postpaid",
  "contract_end": "2027-04-13T00:00:00Z",
  "contract_ending_soon_days": 30,
  "external_subscription_id": "your_contract_ref_123",
  "metadata": {
    "account_manager": "jane@yourcompany.com",
    "annual_contract_value": "120000"
  }
}

QuotaStack:

  1. Creates the subscription with status active.
  2. Sets the billing cycle: current_period_start = now, current_period_end = now + 1 month.
  3. Sets the contract window: contract_start = now, contract_end = 2027-04-13.
  4. Fires the credit grant template: creates a 100,000-credit budget block.
  5. Fires subscription.created webhook.

2. Recording API usage

Each time a customer makes an API call, record usage:

POST /v1/usage
Idempotency-Key: usage:<request_id>

{
  "customer_id": "cus_...",
  "billable_metric_key": "api_call",
  "units": 1,
  "metadata": {
    "endpoint": "/v1/generate",
    "request_id": "req_abc123",
    "response_status": 200
  }
}

Response: 202 Accepted. The usage event is enqueued and processed asynchronously. Your API responds to the customer without waiting for the credit debit to land.

Batch usage for high-throughput APIs:

POST /v1/usage/batch
Idempotency-Key: batch:<batch_id>

{
  "events": [
    {
      "customer_id": "cus_...",
      "billable_metric_key": "api_call",
      "units": 1,
      "idempotency_key": "usage:req_001",
      "metadata": { "endpoint": "/v1/generate" }
    },
    {
      "customer_id": "cus_...",
      "billable_metric_key": "api_call",
      "units": 1,
      "idempotency_key": "usage:req_002",
      "metadata": { "endpoint": "/v1/analyze" }
    }
  ]
}

Up to 100 events per batch. Each event has its own idempotency key for deduplication.

3. Cycle end — usage summary and invoicing

At current_period_end, QuotaStack automatically advances the subscription and fires subscription.renewed with a usage summary:

{
  "event": "subscription.renewed",
  "subscription_id": "sub_...",
  "customer_id": "cus_...",
  "billing_mode": "postpaid",
  "prior_period": {
    "start": "2026-04-13T00:00:00Z",
    "end": "2026-05-13T00:00:00Z"
  },
  "new_period": {
    "start": "2026-05-13T00:00:00Z",
    "end": "2026-06-13T00:00:00Z"
  },
  "credits_granted": 100000000,
  "usage_summary": {
    "total_credits_consumed": 12500000,
    "net_balance_at_cycle_start": 100000000,
    "net_balance_at_cycle_end": 87500000,
    "by_billable_metric": {
      "api_call": {
        "units": 15000,
        "credits": 12500000
      }
    }
  }
}

Your billing system reads this webhook and computes the invoice:

def handle_subscription_renewed(event):
    if event.billing_mode != "postpaid":
        return

    summary = event.usage_summary

    # Map credit consumption back to fiat using your pricing
    api_calls = summary.by_billable_metric["api_call"]["units"]

    # Your tiered fiat pricing (separate from QuotaStack's credit tiers)
    invoice_amount = 0
    if api_calls <= 10000:
        invoice_amount = api_calls * 0.01  # $0.01 per call
    elif api_calls <= 100000:
        invoice_amount = 10000 * 0.01 + (api_calls - 10000) * 0.005
    else:
        invoice_amount = 10000 * 0.01 + 90000 * 0.005 + (api_calls - 100000) * 0.001

    # Create invoice via your PSP
    stripe.Invoice.create(
        customer=event.customer.external_id,
        amount=int(invoice_amount * 100),
        description=f"API usage: {api_calls} calls"
    )

    # Optionally: zero the ledger for the prior period
    # (only if the balance went negative due to overage)
    if summary.net_balance_at_cycle_end < 0:
        quotastack.adjust_credits(
            customer_id=event.customer_id,
            delta=-summary.net_balance_at_cycle_end,  # positive adjustment
            idempotency_key=f"settle:{event.subscription_id}:{event.prior_period.end}",
            metadata={"source": "postpaid_settlement"}
        )

The new cycle’s budget is granted automatically. The customer keeps using the API without interruption.

4. Contract lifecycle

30 days before contract end: QuotaStack fires subscription.contract_ending_soon:

{
  "event": "subscription.contract_ending_soon",
  "subscription_id": "sub_...",
  "customer_id": "cus_...",
  "contract_end": "2027-04-13T00:00:00Z",
  "days_remaining": 30
}

Your account management team uses this to start renewal negotiations.

If the customer renews: Extend the contract:

POST /v1/subscriptions/sub_.../extend
Idempotency-Key: extend:<amendment_id>

{
  "contract_end": "2028-04-13T00:00:00Z",
  "reason": "annual renewal signed",
  "metadata": {
    "amendment_id": "AMD-2027-001",
    "new_annual_value": "144000"
  }
}

The subscription continues auto-advancing at each cycle boundary.

If the customer does not renew: At contract_end, QuotaStack transitions to contract_ended:

{
  "event": "subscription.contract_ended",
  "subscription_id": "sub_...",
  "customer_id": "cus_...",
  "contract_end": "2027-04-13T00:00:00Z"
}

No more auto-renewals. No more credit grants. Generate the final usage summary and invoice.

Re-activating a lapsed contract: If the customer comes back, extend the old subscription:

POST /v1/subscriptions/sub_.../extend
Idempotency-Key: extend:<new_amendment_id>

{
  "contract_end": "2028-04-13T00:00:00Z",
  "reason": "customer returned after lapse"
}

Extend works on contract_ended subscriptions — it flips them back to active and resumes auto-advancing.

Usage event pipeline

For high-throughput API platforms, the usage pipeline must be fast and resilient.

Async ingestion

POST /v1/usage returns 202 Accepted immediately. The event is queued and processed asynchronously, so your API response time stays unaffected by credit-debit latency.

In the background, QuotaStack:

  1. Looks up the metering rule for api_call.
  2. Computes the credit cost (applying tiered pricing if applicable).
  3. Debits the customer’s credit blocks in burn-down order.
  4. Writes a credit ledger entry.
  5. Optionally fires a credit.consumed webhook.

Idempotency

Every usage event must have a unique idempotency key. Use your internal request ID:

Idempotency-Key: usage:<your_request_id>

If the same event is posted twice (network retry, queue replay), the second is a no-op. This gives at-least-once delivery semantics with exactly-once processing.

Batching

For APIs handling thousands of requests per second, batch usage events rather than posting one at a time. Collect events in a local buffer (e.g., 100 events or 5 seconds, whichever comes first) and send them via POST /v1/usage/batch.

# Pseudocode: usage event buffer
class UsageBuffer:
    def __init__(self, max_size=100, flush_interval_seconds=5):
        self.buffer = []
        self.max_size = max_size
        self.flush_interval = flush_interval_seconds

    def add(self, customer_id, metric_key, units, request_id, metadata):
        self.buffer.append({
            "customer_id": customer_id,
            "billable_metric_key": metric_key,
            "units": units,
            "idempotency_key": f"usage:{request_id}",
            "metadata": metadata
        })
        if len(self.buffer) >= self.max_size:
            self.flush()

    def flush(self):
        if not self.buffer:
            return
        quotastack.post_usage_batch(
            events=self.buffer,
            idempotency_key=f"batch:{generate_uuid()}"
        )
        self.buffer = []

Entitlement gating for API calls

For prepaid-like behavior where you want to reject calls when the budget is exhausted, add an entitlement check to your API gateway:

# API gateway middleware
def check_api_entitlement(customer_id):
    ent = quotastack.get_entitlement(customer_id, "api_call", units=1)
    if not ent.allowed:
        return http_response(
            status=429,
            body={
                "error": "usage_limit_exceeded",
                "message": "Monthly API call budget exhausted",
                "balance": ent.balance,
                "affordable_units": ent.affordable_units
            }
        )
    return None  # proceed

The entitlement check responds in sub-1ms on the fast path — safe to put on the hot path. See Entitlements: latency and freshness for the full semantics.

For postpaid customers with overage_policy: "allow", skip the entitlement check or use it only for reporting — the balance can go negative and the customer will be invoiced.

Handling overdue invoices

When a postpaid customer has an unpaid invoice, you may want to throttle or block their API access. QuotaStack does not enforce this automatically — it is your business decision.

Option 1: Set overage_policy to “block” when the invoice is overdue:

PATCH /v1/customers/cus_...
Idempotency-Key: block:<invoice_id>

{
  "overage_policy": "block"
}

Now the entitlement check returns allowed: false when the balance reaches 0. Restore to "allow" when the invoice is paid.

Option 2: Use allow_usage_while_overdue: false on the plan variant. This blocks usage when the subscription enters overdue status. But postpaid subscriptions do not have an overdue state (that is prepaid-only), so this option only applies if you also use prepaid elements.

Option 3: Handle it entirely in your API gateway. Check your own invoice status and reject requests independently of QuotaStack. This is the most common approach for API platforms.

Example: full cycle with tiered usage

Day 1:   Subscription created (postpaid, monthly, annual contract).
         100,000-credit budget granted.

Day 1-30: Customer makes 85,000 API calls.
          Tier 1: 10,000 calls x 1 credit   = 10,000 credits
          Tier 2: 75,000 calls x 0.5 credits = 37,500 credits
          Total consumed: 47,500 credits.
          Balance: 100,000 - 47,500 = 52,500 credits.

Day 30:  Cycle auto-advances.
         subscription.renewed webhook fires with usage_summary:
           total_credits_consumed: 47,500,000 mc (47,500 credits)
           by_billable_metric:
             api_call: { units: 85,000, credits: 47,500,000 }
         
         Your billing system computes:
           85,000 calls at your fiat pricing = $550
           Generates Stripe invoice.

         New 100,000-credit budget granted for next cycle.
         Customer keeps calling the API without interruption.

Month 11: subscription.contract_ending_soon fires.
          Sales contacts customer for renewal.

Month 12: Customer signs renewal.
          POST /v1/subscriptions/{id}/extend
          Contract extends to year 2.

Tips

  • Postpaid auto-advances. Unlike prepaid where you must call /renew, postpaid subscriptions advance automatically at current_period_end. You do not need to call anything. The subscription.renewed webhook delivers the usage summary for invoicing.

  • Usage summary is your invoice source. Do not re-query the credit ledger to compute invoices. The usage_summary in the subscription.renewed webhook payload has everything: total credits consumed, per-metric breakdown. Map credits back to fiat using your own pricing table.

  • Credit tiers are not fiat tiers. QuotaStack’s tiered metering rules define credit costs per unit. Your fiat pricing tiers may differ. A common pattern: set QuotaStack’s credit cost to 1000mc per unit at all tiers (flat), and apply tiered fiat pricing in your invoicing code. Alternatively, mirror your fiat tiers in credit costs so the usage summary directly reflects tiered consumption.

  • Overage with negative balance. When overage_policy: "allow", the balance can go negative. This is normal for postpaid. The negative balance represents unbilled usage. At cycle end, settle the ledger by posting a positive adjustment after invoicing.

  • Contract lifecycle is separate from billing cycle. A 12-month contract with monthly billing has 12 cycles. Each cycle auto-advances. The contract boundary only matters for contract_ending_soon and contract_ended events. Billing continues as normal within the contract.

  • Idempotency is critical at scale. At thousands of requests per second, network retries happen. Every usage event must carry a unique, deterministic idempotency key (your request ID is ideal). Duplicate events are silently dropped. This gives you exactly-once credit accounting with at-least-once delivery.

See also: Subscriptions, Billing Modes, Metering Rules, Usage Events.

Concepts used in this pattern

🤖
Building with an AI agent?
Get this page as markdown: /docs/use-cases/api-platform.md · Full index: /llms.txt