Now in production

Usage-based billing API
for AI products

Credit-native metering, real-time entitlement checks, and full subscription lifecycle management. Ship in a day, not a quarter.

6 API calls to integrate
<1ms entitlement checks
Zero payment lock-in
Sandbox included
AI ChatbotsAPI PlatformsLLM WrappersSaaS Add-onsDev Tools

The problem

Building usage billing from scratch is a trap

  • You build a credit ledger, then realize you need atomic transactions
  • You add metering, then need idempotency to prevent double-charges
  • You want "can this user do X?" but it's a 5-table JOIN
  • You need reservations for async AI tasks but it's not in Stripe
  • Weeks turn into months. You're building billing, not your product

The solution

QuotaStack handles it in 6 API calls

  • Credit ledger with immutable transactions, block burn-down, and expiry
  • Usage metering with idempotency baked in — no double-counting
  • Real-time entitlement checks in <1ms — one GET request
  • Credit reservations for long-running AI inference and batch jobs
  • Full subscription lifecycle — upgrade, downgrade, pause, cancel, renew

How it works

Three steps to usage billing

Set up plans, check entitlements, record usage. That's it.

1

Subscribe users to plans

POST /v1/subscriptions
{
  "end_user_id": "user_123",
  "plan_variant_id": "var_..."
}
2

Check entitlements in real time

GET /v1/entitlements/check
    ?end_user_id=user_123
    &feature_key=api_calls

{ "allowed": true, "balance": 49000 }
3

Record usage, credits deducted

POST /v1/usage
{
  "end_user_id": "user_123",
  "feature_key": "api_calls",
  "units": 1
}

Features

Everything you need for credit-based billing

Built for AI, API, and SaaS products. Nothing you don't need.

Real-time entitlements

"Can this user do this?" answered in <30ms. Cached, invalidated on every credit mutation. One API call replaces complex authorization logic.

Credit ledger

Immutable transaction history. Block-based burn-down with priority, expiry, and accumulation caps. Every credit accounted for.

Reservations

Reserve credits before long-running tasks like AI inference. Commit actual cost after. Unused credits released automatically.

Subscription lifecycle

Create, upgrade, downgrade, pause, resume, cancel, renew. Full state machine with automatic free-plan fallback.

Multi-tenant isolation

Row-level security in PostgreSQL. Sandbox and live environments on the same instance. Scoped API keys per tenant.

Idempotent everything

Every mutation accepts an idempotency key. Safe retries, no double-charging. Webhook delivery with exponential backoff retry.

Developer experience

Integrate in minutes, not months

Your entire billing integration in ~15 lines of code.

billing.ts
// Check entitlement before each request
const { allowed, balance } = await quotastack
  .checkEntitlement("user_123", "ai_request")

if (!allowed) {
  return { error: "upgrade_required" }
}

// Process the request...
const result = await generateResponse(prompt)

// Record usage — credits deducted async
await quotastack.recordUsage("user_123", {
  feature_key: "ai_request",
  units: 1,
  idempotency_key: requestId
})

Compare

How QuotaStack compares

Stop rebuilding billing infrastructure. Focus on your product.

Feature QuotaStack Build Yourself Stripe Billing Orb / Amberflo
Time to integrate 1 day 2-4 weeks 1-2 weeks 1 week
Credit-native ledger Build it
Real-time entitlements Build it
Reservations (hold credits) Build it
Usage metering Build it
Plan management Build it
Payment processing BYO BYO BYO
Open API / no lock-in
Pricing Simple Eng time 0.7% rev Enterprise

Use cases

Built for how AI products actually work

From free-tier gating to complex reservation flows.

AI chatbots & companions

Give users free messages, gate with a paywall, sell credit packs. Track per-character or per-conversation usage.

Free tier gatingCredit packsPer-message metering

API platforms

Rate-limit API calls by credit balance. Reserve credits before expensive operations, commit actual cost after.

ReservationsTiered pricingOverage handling

SaaS with usage add-ons

Base subscription with metered features. Monthly credit grants with rollover caps, top-up packages for burst usage.

Subscription + usageAccumulation capsTop-ups

FAQ

Frequently asked questions

Does QuotaStack process payments? +
No. QuotaStack handles credits, entitlements, and usage metering. You use your own payment provider (Stripe, DodoPayments, etc.) and call our API after payment confirmation. This means zero payment lock-in.
What does "credit-native" mean? +
Credits are the core primitive, not an afterthought on top of subscriptions. Every operation — grants, debits, reservations, entitlements — runs through the credit ledger. This gives you a single source of truth for usage and billing.
How fast are entitlement checks? +
Sub-millisecond with Redis caching. The cache is automatically invalidated on every credit mutation, so entitlement results are always fresh. Safe to call on every API request or user action.
Can I use QuotaStack for free-tier gating? +
Yes. Create a free plan with a credit grant (e.g., 50 API calls). When credits run out, the entitlement check returns allowed: false. Show your paywall and upgrade the user when they pay.
What happens if QuotaStack is down? +
You decide. Your integration can fail open (allow the request) or fail closed (block it). For most products, fail open with async reconciliation is the right choice. We target 99.9% uptime.
How is sandbox different from live? +
Same instance, different API key. Sandbox keys (qs_test_) access isolated data from live keys (qs_live_). Develop and test without affecting production data.
What about reservations? +
Reservations let you hold credits before a long-running task (like AI inference). You reserve an estimated cost upfront, then commit the actual cost when done. If the task fails, credits are released automatically. No other billing platform offers this.

Get your API keys

Tell us about your use case. We'll set up your tenant
with sandbox + live keys within 24 hours.