Docs / Concepts / Reservations

Reservations

How to hold credits during long-running operations using the reserve/commit/release lifecycle, with TTL-based auto-expiry and concurrency safety.

Mental Model

Think of reservations like putting items in a hotel safe. The credits are held, not spent. When you're done: commit (you really used them), release (you didn't), or let the TTL auto-release so nothing leaks.

Quick Take
Reserve credits before a long-running operation starts
Three outcomes: commit (consume), release (return), or TTL auto-expiry
Prevents concurrent requests from overspending the same balance
Partial commits return the difference automatically
Reserve credits held Held success cancel timeout CREDITS CONSUMED Commit CREDITS RETURNED Release AUTO-RELEASED TTL Expiry

Reservations

Some operations take time. An AI image generation might run for 30 seconds. A batch processing job might run for 10 minutes. During that time, you need to guarantee that the customer’s credits are not spent by other concurrent requests.

Reservations solve this by holding credits against the customer’s effective balance for the duration of the operation. When the operation completes, you either commit (finalize the charge) or release (return the credits).

The lifecycle

The diagram above shows how a reservation flows from Reserve into the held state, then out through one of three terminal transitions. Each state in detail:

StateDescription
activeCredits are held. The operation is in progress.
committedThe operation succeeded. Credits were debited based on actual usage.
releasedThe operation was cancelled. All held credits were returned.
expiredThe reservation’s TTL elapsed without a commit or release. Credits are returned automatically.

Reserve

Hold credits before starting a long-running operation.

POST /v1/reserve
curl -X POST https://api.quotastack.io/v1/reserve \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: reserve-gen-xyz" \
  -d '{
    "external_customer_id": "user_abc",
    "billable_metric_key": "look",
    "estimated_units": 1,
    "ttl_seconds": 120,
    "metadata": {
      "outfit_id": "outfit_456"
    }
  }'
FieldTypeRequiredDescription
external_customer_idstringyes*Your tenant ID for the customer. See Customer identification.
customer_idstringyes*Alternative: QuotaStack UUID. Exactly one of the two is required.
billable_metric_keystringyesThe metric being consumed.
estimated_unitsint64yesHow many units you estimate the operation will consume. Must be positive.
ttl_secondsintnoHow long the reservation lives. Default: 1800 (30 minutes). Max: 86400 (24 hours).
metadataobjectnoArbitrary key-value pairs.

The estimated cost is computed from the active metering rule for the given metric key, just like an entitlement check.

Response

{
  "id": "rsv_01HXY...",
  "tenant_id": "t_01...",
  "customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
  "external_customer_id": "user_abc",
  "billable_metric_key": "look",
  "estimated_units": 1,
  "estimated_cost": 1000,
  "environment": "live",
  "status": "active",
  "expires_at": "2025-01-15T10:32:00Z",
  "account": {
    "balance": 150000,
    "reserved_balance": 11000,
    "effective_balance": 139000
  },
  "metadata": {
    "outfit_id": "outfit_456"
  },
  "created_at": "2025-01-15T10:30:00Z"
}

Notice that reserved_balance increased by the estimated_cost (1,000 mc) and effective_balance decreased by the same amount. The credits are held, not debited — balance is unchanged.

Insufficient credits

If the customer’s effective_balance is less than the estimated cost, the reserve call returns 402 Payment Required:

{
  "type": "https://api.quotastack.io/errors/insufficient-credits",
  "title": "Insufficient Credits",
  "status": 402,
  "detail": "Insufficient credits for reservation."
}

Commit

After the operation completes successfully, commit the reservation with the actual units consumed.

POST /v1/reserve/{id}/commit
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../commit \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: commit-gen-xyz" \
  -d '{
    "actual_units": 1
  }'
FieldTypeRequiredDescription
actual_unitsint64yesThe actual number of units consumed. Can be 0 (no charge). Must be >= 0.
metadataobjectnoAdditional metadata to attach.

Response

{
  "reservation_id": "rsv_01HXY...",
  "status": "committed",
  "estimated_units": 1,
  "actual_units": 1,
  "estimated_cost": 1000,
  "actual_cost": 1000,
  "released": 0,
  "transaction": {
    "id": "txn_01...",
    "delta": -1000,
    "type": "consumption",
    "created_at": "2025-01-15T10:30:45Z"
  },
  "account": {
    "balance": 149000,
    "reserved_balance": 10000,
    "effective_balance": 139000
  }
}

When actual_units < estimated_units, the actual cost is lower than the estimated cost. The difference between the held estimated_cost and the computed actual_cost is released back to the customer’s account.

Example: reserve with estimated_units: 10 against a per_unit: 1000 mc rule — 10,000 mc is held. Commit with actual_units: 7actual_cost = 7,000 mc is debited, released = 3,000 mc returns to the effective balance.

When actual_units = 0, no credits are debited and the full estimated cost is released. This is equivalent to calling release.

Over-consumption

If actual_units > estimated_units, the commit debits the excess beyond the held amount:

  • The excess debit follows the standard burn-down order (priority → expiry → free-before-paid → FIFO).
  • Overage policy is not consulted on commit. If the excess would push the balance negative, the commit auto-caps the debit to available balance — the extra units are effectively free.
  • The commit is atomic: either the reservation moves to committed with the cost debited, or nothing changes.

If you need strict overage semantics on expensive operations, either pre-check the expected cost via entitlement or reserve a larger estimate so commit doesn’t need to draw extra credits.

Release

If the operation fails or is cancelled, release the reservation to return all held credits.

POST /v1/reserve/{id}/release
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../release \
  -H "X-API-Key: qs_live_..." \
  -H "Idempotency-Key: release-gen-xyz"

No request body is needed.

Response

{
  "reservation_id": "rsv_01HXY...",
  "status": "released",
  "estimated_cost": 1000,
  "released": 1000,
  "transaction": {
    "id": "txn_02...",
    "delta": 1000,
    "type": "release",
    "reference_id": "rsv_01HXY...",
    "created_at": "2025-01-15T10:30:30Z"
  },
  "account": {
    "balance": 150000,
    "reserved_balance": 10000,
    "effective_balance": 140000
  }
}

The reserved_balance decreases by the estimated cost and effective_balance increases accordingly. No credits are debited.

Checking reservation status

Inspect a single reservation:

GET /v1/reserve/{id}
curl https://api.quotastack.io/v1/reserve/019d8a20-4ff5-7be0-81da-e1454b3d6f64 \
  -H "X-API-Key: qs_live_..."

Response:

{
  "id": "019d8a20-4ff5-7be0-81da-e1454b3d6f64",
  "customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
  "billable_metric_key": "look",
  "estimated_units": 1,
  "estimated_cost": 1000,
  "status": "active",
  "expires_at": "2026-04-14T12:15:00Z",
  "reference_id": "019d8a20-4ff5-7be0-81da-e1454b3d6f63",
  "metadata": {},
  "created_at": "2026-04-14T12:10:00Z"
}

Useful for confirming a reservation is still active before calling commit (avoiding 409s on already-expired reservations) or for debugging when a worker process crashed mid-operation.

Listing reservations for a customer

GET /v1/customers/{customer_id}/reservations
GET /v1/customer-by-external-id/{external_id}/reservations

Query parameters:

ParameterDescription
statusFilter by active, committed, released, or expired. Omit to list all.
cursor, limitStandard pagination (see API Conventions).
curl "https://api.quotastack.io/v1/customer-by-external-id/user_abc/reservations?status=active" \
  -H "X-API-Key: qs_live_..."

Useful for finding stuck reservations after a deploy or for surfacing in-progress operations to a customer.

TTL and auto-expiry

Every reservation has a TTL. If neither commit nor release is called before the TTL expires, the reservation is automatically expired by a background sweep and all held credits are returned.

SettingValue
Default TTL1,800 seconds (30 minutes)
Maximum TTL86,400 seconds (24 hours)
Minimum TTL1 second

Set ttl_seconds in the reserve request to override the default. Values above the maximum are clamped to 86,400.

The auto-expiry sweep runs periodically. There may be a brief delay (seconds, not minutes) between when a reservation’s TTL elapses and when the sweep processes it.

No webhook is fired when a reservation auto-expires today. If you need to react to expired reservations (e.g., to unblock a stuck user in your UI), rely on your own TTL timer and reconcile against the customer balance if needed.

Concurrency safety

Reservations interact with the credit account’s reserved_balance. When a reserve request is made:

  1. The system reads the customer’s effective_balance (balance - reserved_balance).
  2. If effective_balance >= estimated_cost, the reservation is created and reserved_balance increases.
  3. If not, the request is rejected with 402.

Two parallel reserve requests for the same customer are serialized server-side. If a customer has 10,000 mc effective balance and two requests each try to reserve 8,000 mc simultaneously:

  • Request A acquires the lock, checks effective_balance (10,000 >= 8,000), succeeds. Reserved balance is now 8,000.
  • Request B acquires the lock after A commits, checks effective_balance (10,000 - 8,000 = 2,000 < 8,000), fails with 402.

There is no window for double-spending.

Example flow: AI outfit generation

A fashion SaaS uses QuotaStack to meter AI-generated outfit looks at 1,000 mc per look.

1. Reserve before starting generation

curl -X POST https://api.quotastack.io/v1/reserve \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: reserve-outfit-456" \
  -d '{
    "external_customer_id": "user_abc",
    "billable_metric_key": "look",
    "estimated_units": 1,
    "ttl_seconds": 120
  }'

Save the reservation_id from the response.

2a. Commit on success

The AI pipeline completes. Commit with the actual units consumed:

curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../commit \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: commit-outfit-456" \
  -d '{ "actual_units": 1 }'

Credits are debited. The customer sees the charge.

2b. Release on failure

The AI pipeline crashes. Release the reservation:

curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../release \
  -H "X-API-Key: qs_live_..." \
  -H "Idempotency-Key: release-outfit-456"

Credits are returned. The customer is not charged.

2c. TTL safety net

If your server crashes and neither commit nor release is called, the reservation expires automatically after the TTL (120 seconds in this example). Credits are returned. The customer is not charged.

When to use reservations vs. direct usage events

ScenarioApproach
Instant operations (send a message, make an API call)Record a usage event directly. No reservation needed.
Operations that take seconds to minutes (AI generation, file processing)Use a reservation. Reserve before starting, commit on success, release on failure.
Operations where cost depends on output (variable token count, variable file size)Use a reservation with an estimated upper bound. Commit with actual units. The excess is returned.

Error states

ErrorHTTP statusWhen
Insufficient credits402Customer’s effective_balance cannot cover the estimated cost.
Reservation not found404The reservation ID does not exist or belongs to a different tenant.
Reservation not active409Attempting to commit or release a reservation that is already committed, released, or expired.
Reservation expired409Attempting to commit a reservation whose TTL has elapsed.

Common Mistakes

The mistakes developers typically make with this concept — and what to do instead.

×
Don't reserve for every call — only long or fallible operations
Why
Reservations add a round-trip. For fast, reliable operations (sub-second, no external API), a direct usage event is simpler and cheaper.
×
Don't set very long TTLs to "be safe"
Why
A 1-hour TTL holds the customer's credits even if your job finished in 30 seconds. Tune TTL to your operation's p99 duration, not your worst-case.
🤖
Building with an AI agent?
Get this page as markdown: /docs/concepts/reservations.md · Full index: /llms.txt