Reservations
How to hold credits during long-running operations using the reserve/commit/release lifecycle, with TTL-based auto-expiry and concurrency safety.
Think of reservations like putting items in a hotel safe. The credits are held, not spent. When you're done: commit (you really used them), release (you didn't), or let the TTL auto-release so nothing leaks.
Reservations
Some operations take time. An AI image generation might run for 30 seconds. A batch processing job might run for 10 minutes. During that time, you need to guarantee that the customer’s credits are not spent by other concurrent requests.
Reservations solve this by holding credits against the customer’s effective balance for the duration of the operation. When the operation completes, you either commit (finalize the charge) or release (return the credits).
The lifecycle
The diagram above shows how a reservation flows from Reserve into the held state, then out through one of three terminal transitions. Each state in detail:
| State | Description |
|---|---|
active | Credits are held. The operation is in progress. |
committed | The operation succeeded. Credits were debited based on actual usage. |
released | The operation was cancelled. All held credits were returned. |
expired | The reservation’s TTL elapsed without a commit or release. Credits are returned automatically. |
Reserve
Hold credits before starting a long-running operation.
POST /v1/reserve
curl -X POST https://api.quotastack.io/v1/reserve \
-H "X-API-Key: qs_live_..." \
-H "Content-Type: application/json" \
-H "Idempotency-Key: reserve-gen-xyz" \
-d '{
"external_customer_id": "user_abc",
"billable_metric_key": "look",
"estimated_units": 1,
"ttl_seconds": 120,
"metadata": {
"outfit_id": "outfit_456"
}
}'
| Field | Type | Required | Description |
|---|---|---|---|
external_customer_id | string | yes* | Your tenant ID for the customer. See Customer identification. |
customer_id | string | yes* | Alternative: QuotaStack UUID. Exactly one of the two is required. |
billable_metric_key | string | yes | The metric being consumed. |
estimated_units | int64 | yes | How many units you estimate the operation will consume. Must be positive. |
ttl_seconds | int | no | How long the reservation lives. Default: 1800 (30 minutes). Max: 86400 (24 hours). |
metadata | object | no | Arbitrary key-value pairs. |
The estimated cost is computed from the active metering rule for the given metric key, just like an entitlement check.
Response
{
"id": "rsv_01HXY...",
"tenant_id": "t_01...",
"customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
"external_customer_id": "user_abc",
"billable_metric_key": "look",
"estimated_units": 1,
"estimated_cost": 1000,
"environment": "live",
"status": "active",
"expires_at": "2025-01-15T10:32:00Z",
"account": {
"balance": 150000,
"reserved_balance": 11000,
"effective_balance": 139000
},
"metadata": {
"outfit_id": "outfit_456"
},
"created_at": "2025-01-15T10:30:00Z"
}
Notice that reserved_balance increased by the estimated_cost (1,000 mc) and effective_balance decreased by the same amount. The credits are held, not debited — balance is unchanged.
Insufficient credits
If the customer’s effective_balance is less than the estimated cost, the reserve call returns 402 Payment Required:
{
"type": "https://api.quotastack.io/errors/insufficient-credits",
"title": "Insufficient Credits",
"status": 402,
"detail": "Insufficient credits for reservation."
}
Commit
After the operation completes successfully, commit the reservation with the actual units consumed.
POST /v1/reserve/{id}/commit
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../commit \
-H "X-API-Key: qs_live_..." \
-H "Content-Type: application/json" \
-H "Idempotency-Key: commit-gen-xyz" \
-d '{
"actual_units": 1
}'
| Field | Type | Required | Description |
|---|---|---|---|
actual_units | int64 | yes | The actual number of units consumed. Can be 0 (no charge). Must be >= 0. |
metadata | object | no | Additional metadata to attach. |
Response
{
"reservation_id": "rsv_01HXY...",
"status": "committed",
"estimated_units": 1,
"actual_units": 1,
"estimated_cost": 1000,
"actual_cost": 1000,
"released": 0,
"transaction": {
"id": "txn_01...",
"delta": -1000,
"type": "consumption",
"created_at": "2025-01-15T10:30:45Z"
},
"account": {
"balance": 149000,
"reserved_balance": 10000,
"effective_balance": 139000
}
}
When actual_units < estimated_units, the actual cost is lower than the estimated cost. The difference between the held estimated_cost and the computed actual_cost is released back to the customer’s account.
Example: reserve with estimated_units: 10 against a per_unit: 1000 mc rule — 10,000 mc is held. Commit with actual_units: 7 → actual_cost = 7,000 mc is debited, released = 3,000 mc returns to the effective balance.
When actual_units = 0, no credits are debited and the full estimated cost is released. This is equivalent to calling release.
Over-consumption
If actual_units > estimated_units, the commit debits the excess beyond the held amount:
- The excess debit follows the standard burn-down order (priority → expiry → free-before-paid → FIFO).
- Overage policy is not consulted on commit. If the excess would push the balance negative, the commit auto-caps the debit to available balance — the extra units are effectively free.
- The commit is atomic: either the reservation moves to
committedwith the cost debited, or nothing changes.
If you need strict overage semantics on expensive operations, either pre-check the expected cost via entitlement or reserve a larger estimate so commit doesn’t need to draw extra credits.
Release
If the operation fails or is cancelled, release the reservation to return all held credits.
POST /v1/reserve/{id}/release
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../release \
-H "X-API-Key: qs_live_..." \
-H "Idempotency-Key: release-gen-xyz"
No request body is needed.
Response
{
"reservation_id": "rsv_01HXY...",
"status": "released",
"estimated_cost": 1000,
"released": 1000,
"transaction": {
"id": "txn_02...",
"delta": 1000,
"type": "release",
"reference_id": "rsv_01HXY...",
"created_at": "2025-01-15T10:30:30Z"
},
"account": {
"balance": 150000,
"reserved_balance": 10000,
"effective_balance": 140000
}
}
The reserved_balance decreases by the estimated cost and effective_balance increases accordingly. No credits are debited.
Checking reservation status
Inspect a single reservation:
GET /v1/reserve/{id}
curl https://api.quotastack.io/v1/reserve/019d8a20-4ff5-7be0-81da-e1454b3d6f64 \
-H "X-API-Key: qs_live_..."
Response:
{
"id": "019d8a20-4ff5-7be0-81da-e1454b3d6f64",
"customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
"billable_metric_key": "look",
"estimated_units": 1,
"estimated_cost": 1000,
"status": "active",
"expires_at": "2026-04-14T12:15:00Z",
"reference_id": "019d8a20-4ff5-7be0-81da-e1454b3d6f63",
"metadata": {},
"created_at": "2026-04-14T12:10:00Z"
}
Useful for confirming a reservation is still active before calling commit (avoiding 409s on already-expired reservations) or for debugging when a worker process crashed mid-operation.
Listing reservations for a customer
GET /v1/customers/{customer_id}/reservations
GET /v1/customer-by-external-id/{external_id}/reservations
Query parameters:
| Parameter | Description |
|---|---|
status | Filter by active, committed, released, or expired. Omit to list all. |
cursor, limit | Standard pagination (see API Conventions). |
curl "https://api.quotastack.io/v1/customer-by-external-id/user_abc/reservations?status=active" \
-H "X-API-Key: qs_live_..."
Useful for finding stuck reservations after a deploy or for surfacing in-progress operations to a customer.
TTL and auto-expiry
Every reservation has a TTL. If neither commit nor release is called before the TTL expires, the reservation is automatically expired by a background sweep and all held credits are returned.
| Setting | Value |
|---|---|
| Default TTL | 1,800 seconds (30 minutes) |
| Maximum TTL | 86,400 seconds (24 hours) |
| Minimum TTL | 1 second |
Set ttl_seconds in the reserve request to override the default. Values above the maximum are clamped to 86,400.
The auto-expiry sweep runs periodically. There may be a brief delay (seconds, not minutes) between when a reservation’s TTL elapses and when the sweep processes it.
No webhook is fired when a reservation auto-expires today. If you need to react to expired reservations (e.g., to unblock a stuck user in your UI), rely on your own TTL timer and reconcile against the customer balance if needed.
Concurrency safety
Reservations interact with the credit account’s reserved_balance. When a reserve request is made:
- The system reads the customer’s
effective_balance(balance - reserved_balance). - If
effective_balance >= estimated_cost, the reservation is created andreserved_balanceincreases. - If not, the request is rejected with
402.
Two parallel reserve requests for the same customer are serialized server-side. If a customer has 10,000 mc effective balance and two requests each try to reserve 8,000 mc simultaneously:
- Request A acquires the lock, checks effective_balance (10,000 >= 8,000), succeeds. Reserved balance is now 8,000.
- Request B acquires the lock after A commits, checks effective_balance (10,000 - 8,000 = 2,000 < 8,000), fails with 402.
There is no window for double-spending.
Example flow: AI outfit generation
A fashion SaaS uses QuotaStack to meter AI-generated outfit looks at 1,000 mc per look.
1. Reserve before starting generation
curl -X POST https://api.quotastack.io/v1/reserve \
-H "X-API-Key: qs_live_..." \
-H "Content-Type: application/json" \
-H "Idempotency-Key: reserve-outfit-456" \
-d '{
"external_customer_id": "user_abc",
"billable_metric_key": "look",
"estimated_units": 1,
"ttl_seconds": 120
}'
Save the reservation_id from the response.
2a. Commit on success
The AI pipeline completes. Commit with the actual units consumed:
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../commit \
-H "X-API-Key: qs_live_..." \
-H "Content-Type: application/json" \
-H "Idempotency-Key: commit-outfit-456" \
-d '{ "actual_units": 1 }'
Credits are debited. The customer sees the charge.
2b. Release on failure
The AI pipeline crashes. Release the reservation:
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../release \
-H "X-API-Key: qs_live_..." \
-H "Idempotency-Key: release-outfit-456"
Credits are returned. The customer is not charged.
2c. TTL safety net
If your server crashes and neither commit nor release is called, the reservation expires automatically after the TTL (120 seconds in this example). Credits are returned. The customer is not charged.
When to use reservations vs. direct usage events
| Scenario | Approach |
|---|---|
| Instant operations (send a message, make an API call) | Record a usage event directly. No reservation needed. |
| Operations that take seconds to minutes (AI generation, file processing) | Use a reservation. Reserve before starting, commit on success, release on failure. |
| Operations where cost depends on output (variable token count, variable file size) | Use a reservation with an estimated upper bound. Commit with actual units. The excess is returned. |
Error states
| Error | HTTP status | When |
|---|---|---|
| Insufficient credits | 402 | Customer’s effective_balance cannot cover the estimated cost. |
| Reservation not found | 404 | The reservation ID does not exist or belongs to a different tenant. |
| Reservation not active | 409 | Attempting to commit or release a reservation that is already committed, released, or expired. |
| Reservation expired | 409 | Attempting to commit a reservation whose TTL has elapsed. |
Common Mistakes
The mistakes developers typically make with this concept — and what to do instead.