---
title: Reservations
description: How to hold credits during long-running operations using the reserve/commit/release lifecycle, with TTL-based auto-expiry and concurrency safety.
order: 4
---

# Reservations

Some operations take time. An AI image generation might run for 30 seconds. A batch processing job might run for 10 minutes. During that time, you need to guarantee that the customer's credits are not spent by other concurrent requests.

Reservations solve this by **holding** credits against the customer's effective balance for the duration of the operation. When the operation completes, you either **commit** (finalize the charge) or **release** (return the credits).

> **Mental Model:** Think of reservations like **putting items in a hotel safe**. The credits are held, not spent. When you're done: **commit** (you really used them), **release** (you didn't), or let the **TTL** auto-release so nothing leaks.

## Quick Take

- **Reserve** credits before a long-running operation starts
- Three outcomes: **commit** (consume), **release** (return), or **TTL auto-expiry**
- Prevents concurrent requests from overspending the same balance
- Partial commits return the difference automatically

## Diagram

Reserve holds credits, moving into a Held state. From Held, three outcomes: success leads to Commit (credits consumed), cancel leads to Release (credits returned), or timeout leads to TTL Expiry (auto-released).

```mermaid
stateDiagram-v2
    [*] --> Reserve
    Reserve --> Held: credits held
    Held --> Commit: success
    Held --> Release: cancel
    Held --> TTLExpiry: timeout
    Commit --> [*]: credits consumed
    Release --> [*]: credits returned
    TTLExpiry --> [*]: auto-released
```

## The lifecycle

The diagram above shows how a reservation flows from `Reserve` into the held state, then out through one of three terminal transitions. Each state in detail:

| State | Description |
|---|---|
| `active` | Credits are held. The operation is in progress. |
| `committed` | The operation succeeded. Credits were debited based on actual usage. |
| `released` | The operation was cancelled. All held credits were returned. |
| `expired` | The reservation's TTL elapsed without a commit or release. Credits are returned automatically. |

## Reserve

Hold credits before starting a long-running operation.

```
POST /v1/reserve
```

```bash
curl -X POST https://api.quotastack.io/v1/reserve \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: reserve-gen-xyz" \
  -d '{
    "external_customer_id": "user_abc",
    "billable_metric_key": "look",
    "estimated_units": 1,
    "ttl_seconds": 120,
    "metadata": {
      "outfit_id": "outfit_456"
    }
  }'
```

| Field | Type | Required | Description |
|---|---|---|---|
| `external_customer_id` | string | yes* | Your tenant ID for the customer. See [Customer identification](/docs/concepts/customer-identification). |
| `customer_id` | string | yes* | Alternative: QuotaStack UUID. Exactly one of the two is required. |
| `billable_metric_key` | string | yes | The metric being consumed. |
| `estimated_units` | int64 | yes | How many units you estimate the operation will consume. Must be positive. |
| `ttl_seconds` | int | no | How long the reservation lives. Default: 1800 (30 minutes). Max: 86400 (24 hours). |
| `metadata` | object | no | Arbitrary key-value pairs. |

The estimated cost is computed from the active metering rule for the given metric key, just like an [entitlement check](/docs/concepts/entitlements).

### Response

```json
{
  "id": "rsv_01HXY...",
  "tenant_id": "t_01...",
  "customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
  "external_customer_id": "user_abc",
  "billable_metric_key": "look",
  "estimated_units": 1,
  "estimated_cost": 1000,
  "environment": "live",
  "status": "active",
  "expires_at": "2025-01-15T10:32:00Z",
  "account": {
    "balance": 150000,
    "reserved_balance": 11000,
    "effective_balance": 139000
  },
  "metadata": {
    "outfit_id": "outfit_456"
  },
  "created_at": "2025-01-15T10:30:00Z"
}
```

Notice that `reserved_balance` increased by the `estimated_cost` (1,000 mc) and `effective_balance` decreased by the same amount. The credits are held, not debited -- `balance` is unchanged.

### Insufficient credits

If the customer's `effective_balance` is less than the estimated cost, the reserve call returns `402 Payment Required`:

```json
{
  "type": "https://api.quotastack.io/errors/insufficient-credits",
  "title": "Insufficient Credits",
  "status": 402,
  "detail": "Insufficient credits for reservation."
}
```

## Commit

After the operation completes successfully, commit the reservation with the actual units consumed.

```
POST /v1/reserve/{id}/commit
```

```bash
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../commit \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: commit-gen-xyz" \
  -d '{
    "actual_units": 1
  }'
```

| Field | Type | Required | Description |
|---|---|---|---|
| `actual_units` | int64 | yes | The actual number of units consumed. Can be 0 (no charge). Must be >= 0. |
| `metadata` | object | no | Additional metadata to attach. |

### Response

```json
{
  "reservation_id": "rsv_01HXY...",
  "status": "committed",
  "estimated_units": 1,
  "actual_units": 1,
  "estimated_cost": 1000,
  "actual_cost": 1000,
  "released": 0,
  "transaction": {
    "id": "txn_01...",
    "delta": -1000,
    "type": "consumption",
    "created_at": "2025-01-15T10:30:45Z"
  },
  "account": {
    "balance": 149000,
    "reserved_balance": 10000,
    "effective_balance": 139000
  }
}
```

When `actual_units < estimated_units`, the actual cost is lower than the estimated cost. The difference between the held `estimated_cost` and the computed `actual_cost` is released back to the customer's account.

Example: reserve with `estimated_units: 10` against a `per_unit: 1000 mc` rule — 10,000 mc is held. Commit with `actual_units: 7` → `actual_cost = 7,000 mc` is debited, `released = 3,000 mc` returns to the effective balance.

When `actual_units = 0`, no credits are debited and the full estimated cost is released. This is equivalent to calling [release](#release).

### Over-consumption

If `actual_units > estimated_units`, the commit debits the excess beyond the held amount:

- The excess debit follows the standard [burn-down order](/docs/concepts/credits#burn-down-order) (priority → expiry → free-before-paid → FIFO).
- **Overage policy is not consulted on commit.** If the excess would push the balance negative, the commit auto-caps the debit to available balance — the extra units are effectively free.
- The commit is atomic: either the reservation moves to `committed` with the cost debited, or nothing changes.

If you need strict overage semantics on expensive operations, either pre-check the expected cost via entitlement or reserve a larger estimate so commit doesn't need to draw extra credits.

## Release

If the operation fails or is cancelled, release the reservation to return all held credits.

```
POST /v1/reserve/{id}/release
```

```bash
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../release \
  -H "X-API-Key: qs_live_..." \
  -H "Idempotency-Key: release-gen-xyz"
```

No request body is needed.

### Response

```json
{
  "reservation_id": "rsv_01HXY...",
  "status": "released",
  "estimated_cost": 1000,
  "released": 1000,
  "transaction": {
    "id": "txn_02...",
    "delta": 1000,
    "type": "release",
    "reference_id": "rsv_01HXY...",
    "created_at": "2025-01-15T10:30:30Z"
  },
  "account": {
    "balance": 150000,
    "reserved_balance": 10000,
    "effective_balance": 140000
  }
}
```

The `reserved_balance` decreases by the estimated cost and `effective_balance` increases accordingly. No credits are debited.

## Checking reservation status

Inspect a single reservation:

```
GET /v1/reserve/{id}
```

```bash
curl https://api.quotastack.io/v1/reserve/019d8a20-4ff5-7be0-81da-e1454b3d6f64 \
  -H "X-API-Key: qs_live_..."
```

Response:

```json
{
  "id": "019d8a20-4ff5-7be0-81da-e1454b3d6f64",
  "customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
  "billable_metric_key": "look",
  "estimated_units": 1,
  "estimated_cost": 1000,
  "status": "active",
  "expires_at": "2026-04-14T12:15:00Z",
  "reference_id": "019d8a20-4ff5-7be0-81da-e1454b3d6f63",
  "metadata": {},
  "created_at": "2026-04-14T12:10:00Z"
}
```

Useful for confirming a reservation is still active before calling commit (avoiding 409s on already-expired reservations) or for debugging when a worker process crashed mid-operation.

## Listing reservations for a customer

```
GET /v1/customers/{customer_id}/reservations
GET /v1/customer-by-external-id/{external_id}/reservations
```

Query parameters:

| Parameter | Description |
|---|---|
| `status` | Filter by `active`, `committed`, `released`, or `expired`. Omit to list all. |
| `cursor`, `limit` | Standard pagination (see [API Conventions](/docs/concepts/conventions#pagination)). |

```bash
curl "https://api.quotastack.io/v1/customer-by-external-id/user_abc/reservations?status=active" \
  -H "X-API-Key: qs_live_..."
```

Useful for finding stuck reservations after a deploy or for surfacing in-progress operations to a customer.

## TTL and auto-expiry

Every reservation has a TTL. If neither commit nor release is called before the TTL expires, the reservation is automatically expired by a background sweep and all held credits are returned.

| Setting | Value |
|---|---|
| Default TTL | 1,800 seconds (30 minutes) |
| Maximum TTL | 86,400 seconds (24 hours) |
| Minimum TTL | 1 second |

Set `ttl_seconds` in the reserve request to override the default. Values above the maximum are clamped to 86,400.

The auto-expiry sweep runs periodically. There may be a brief delay (seconds, not minutes) between when a reservation's TTL elapses and when the sweep processes it.

No webhook is fired when a reservation auto-expires today. If you need to react to expired reservations (e.g., to unblock a stuck user in your UI), rely on your own TTL timer and reconcile against the customer balance if needed.

## Concurrency safety

Reservations interact with the credit account's `reserved_balance`. When a reserve request is made:

1. The system reads the customer's `effective_balance` (balance - reserved_balance).
2. If `effective_balance >= estimated_cost`, the reservation is created and `reserved_balance` increases.
3. If not, the request is rejected with `402`.

Two parallel reserve requests for the same customer are serialized server-side. If a customer has 10,000 mc effective balance and two requests each try to reserve 8,000 mc simultaneously:

- Request A acquires the lock, checks effective_balance (10,000 >= 8,000), succeeds. Reserved balance is now 8,000.
- Request B acquires the lock after A commits, checks effective_balance (10,000 - 8,000 = 2,000 < 8,000), fails with 402.

There is no window for double-spending.

## Example flow: AI outfit generation

A fashion SaaS uses QuotaStack to meter AI-generated outfit looks at 1,000 mc per look.

### 1. Reserve before starting generation

```bash
curl -X POST https://api.quotastack.io/v1/reserve \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: reserve-outfit-456" \
  -d '{
    "external_customer_id": "user_abc",
    "billable_metric_key": "look",
    "estimated_units": 1,
    "ttl_seconds": 120
  }'
```

Save the `reservation_id` from the response.

### 2a. Commit on success

The AI pipeline completes. Commit with the actual units consumed:

```bash
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../commit \
  -H "X-API-Key: qs_live_..." \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: commit-outfit-456" \
  -d '{ "actual_units": 1 }'
```

Credits are debited. The customer sees the charge.

### 2b. Release on failure

The AI pipeline crashes. Release the reservation:

```bash
curl -X POST https://api.quotastack.io/v1/reserve/rsv_01HXY.../release \
  -H "X-API-Key: qs_live_..." \
  -H "Idempotency-Key: release-outfit-456"
```

Credits are returned. The customer is not charged.

### 2c. TTL safety net

If your server crashes and neither commit nor release is called, the reservation expires automatically after the TTL (120 seconds in this example). Credits are returned. The customer is not charged.

## When to use reservations vs. direct usage events

| Scenario | Approach |
|---|---|
| Instant operations (send a message, make an API call) | Record a [usage event](/docs/concepts/metering) directly. No reservation needed. |
| Operations that take seconds to minutes (AI generation, file processing) | Use a reservation. Reserve before starting, commit on success, release on failure. |
| Operations where cost depends on output (variable token count, variable file size) | Use a reservation with an estimated upper bound. Commit with actual units. The excess is returned. |

## Error states

| Error | HTTP status | When |
|---|---|---|
| Insufficient credits | 402 | Customer's effective_balance cannot cover the estimated cost. |
| Reservation not found | 404 | The reservation ID does not exist or belongs to a different tenant. |
| Reservation not active | 409 | Attempting to commit or release a reservation that is already committed, released, or expired. |
| Reservation expired | 409 | Attempting to commit a reservation whose TTL has elapsed. |

## Common Mistakes

**✗ Don't reserve for every call — only long or fallible operations**

Reservations add a round-trip. For fast, reliable operations (sub-second, no external API), a direct usage event is simpler and cheaper.

**✗ Don't set very long TTLs to "be safe"**

A 1-hour TTL holds the customer's credits even if your job finished in 30 seconds. Tune TTL to your operation's p99 duration, not your worst-case.
