---
title: "AI Generation App: Packs + Reserve/Commit/Release"
description: "How to model an AI outfit generation app with look packs, free credits on signup, and credit reservations for long-running AI calls."
order: 4
---

# AI Generation App

Pattern: packs + reserve/commit/release.

**Pattern:** PACKS · RESERVE / COMMIT / RELEASE

*Inspired by: Midjourney, Runway, ClosetNow-style generation apps*

> **Mental Model:** Think of this as **buying outfit looks in packs**: customers purchase weekly or monthly packs of generation credits, plus a few free looks on signup. Each AI generation reserves credits up front, commits on success, releases on failure — no concurrent overspending, no charges for failed runs.

## Quick Take

- **Credit packs** with expiry (weekly = 7 days, monthly = 30 days), priority 10
- A small **free-credit grant** on signup (priority 0, no expiry) as a try-before-you-buy
- **Reserve before generation, commit on success, release on failure** — TTL safety-nets crashed workers
- Burn order auto-prefers expiring packs over free credits

## Diagram

Customer signs up — small free-credit grant lands. Customer buys a pack — POST /v1/topup/grant with expires_at. Each generation: POST /v1/reserve holds the estimated cost, generation runs (seconds to minutes), then POST /v1/reserve/{id}/commit on success or /release on failure. TTL auto-expires reservations if the worker crashes — credits return to the customer.

```mermaid
flowchart TD
    A[Sign up] --> B[POST /v1/topup/grant<br/>free credits, P0]
    C[Customer buys pack] --> D[POST /v1/topup/grant<br/>pack credits, P10, expires]
    E[Start generation] --> F[POST /v1/reserve<br/>hold estimated cost]
    F --> G{Job result}
    G -->|success| H[POST /reserve/id/commit<br/>actual_units]
    G -->|failure| I[POST /reserve/id/release<br/>return credits]
    G -->|worker crash| J[TTL auto-expiry<br/>credits returned]
```

## The problem

You run an AI outfit generation app. Users upload a photo, the AI generates outfit "looks." Each look costs 1 credit. The AI call takes 30-90 seconds. Users get free looks on signup and buy look packs (weekly, monthly) with time-limited expiry.

You need:

- Free looks granted at signup that never expire.
- Paid look packs with 7-day or 30-day expiry, purchased after external payment.
- Credit reservation during the AI call so two parallel requests cannot overdraw the account.
- Clean handling of AI failures -- if generation fails, the user is not charged.
- A profile screen showing remaining looks with block-level breakdown.

## Credit structure

| Block type | Priority | Expiry | Source |
|---|---|---|---|
| Free looks (signup) | 0 | None | Topup grant on signup |
| Weekly pack | 10 | 7 days | Topup grant after payment |
| Monthly pack | 10 | 30 days | Topup grant after payment |

**Burn-down order:** Pack looks (priority 10) burn first. Within priority 10, the soonest-expiring pack burns first. Free looks (priority 0, no expiry) are the last resort.

This is the correct behavior: paid time-limited credits should be consumed before free permanent ones to minimize waste from expiry.

## Billable metric

One metric, one rule:

```bash
POST /v1/billable-metrics
Idempotency-Key: metric:look

{ "key": "look", "name": "Look Generation" }
```

```bash
POST /v1/metering-rules
Idempotency-Key: rule:look

{
  "billable_metric_key": "look",
  "cost_type": "per_unit",
  "credit_cost": 1000,
  "unit_cost": 1000
}
```

1 look = 1000mc = 1 credit.

## Integration flow

### 1. Signup -- grant free looks

When a new user signs up, create the customer and grant 3 free looks:

```bash
POST /v1/customers
Idempotency-Key: signup:<your_user_id>

{
  "external_id": "user_56789"
}
```

```bash
POST /v1/topup/grant
Idempotency-Key: topup:signup-grant-<your_user_id>

{
  "customer_id": "cus_...",
  "credits": 3000,
  "metadata": {
    "source": "signup_free",
    "looks": 3
  }
}
```

3000mc = 3 looks. No expiry, priority 0 (default). These are the safety net.

### 2. Pack purchase after payment

User buys a weekly pack (24 looks) via your payment provider (Cashfree, Stripe, etc.). After payment confirmation:

```bash
POST /v1/topup/grant
Idempotency-Key: topup:<cashfree_order_id>

{
  "customer_id": "cus_...",
  "credits": 24000,
  "expires_at": "2026-04-20T08:00:00Z",
  "external_payment_id": "cashfree_order_abc",
  "metadata": {
    "source": "weekly_pack",
    "looks": 24,
    "priority": 10
  }
}
```

24000mc = 24 looks. Expires in 7 days. Priority 10 ensures these burn before the free looks.

If you have pre-configured topup packages, reference them by ID instead:

```bash
POST /v1/topup/grant
Idempotency-Key: topup:<cashfree_order_id>

{
  "customer_id": "cus_...",
  "package_id": "pkg_weekly_24",
  "external_payment_id": "cashfree_order_abc"
}
```

The package carries the credits amount, expiry duration, and metadata. You only need to pass the customer and payment reference.

### 3. Generate a look -- reserve, call AI, commit or release

This is where reservations matter.

**Step 1: Check entitlement.**

```bash
GET /v1/customers/cus_.../entitlements/look?units=1
```

```json
{
  "allowed": true,
  "balance": 27000,
  "effective_balance": 27000,
  "cost_per_unit": 1000,
  "cost_total": 1000,
  "affordable_units": 27
}
```

If `allowed` is false, show the user a "buy more looks" prompt. Do not proceed.

**Step 2: Reserve credits.**

```bash
POST /v1/reserve
Idempotency-Key: reservation:<your_request_id>

{
  "customer_id": "cus_...",
  "billable_metric_key": "look",
  "estimated_units": 1,
  "ttl_seconds": 300,
  "metadata": {
    "request_id": "req_gen_001"
  }
}
```

Response:
```json
{
  "id": "res_...",
  "status": "active",
  "estimated_units": 1,
  "estimated_cost": 1000,
  "expires_at": "2026-04-13T08:05:00Z",
  "effective_balance_after": 26000
}
```

The reservation immediately decrements the effective balance by 1000mc. Any concurrent request now sees `effective_balance: 26000` and can only reserve against that.

**Step 3: Run the AI.**

Call your AI model (Gemini, GPT, Stable Diffusion, etc.). This takes 30-90 seconds.

**Step 4a: AI succeeds -- commit.**

```bash
POST /v1/reserve/res_.../commit
Idempotency-Key: commit:res_...

{
  "actual_units": 1,
  "metadata": {
    "model": "gemini-pro",
    "generation_time_ms": 45000
  }
}
```

Response:
```json
{
  "id": "res_...",
  "status": "committed",
  "actual_units": 1,
  "actual_cost": 1000,
  "released": 0,
  "balance_after": 26000
}
```

The 1000mc is permanently debited from the highest-priority block.

**Step 4b: AI fails -- release.**

```bash
POST /v1/reserve/res_.../release
Idempotency-Key: release:res_...

{}
```

Response:
```json
{
  "id": "res_...",
  "status": "released"
}
```

The 1000mc hold is released. The user's effective balance returns to 27000mc. No charge.

### 4. Profile screen -- show balance with block breakdown

```bash
GET /v1/customers/cus_.../credits?include_blocks=true
```

Response:
```json
{
  "customer_id": "cus_...",
  "balance": 26000,
  "reserved_balance": 0,
  "effective_balance": 26000,
  "blocks": [
    {
      "id": "blk_...",
      "remaining_amount": 23000,
      "original_amount": 24000,
      "source": "topup",
      "priority": 10,
      "expires_at": "2026-04-20T08:00:00Z",
      "metadata": { "source": "weekly_pack", "looks": 24 }
    },
    {
      "id": "blk_...",
      "remaining_amount": 3000,
      "original_amount": 3000,
      "source": "topup",
      "priority": 0,
      "expires_at": null,
      "metadata": { "source": "signup_free", "looks": 3 }
    }
  ]
}
```

From this, render:

- **Weekly pack:** 23 looks remaining, expires Apr 20
- **Free looks:** 3 looks remaining, no expiry
- **Total:** 26 looks available

Divide `remaining_amount` by 1000 to get look counts (since 1 look = 1000mc).

## Why reservations matter

Without reservations, this race condition is possible:

1. User has 1 look remaining (1000mc).
2. User taps "Generate" on two devices simultaneously.
3. Both requests check entitlement: both see `balance: 1000, allowed: true`.
4. Both requests call the AI.
5. Both requests try to debit 1000mc.
6. One succeeds. The other either fails (bad UX after a 60-second wait) or overdraws the account.

With reservations:

1. User has 1 look remaining.
2. Request A reserves 1000mc. Effective balance drops to 0.
3. Request B checks entitlement: sees `effective_balance: 0, allowed: false`. Fails fast.
4. Request A runs AI, commits. Clean.

The reservation is a pessimistic lock on the credit balance. It costs one extra API call but eliminates the double-spend problem entirely.

## Failure policy: fail-closed

Never run the AI call without a confirmed reservation. The correct order is:

1. Check entitlement (fast, sub-1ms on the hot path).
2. Reserve (creates the hold).
3. Run AI (only if reservation succeeded).
4. Commit or release.

If the reservation returns 402 (insufficient credits), stop. Do not proceed to the AI call. This is fail-closed: absent proof of available credits, the expensive operation does not run.

## Auto-expiry of reservations

If your server crashes after reserving but before committing or releasing, the reservation has a TTL (set via `ttl_seconds`, default 300 seconds, max 3600). When the TTL expires, QuotaStack's reservation reaper automatically releases the hold. The credits return to the effective balance.

This means:

- Set `ttl_seconds` to slightly longer than your worst-case AI generation time. If generation takes up to 90 seconds, set TTL to 300 (5 minutes) for safety margin.
- You do not need to build your own cleanup job for abandoned reservations.
- The auto-release writes a ledger entry, so you have an audit trail.

## Full generation handler

```python
def handle_generate_look(customer_id, photo, style_preferences):
    request_id = generate_uuid()

    # 1. Entitlement check
    ent = quotastack.get_entitlement(customer_id, "look", units=1)
    if not ent.allowed:
        return error(
            "No looks remaining. Buy a pack to continue.",
            affordable_units=ent.affordable_units
        )

    # 2. Reserve
    try:
        reservation = quotastack.reserve(
            customer_id=customer_id,
            billable_metric_key="look",
            estimated_units=1,
            ttl_seconds=300,
            idempotency_key=f"reservation:{request_id}",
            metadata={"request_id": request_id}
        )
    except InsufficientCreditsError:
        # Race condition: someone else consumed between check and reserve
        return error("No looks remaining. Buy a pack to continue.")

    # 3. Run AI (30-90 seconds)
    try:
        result = ai_service.generate_outfit(photo, style_preferences)
    except AIServiceError as e:
        # 4b. AI failed -- release reservation
        quotastack.release(
            reservation_id=reservation.id,
            idempotency_key=f"release:{reservation.id}"
        )
        return error("Generation failed. You were not charged.", detail=str(e))

    # 4a. AI succeeded -- commit
    quotastack.commit(
        reservation_id=reservation.id,
        actual_units=1,
        idempotency_key=f"commit:{reservation.id}",
        metadata={
            "model": result.model,
            "generation_time_ms": result.duration_ms
        }
    )

    return success(result.images)
```

## Pack expiry and re-purchase

When a pack expires, QuotaStack fires a `credit.expired` webhook:

```json
{
  "event": "credit.expired",
  "customer_id": "cus_...",
  "credit_block_id": "blk_...",
  "amount": 5000,
  "expired_at": "2026-04-20T08:00:00Z"
}
```

Use this to send a push notification: "Your weekly pack expired. 5 looks were unused. Buy a new pack to keep generating."

## Tips

- **One metric, one rule.** Unlike the chat app pattern where plan purchases are modeled as separate metrics, here there is only one billable metric (`look`). Pack purchases happen outside QuotaStack's metering -- you grant credits directly via `POST /v1/topup/grant` after confirming payment.

- **Idempotency key conventions.** Use `reservation:<request_id>`, `commit:<reservation_id>`, `release:<reservation_id>`. These deterministic keys make retries safe. If your commit call times out, retry with the same key -- QuotaStack returns the original response.

- **TTL tuning.** Set reservation TTL to 2-3x your expected AI generation time. Too short and the reservation expires mid-generation, causing a commit failure. Too long and abandoned reservations lock credits unnecessarily. 300 seconds (5 minutes) is a good default for 30-90 second AI calls.

- **Blocks are returned in burn-down order.** The `blocks` array from `GET /credits?include_blocks=true` is sorted by the same priority/expiry/age order used for debits. The first block in the array is the one that will be consumed next.

- **No subscriptions needed.** This entire pattern works without QuotaStack subscriptions. Customers exist, credits are granted via topups, usage is metered via reservations and commits. Subscriptions are optional in QuotaStack -- use them when you need recurring billing cycles, skip them when your model is pure pack-based.

See also: [Reservations](/docs/concepts/reservations), [Credits](/docs/concepts/credits), [Topup Packages](/docs/concepts/topups-and-wallets#topup-packages).

## Concepts Used

- [Reservations](/docs/concepts/reservations)
- [Topups & Wallets](/docs/concepts/topups-and-wallets)
- [Credits](/docs/concepts/credits)
- [Idempotency](/docs/concepts/idempotency)