Docs / Use Cases / AI Generation App: Packs + Reserve/Commit/Release
PACKS · RESERVE / COMMIT / RELEASE

AI Generation App: Packs + Reserve/Commit/Release

How to model an AI outfit generation app with look packs, free credits on signup, and credit reservations for long-running AI calls.

Inspired by: Midjourney, Runway, ClosetNow-style generation apps

Mental Model

Think of this as buying outfit looks in packs: customers purchase weekly or monthly packs of generation credits, plus a few free looks on signup. Each AI generation reserves credits up front, commits on success, releases on failure — no concurrent overspending, no charges for failed runs.

Quick Take
Credit packs with expiry (weekly = 7 days, monthly = 30 days), priority 10
A small free-credit grant on signup (priority 0, no expiry) as a try-before-you-buy
Reserve before generation, commit on success, release on failure — TTL safety-nets crashed workers
Burn order auto-prefers expiring packs over free credits
P10 · 7 OR 30-DAY EXPIRY Pack Purchase generation starts HOLDS ESTIMATED COST Reserve success failure worker crash ACTUAL_UNITS Commit CREDITS RETURNED Release AUTO-RELEASED TTL Expiry

AI Generation App

Pattern: packs + reserve/commit/release.

The problem

You run an AI outfit generation app. Users upload a photo, the AI generates outfit “looks.” Each look costs 1 credit. The AI call takes 30-90 seconds. Users get free looks on signup and buy look packs (weekly, monthly) with time-limited expiry.

You need:

  • Free looks granted at signup that never expire.
  • Paid look packs with 7-day or 30-day expiry, purchased after external payment.
  • Credit reservation during the AI call so two parallel requests cannot overdraw the account.
  • Clean handling of AI failures — if generation fails, the user is not charged.
  • A profile screen showing remaining looks with block-level breakdown.

Credit structure

Block typePriorityExpirySource
Free looks (signup)0NoneTopup grant on signup
Weekly pack107 daysTopup grant after payment
Monthly pack1030 daysTopup grant after payment

Burn-down order: Pack looks (priority 10) burn first. Within priority 10, the soonest-expiring pack burns first. Free looks (priority 0, no expiry) are the last resort.

This is the correct behavior: paid time-limited credits should be consumed before free permanent ones to minimize waste from expiry.

Billable metric

One metric, one rule:

POST /v1/billable-metrics
Idempotency-Key: metric:look

{ "key": "look", "name": "Look Generation" }
POST /v1/metering-rules
Idempotency-Key: rule:look

{
  "billable_metric_key": "look",
  "cost_type": "per_unit",
  "credit_cost": 1000,
  "unit_cost": 1000
}

1 look = 1000mc = 1 credit.

Integration flow

1. Signup — grant free looks

When a new user signs up, create the customer and grant 3 free looks:

POST /v1/customers
Idempotency-Key: signup:<your_user_id>

{
  "external_id": "user_56789"
}
POST /v1/topup/grant
Idempotency-Key: topup:signup-grant-<your_user_id>

{
  "customer_id": "cus_...",
  "credits": 3000,
  "metadata": {
    "source": "signup_free",
    "looks": 3
  }
}

3000mc = 3 looks. No expiry, priority 0 (default). These are the safety net.

2. Pack purchase after payment

User buys a weekly pack (24 looks) via your payment provider (Cashfree, Stripe, etc.). After payment confirmation:

POST /v1/topup/grant
Idempotency-Key: topup:<cashfree_order_id>

{
  "customer_id": "cus_...",
  "credits": 24000,
  "expires_at": "2026-04-20T08:00:00Z",
  "external_payment_id": "cashfree_order_abc",
  "metadata": {
    "source": "weekly_pack",
    "looks": 24,
    "priority": 10
  }
}

24000mc = 24 looks. Expires in 7 days. Priority 10 ensures these burn before the free looks.

If you have pre-configured topup packages, reference them by ID instead:

POST /v1/topup/grant
Idempotency-Key: topup:<cashfree_order_id>

{
  "customer_id": "cus_...",
  "package_id": "pkg_weekly_24",
  "external_payment_id": "cashfree_order_abc"
}

The package carries the credits amount, expiry duration, and metadata. You only need to pass the customer and payment reference.

3. Generate a look — reserve, call AI, commit or release

This is where reservations matter.

Step 1: Check entitlement.

GET /v1/customers/cus_.../entitlements/look?units=1
{
  "allowed": true,
  "balance": 27000,
  "effective_balance": 27000,
  "cost_per_unit": 1000,
  "cost_total": 1000,
  "affordable_units": 27
}

If allowed is false, show the user a “buy more looks” prompt. Do not proceed.

Step 2: Reserve credits.

POST /v1/reserve
Idempotency-Key: reservation:<your_request_id>

{
  "customer_id": "cus_...",
  "billable_metric_key": "look",
  "estimated_units": 1,
  "ttl_seconds": 300,
  "metadata": {
    "request_id": "req_gen_001"
  }
}

Response:

{
  "id": "res_...",
  "status": "active",
  "estimated_units": 1,
  "estimated_cost": 1000,
  "expires_at": "2026-04-13T08:05:00Z",
  "effective_balance_after": 26000
}

The reservation immediately decrements the effective balance by 1000mc. Any concurrent request now sees effective_balance: 26000 and can only reserve against that.

Step 3: Run the AI.

Call your AI model (Gemini, GPT, Stable Diffusion, etc.). This takes 30-90 seconds.

Step 4a: AI succeeds — commit.

POST /v1/reserve/res_.../commit
Idempotency-Key: commit:res_...

{
  "actual_units": 1,
  "metadata": {
    "model": "gemini-pro",
    "generation_time_ms": 45000
  }
}

Response:

{
  "id": "res_...",
  "status": "committed",
  "actual_units": 1,
  "actual_cost": 1000,
  "released": 0,
  "balance_after": 26000
}

The 1000mc is permanently debited from the highest-priority block.

Step 4b: AI fails — release.

POST /v1/reserve/res_.../release
Idempotency-Key: release:res_...

{}

Response:

{
  "id": "res_...",
  "status": "released"
}

The 1000mc hold is released. The user’s effective balance returns to 27000mc. No charge.

4. Profile screen — show balance with block breakdown

GET /v1/customers/cus_.../credits?include_blocks=true

Response:

{
  "customer_id": "cus_...",
  "balance": 26000,
  "reserved_balance": 0,
  "effective_balance": 26000,
  "blocks": [
    {
      "id": "blk_...",
      "remaining_amount": 23000,
      "original_amount": 24000,
      "source": "topup",
      "priority": 10,
      "expires_at": "2026-04-20T08:00:00Z",
      "metadata": { "source": "weekly_pack", "looks": 24 }
    },
    {
      "id": "blk_...",
      "remaining_amount": 3000,
      "original_amount": 3000,
      "source": "topup",
      "priority": 0,
      "expires_at": null,
      "metadata": { "source": "signup_free", "looks": 3 }
    }
  ]
}

From this, render:

  • Weekly pack: 23 looks remaining, expires Apr 20
  • Free looks: 3 looks remaining, no expiry
  • Total: 26 looks available

Divide remaining_amount by 1000 to get look counts (since 1 look = 1000mc).

Why reservations matter

Without reservations, this race condition is possible:

  1. User has 1 look remaining (1000mc).
  2. User taps “Generate” on two devices simultaneously.
  3. Both requests check entitlement: both see balance: 1000, allowed: true.
  4. Both requests call the AI.
  5. Both requests try to debit 1000mc.
  6. One succeeds. The other either fails (bad UX after a 60-second wait) or overdraws the account.

With reservations:

  1. User has 1 look remaining.
  2. Request A reserves 1000mc. Effective balance drops to 0.
  3. Request B checks entitlement: sees effective_balance: 0, allowed: false. Fails fast.
  4. Request A runs AI, commits. Clean.

The reservation is a pessimistic lock on the credit balance. It costs one extra API call but eliminates the double-spend problem entirely.

Failure policy: fail-closed

Never run the AI call without a confirmed reservation. The correct order is:

  1. Check entitlement (fast, sub-1ms on the hot path).
  2. Reserve (creates the hold).
  3. Run AI (only if reservation succeeded).
  4. Commit or release.

If the reservation returns 402 (insufficient credits), stop. Do not proceed to the AI call. This is fail-closed: absent proof of available credits, the expensive operation does not run.

Auto-expiry of reservations

If your server crashes after reserving but before committing or releasing, the reservation has a TTL (set via ttl_seconds, default 300 seconds, max 3600). When the TTL expires, QuotaStack’s reservation reaper automatically releases the hold. The credits return to the effective balance.

This means:

  • Set ttl_seconds to slightly longer than your worst-case AI generation time. If generation takes up to 90 seconds, set TTL to 300 (5 minutes) for safety margin.
  • You do not need to build your own cleanup job for abandoned reservations.
  • The auto-release writes a ledger entry, so you have an audit trail.

Full generation handler

def handle_generate_look(customer_id, photo, style_preferences):
    request_id = generate_uuid()

    # 1. Entitlement check
    ent = quotastack.get_entitlement(customer_id, "look", units=1)
    if not ent.allowed:
        return error(
            "No looks remaining. Buy a pack to continue.",
            affordable_units=ent.affordable_units
        )

    # 2. Reserve
    try:
        reservation = quotastack.reserve(
            customer_id=customer_id,
            billable_metric_key="look",
            estimated_units=1,
            ttl_seconds=300,
            idempotency_key=f"reservation:{request_id}",
            metadata={"request_id": request_id}
        )
    except InsufficientCreditsError:
        # Race condition: someone else consumed between check and reserve
        return error("No looks remaining. Buy a pack to continue.")

    # 3. Run AI (30-90 seconds)
    try:
        result = ai_service.generate_outfit(photo, style_preferences)
    except AIServiceError as e:
        # 4b. AI failed -- release reservation
        quotastack.release(
            reservation_id=reservation.id,
            idempotency_key=f"release:{reservation.id}"
        )
        return error("Generation failed. You were not charged.", detail=str(e))

    # 4a. AI succeeded -- commit
    quotastack.commit(
        reservation_id=reservation.id,
        actual_units=1,
        idempotency_key=f"commit:{reservation.id}",
        metadata={
            "model": result.model,
            "generation_time_ms": result.duration_ms
        }
    )

    return success(result.images)

Pack expiry and re-purchase

When a pack expires, QuotaStack fires a credit.expired webhook:

{
  "event": "credit.expired",
  "customer_id": "cus_...",
  "credit_block_id": "blk_...",
  "amount": 5000,
  "expired_at": "2026-04-20T08:00:00Z"
}

Use this to send a push notification: “Your weekly pack expired. 5 looks were unused. Buy a new pack to keep generating.”

Tips

  • One metric, one rule. Unlike the chat app pattern where plan purchases are modeled as separate metrics, here there is only one billable metric (look). Pack purchases happen outside QuotaStack’s metering — you grant credits directly via POST /v1/topup/grant after confirming payment.

  • Idempotency key conventions. Use reservation:<request_id>, commit:<reservation_id>, release:<reservation_id>. These deterministic keys make retries safe. If your commit call times out, retry with the same key — QuotaStack returns the original response.

  • TTL tuning. Set reservation TTL to 2-3x your expected AI generation time. Too short and the reservation expires mid-generation, causing a commit failure. Too long and abandoned reservations lock credits unnecessarily. 300 seconds (5 minutes) is a good default for 30-90 second AI calls.

  • Blocks are returned in burn-down order. The blocks array from GET /credits?include_blocks=true is sorted by the same priority/expiry/age order used for debits. The first block in the array is the one that will be consumed next.

  • No subscriptions needed. This entire pattern works without QuotaStack subscriptions. Customers exist, credits are granted via topups, usage is metered via reservations and commits. Subscriptions are optional in QuotaStack — use them when you need recurring billing cycles, skip them when your model is pure pack-based.

See also: Reservations, Credits, Topup Packages.

Concepts used in this pattern

🤖
Building with an AI agent?
Get this page as markdown: /docs/use-cases/ai-generation-app.md · Full index: /llms.txt