AI Generation App: Packs + Reserve/Commit/Release
How to model an AI outfit generation app with look packs, free credits on signup, and credit reservations for long-running AI calls.
Inspired by: Midjourney, Runway, ClosetNow-style generation apps
Think of this as buying outfit looks in packs: customers purchase weekly or monthly packs of generation credits, plus a few free looks on signup. Each AI generation reserves credits up front, commits on success, releases on failure — no concurrent overspending, no charges for failed runs.
AI Generation App
Pattern: packs + reserve/commit/release.
The problem
You run an AI outfit generation app. Users upload a photo, the AI generates outfit “looks.” Each look costs 1 credit. The AI call takes 30-90 seconds. Users get free looks on signup and buy look packs (weekly, monthly) with time-limited expiry.
You need:
- Free looks granted at signup that never expire.
- Paid look packs with 7-day or 30-day expiry, purchased after external payment.
- Credit reservation during the AI call so two parallel requests cannot overdraw the account.
- Clean handling of AI failures — if generation fails, the user is not charged.
- A profile screen showing remaining looks with block-level breakdown.
Credit structure
| Block type | Priority | Expiry | Source |
|---|---|---|---|
| Free looks (signup) | 0 | None | Topup grant on signup |
| Weekly pack | 10 | 7 days | Topup grant after payment |
| Monthly pack | 10 | 30 days | Topup grant after payment |
Burn-down order: Pack looks (priority 10) burn first. Within priority 10, the soonest-expiring pack burns first. Free looks (priority 0, no expiry) are the last resort.
This is the correct behavior: paid time-limited credits should be consumed before free permanent ones to minimize waste from expiry.
Billable metric
One metric, one rule:
POST /v1/billable-metrics
Idempotency-Key: metric:look
{ "key": "look", "name": "Look Generation" }
POST /v1/metering-rules
Idempotency-Key: rule:look
{
"billable_metric_key": "look",
"cost_type": "per_unit",
"credit_cost": 1000,
"unit_cost": 1000
}
1 look = 1000mc = 1 credit.
Integration flow
1. Signup — grant free looks
When a new user signs up, create the customer and grant 3 free looks:
POST /v1/customers
Idempotency-Key: signup:<your_user_id>
{
"external_id": "user_56789"
}
POST /v1/topup/grant
Idempotency-Key: topup:signup-grant-<your_user_id>
{
"customer_id": "cus_...",
"credits": 3000,
"metadata": {
"source": "signup_free",
"looks": 3
}
}
3000mc = 3 looks. No expiry, priority 0 (default). These are the safety net.
2. Pack purchase after payment
User buys a weekly pack (24 looks) via your payment provider (Cashfree, Stripe, etc.). After payment confirmation:
POST /v1/topup/grant
Idempotency-Key: topup:<cashfree_order_id>
{
"customer_id": "cus_...",
"credits": 24000,
"expires_at": "2026-04-20T08:00:00Z",
"external_payment_id": "cashfree_order_abc",
"metadata": {
"source": "weekly_pack",
"looks": 24,
"priority": 10
}
}
24000mc = 24 looks. Expires in 7 days. Priority 10 ensures these burn before the free looks.
If you have pre-configured topup packages, reference them by ID instead:
POST /v1/topup/grant
Idempotency-Key: topup:<cashfree_order_id>
{
"customer_id": "cus_...",
"package_id": "pkg_weekly_24",
"external_payment_id": "cashfree_order_abc"
}
The package carries the credits amount, expiry duration, and metadata. You only need to pass the customer and payment reference.
3. Generate a look — reserve, call AI, commit or release
This is where reservations matter.
Step 1: Check entitlement.
GET /v1/customers/cus_.../entitlements/look?units=1
{
"allowed": true,
"balance": 27000,
"effective_balance": 27000,
"cost_per_unit": 1000,
"cost_total": 1000,
"affordable_units": 27
}
If allowed is false, show the user a “buy more looks” prompt. Do not proceed.
Step 2: Reserve credits.
POST /v1/reserve
Idempotency-Key: reservation:<your_request_id>
{
"customer_id": "cus_...",
"billable_metric_key": "look",
"estimated_units": 1,
"ttl_seconds": 300,
"metadata": {
"request_id": "req_gen_001"
}
}
Response:
{
"id": "res_...",
"status": "active",
"estimated_units": 1,
"estimated_cost": 1000,
"expires_at": "2026-04-13T08:05:00Z",
"effective_balance_after": 26000
}
The reservation immediately decrements the effective balance by 1000mc. Any concurrent request now sees effective_balance: 26000 and can only reserve against that.
Step 3: Run the AI.
Call your AI model (Gemini, GPT, Stable Diffusion, etc.). This takes 30-90 seconds.
Step 4a: AI succeeds — commit.
POST /v1/reserve/res_.../commit
Idempotency-Key: commit:res_...
{
"actual_units": 1,
"metadata": {
"model": "gemini-pro",
"generation_time_ms": 45000
}
}
Response:
{
"id": "res_...",
"status": "committed",
"actual_units": 1,
"actual_cost": 1000,
"released": 0,
"balance_after": 26000
}
The 1000mc is permanently debited from the highest-priority block.
Step 4b: AI fails — release.
POST /v1/reserve/res_.../release
Idempotency-Key: release:res_...
{}
Response:
{
"id": "res_...",
"status": "released"
}
The 1000mc hold is released. The user’s effective balance returns to 27000mc. No charge.
4. Profile screen — show balance with block breakdown
GET /v1/customers/cus_.../credits?include_blocks=true
Response:
{
"customer_id": "cus_...",
"balance": 26000,
"reserved_balance": 0,
"effective_balance": 26000,
"blocks": [
{
"id": "blk_...",
"remaining_amount": 23000,
"original_amount": 24000,
"source": "topup",
"priority": 10,
"expires_at": "2026-04-20T08:00:00Z",
"metadata": { "source": "weekly_pack", "looks": 24 }
},
{
"id": "blk_...",
"remaining_amount": 3000,
"original_amount": 3000,
"source": "topup",
"priority": 0,
"expires_at": null,
"metadata": { "source": "signup_free", "looks": 3 }
}
]
}
From this, render:
- Weekly pack: 23 looks remaining, expires Apr 20
- Free looks: 3 looks remaining, no expiry
- Total: 26 looks available
Divide remaining_amount by 1000 to get look counts (since 1 look = 1000mc).
Why reservations matter
Without reservations, this race condition is possible:
- User has 1 look remaining (1000mc).
- User taps “Generate” on two devices simultaneously.
- Both requests check entitlement: both see
balance: 1000, allowed: true. - Both requests call the AI.
- Both requests try to debit 1000mc.
- One succeeds. The other either fails (bad UX after a 60-second wait) or overdraws the account.
With reservations:
- User has 1 look remaining.
- Request A reserves 1000mc. Effective balance drops to 0.
- Request B checks entitlement: sees
effective_balance: 0, allowed: false. Fails fast. - Request A runs AI, commits. Clean.
The reservation is a pessimistic lock on the credit balance. It costs one extra API call but eliminates the double-spend problem entirely.
Failure policy: fail-closed
Never run the AI call without a confirmed reservation. The correct order is:
- Check entitlement (fast, sub-1ms on the hot path).
- Reserve (creates the hold).
- Run AI (only if reservation succeeded).
- Commit or release.
If the reservation returns 402 (insufficient credits), stop. Do not proceed to the AI call. This is fail-closed: absent proof of available credits, the expensive operation does not run.
Auto-expiry of reservations
If your server crashes after reserving but before committing or releasing, the reservation has a TTL (set via ttl_seconds, default 300 seconds, max 3600). When the TTL expires, QuotaStack’s reservation reaper automatically releases the hold. The credits return to the effective balance.
This means:
- Set
ttl_secondsto slightly longer than your worst-case AI generation time. If generation takes up to 90 seconds, set TTL to 300 (5 minutes) for safety margin. - You do not need to build your own cleanup job for abandoned reservations.
- The auto-release writes a ledger entry, so you have an audit trail.
Full generation handler
def handle_generate_look(customer_id, photo, style_preferences):
request_id = generate_uuid()
# 1. Entitlement check
ent = quotastack.get_entitlement(customer_id, "look", units=1)
if not ent.allowed:
return error(
"No looks remaining. Buy a pack to continue.",
affordable_units=ent.affordable_units
)
# 2. Reserve
try:
reservation = quotastack.reserve(
customer_id=customer_id,
billable_metric_key="look",
estimated_units=1,
ttl_seconds=300,
idempotency_key=f"reservation:{request_id}",
metadata={"request_id": request_id}
)
except InsufficientCreditsError:
# Race condition: someone else consumed between check and reserve
return error("No looks remaining. Buy a pack to continue.")
# 3. Run AI (30-90 seconds)
try:
result = ai_service.generate_outfit(photo, style_preferences)
except AIServiceError as e:
# 4b. AI failed -- release reservation
quotastack.release(
reservation_id=reservation.id,
idempotency_key=f"release:{reservation.id}"
)
return error("Generation failed. You were not charged.", detail=str(e))
# 4a. AI succeeded -- commit
quotastack.commit(
reservation_id=reservation.id,
actual_units=1,
idempotency_key=f"commit:{reservation.id}",
metadata={
"model": result.model,
"generation_time_ms": result.duration_ms
}
)
return success(result.images)
Pack expiry and re-purchase
When a pack expires, QuotaStack fires a credit.expired webhook:
{
"event": "credit.expired",
"customer_id": "cus_...",
"credit_block_id": "blk_...",
"amount": 5000,
"expired_at": "2026-04-20T08:00:00Z"
}
Use this to send a push notification: “Your weekly pack expired. 5 looks were unused. Buy a new pack to keep generating.”
Tips
-
One metric, one rule. Unlike the chat app pattern where plan purchases are modeled as separate metrics, here there is only one billable metric (
look). Pack purchases happen outside QuotaStack’s metering — you grant credits directly viaPOST /v1/topup/grantafter confirming payment. -
Idempotency key conventions. Use
reservation:<request_id>,commit:<reservation_id>,release:<reservation_id>. These deterministic keys make retries safe. If your commit call times out, retry with the same key — QuotaStack returns the original response. -
TTL tuning. Set reservation TTL to 2-3x your expected AI generation time. Too short and the reservation expires mid-generation, causing a commit failure. Too long and abandoned reservations lock credits unnecessarily. 300 seconds (5 minutes) is a good default for 30-90 second AI calls.
-
Blocks are returned in burn-down order. The
blocksarray fromGET /credits?include_blocks=trueis sorted by the same priority/expiry/age order used for debits. The first block in the array is the one that will be consumed next. -
No subscriptions needed. This entire pattern works without QuotaStack subscriptions. Customers exist, credits are granted via topups, usage is metered via reservations and commits. Subscriptions are optional in QuotaStack — use them when you need recurring billing cycles, skip them when your model is pure pack-based.
See also: Reservations, Credits, Topup Packages.