---
title: "AI Chat App: Wallet + Plans + Per-Message Metering"
description: "How to model an AI companion chat app with wallet-based credits, time-limited plans purchased from the wallet, and per-message metering."
order: 3
---

# AI Chat App

Pattern: wallet + plans + per-message metering.

**Pattern:** WALLET · PLANS · PER-MESSAGE METERING

*Inspired by: Character.AI, Replika, OurVibe-style companion apps*

> **Mental Model:** Think of this as a **chat app with a prepaid wallet**: customers top up real money, optionally buy time-limited plan packs (1-hour or weekly bundles) from that wallet, and every message debits credits. Plan packs burn first; wallet is the safety net.

## Quick Take

- A **persistent wallet** funded by topups — never expires, lowest priority
- **Time-limited plan packs** (1hr, weekly) bought from the wallet, burn first
- Per-message metering with **variable cost** by companion or model
- Auto-buy logic: when plan expires mid-conversation, purchase the next from the wallet without breaking flow

## Diagram

Customer tops up wallet via your payment provider; QuotaStack grants wallet credits (priority 0, no expiry). Customer optionally buys a plan pack — your code debits the wallet via /v1/usage and grants plan credits (priority 10, time-limited) via /v1/topup/grant. Each message hits /v1/usage which debits the highest-priority block first (plan, then wallet). Auto-buy fires on plan expiry if the wallet has funds.

```mermaid
flowchart TD
    A[Customer pays] --> B[POST /v1/topup/grant<br/>wallet credits, P0, no expiry]
    B --> C[Buy plan pack]
    C --> D[POST /v1/usage<br/>debit wallet for plan price]
    D --> E[POST /v1/topup/grant<br/>plan credits, P10, expires]
    F[Send message] --> G[POST /v1/usage<br/>burn plan first, then wallet]
    H[Plan expired,<br/>wallet has funds] --> C
```

## The problem

You run an AI companion chat app. Users chat with AI companions, and each message costs credits. Users maintain a wallet topped up with real money. They can buy time-limited plans (1-hour, weekly) from their wallet that give a bundle of message credits at higher priority. When plan credits run out or expire, the wallet balance kicks in.

You need:

- A persistent wallet balance that never expires.
- Time-limited plan bundles that burn before the wallet.
- Free conversation credits granted when a user starts a new chat.
- Variable pricing per companion (some companions cost more than others).
- Auto-buy logic: when a plan expires mid-session, automatically purchase a new one from the wallet.

## Credit structure

QuotaStack models this with three types of credit blocks, all stacking on a single customer:

| Block type | Priority | Expiry | Source |
|---|---|---|---|
| Wallet | 0 | None | Topup grant after payment |
| Plan credits (1hr, weekly) | 10 | Time-limited (1hr or 7 days) | Topup grant after wallet debit |
| Free conversation credits | 0 | None | Topup grant on new conversation |

**Burn-down order:** Plan credits (priority 10) burn first. When they expire or run out, free conversation credits and wallet balance (both priority 0) share the same tier -- oldest-expiring first, then oldest-created first. Since both have no expiry, the oldest block burns first.

This is the right behavior: time-limited credits should be consumed before permanent ones to minimize waste.

## Billable metrics

Set up three billable metrics and their metering rules:

```bash
POST /v1/billable-metrics
{ "key": "chat_message", "name": "Chat Message" }

POST /v1/billable-metrics
{ "key": "plan_purchase_1hr", "name": "1-Hour Plan Purchase" }

POST /v1/billable-metrics
{ "key": "plan_purchase_weekly", "name": "Weekly Plan Purchase" }
```

Metering rules (all `per_unit` at 1000mc per unit):

```bash
POST /v1/metering-rules
{
  "billable_metric_key": "chat_message",
  "cost_type": "per_unit",
  "credit_cost": 1000,
  "unit_cost": 1000
}

POST /v1/metering-rules
{
  "billable_metric_key": "plan_purchase_1hr",
  "cost_type": "per_unit",
  "credit_cost": 1000,
  "unit_cost": 1000
}

POST /v1/metering-rules
{
  "billable_metric_key": "plan_purchase_weekly",
  "cost_type": "per_unit",
  "credit_cost": 1000,
  "unit_cost": 1000
}
```

Each unit = 1000mc = 1 credit. A message costs 1 unit (1 credit). A plan purchase costs N units, where N is the plan price in credits -- passed dynamically via the `units` field.

## Integration flow

### 1. Customer signup and wallet recharge

Create the customer when they sign up:

```bash
POST /v1/customers
Idempotency-Key: signup:<your_user_id>

{
  "external_id": "user_12345"
}
```

When the user pays via your payment provider (Stripe, Razorpay, etc.), grant wallet credits:

```bash
POST /v1/topup/grant
Idempotency-Key: topup:<your_payment_id>

{
  "customer_id": "cus_...",
  "credits": 500000,
  "metadata": {
    "source": "wallet_recharge",
    "payment_id": "pay_abc123",
    "amount_paid": "499"
  }
}
```

No expiry, no priority specified (defaults to 0). This is the wallet. The `metadata` field stores your fiat payment reference for audit -- QuotaStack never reads it.

### 2. New conversation -- grant free credits

Each time a user starts a new conversation, grant 50 free message credits:

```bash
POST /v1/topup/grant
Idempotency-Key: convo-grant:<conversation_id>

{
  "customer_id": "cus_...",
  "credits": 50000,
  "metadata": {
    "source": "free_conversation",
    "conversation_id": "conv_789"
  }
}
```

50 credits = 50,000mc. No expiry, priority 0. These stack with the wallet and burn in creation-order alongside wallet blocks.

### 3. Buy a plan from the wallet

Purchasing a plan is a two-step operation: debit the plan cost from the wallet, then grant the plan credits with an expiry.

**Step 1: Debit the plan cost.**

```bash
POST /v1/usage
Idempotency-Key: usage:plan-buy-<purchase_id>

{
  "customer_id": "cus_...",
  "billable_metric_key": "plan_purchase_1hr",
  "units": 100,
  "metadata": {
    "companion_id": "companion_42",
    "plan_type": "1hr"
  }
}
```

Here `units: 100` means 100 credits (100,000mc) -- the price of a 1-hour plan for this companion. Different companions can have different prices; you pass the companion-specific price as the `units` value.

**Step 2: Grant plan credits.**

```bash
POST /v1/topup/grant
Idempotency-Key: topup:plan-grant-<purchase_id>

{
  "customer_id": "cus_...",
  "credits": 200000,
  "expires_at": "2026-04-13T11:00:00Z",
  "metadata": {
    "source": "plan_1hr",
    "companion_id": "companion_42",
    "priority": 10
  }
}
```

200 credits (200,000mc) with a 1-hour expiry and priority 10. These burn before wallet credits.

**Variable pricing per companion:** The plan cost (units on the debit) and the credits granted can both vary by companion. Your app stores companion pricing; QuotaStack just executes the credit math.

### 4. Send a message

Before sending a message to the AI, check entitlement and then record usage:

```bash
GET /v1/customers/cus_.../entitlements/chat_message?units=1
```

Response:
```json
{
  "allowed": true,
  "balance": 245000,
  "effective_balance": 245000,
  "cost_per_unit": 1000,
  "cost_total": 1000,
  "affordable_units": 245
}
```

If `allowed` is true, proceed:

```bash
POST /v1/usage
Idempotency-Key: usage:msg-<message_id>

{
  "customer_id": "cus_...",
  "billable_metric_key": "chat_message",
  "units": 1,
  "metadata": {
    "conversation_id": "conv_789",
    "companion_id": "companion_42"
  }
}
```

The usage consumer debits 1000mc from the highest-priority block (the plan credits, if active). Returns 202 -- processing is async.

### 5. Plan stacking

When a user buys a second plan while one is still active, set `expires_at` relative to the existing plan's expiry so the new plan extends the window rather than overlapping.

To compute the correct expiry:

```bash
GET /v1/customers/cus_.../credits?include_blocks=true
```

Find the latest `expires_at` among active plan blocks (blocks with priority 10 and a non-null expiry). Set the new plan's `expires_at` to that value plus the plan duration.

```python
# Pseudocode
blocks = get_credit_blocks(customer_id)
plan_blocks = [b for b in blocks if b.priority == 10 and b.expires_at]

if plan_blocks:
    latest_expiry = max(b.expires_at for b in plan_blocks)
    new_expires_at = latest_expiry + plan_duration
else:
    new_expires_at = now() + plan_duration
```

This prevents wasted overlap. The user effectively extends their plan window.

## Full message handler with auto-buy

Here is the complete pseudocode for handling a message send, including auto-buy logic:

```python
def handle_send_message(customer_id, companion_id, conversation_id, message):
    # 1. Check entitlement
    ent = quotastack.get_entitlement(customer_id, "chat_message", units=1)

    if not ent.allowed:
        # Try auto-buy if wallet has funds for a plan
        bought = try_auto_buy_plan(customer_id, companion_id)
        if not bought:
            return error("Insufficient credits. Please recharge your wallet.")
        # Re-check after auto-buy
        ent = quotastack.get_entitlement(customer_id, "chat_message", units=1)
        if not ent.allowed:
            return error("Insufficient credits after plan purchase.")

    # 2. Record usage (async, returns 202)
    quotastack.record_usage(
        customer_id=customer_id,
        billable_metric_key="chat_message",
        units=1,
        idempotency_key=f"usage:msg-{message.id}",
        metadata={
            "conversation_id": conversation_id,
            "companion_id": companion_id,
        }
    )

    # 3. Call AI and stream response
    response = ai_service.chat(companion_id, message)
    return response

def try_auto_buy_plan(customer_id, companion_id):
    """Attempt to purchase the user's preferred plan from wallet balance."""
    companion = get_companion(companion_id)
    plan_cost = companion.default_plan_cost      # e.g. 100 credits
    plan_credits = companion.default_plan_credits # e.g. 200 credits
    plan_duration = companion.default_plan_duration # e.g. 1 hour

    # Check if wallet can afford the plan
    ent = quotastack.get_entitlement(
        customer_id,
        "plan_purchase_1hr",
        units=plan_cost
    )
    if not ent.allowed:
        return False

    purchase_id = generate_uuid()

    # Compute stacked expiry
    blocks = quotastack.get_credits(customer_id, include_blocks=True).blocks
    plan_blocks = [b for b in blocks if b.priority == 10 and b.expires_at]
    if plan_blocks:
        base_time = max(b.expires_at for b in plan_blocks)
    else:
        base_time = now()
    expires_at = base_time + plan_duration

    # Step 1: Debit plan cost from wallet
    quotastack.record_usage(
        customer_id=customer_id,
        billable_metric_key="plan_purchase_1hr",
        units=plan_cost,
        idempotency_key=f"usage:plan-buy-{purchase_id}",
        metadata={
            "companion_id": companion_id,
            "auto_buy": True,
        }
    )

    # Step 2: Grant plan credits with expiry + high priority
    quotastack.topup_grant(
        customer_id=customer_id,
        credits=plan_credits * 1000,  # convert to millicredits
        expires_at=expires_at,
        idempotency_key=f"topup:plan-grant-{purchase_id}",
        metadata={
            "source": "plan_1hr",
            "companion_id": companion_id,
            "priority": 10,
            "auto_buy": True,
        }
    )

    return True
```

## Plan expiry and auto-buy trigger

QuotaStack fires a `credit.expired` webhook when plan blocks expire. Use this to trigger auto-buy:

```python
def handle_webhook(event):
    if event.type == "credit.expired":
        block = event.credit_block
        # Only auto-buy for plan blocks
        if block.priority == 10 and block.metadata.get("source") in ["plan_1hr", "plan_weekly"]:
            customer = get_customer(block.customer_id)
            if customer.settings.auto_buy_enabled:
                companion_id = block.metadata.get("companion_id")
                try_auto_buy_plan(block.customer_id, companion_id)
```

## Showing balance in the UI

Use the balance endpoint to display remaining credits:

```bash
GET /v1/customers/cus_.../credits?include_blocks=true
```

From the response, compute:

- **Wallet balance:** Sum `remaining_amount` of blocks with no expiry and source `"wallet_recharge"`.
- **Plan credits remaining:** Sum `remaining_amount` of blocks with priority 10.
- **Plan expires at:** Latest `expires_at` among priority-10 blocks.
- **Free credits remaining:** Sum `remaining_amount` of blocks with source `"free_conversation"`.
- **Total messages available:** `balance / 1000` (since 1 message = 1000mc).

## Tips

- **Idempotency keys matter.** The two-step plan purchase (debit + grant) must use deterministic idempotency keys derived from a single `purchase_id`. If step 1 succeeds but step 2 fails, retrying with the same keys is safe -- step 1 replays as a no-op, step 2 executes.

- **Usage is async.** `POST /v1/usage` returns 202 immediately; the credit debit is applied in the background. For chat messages this is fine — the entitlement check already confirmed the user can afford it.

- **Priority is the key lever.** Setting plan credits to priority 10 and wallet credits to priority 0 ensures plans burn first. This is the entire mechanism for "plans are consumed before wallet."

- **Variable companion pricing.** QuotaStack's metering rules are tenant-scoped, not companion-scoped. All companions share the same `chat_message` metric at 1000mc per unit. Price differences are expressed as different `units` values on the plan purchase debit, not as different metering rules.

- **Plan purchase is not a subscription.** This pattern uses topup grants with expiry, not QuotaStack subscriptions. There is no renewal cycle, no grace period, no subscription state machine. The plan is a credit block with a clock on it. This keeps the model simple for consumer apps where users buy plans ad hoc.

See also: [Credits](/docs/concepts/credits), [Burn-Down Order](/docs/concepts/credits#burn-down-order), [Entitlements](/docs/concepts/entitlements).

## Concepts Used

- [Topups & Wallets](/docs/concepts/topups-and-wallets)
- [Credits](/docs/concepts/credits)
- [Metering](/docs/concepts/metering)
- [Entitlements](/docs/concepts/entitlements)
