Entitlements
How to check whether a customer can perform an action, how entitlement results are computed, and how caching keeps checks fast on the hot path.
An entitlement check is the bouncer at the door. Before your app starts an expensive operation, it asks QuotaStack "can this customer afford this?" — answer comes back in milliseconds, cached and ready.
Entitlements
An entitlement check answers one question: “Can this customer perform this action right now?”
You call it before starting work — before generating an image, sending a message, making an API call. The response tells you whether to proceed, how much it will cost, and how many more units the customer can afford.
The check endpoint
Two URL forms — pick the one that matches the ID you have (see Customer identification):
GET /v1/customers/{customer_id}/entitlements/{billable_metric_key}?units=N
GET /v1/customer-by-external-id/{external_id}/entitlements/{billable_metric_key}?units=N
| Parameter | Location | Required | Default | Description |
|---|---|---|---|---|
customer_id / external_id | path | yes | — | The customer to check, in either ID form. |
billable_metric_key | path | yes | — | The metric key (e.g. chat_message, look, api_call). |
units | query | no | 1 | How many units the customer wants to consume. |
Example request
Check if a customer can generate 1 outfit look (which costs 1,000 mc per the metering rule). Using the external-id form:
curl "https://api.quotastack.io/v1/customer-by-external-id/user_abc/entitlements/look?units=1" \
-H "X-API-Key: qs_live_..."
Example response
{
"allowed": true,
"customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
"external_customer_id": "user_abc",
"billable_metric_key": "look",
"units": 1,
"balance": 150000,
"reserved_balance": 10000,
"effective_balance": 140000,
"estimated_cost": 1000,
"balance_after": 139000,
"subscription_status": "active",
"overage_policy": "block"
}
Response fields
| Field | Type | Description |
|---|---|---|
allowed | boolean | Whether the customer can perform the action. |
balance | int64 | Total millicredits in the account. |
reserved_balance | int64 | Millicredits held by active reservations. |
effective_balance | int64 | balance - reserved_balance. The usable amount. |
estimated_cost | int64 | Millicredits this operation would cost, computed from the active metering rule. |
balance_after | int64 | effective_balance - estimated_cost. Can be negative if the overage policy allows it. |
subscription_status | string or null | The customer’s subscription status (active, trialing, overdue, etc.), or null if no subscription. |
overage_policy | string | Tenant-level policy: block (deny when insufficient), allow (permit overage), or notify (permit but flag). |
When is allowed true?
- If
effective_balance >= estimated_cost, the customer has enough credits.allowed = true. - If the overage policy is
allowornotify,allowed = trueregardless of balance. Thebalance_afterfield will be negative, signaling overage. - If the customer’s subscription is
overdueand the plan variant hasallow_usage_while_overdue = false,allowed = falseeven if balance is sufficient.
Configuring overage policy
overage_policy is set at the tenant level and can be overridden per-customer:
# Tenant default
curl -X PATCH https://api.quotastack.io/v1/tenants/{tenant_id}/config \
-H "X-API-Key: qs_live_..." \
-H "Idempotency-Key: config-overage:{tenant_id}" \
-H "Content-Type: application/json" \
-d '{ "overage_policy": "block" }'
# Per-customer override
curl -X PATCH https://api.quotastack.io/v1/customer-by-external-id/user_abc \
-H "X-API-Key: qs_live_..." \
-H "Content-Type: application/json" \
-d '{ "overage_policy": "allow" }'
When a customer’s override is null, the tenant default applies. Overage policy applies to consumption (usage events) and entitlement checks — it does not apply to manual grant or adjust operations, which are always strict.
Customers without any active subscription still have an overage_policy; it’s a tenant/customer property independent of subscriptions.
How cost is computed
The entitlement check looks up the active metering rule for the given billable_metric_key and computes cost based on the rule’s cost_type:
| Cost type | Formula | Example |
|---|---|---|
flat | base_cost (fixed, ignores units) | Plan purchase: base_cost = 99000 mc. Checking 1 unit costs 99,000 mc. |
per_unit | unit_cost * units | Chat message: unit_cost = 1000 mc. Checking 5 units costs 5,000 mc. |
tiered | Graduated or volume pricing across tiers | See metering rules for details. |
If no active metering rule exists for the metric key, the check returns a 404.
Bulk entitlement check
Retrieve entitlements for all active metrics at once. Two URL forms:
GET /v1/customers/{customer_id}/entitlements
GET /v1/customer-by-external-id/{external_id}/entitlements
curl https://api.quotastack.io/v1/customer-by-external-id/user_abc/entitlements \
-H "X-API-Key: qs_live_..."
Response:
{
"customer_id": "019d6258-07ba-7418-83be-58f5fde53e4e",
"external_customer_id": "user_abc",
"environment": "live",
"balance": 150000,
"reserved_balance": 10000,
"effective_balance": 140000,
"subscription_status": "active",
"plan_name": "Pro",
"entitlements": {
"look": {
"billable_metric_key": "look",
"allowed": true,
"estimated_cost_per_unit": 1000,
"affordable_units": 140,
"cost_type": "per_unit"
},
"chat_message": {
"billable_metric_key": "chat_message",
"allowed": true,
"estimated_cost_per_unit": 500,
"affordable_units": 280,
"cost_type": "per_unit"
}
},
"cached_at": "2025-01-15T10:30:00Z"
}
Each entry in the entitlements map includes:
| Field | Type | Description |
|---|---|---|
allowed | boolean | Whether the customer can perform at least 1 unit. |
estimated_cost_per_unit | int64 | Millicredits for a single unit of this metric. For tiered rules, this is the cost of the next unit at the customer’s current tier position (see caveat below). |
affordable_units | int64 | How many units the customer can afford at the current balance, computed as effective_balance / estimated_cost_per_unit. For flat rules: either 1 or 0. For a free metric (cost 0): int64 max. |
cost_type | string | The metering rule type: flat, per_unit, or tiered. |
Caveat for tiered rules: affordable_units assumes the per-unit cost stays constant. Crossing a tier boundary during consumption changes the rate, so the actual number of affordable units may be higher (cheaper upper tier) or lower (more expensive upper tier). Treat it as a guide, not a guarantee.
Latency and freshness
Entitlement checks are designed for the hot path. The two endpoints trade staleness for speed differently:
| Endpoint | Freshness | Typical latency |
|---|---|---|
Single-metric (/entitlements/{metric}) | Always live — reflects the balance at request time | 5-15ms |
Bulk (/entitlements) | Up to 30 seconds stale | sub-1ms on the fast path, 5-15ms otherwise |
Use the single-metric endpoint on the hot path — usage-gating, reserve→check→commit flows, anywhere a few-second-stale answer would be wrong. Use the bulk endpoint for dashboards, profile screens, and other places where 30-second staleness is acceptable.
Freshness after balance changes
Any credit mutation — a usage event, topup, grant, reservation, block expiry, or adjustment — immediately refreshes the customer’s bulk-endpoint result. Your next check after a balance change sees the up-to-date numbers without waiting for the staleness window to elapse.
Forcing a fresh result
To force the bulk endpoint to skip its staleness window, send the Cache-Control: no-cache header:
curl https://api.quotastack.io/v1/customer-by-external-id/user_abc/entitlements \
-H "X-API-Key: qs_live_..." \
-H "Cache-Control: no-cache"
The single-metric endpoint always computes live — no header needed.
Using entitlements on the hot path
Entitlement checks are designed for the hot path. Common patterns:
Gate UI elements. Before rendering a “Generate” button, check if the user is entitled. If allowed is false, show a disabled button with an upgrade prompt.
Pre-check before expensive operations. Before kicking off an AI generation that will cost compute resources, verify the user has credits. This avoids wasting infrastructure on work you cannot charge for.
Display affordable units. Use affordable_units to show the user how many actions they have remaining: “You have 140 looks left this month.”
Determine upsell moments. When affordable_units drops below a threshold, prompt the user to purchase more credits or upgrade their plan.
Example: checking if a user can generate an outfit
A fashion SaaS charges 1,000 mc (1 credit) per outfit look. Before starting the AI pipeline:
curl "https://api.quotastack.io/v1/customer-by-external-id/user_xyz/entitlements/look?units=1" \
-H "X-API-Key: qs_live_..."
If allowed is true, proceed with generation. After generation completes, record the usage event to debit the credits. If you need to hold credits during the generation, use a reservation instead.
If allowed is false, return an error to the user and suggest they purchase a credit pack or upgrade.
Common Mistakes
The mistakes developers typically make with this concept — and what to do instead.
allowed: true as a binding promise