Spend cap
How the hard monthly spend cap is enforced and how to recover when you hit it.
The spend cap is a hard ceiling on month-to-date (MTD) spend, enforced on every chat call. You cannot accidentally overshoot it from concurrent traffic, and you cannot disable it. Picking a cap is part of onboarding.
Cap basics
- Range: $1 — $10,000 per month.
- Set during onboarding; editable any time from the console.
- Resets at the start of each billing period.
- Free credit (default $25) burns down before paid spend. The cap counts total committed cost — credit + paid — so a $5 cap means $5 total this period including credit.
What gets enforced per call
Each chat call:
- Is rejected with
402 onboarding_incompleteif you haven't picked a cap yet. - Is rejected with
402and a specific reason if your account is currently blocked (spend_cap_exceeded,payment_failed,subscription_canceled). - Otherwise, an upper-bound cost for the call is checked against your
remaining headroom. If the call would push you past the cap, you get
402 spend_cap_exceededand the call never runs.
The upper bound is calculated from the input length plus the maximum output size you've requested. Two practical implications:
- Output size is capped at 4,096 tokens per call. A larger
max_tokensin your request is silently clamped down. - Input above 32,768 tokens rejects with
413 input_too_largebefore any spend is reserved.
After the call completes, the actual cost (which is always lower than the reservation) is settled against your MTD spend.
Mid-stream cap abort
For streaming completions, we re-check your MTD against the cap during
the response. If you cross the cap mid-stream, the SSE stream emits an
error event and closes:
event: error
data: {"code":"spend_cap_exceeded","message":"Spend cap reached during this request. Raise your cap to continue.","docs_url":"https://docs.olava.dev/billing/spend_cap_exceeded"}
Chunks emitted before the abort are billable. Subsequent chat calls
return 402 spend_cap_exceeded immediately.
What 402 spend_cap_exceeded looks like
{
"error": {
"type": "billing_error",
"code": "spend_cap_exceeded",
"reason": "spend_cap_exceeded",
"message": "Your monthly spend cap has been reached. Raise your cap to continue.",
"docs_url": "https://docs.olava.dev/billing/spend_cap_exceeded",
"resolution_url": "https://console.olava.dev/billing",
"portal_url": "https://console.olava.dev/billing"
}
}
Response headers include Retry-After, seconds until your current
billing period ends. Well-behaved clients should back off until then or
until you raise the cap.
Recovering
Two ways out:
- Raise the cap from the console. When the new cap is above your committed MTD, the block is cleared immediately so the next chat call goes through.
- Wait for the period rollover. Caps reset at the start of each
billing period;
Retry-Aftertells you when.
Editing your cap
You edit the cap from console.olava.dev/billing. A few rules:
- You can change the cap up to 10 times per 24 hours.
- For large raises (more than 2× the current cap, or more than $1,000), the console will ask you to sign in again before the change takes effect.
- Lowering the cap below your already-committed MTD blocks new chat calls immediately.
Cap warning emails
You'll get warning emails at the 50%, 80%, and 95% thresholds of your cap, plus a notification the moment service is paused for cap reasons. Raising the cap also triggers a confirmation email; lowering does not.