Your first chat call

The chat completions endpoint is OpenAI-compatible. If you have code that talks to https://api.openai.com/v1/chat/completions, point the base URL at https://api.olava.dev/v1 and your existing client should work.

Non-streaming

curl https://api.olava.dev/v1/chat/completions \
  -H "Authorization: Bearer olv_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "olava-extract",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Response shape matches OpenAI's, including usage:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "olava-extract",
  "choices": [{"message": {"role": "assistant", "content": "Hi!"}}],
  "usage": {
    "prompt_tokens": 7,
    "completion_tokens": 3,
    "total_tokens": 10
  }
}

Streaming (SSE)

Set stream: true to receive Server-Sent Events. A usage summary chunk arrives just before [DONE] so you always get token counts.

curl https://api.olava.dev/v1/chat/completions \
  -H "Authorization: Bearer olv_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "olava-extract",
    "messages": [{"role": "user", "content": "Stream me a haiku"}],
    "stream": true
  }'

Python (OpenAI SDK)

from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.olava.dev/v1",
    api_key="olv_sk_...",
)
 
resp = client.chat.completions.create(
    model="olava-extract",
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)
print(resp.usage.total_tokens)

Node (OpenAI SDK)

import OpenAI from "openai";
 
const client = new OpenAI({
  baseURL: "https://api.olava.dev/v1",
  apiKey: process.env.OLAVA_API_KEY,
});
 
const resp = await client.chat.completions.create({
  model: "olava-extract",
  messages: [{ role: "user", content: "Hello" }],
});
 
console.log(resp.choices[0].message.content);

What can fail

401 invalid_api_key — bad or revoked key. See Authentication.
402 onboarding_incomplete — onboarding isn't finished.
402 spend_cap_exceeded — monthly cap reached. See Spend cap.
402 payment_failed — last invoice failed; fix payment from the console.
413 input_too_large — input above 32,768 tokens.
429 api_key_rate_limited — request rate exceeded. Honor Retry-After.
503 — service temporarily unavailable. Honor Retry-After and retry.

See Errors for the full catalogue.