Tokens: the LLM unit
A token is roughly a fragment of a word — not a whole word, not a character. As a working rule of thumb, 1 token ≈ 0.75 words, or equivalently 1,000 tokens ≈ 750 words. A typical page of English text runs about 500–600 words, so it costs roughly 700–800 tokens to process.
Every LLM API splits pricing into two separate rates: input tokens (what you send — your prompt, any documents, conversation history) and output tokens (what the model generates back). Output almost always costs more than input, often 5–6 times more, because generating new text is more computationally expensive than reading existing text.
Credits: a vendor-specific wrapper around the same idea
Several vendors (ElevenLabs is the clearest example) bill in "credits" instead of dollars or tokens directly. Credits are just a conversion layer — underneath, they still map to a real unit like characters of text or minutes of audio. The reason vendors do this is partly to let one credit pool cover multiple product types (text-to-speech, dubbing, sound effects) that would otherwise need separate billing systems.
The practical move is to find the vendor's stated conversion rate (e.g. "1 credit = 1 character") and convert back to a real unit yourself, so you can compare it against competitors who bill in plain dollars-per-minute.
Per-second and per-clip: how video generation bills
Video generation APIs typically charge per second of output footage, though a few price by the clip regardless of exact length. Per-second rates vary enormously — roughly $0.05 to $0.75 across major providers — driven mostly by resolution and whether audio is generated natively alongside the video.
The costly mistake here isn't picking an expensive model; it's generating more seconds of AI video than a project actually needs. Mixing a few seconds of AI-generated hero footage with stock visuals or static images is, in practice, how most cost-conscious creators keep video bills manageable.
Subscriptions vs. pay-as-you-go API access
Most vendors offer both a consumer subscription (flat monthly fee, bundled usage quota) and a developer API (metered, pay only for what you use). Subscriptions tend to be cheaper per-unit if you reliably use most of your monthly quota; API billing tends to be cheaper if your usage is bursty, unpredictable, or needs to be triggered programmatically rather than through a manual interface.
A common trap: comparing a subscription's flat fee directly against a competitor's per-unit API rate. They're not the same kind of number — always normalize to a per-unit cost before comparing (divide the subscription price by how many units it actually includes).
Free tiers: read the fine print on commercial use
Free tiers exist almost everywhere, but two restrictions show up repeatedly and catch people off guard: no commercial use (the output can't legally be used in anything monetized) and mandatory attribution (you must credit the vendor publicly). If you're building anything you intend to earn from — even indirectly, like a YouTube channel or a paid newsletter — check this before relying on a free tier for production content.
The cost that doesn't show up on the pricing page
Headline per-unit rates are only part of the real cost. Three things routinely double a project's effective spend without changing the rate card at all: iteration (generating 3–4 versions to get one usable result), retries on failure (some vendors don't refund failed generations), and post-processing (cheaper models often need upscaling, color correction, or manual cleanup that costs time even when it doesn't cost API dollars).
When budgeting a project, it's worth padding any headline per-unit estimate by 50–100% to account for this — the gap between a "should cost" estimate and the real bill almost always comes from here, not from the vendor changing prices.
See the current numbers
This page covers the concepts, which don't change often. For the actual current rates — which do change often — see the live pricing tables, broken out by video, voice, LLM, and image generation.