Learn
What is a token in AI?
Tokens are the fundamental unit of text in large language models. Understanding them helps you write better prompts, control costs, and avoid hitting context limits.
What is a token in AI?
A tokenis a chunk of text that an AI language model reads and generates one piece at a time. Tokens are not the same as words, characters, or syllables — they are sub-word units determined by the model's tokenizer, a vocabulary of roughly 50,000–200,000 common fragments learned during training.
For most English text, one token is roughly four characters or three-quarters of a word. Common words like the, is, and and are each a single token. Longer or rarer words split into multiple tokens: tokenization becomes token + ization.
Different model families use different tokenizers with distinct vocabularies. This is why the same sentence can produce a different token count on GPT-4o (using OpenAI's o200k_base) versus Claude (using Anthropic's tokenizer) versus Gemini. The difference is not rounding — it reflects genuinely different vocabularies, and it matters when you are budgeting API costs.
How are tokens counted? (with examples)
Tokenizers convert raw text into integer IDs from a fixed vocabulary using byte-pair encoding (BPE) or similar algorithms. The count of those IDs is the token count. Here are five practical benchmarks across different text types:
| Text | Approximate tokens |
|---|---|
| “Hello world” | ~4 tokens |
| 100-word paragraph | ~75 tokens |
| Python function (20 lines) | ~120 tokens |
| JSON object (10 fields) | ~50 tokens |
| 1,000-word article | ~750 tokens |
Why do tokens cost money?
Language models are compute-intensive. Each token in your prompt must be attended to by every layer of the transformer on every forward pass — and generating each output token requires a separate forward pass through the entire model. Running these operations at scale requires thousands of high-end GPUs or TPUs.
API providers therefore charge per million tokens processed, split into two components:
- Input (prompt) tokens — the text you send, including any system prompt and conversation history. These are cheaper because they can be processed in a single batched forward pass.
- Output (completion) tokens — the text the model generates, one token at a time in an autoregressive loop. Generating tokens is typically 3–10 times more expensive per token than reading them.
Some models also bill separately for thinking tokens (internal reasoning chains). OpenAI's o-series models add thinking tokens on top of their output price; DeepSeek R1 bundles thinking into the output token price. This distinction can double or triple the cost of a reasoning-heavy request if you are not accounting for it.
How many tokens is my text?
The fastest way to find out is to paste your text into the Calculate Tokens calculator. It runs each model's actual tokenizer in your browser using WebAssembly — your text never leaves your device.
If you need a quick mental estimate:
- English prose: divide word count by 0.75 (or multiply by 1.33) to get a rough token count.
- Characters: divide character count by 4 for a rough heuristic. This is what models without a dedicated tokenizer report (marked with a
~prefix in the calculator). - Code: expect slightly more tokens per character than prose — indentation, punctuation, and operator symbols each consume tokens.
Remember that the heuristic significantly underestimates token counts for non-Latin scripts (Chinese, Arabic, Hindi) and overestimates for dense code. Only an exact tokenizer call gives you a reliable number for cost estimation.
Token limits by model
Every model has a context window — the maximum number of tokens it can process in a single request, counting both input and output together. Exceeding this limit produces an error.
| Model | Provider | Context window |
|---|---|---|
| GPT-4o | OpenAI | 128K |
| GPT-4.1 | OpenAI | 1M |
| o4-mini | OpenAI | 200K |
| Claude Sonnet 4.6 | Anthropic | 200K |
| Claude Haiku 4.5 | Anthropic | 200K |
| Gemini 2.5 Pro | 1M | |
| DeepSeek V3 | DeepSeek | 128K |
| DeepSeek R1 | DeepSeek | 128K |
| Llama 4 Scout | Meta | 10M |
Context windows grow as models are updated. Check the calculator for the latest verified figures.
Token cost calculator
Paste any text into the calculator to see exact token counts and USD costs across all major models simultaneously. Each model runs its own tokenizer in your browser — no server, no data collection, no approximations.
Open the token calculator