📝 Utilities

Word & Token Counter

Count words, characters, sentences, paragraphs and reading time in real time. Estimate LLM token usage for GPT-4, Claude and Gemini — essential for prompt engineering, content planning and API cost estimation.

📝 Your Text

📖 0 min read

📊 Text Statistics

0
Words

Characters

Chars (no spaces)

Sentences

Paragraphs

Lines

Unique Words

Avg Word Length

🤖 LLM Token Estimates

Token counts are estimates using the ~4 chars/token heuristic. Actual counts vary by model and tokenizer. Use OpenAI's tokenizer or Anthropic's token counter for exact values.

💸 Estimated API Input Cost

Cost if this text were sent as input/prompt to each API (as of 2025 pricing). Output tokens are billed separately.

📖 Reading Time Estimates

📖 How to Use This Tool

▼

Type or paste text — stats update live

View words, chars, sentences, reading time

Check LLM token estimates for 8 models

See API costs and word frequency

📝 Examples

Blog post

Input: 500-word article

Output: ~375 tokens, ~2min read

Counting Words, Tokens, and Everything Between

The word and character counts update from straightforward string operations: words are found by splitting the input on whitespace and discarding empty results, while character counts are simply the string's length, computed once including whitespace and once after stripping it out. Sentence detection uses a regular expression that looks for terminal punctuation (., !, ?) followed by whitespace or the end of the string, and paragraph counting splits the text on blank lines.

Token estimation works differently, because tokens aren't a property of the text itself — they're an artifact of whichever tokenizer a specific language model uses. Rather than implementing eight different tokenizers in the browser, this tool applies the widely-cited heuristic that English text averages roughly four characters per token, adjusting that ratio slightly per model family to reflect published differences in tokenizer efficiency. Reading time follows the same pattern of informed approximation: word count divided by an assumed reading speed (200 words per minute as the default, with alternate speeds shown for slower and faster readers), acknowledging explicitly that this is a planning estimate rather than a measurement of how long any specific person will actually take.

Where Teams Reach for a Counter

Sizing a prompt before an LLM API call: checking the estimated token count against a model's context window before submitting a large document, codebase excerpt, or long conversation history, to avoid a silent truncation or a rejected request.
Forecasting API spend before a batch job runs: estimating token counts across a large corpus to project the input cost of an LLM-based pipeline before committing to running it against the full dataset.
Keeping runbook steps and alert descriptions readable under pressure: checking that an incident-response document's individual steps stay short enough for an on-call engineer to parse quickly at 3am, without over-trimming the document as a whole.
Meeting a minimum content-length requirement before publishing: confirming a documentation page or blog post clears a word-count threshold set by an editorial or SEO policy before it goes live.

The Prompt That Silently Got Cut in Half

A team building an internal tool that summarized long customer support threads using an LLM API noticed that summaries for the longest threads were consistently missing context from the early part of the conversation — the model seemed to only remember the last third of what happened. The bug wasn't in the summarization logic at all: the thread history was being concatenated into a single prompt without any token accounting, and once a thread grew past the model's context window, the API was silently truncating the beginning of the prompt to fit, rather than raising an error the team would have noticed immediately.

Because the truncation happened server-side and the API still returned a normal-looking response with a plausible summary, nothing in the application's error handling ever flagged it — the only symptom was summaries that felt subtly wrong. The fix was to estimate the token count of the assembled prompt before sending it, and either intelligently truncate or chunk-and-summarize threads that exceeded a safe margin below the actual context limit, rather than discovering the limit only when the model's output started looking suspicious.

Character Heuristics vs. Real Tokenizers

The ~4-characters-per-token estimate this tool uses is deliberately a fast approximation, and it's worth knowing when that approximation is good enough versus when it isn't. For rough sizing — "will this roughly fit in a 128K context window" — the heuristic is close enough to be useful instantly, with no dependency to install. For anything cost- or limit-sensitive in production, the heuristic's 20-30% margin of error (worse for code, non-English text, or heavily punctuated content) is too wide to trust, and the right tool is the model provider's actual tokenizer: OpenAI's tiktoken library gives an exact count for GPT models using the same byte-pair encoding the API itself applies, and Anthropic's API exposes a token-counting endpoint that returns the exact count Claude's tokenizer would produce for a given input, without needing to run inference at all. The practical pattern is to use a character-based estimate like this tool during early development and design, then swap in the provider's real tokenizer once the pipeline is close to shipping and the margin of error actually matters.

Counting Mistakes That Skew the Numbers

Pasting raw Markdown or HTML instead of plain text: link syntax, heading markers, and tags inflate the character count and skew reading-time estimates — strip markup first if you want a reader-facing number rather than a source-file number.
Treating the token estimate as exact when it's a heuristic: shipping a production prompt budget based on the ~4-chars-per-token approximation without validating against the real tokenizer risks silent truncation the estimate never warned about.
Letting abbreviations and version numbers inflate the sentence count: strings like "e.g.", "v1.2.3", or "Node.js" each contain a period that a simple terminal-punctuation heuristic can miscount as a sentence boundary.
Assuming reading time scales linearly for dense technical content: code samples, tables, and diagrams take meaningfully longer to actually absorb than the word-count-based estimate suggests, since the underlying assumption is continuous prose reading speed.

Frequently Asked Questions

How are LLM tokens estimated?

Token counts are estimated using the ~4 characters per token heuristic, which holds approximately true for English prose and code with most modern tokenizers. The actual token count depends heavily on the specific model's tokenizer: GPT-4 uses OpenAI's cl100k_base encoding, Claude uses Anthropic's internal tokenizer, and other models use their own variants. Technical text with many punctuation characters, code with brackets and operators, or non-English languages can have significantly different character-to-token ratios. For exact token counts when building production LLM applications, use the model provider's official tokenizer library — OpenAI's tiktoken for GPT models, or the Anthropic API's token counting endpoint for Claude.

What is the difference between words and tokens?

Words are the human-intuitive units of text separated by spaces and punctuation — the count you would get by manually reading through a document. Tokens are sub-word units that AI language models use internally, produced by byte-pair encoding (BPE) or similar tokenisation algorithms that learn the most frequent character sequences in a training corpus. Common, short English words like "the", "is", and "cat" are typically single tokens. Rare or long words, technical terms, or words in non-English languages are often split into multiple tokens — "Kubernetes" might tokenise as Kube + rnetes, for example. On average, one English word corresponds to approximately 1.3 tokens, which is why a 1,000-word document typically contains 1,200–1,400 tokens.

How is reading time calculated?

Reading time is estimated by dividing the word count by an average reading speed expressed in words per minute (WPM). This tool uses 200 WPM as the baseline, which reflects the typical speed for reading dense technical material — studies suggest that adults read prose at 200-300 WPM but slow to 100-150 WPM for highly technical content requiring close attention. For a 500-word runbook entry, the estimate would be approximately 2.5 minutes. The displayed estimate should be treated as a planning guide for content strategy and readability assessment rather than a precise measurement, since individual reading speeds vary significantly based on technical familiarity with the subject matter and the density of jargon in the text.