A poster on LinkedIn highlighted the Xenova Tokenizer Playground to compare Tokenizer efficiency.
I remarked:
There is however a difference between what this Playground calculates and what the relevant APIs report as actually used (and therefore billed) input tokens. With a short German sentence:
- Xenova Playground „Claude“: 13 Tokens
- Claude 2.1 API: 22 Tokens
- Claude 3 API: 20 Tokens
- Xenova Playground “gpt-4/…”: 11 Tokens
- OpenAI API: 28 Tokens
Note that tokenization seems to work differently for Claude 2 and 3, which the Xenova Playground doesn‘t account for either.
(German sentence: „Was ist das englische Wort für Dokument?“)