tok
Fast Text Tokenization
Interfaces with the 'Hugging Face' tokenizers library to provide implementations of today's most used tokenizers such as the 'Byte-Pair Encoding' algorithm <https://huggingface.co/docs/tokenizers/index>. It's extremely fast for both training new vocabularies and tokenizing texts.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
0.2.2 |
rolling linux/jammy R-4.5 | tok_0.2.2.tar.gz |
2.4 MiB |
0.2.2 |
rolling linux/noble R-4.5 | tok_0.2.2.tar.gz |
2.4 MiB |
0.2.2 |
rolling source/ R- | tok_0.2.2.tar.gz |
7.7 MiB |
0.2.2 |
latest linux/jammy R-4.5 | tok_0.2.2.tar.gz |
2.4 MiB |
0.2.2 |
latest linux/noble R-4.5 | tok_0.2.2.tar.gz |
2.4 MiB |
0.2.2 |
latest source/ R- | tok_0.2.2.tar.gz |
7.7 MiB |
0.2.2 |
2026-04-26 source/ R- | tok_0.2.2.tar.gz |
7.7 MiB |
0.2.2 |
2026-04-23 source/ R- | tok_0.2.2.tar.gz |
7.7 MiB |
0.2.1 |
2026-04-09 windows/windows R-4.5 | tok_0.2.1.zip |
2.4 MiB |