wordpiece
R Implementation of Wordpiece Tokenization
Apply 'Wordpiece' (<arXiv:1609.08144>) tokenization to input text, given an appropriate vocabulary. The 'BERT' (<arXiv:1810.04805>) tokenization conventions are used by default.
Versions across snapshots
| Version | Repository | File | Size |
|---|---|---|---|
2.1.3 |
rolling linux/jammy R-4.5 | wordpiece_2.1.3.tar.gz |
50.2 KiB |
2.1.3 |
rolling linux/noble R-4.5 | wordpiece_2.1.3.tar.gz |
50.1 KiB |
2.1.3 |
rolling source/ R- | wordpiece_2.1.3.tar.gz |
17.6 KiB |
2.1.3 |
latest linux/jammy R-4.5 | wordpiece_2.1.3.tar.gz |
50.2 KiB |
2.1.3 |
latest linux/noble R-4.5 | wordpiece_2.1.3.tar.gz |
50.1 KiB |
2.1.3 |
latest source/ R- | wordpiece_2.1.3.tar.gz |
17.6 KiB |
2.1.3 |
2026-04-26 source/ R- | wordpiece_2.1.3.tar.gz |
17.6 KiB |
2.1.3 |
2026-04-23 source/ R- | wordpiece_2.1.3.tar.gz |
17.6 KiB |
2.1.3 |
2026-04-09 windows/windows R-4.5 | wordpiece_2.1.3.zip |
56.0 KiB |
2.1.3 |
2025-04-20 source/ R- | wordpiece_2.1.3.tar.gz |
17.6 KiB |