Crandore Hub

wordpiece

R Implementation of Wordpiece Tokenization

Apply 'Wordpiece' (<arXiv:1609.08144>) tokenization to input text, given an appropriate vocabulary. The 'BERT' (<arXiv:1810.04805>) tokenization conventions are used by default.

Versions across snapshots

VersionRepositoryFileSize
2.1.3 rolling linux/jammy R-4.5 wordpiece_2.1.3.tar.gz 50.2 KiB
2.1.3 rolling linux/noble R-4.5 wordpiece_2.1.3.tar.gz 50.1 KiB
2.1.3 rolling source/ R- wordpiece_2.1.3.tar.gz 17.6 KiB
2.1.3 latest linux/jammy R-4.5 wordpiece_2.1.3.tar.gz 50.2 KiB
2.1.3 latest linux/noble R-4.5 wordpiece_2.1.3.tar.gz 50.1 KiB
2.1.3 latest source/ R- wordpiece_2.1.3.tar.gz 17.6 KiB
2.1.3 2026-04-26 source/ R- wordpiece_2.1.3.tar.gz 17.6 KiB
2.1.3 2026-04-23 source/ R- wordpiece_2.1.3.tar.gz 17.6 KiB
2.1.3 2026-04-09 windows/windows R-4.5 wordpiece_2.1.3.zip 56.0 KiB
2.1.3 2025-04-20 source/ R- wordpiece_2.1.3.tar.gz 17.6 KiB

Dependencies (latest)

Imports

Suggests